Adi Levin's Blog for programmers

June 2, 2009

What is a thread?

Filed under: Multithreading — Adi Levin @ 9:01 pm
Tags: , , ,

A thread is the entity inside a process that can be scheduled for execution. A process always has at least one thread, but it can have many threads, all sharing the resources of the processes (address space, handles, executable code).  Whenever executable code (a function) is running, it is running in the context of a certain thread. The thread context includes a thread’s set of machine registers, and a stack. The stack holds the call stack and the addresses of the automatic variables (Variables not allocated on the heap, but declared normally inside a function).

Before I began to “think multithreaded”, whenever I was debugging a program, I looked at the call-stack of the program in the debugger, to understand what my program was doing at any moment. Now I know that there is no single call-stack of the program. Every thread has its own call-stack. Whenever I break the process for debugging, I look at which threads are running at the moment (Choose “threads” from the “debug” menu in Visual Studio), and then I look at the call-stack of each of the relevant threads.

Whenever an exception is thrown, you need to know where to look in the code, so you have to know which thread threw the exception. It is not true any more that a program throws an exception – from now on say that a thread throws an exception. Debugging is even more complicated in multithreaded programs, because a thread can throw an exception, because its is accessing data that has been corrupted by another thread!

What is the stack?

To understand multithreading, it is important to distinguish between variables that on the stack, and variables (data) on the heap. Thus, it is important to understand exactly what a stack is.

A stack is a range of addresses, which is used to store variables and to enable function calls. A local variable (or array) defined within the scope of a function is allocated on the stack. Such allocation is much more efficient than allocation on the heap (via malloc), because it takes zero time. The address of a variable is known, in compile time, as a fixed offset from the stack pointer that represents the beginning of the function. In contrast, when allocating an array on the heap, the C run-time library has to look for a range of addresses in the process address space which is not allocated yet. This search can take a long time (relatively).

When calling a function, the value of the arguments and some registers are pushed to the stack, and the stack pointer is updated. When returning from a function, the stack pointer is restored by popping its old value from the stack. This is how the “return” command knows where to return to.

Since the stack is part of the thread context, variables that are on the stack (i.e. local variables and function arguments) are local to the thread. If a function is executed simultaneously in two different threads, each thread hold its own instances of all the local variables. Thus, local variables are not resources that are shared between threads.

By default, the size of the stack in 1MB, which means, for example, that you can’t define an array of over 2^20 characters. This will result in a crush. When creating threads, you can control the size of the stack. In case the function that runs on the thread is guaranteed to be very simple, consider setting the stack size to be much less than the default.

Starting and Ending a thread

When the process is created, it is always created with one thread (called the primary thread).  The standard way to create new threads is to call the function CreateThread. CreateThread gets as input a pointer to a function that should run on the newly created thread, and a parameter to pass to that function. Immediately after the thread is created, the function starts with the given parameter value. When this function ends, the thread ends, and all the resources associated with it are freed (in particular, the stack). The value of the parameter can be used to send the pointer of an object, such that the thread function will operate on the object’s data.

Thread scheduling and priorities 

Once the thread is started, Windows schedules it to run on one of the CPUs according to certain scheduling rules. In case more threads are requested than the number of processors, Windows will schedule a thread to run on a CPU, then it will suspend (“preempt”) it, and let another thread run on that CPU. This method is known as preemptive scheduling. It enables a number of processes appear to run concurrently, even on a single processor machine.

If you count the threads from all of the running processes on your machine, you’ll see that there are probably hundreds of threads living in the system at any time. However, most of them are not doing any computation. Most of them are waiting for input (in the form of mouse click, Windows message, etc…) or sleeping for a prescribed amount of time. Of the threads that require actual computation, only 2 are currently running (in case of a dual-core machine), and the rest are suspended (preempted).

As a programmer you can’t control the scheduling process, but you can influence it via the mechanism of priorities. Each thread has a priority, which is represented by an integer number. By default, threads have the priority level called normal priority. The function SetThreadPriority can be used to change the priority, for example, to below normal or above normal. A thread with priority below normal will only be allowed to run if there is no thread of normal or higher priority that currently requests CPU time. Setting priority to below normal is particularly useful for performing background tasks that (such as havey computations) should not interfere with foreground activities (such as user interaction). If the worker thread that does the heavy computation has priority below normal, the user is free to interact with the primary thread of the application. If there is a lot of user interaction during the background computation, the computation will have less CPU time and it will take more time to complete. Otherwise, the computation will take the same time it would take if the worker thread had normal priority.

Be aware that Windows schedules threads and not processes. Therefore, a process running 2 concurrent threads will be given more CPU time that a process running 1 thread, if the two processes run simultaneously, and their priorities are the same.

How do threads communicate?

The simplest way to transfer data between one thread and the other is via the memory. Threads share the resources of the process, hence they have access to the address space of the process. When a thread is created, you supply a parameter to the thread function. That parameter is typically the pointer to an object that you define, that can be as complex as you want. The thread that called CreateThread also has a pointer to the same object. Thus, the two threads have shared access to an arbitrary amount of data.

Here is an example for how you can use it: Suppose you have a worker thread doing some heavy computation, and you want to be able to stop it upon demand, from the primary thread. The way to do it is to define a boolean variable called “abort_requested“, and pass it to the worker thread, with zero as an initial value. The function doing the heavy computation frequently checks if abort_requested is true, and when that happens, it exits. In order to stop the worker thread, all the primary thread needs to do is to set abort_requested=1. When doing so, it is adviable to define the variable abort_requested as volatile. This means that compiler optimizations will not apply to it.

Other methods of communication exist, and they include Windows Messages and Synchronization objects. However, it is advised to use these mechanisms only when they are really necessary, because they are more expensive.

Advertisements

4 Comments »

  1. corrections:
    1. “momenr” should be “moment”
    2. “multithreadnig” -> “multithreading”
    3. “By default, the size of the stack in 1MB” already mentioned at previous paragraph – (whose default size is 1MB)

    Sof-sof I know what a Thread is.

    How do threads communicate
    ————————–
    I remember issues of synchronization while reading/writing from/to a shared memory. The term was “critical segment”. What’s happen if one thread is reading from the shared memory while at the very same time the other thread is writing to it ?

    Comment by Doron Osovlanski — September 6, 2009 @ 2:37 pm | Reply

    • Thanks for the corrections, Doron. I fixed them.

      Regarding your question – unsynchornized read/write may result in undeterministic behavior, i.e. each run of the program can have different results. However, this may be fine, depending on the situation. When the non-determinism is critical to the correctness of the system, it is called a “race condition” (because threads are racing against each-other to see who gets first to that place in memory). I myself don’t hesitate to do it, when I know it is safe to do so, because I try to synchronize as little as possible – for simplifcation of code and for performance optimization.

      A race condition happens, for example, if you compose a null-terminated string while reading it from another thread. The write/read operations are not atomic (i.e. they are composed of many instructions, and context-switches can occur in the middle). Therefore, if you read before the write is complete, you don’t know what you’re going to get. Will there be a NULL character at the end of the string, or will it be a long string full or garbage, because the NULL character hasn’t been written yet?
      For another example of a race condition, take a look at my post on “locking mechanisms” in the last section on “interlocked functions”.

      On the other hand, if one thread is only checking if a certain flag is non-zero, and the other thread sets its value to be non-zero – this is fine. The write/read operations are atomic, and the non-deterministic behavior is what you would expect. On the hardware level, the read/write operations to the same place in physical memory cannot happen simultaneously, so there’s no need to worry about it. The hardware is in charge of deciding what happens first – the read or the write, depending on what instruction was processed first, and on memory management issues.

      Comment by adilevin — September 6, 2009 @ 3:30 pm | Reply

  2. Basically my question was answered while I was reading chapter 5 but thanks anyways.
    Unfortunatelly, new question arises: why checking is a certian flag is non-zero, considered as an atomic operation ?
    What’s happened to read-ask-write scenario ?

    Comment by Doron Osovlanski — September 8, 2009 @ 1:03 pm | Reply

    • You are right, of course – checking that the flag is non-zero requires a number of operations. But the reading itself is atomic. So, by the time you check the value of the flag, it might have been modified. Depending on the scenario, you have to decide whether this bothers you or not. In case of a flag that is being checked just as a way to signal to the thread that the user requests to abort in the middle of computation – the exact timing is not important, so if the flag is non-zero you can safely assume that the user requested an abort. If the flag is zero, it is possible that there was an abort request that you missed, but if the flag is checked at a high frequency, you will see the non-zero value in the next iteration, so there’s no problem.

      Comment by adilevin — September 8, 2009 @ 1:09 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: