12.1 A Motivating Problem: Monitoring File Descriptors
A blocking read operation causes the calling process to block until input becomes available. Such blocking creates difficulties when a process expects input from more than one source, since the process has no way of knowing which file descriptor will produce the next input. The multiple file descriptor problem commonly appears in client-server programming because the server expects input from multiple clients. Six general approaches to monitoring multiple file descriptors for input under POSIX are as follows.
A separate process monitors each file descriptor (Program 4.11) select (Program 4.12 and Program 4.14) poll (Program 4.17) Nonblocking I/O with polling (Example 4.39) POSIX asynchronous I/O (Program 8.14 and Program 8.16) A separate thread monitors each file descriptor (Section 12.2)
In the separate process approach, the original process forks a child process to handle each file descriptor. This approach works for descriptors representing independent I/O streams, since once forked, the children don't share any variables. If processing of the descriptors is not independent, the children may use shared memory or message passing to exchange information.
Approaches two and three use blocking calls (select or poll) to explicitly wait for I/O on the descriptors. Once the blocking call returns, the calling program handles each ready file descriptor in turn. The code can be complicated when some of the file descriptors close while others remain open (e.g., Program 4.17). Furthermore, the program can do no useful processing while blocked.
The nonblocking strategy of the fourth approach works well when the program has "useful work" that it can perform between its intermittent checks to see if I/O is available. Unfortunately, most problems are difficult to structure in this way, and the strategy sometimes forces hard-coding of the timing for the I/O check relative to useful work. If the platform changes, the choice may no longer be appropriate. Without very careful programming and a very specific program structure, the nonblocking I/O strategy can lead to busy waiting and inefficient use of processor resources.
POSIX asynchronous I/O can be used with or without signal notification to overlap processing with monitoring of file descriptors. Without signal notification, asynchronous I/O relies on polling as in approach 4. With signal notification, the program does its useful work until it receives a signal advising that the I/O may be ready. The operating system transfers control to a handler to process the I/O. This method requires that the handler use only async-signal-safe functions. The signal handler must synchronize with the rest of the program to access the data, opening the potential for deadlocks and race conditions. Although asynchronous I/O can be tuned very efficiently, the approach is error-prone and difficult to implement.
The final approach uses a separate thread to handle each descriptor, in effect reducing the problem to one of processing a single file descriptor. The threaded code is simpler than the other implementations, and a program can overlap processing with waiting for input in a transparent way.
Threading is not as widely used as it might be because, until recently, threaded programs were not portable. Each vendor provided a proprietary thread library with different calls. The POSIX standard addresses the portability issue with POSIX threads, described in the POSIX:THR Threads Extension. Table E.1 on page 860 lists several additional extensions that relate to the more esoteric aspects of POSIX thread management. Section 12.2 introduces POSIX threads by solving the multiple file descriptor problem. Do not focus on the details of the calls when you first read this section. The remainder of this chapter discusses basic POSIX thread management and use of the library. Chapter 13 explains synchronization and signal handling with POSIX threads. Chapters 14 and 15 discuss the use of semaphores for synchronization. Semaphores are part of the POSIX:SEM Extension and the POSIX:XSI Extension and can be used with threads. Chapters 16 and 17 discuss projects that use threads and synchronization.
12.2 Use of Threads to Monitor Multiple File Descriptors
Multiple threads can simplify the problem of monitoring multiple file descriptors because a dedicated thread with relatively simple logic can handle each file descriptor. Threads also make the overlap of I/O and processing transparent to the programmer.
We begin by comparing the execution of a function by a separate thread to the execution of an ordinary function call within the same thread of execution. Figure 12.1 illustrates a call to the processfd function within the same thread of execution. The calling mechanism creates an activation record (usually on the stack) that contains the return address. The thread of execution jumps to processfd when the calling mechanism writes the starting address of processfd in the processor's program counter. The thread uses the newly created activation record as the environment for execution, creating automatic variables on the stack as part of the record. The thread of execution continues in processfd until reaching a return statement (or the end of the function). The return statement copies the return address that is stored in the activation record into the processor program counter, causing the thread of execution to jump back to the calling program.
Figure 12.2 illustrates the creation of a separate thread to execute the processfd function. The pthread_create call creates a new "schedulable entity" with its own value of the program counter, its own stack and its own scheduling parameters. The "schedulable entity" (i.e., thread) executes an independent stream of instructions, never returning to the point of the call. The calling program continues to execute concurrently. In contrast, when processfd is called as an ordinary function, the caller's thread of execution moves through the function code and returns to the point of the call, generating a single thread of execution rather than two separate ones.
We now turn to the specific problem of handling multiple file descriptors. The processfd function of Program 12.1 monitors a single file descriptor by calling a blocking read. The function returns when it encounters end-of-file or detects an error. The caller passes the file descriptor as a pointer to void, so processfd can be called either as an ordinary function or as a thread.
The processfd function uses the r_read function of Program 4.3 instead of read to restart reading if the thread is interrupted by a signal. However, we recommend a dedicated thread for signal handling, as explained in Section 13.5. In this case, the thread that executes processfd would have all signals blocked and could call read.
Program 12.1 processfd.c
The processfd function monitors a single file descriptor for input.
#include <stdio.h>
#include "restart.h"
#define BUFSIZE 1024
void docommand(char *cmd, int cmdsize);
void *processfd(void *arg) { /* process commands read from file descriptor */
char buf[BUFSIZE];
int fd;
ssize_t nbytes;
fd = *((int *)(arg));
for ( ; ; ) {
if ((nbytes = r_read(fd, buf, BUFSIZE)) <= 0)
break;
docommand(buf, nbytes);
}
return NULL;
}
Example 12.1
The following code segment calls processfd as an ordinary function. The code assumes that fd is open for reading and passes it by reference to processfd.
void *processfd(void *);
int fd;
processfd(&fd);
Example 12.2
The following code segment creates a new thread to run processfd for the open file descriptor fd.
void *processfd(void *arg);
int error;
int fd;
pthread_t tid;
if (error = pthread_create(&tid, NULL, processfd, &fd))
fprintf(stderr, "Failed to create thread: %s\n", strerror(error));
The code of Example 12.1 has a single thread of execution, as illustrated in Figure 12.1. The thread of execution for the calling program traverses the statements in the function and then resumes execution at the statement after the call. Since processfd uses blocking I/O, the program blocks on r_read until input becomes available on the file descriptor. Remember that the thread of execution is really the sequence of statements that the thread executes. The sequence contains no timing information, so the fact that execution blocks on a read call is not directly visible to the caller. The code in Example 12.2 has two threads of execution. A separate thread executes processfd, as illustrated in Figure 12.2.
The function monitorfd of Program 12.2 uses threads to monitor an array of file descriptors. Compare this implementation with those of Program 4.14 and Program 4.17. The threaded version is considerably simpler and takes advantage of parallelism. If docommand causes the calling thread to block for some reason, the thread runtime system schedules another runnable thread. In this way, processing and reading are overlapped in a natural way. In contrast, blocking of docommand in the single-threaded implementation causes the entire process to block.
If monitorfd fails to create thread i, it sets the corresponding thread ID to itself to signify that creation failed. The last loop uses pthread_join, described in Section 12.3, to wait until all threads have completed.
Program 12.2 monitorfd.c
A function to monitor an array of file descriptors, using a separate thread for each descriptor.
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void *processfd(void *arg);
void monitorfd(int fd[], int numfds) { /* create threads to monitor fds */
int error, i;
pthread_t *tid;
if ((tid = (pthread_t *)calloc(numfds, sizeof(pthread_t))) == NULL) {
perror("Failed to allocate space for thread IDs");
return;
}
for (i = 0; i < numfds; i++) /* create a thread for each file descriptor */
if (error = pthread_create(tid + i, NULL, processfd, (fd + i))) {
fprintf(stderr, "Failed to create thread %d: %s\n",
i, strerror(error));
tid[i] = pthread_self();
}
for (i = 0; i < numfds; i++) {
if (pthread_equal(pthread_self(), tid[i]))
continue;
if (error = pthread_join(tid[i], NULL))
fprintf(stderr, "Failed to join thread %d: %s\n", i, strerror(error));
}
free(tid);
return;
}
|
12.3 Thread Management
A thread package usually includes functions for thread creation and thread destruction, scheduling, enforcement of mutual exclusion and conditional waiting. A typical thread package also contains a runtime system to manage threads transparently (i.e., the user is not aware of the runtime system). When a thread is created, the runtime system allocates data structures to hold the thread's ID, stack and program counter value. The thread's internal data structure might also contain scheduling and usage information. The threads for a process share the entire address space of that process. They can modify global variables, access open file descriptors, and cooperate or interfere with each other in other ways.
POSIX threads are sometimes called pthreads because all the thread functions start with pthread. Table 12.1 summarizes the basic POSIX thread management functions introduced in this section. The programs listed in Section 12.1 used pthread_create to create threads and pthread_join to wait for threads to complete. Other management functions deal with thread termination, signals and comparison of thread IDs. Section 12.6 introduces the functions related to thread attribute objects, and Chapter 13 covers thread synchronization functions.
Table 12.1. POSIX thread management functions.|
pthread_cancel | terminate another thread | pthread_create | create a thread | pthread_detach | set thread to release resources | pthread_equal | test two thread IDs for equality | pthread_exit | exit a thread without exiting process | pthread_kill | send a signal to a thread | pthread_join | wait for a thread | pthread_self | find out own thread ID |
Most POSIX thread functions return 0 if successful and a nonzero error code if unsuccessful. They do not set errno, so the caller cannot use perror to report errors. Programs can use strerror if the issues of thread safety discussed in Section 12.4 are addressed. The POSIX standard specifically states that none of the POSIX thread functions returns EINTR and that POSIX thread functions do not have to be restarted if interrupted by a signal.
12.3.1 Referencing threads by ID
POSIX threads are referenced by an ID of type pthread_t. A thread can find out its ID by calling pthread_self.
SYNOPSIS
#include <pthread.h>
pthread_t pthread_self(void);
POSIX:THR
The pthread_self function returns the thread ID of the calling thread. No errors are defined for pthread_self.
Since pthread_t may be a structure, use pthread_equal to compare thread IDs for equality. The parameters of pthread_equal are the thread IDs to be compared.
SYNOPSIS
#include <pthread.h>
pthread_t pthread_equal(thread_t t1, pthread_t t2);
POSIX:THR
If t1 equals t2, pthread_equal returns a nonzero value. If the thread IDs are not equal, pthread_equal returns 0. No errors are defined for pthread_equal.
Example 12.3
In the following code segment, a thread outputs a message if its thread ID is mytid.
pthread_t mytid;
if (pthread_equal(pthread_self(), mytid))
printf("My thread ID matches mytid\n");
12.3.2 Creating a thread
The pthread_create function creates a thread. Unlike some thread facilities, such as those provided by the Java programming language, the POSIX pthread_create automatically makes the thread runnable without requiring a separate start operation. The thread parameter of pthread_create points to the ID of the newly created thread. The attr parameter represents an attribute object that encapsulates the attributes of a thread. If attr is NULL, the new thread has the default attributes. Section 12.6 discusses the setting of thread attributes. The third parameter, start_routine, is the name of a function that the thread calls when it begins execution. The start_routine takes a single parameter specified by arg, a pointer to void. The start_routine returns a pointer to void, which is treated as an exit status by pthread_join.
SYNOPSIS
#include <pthread.h>
int pthread_create(pthread_t *restrict thread,
const pthread_attr_t *restrict attr,
void *(*start_routine)(void *), void *restrict arg);
POSIX:THR
If successful, pthread_create returns 0. If unsuccessful, pthread_create returns a nonzero error code. The following table lists the mandatory errors for pthread_create.
|
EAGAIN | system did not have the resources to create the thread, or would exceed system limit on total number of threads in a process | EINVAL | attr parameter is invalid | EPERM | caller does not have the appropriate permissions to set scheduling policy or parameters specified by attr
|
Do not let the prototype of pthread_create intimidate youthreads are easy to create and use.
Example 12.4
The following code segment creates a thread to execute the function processfd after opening the my.dat file for reading.
void *processfd(void *arg);
int error;
int fd;
pthread_t tid;
if ((fd = open("my.dat", O_RDONLY)) == -1)
perror("Failed to open my.dat");
else if (error = pthread_create(&tid, NULL, processfd, &fd))
fprintf(stderr, "Failed to create thread: %s\n", strerror(error));
else
printf("Thread created\n");
12.3.3 Detaching and joining
When a thread exits, it does not release its resources unless it is a detached thread. The pthread_detach function sets a thread's internal options to specify that storage for the thread can be reclaimed when the thread exits. Detached threads do not report their status when they exit. Threads that are not detached are joinable and do not release all their resources until another thread calls pthread_join for them or the entire process exits. The pthread_join function causes the caller to wait for the specified thread to exit, similar to waitpid at the process level. To prevent memory leaks, long-running programs should eventually call either pthread_detach or pthread_join for every thread.
The pthread_detach function has a single parameter, thread, the thread ID of the thread to be detached.
SYNOPSIS
#include <pthread.h>
int pthread_detach(pthread_t thread);
POSIX:THR
If successful, pthread_detach returns 0. If unsuccessful, pthread_detach returns a nonzero error code. The following table lists the mandatory errors for pthread_detach.
|
EINVAL | thread does not correspond to a joinable thread | ESRCH | no thread with ID thread |
Example 12.5
The following code segment creates and then detaches a thread to execute the function processfd.
void *processfd(void *arg);
int error;
int fd
pthread_t tid;
if (error = pthread_create(&tid, NULL, processfd, &fd))
fprintf(stderr, "Failed to create thread: %s\n", strerror(error));
else if (error = pthread_detach(tid))
fprintf(stderr, "Failed to detach thread: %s\n", strerror(error));
Example 12.6 detachfun.c
When detachfun is executed as a thread, it detaches itself.
#include <pthread.h>
#include <stdio.h>
void *detachfun(void *arg) {
int i = *((int *)(arg));
if (!pthread_detach(pthread_self()))
return NULL;
fprintf(stderr, "My argument is %d\n", i);
return NULL;
}
A nondetached thread's resources are not released until another thread calls pthread_join with the ID of the terminating thread as the first parameter. The pthread_join function suspends the calling thread until the target thread, specified by the first parameter, terminates. The value_ptr parameter provides a location for a pointer to the return status that the target thread passes to pthread_exit or return. If value_ptr is NULL, the caller does not retrieve the target thread return status.
SYNOPSIS
#include <pthread.h>
int pthread_join(pthread_t thread, void **value_ptr);
POSIX:THR
If successful, pthread_join returns 0. If unsuccessful, pthread_join returns a nonzero error code. The following table lists the mandatory errors for pthread_join.
|
EINVAL | thread does not correspond to a joinable thread | ESRCH | no thread with ID thread |
Example 12.7
The following code illustrates how to retrieve the value passed to pthread_exit by a terminating thread.
int error;
int *exitcodep;
pthread_t tid;
if (error = pthread_join(tid, &exitcodep))
fprintf(stderr, "Failed to join thread: %s\n", strerror(error));
else
fprintf(stderr, "The exit code was %d\n", *exitcodep);
Exercise 12.8
What happens if a thread executes the following?
pthread_join(pthread_self());
Answer:
Assuming the thread was joinable (not detached), this statement creates a deadlock. Some implementations detect a deadlock and force pthread_join to return with the error EDEADLK. However, this detection is not required by the POSIX:THR Extension.
Calling pthread_join is not the only way for the main thread to block until the other threads have completed. The main thread can use a semaphore or one of the methods discussed in Section 16.6 to wait for all threads to finish.
12.3.4 Exiting and cancellation
The process can terminate by calling exit directly, by executing return from main, or by having one of the other process threads call exit. In any of these cases, all threads terminate. If the main thread has no work to do after creating other threads, it should either block until all threads have completed or call pthread_exit(NULL).
A call to exit causes the entire process to terminate; a call to pthread_exit causes only the calling thread to terminate. A thread that executes return from its top level implicitly calls pthread_exit with the return value (a pointer) serving as the parameter to pthread_exit. A process will exit with a return status of 0 if its last thread calls pthread_exit.
The value_ptr value is available to a successful pthread_join. However, the value_ptr in pthread_exit must point to data that exists after the thread exits, so the thread should not use a pointer to automatic local data for value_ptr.
SYNOPSIS
#include <pthread.h>
void pthread_exit(void *value_ptr);
POSIX:THR
POSIX does not define any errors for pthread_exit.
Threads can force other threads to return through the cancellation mechanism. A thread calls pthread_cancel to request that another thread be canceled. The target thread's type and cancellability state determine the result. The single parameter of pthread_cancel is the thread ID of the target thread to be canceled. The pthread_cancel function does not cause the caller to block while the cancellation completes. Rather, pthread_cancel returns after making the cancellation request.
SYNOPSIS
#include <pthread.h>
int pthread_cancel(pthread_t thread);
POSIX:THR
If successful, pthread_cancel returns 0. If unsuccessful, pthread_cancel returns a nonzero error code. No mandatory errors are defined for pthread_cancel.
What happens when a thread receives a cancellation request depends on its state and type. If a thread has the PTHREAD_CANCEL_ENABLE state, it receives cancellation requests. On the other hand, if the thread has the PTHREAD_CANCEL_DISABLE state, the cancellation requests are held pending. By default, threads have the PTHREAD_CANCEL_ENABLE state.
The pthread_setcancelstate function changes the cancellability state of the calling thread. The pthread_setcancelstate takes two parameters: state, specifying the new state to set; and oldstate, a pointer to an integer for holding the previous state.
SYNOPSIS
#include <pthread.h>
int pthread_setcancelstate(int state, int *oldstate);
POSIX:THR
If successful, pthread_setcancelstate returns 0. If unsuccessful, it returns a nonzero error code. No mandatory errors are defined for pthread_setcancelstate.
Program 12.3 shows a modification of the processfd function of Program 12.1 that explicitly disables cancellation before it calls docommand, to ensure that the command won't be canceled midstream. The original processfd always returns NULL. The processfdcancel function returns a pointer other than NULL if it cannot change the cancellation state. This function should not return a pointer to an automatic local variable, since local variables are deallocated when the function returns or the thread exits. Program 12.3 uses a parameter passed by the calling thread to return the pointer.
Program 12.3 processfdcancel.c
This function monitors a file descriptor for input and calls docommand to process the result. It explicitly disables cancellation before calling docommand.
#include <pthread.h>
#include "restart.h"
#define BUFSIZE 1024
void docommand(char *cmd, int cmdsize);
void *processfdcancel(void *arg) { /* process commands with cancellation */
char buf[BUFSIZE];
int fd;
ssize_t nbytes;
int newstate, oldstate;
fd = *((int *)(arg));
for ( ; ; ) {
if ((nbytes = r_read(fd, buf, BUFSIZE)) <= 0)
break;
if (pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &oldstate))
return arg;
docommand(buf, nbytes);
if (pthread_setcancelstate(oldstate, &newstate))
return arg;
}
return NULL;
}
As a general rule, a function that changes its cancellation state or its type should restore the value before returning. A caller cannot make reliable assumptions about the program behavior unless this rule is observed. The processfdcancel function saves the old state and restores it rather than just enabling cancellation after calling docommand.
Cancellation can cause difficulties if a thread holds resources such as a lock or an open file descriptor that must be released before exiting. A thread maintains a stack of cleanup routines using pthread_cleanup_push and pthread_cleanup_pop. (We do not discuss these here.) Although a canceled thread can execute a cleanup function before exiting (not discussed here), it is not always feasible to release resources in an exit handler. Also, there may be points in the execution at which an exit would leave the program in an unacceptable state. The cancellation type allows a thread to control the point when it exits in response to a cancellation request. When its cancellation type is PTHREAD_CANCEL_ASYNCHRONOUS, the thread can act on the cancellation request at any time. In contrast, a cancellation type of PTHREAD_CANCEL_DEFERRED causes the thread to act on cancellation requests only at specified cancellation points. By default, threads have the PTHREAD_CANCEL_DEFERRED type.
The pthread_setcanceltype function changes the cancellability type of a thread as specified by its type parameter. The oldtype parameter is a pointer to a location for saving the previous type. A thread can set a cancellation point at a particular place in the code by calling pthread_testcancel. Certain blocking functions, such as read, are automatically treated as cancellation points. A thread with the PTHREAD_CANCEL_DEFERRED type accepts pending cancellation requests when it reaches such a cancellation point.
SYNOPSIS
#include <pthread.h>
int pthread_setcanceltype(int type, int *oldtype);
void pthread_testcancel(void);
POSIX:THR
If successful, pthread_setcanceltype returns 0. If unsuccessful, it returns a nonzero error code. No mandatory errors are defined for pthread_setcanceltype. The pthread_testcancel has no return value.
12.3.5 Passing parameters to threads and returning values
The creator of a thread may pass a single parameter to a thread at creation time, using a pointer to void. To communicate multiple values, the creator must use a pointer to an array or a structure. Program 12.4 illustrates how to pass a pointer to an array. The main program passes an array containing two open file descriptors to a thread that runs copyfilemalloc.
Program 12.4 callcopymalloc.c
This program creates a thread to copy a file.
#include <errno.h>
#include <fcntl.h>
#include <pthread.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/types.h>
#define PERMS (S_IRUSR | S_IWUSR)
#define READ_FLAGS O_RDONLY
#define WRITE_FLAGS (O_WRONLY | O_CREAT | O_TRUNC)
void *copyfilemalloc(void *arg);
int main (int argc, char *argv[]) { /* copy fromfile to tofile */
int *bytesptr;
int error;
int fds[2];
pthread_t tid;
if (argc != 3) {
fprintf(stderr, "Usage: %s fromfile tofile\n", argv[0]);
return 1;
}
if (((fds[0] = open(argv[1], READ_FLAGS)) == -1) ||
((fds[1] = open(argv[2], WRITE_FLAGS, PERMS)) == -1)) {
perror("Failed to open the files");
return 1;
}
if (error = pthread_create(&tid, NULL, copyfilemalloc, fds)) {
fprintf(stderr, "Failed to create thread: %s\n", strerror(error));
return 1;
}
if (error = pthread_join(tid, (void **)&bytesptr)) {
fprintf(stderr, "Failed to join thread: %s\n", strerror(error));
return 1;
}
printf("Number of bytes copied: %d\n", *bytesptr);
return 0;
}
Program 12.5 shows an implementation of copyfilemalloc, a function that reads from one file and outputs to another file. The arg parameter holds a pointer to a pair of open descriptors representing the source and destination files. The variables bytesp, infd and outfd are allocated on copyfilemalloc's local stack and are not directly accessible to other threads.
Program 12.5 also illustrates a strategy for returning values from the thread. The thread allocates memory space for returning the total number of bytes copied since it is not allowed to return a pointer to its local variables. POSIX requires that malloc be thread-safe. The copyfilemalloc function returns the bytesp pointer, which is equivalent to calling pthread_exit. It is the responsibility of the calling program (callcopymalloc) to free this space when it has finished using it. In this case, the program terminates, so it is not necessary to call free.
Program 12.5 copyfilemalloc.c
The copyfilemalloc function copies the contents of one file to another by calling the copyfile function of Program 4.6 on page 100. It returns the number of bytes copied by dynamically allocating space for the return value.
#include <stdlib.h>
#include <unistd.h>
#include "restart.h"
void *copyfilemalloc(void *arg) { /* copy infd to outfd with return value */
int *bytesp;
int infd;
int outfd;
infd = *((int *)(arg));
outfd = *((int *)(arg) + 1);
if ((bytesp = (int *)malloc(sizeof(int))) == NULL)
return NULL;
*bytesp = copyfile(infd, outfd);
r_close(infd);
r_close(outfd);
return bytesp;
}
Exercise 12.9
What happens if copyfilemalloc stores the byte count in a variable with static storage class and returns a pointer to this static variable instead of dynamically allocating space for it?
Answer:
The program still works since only one thread is created. However, in a program with two copyfilemalloc threads, both store the byte count in the same place and one overwrites the other's value.
When a thread allocates space for a return value, some other thread is responsible for freeing that space. Whenever possible, a thread should clean up its own mess rather than requiring another thread to do it. It is also inefficient to dynamically allocate space to hold a single integer. An alternative to having the thread allocate space for the return value is for the creating thread to do it and pass a pointer to this space in the argument parameter of the thread. This approach avoids dynamic allocation completely if the space is on the stack of the creating thread.
Program 12.6 creates a copyfilepass thread to copy a file. The parameter to the thread is now an array of size 3. The first two entries of the array hold the file descriptors as in Program 12.4. The third array element stores the number of bytes copied. Program 12.6 can retrieve this value either through the array or through the second parameter of pthread_join. Alternatively, callcopypass could pass an array of size 2, and the thread could store the return value over one of the incoming file descriptors.
Program 12.6 callcopypass.c
A program that creates a thread to copy a file. The parameter of the thread is an array of three integers used for two file descriptors and the number of bytes copied.
#include <errno.h>
#include <fcntl.h>
#include <pthread.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/types.h>
#define PERMS (S_IRUSR | S_IWUSR)
#define READ_FLAGS O_RDONLY
#define WRITE_FLAGS (O_WRONLY | O_CREAT | O_TRUNC)
void *copyfilepass(void *arg);
int main (int argc, char *argv[]) {
int *bytesptr;
int error;
int targs[3];
pthread_t tid;
if (argc != 3) {
fprintf(stderr, "Usage: %s fromfile tofile\n", argv[0]);
return 1;
}
if (((targs[0] = open(argv[1], READ_FLAGS)) == -1) ||
((targs[1] = open(argv[2], WRITE_FLAGS, PERMS)) == -1)) {
perror("Failed to open the files");
return 1;
}
if (error = pthread_create(&tid, NULL, copyfilepass, targs)) {
fprintf(stderr, "Failed to create thread: %s\n", strerror(error));
return 1;
}
if (error = pthread_join(tid, (void **)&bytesptr)) {
fprintf(stderr, "Failed to join thread: %s\n", strerror(error));
return 1;
}
printf("Number of bytes copied: %d\n", *bytesptr);
return 0;
}
The copyfilepass function of Program 12.7 uses an alternative way of accessing the pieces of the argument. Compare this with the method used by the copyfilemalloc function of Program 12.5.
Program 12.7 copyfilepass.c
A thread that can be used by callcopypass to copy a file.
#include <unistd.h>
#include "restart.h"
void *copyfilepass(void *arg) {
int *argint;
argint = (int *)arg;
argint[2] = copyfile(argint[0], argint[1]);
r_close(argint[0]);
r_close(argint[1]);
return argint + 2;
}
Exercise 12.10
Why have copyfilepass return a pointer to the number of bytes copied when callcopypass can access this value as args[2]?
Answer:
If a thread other than the creating thread joins with copyfilepass, it has access to the number of bytes copied through the parameter to pthread_join.
Program 12.8 shows a parallel file-copy program that uses the thread in Program 12.7. The main program has three command-line arguments: an input file basename, an output file basename and the number of files to copy. The program creates numcopiers threads. Thread i copies infile.i to outfile.i.
Exercise 12.11
What happens in Program 12.8 if a write call in copyfile of copyfilepass fails?
Answer:
The copyfilepass returns the number of bytes successfully copied, and the main program does not detect an error. You can address the issue by having copyfilepass return an error value and pass the number of bytes written in one of the elements of the array used as a parameter for thread creation.
When creating multiple threads, do not reuse the variable holding a thread's parameter until you are sure that the thread has finished accessing the parameter. Because the variable is passed by reference, it is a good practice to use a separate variable for each thread.
Program 12.8 copymultiple.c
A program that creates threads to copy multiple file descriptors.
#include <errno.h>
#include <fcntl.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#define MAXNAME 80
#define R_FLAGS O_RDONLY
#define W_FLAGS (O_WRONLY | O_CREAT)
#define W_PERMS (S_IRUSR | S_IWUSR)
typedef struct {
int args[3];
pthread_t tid;
} copy_t;
void *copyfilepass(void *arg);
int main(int argc, char *argv[]) {
int *bytesp;
copy_t *copies;
int error;
char filename[MAXNAME];
int i;
int numcopiers;
int totalbytes = 0;
if (argc != 4) {
fprintf(stderr, "Usage: %s infile outfile copies\n", argv[0]);
return 1;
}
numcopiers = atoi(argv[3]);
if ((copies = (copy_t *)calloc(numcopiers, sizeof(copy_t))) == NULL) {
perror("Failed to allocate copier space");
return 1;
}
/* open the source and destination files and create the threads */
for (i = 0; i < numcopiers; i++) {
copies[i].tid = pthread_self(); /* cannot be value for new thread */
if (snprintf(filename, MAXNAME, "%s.%d", argv[1], i+1) == MAXNAME) {
fprintf(stderr, "Input filename %s.%d too long", argv[1], i + 1);
continue;
}
if ((copies[i].args[0] = open(filename, R_FLAGS)) == -1) {
fprintf(stderr, "Failed to open source file %s: %s\n",
filename, strerror(errno));
continue;
}
if (snprintf(filename, MAXNAME, "%s.%d", argv[2], i+1) == MAXNAME) {
fprintf(stderr, "Output filename %s.%d too long", argv[2], i + 1);
continue;
}
if ((copies[i].args[1] = open(filename, W_FLAGS, W_PERMS)) == -1) {
fprintf(stderr, "Failed to open destination file %s: %s\n",
filename, strerror(errno));
continue;
}
if (error = pthread_create((&copies[i].tid), NULL,
copyfilepass, copies[i].args)) {
fprintf(stderr, "Failed to create thread %d: %s\n", i + 1,
strerror(error));
copies[i].tid = pthread_self(); /* cannot be value for new thread */
}
}
/* wait for the threads to finish and report total bytes */
for (i = 0; i < numcopiers; i++) {
if (pthread_equal(copies[i].tid, pthread_self())) /* not created */
continue;
if (error = pthread_join(copies[i].tid, (void**)&bytesp)) {
fprintf(stderr, "Failed to join thread %d\n", i);
continue;
}
if (bytesp == NULL) {
fprintf(stderr, "Thread %d failed to return status\n", i);
continue;
}
printf("Thread %d copied %d bytes from %s.%d to %s.%d\n",
i, *bytesp, argv[1], i + 1, argv[2], i + 1);
totalbytes += *bytesp;
}
printf("Total bytes copied = %d\n", totalbytes);
return 0;
}
Program 12.9 shows a simple example of what can go wrong. The program creates 10 threads that each output the value of their parameter. The main program uses the thread creation loop index i as the parameter it passes to the threads. Each thread prints the value of the parameter it received. A thread can get an incorrect value if the main program changes i before the thread has a chance to print it.
Exercise 12.12
Run Program 12.9 and examine the results. What parameter value is reported by each thread?
Answer:
The results vary, depending on how the system schedules threads. One possibility is that main completes the loop creating the threads before any thread prints the value of the parameter. In this case, all the threads print the value 10.
Program 12.9 badparameters.c
A program that incorrectly passes parameters to multiple threads.
#include <pthread.h>
#include <stdio.h>
#include <string.h>
#define NUMTHREADS 10
static void *printarg(void *arg) {
fprintf(stderr, "Thread received %d\n", *(int *)arg);
return NULL;
}
int main (void) { /* program incorrectly passes parameters to threads */
int error;
int i;
int j;
pthread_t tid[NUMTHREADS];
for (i = 0; i < NUMTHREADS; i++)
if (error = pthread_create(tid + i, NULL, printarg, (void *)&i)) {
fprintf(stderr, "Failed to create thread: %s\n", strerror(error));
tid[i] = pthread_self();
}
for (j = 0; j < NUMTHREADS; j++)
if (pthread_equal(pthread_self(), tid[j]))
continue;
if (error = pthread_join(tid[j], NULL))
fprintf(stderr, "Failed to join thread: %s\n", strerror(error));
printf("All threads done\n");
return 0;
}
Exercise 12.13
For each of the following, start with Program 12.9 and make the specified modifications. Predict the output, and then run the program to see if you are correct.
1 | Run the original program without any modification. | 2 | Put a call to sleep(1); at the start of printarg. | 3 | Put a call to sleep(1); inside the first for loop after the call to pthread_create. | 4 | Put a call to sleep(1); after the first for loop. | 5.-8. | Repeat each of the items above, using i as the loop index rather than j. |
Answer:
The results may vary if it takes more than a second for the threads to execute. On a fast enough system, the result will be something like the following.
Output described in Exercise 12.12. Each thread outputs the value 10, the value of i when main has finished its loop. Each thread outputs the correct value since it executes before the value of i changes. Same as in Exercise 12.12. All threads output the value 0, the value of i when main waits for the first thread to terminate. The results may vary. Same as 5. Same as 3. Same as 4.
Exercise 12.14 whichexit.c
The whichexit function can be executed as a thread.
#include <errno.h>
#include <pthread.h>
#include <stdlib.h>
#include <string.h>
void *whichexit(void *arg) {
int n;
int np1[1];
int *np2;
char s1[10];
char s2[] = "I am done";
n = 3;
np1[0] = n;
np2 = (int *)malloc(sizeof(int *));
*np2 = n;
strcpy(s1, "Done");
return(NULL);
}
Which of the following would be safe replacements for NULL as the parameter to pthread_exit? Assume no errors occur.
n &n (int *)n np1 np2 s1 s2 "This works" strerror(EINTR)
Answer:
The return value is a pointer, not an integer, so this is invalid. The integer n has automatic storage class, so it is illegal to access it after the function terminates. This is a common way to return an integer from a thread. The integer is cast to a pointer. When another thread calls pthread_join for this thread, it casts the pointer back to an integer. While this will probably work in most implementations, it should be avoided. The C standard [56, Section 6.3.2.3] says that an integer may be converted to a pointer or a pointer to an integer, but the result is implementation defined. It does not guarantee that the result of converting an integer to a pointer and back again yields the original integer. The array np1 has automatic storage class, so it is illegal to access the array after the function terminates. This is safe since the dynamically allocated space will be available until it is freed. The array s1 has automatic storage class, so it is illegal to access the array after the function terminates. The array s2 has automatic storage class, so it is illegal to access the array after the function terminates. This is valid in C, since string literals have static storage duration. This is certainly invalid if strerror is not thread-safe. Even on a system where strerror is thread-safe, the string produced is not guaranteed to be available after the thread terminates.
|
12.4 Thread Safety
A hidden problem with threads is that they may call library functions that are not thread-safe, possibly producing spurious results. A function is thread-safe if multiple threads can execute simultaneous active invocations of the function without interference. POSIX specifies that all the required functions, including the functions from the standard C library, be implemented in a thread-safe manner except for the specific functions listed in Table 12.2. Those functions whose traditional interfaces preclude making them thread-safe must have an alternative thread-safe version designated with an _r suffix.
An important example of a function that does not have to be thread-safe is strerror. Although strerror is not guaranteed to be thread-safe, many systems have implemented this function in a thread-safe manner. Unfortunately, because strerror is listed in Table 12.2, you can not assume that it works correctly if multiple threads call it. We use strerror only in the main thread, often to produce error messages for pthread_create and pthread_join. Section 13.7 gives a thread-safe implementation called strerror_r.
Another interaction problem occurs when threads access the same data. The individual copier threads in Program 12.8 work on independent problems and do not interact with each other. In more complicated applications, a thread may not exit after completing its assigned task. Instead, a worker thread may request additional tasks or share information. Chapter 13 explains how to control this type of interaction by using synchronization primitives such as mutex locks and condition variables.
Table 12.2. POSIX functions that are not required to be thread-safe.asctime | fcvt | getpwnam | nl_langinfo | basename | ftw | getpwuid | ptsname | catgets | gcvt | getservbyname | putc_unlocked | crypt | getc_unlocked | getservbyport | putchar_unlocked | ctime | getchar_unlocked | getservent | putenv | dbm_clearerr | getdate | getutxent | pututxline | dbm_close | getenv | getutxid | rand | dbm_delete | getgrent | getutxline | readdir | dbm_error | getgrgid | gmtime | setenv | dbm_fetch | getgrnam | hcreate | setgrent | dbm_firstkey | gethostbyaddr | hdestroy | setkey | dbm_nextkey | gethostbyname | hsearch | setpwent | dbm_open | gethostent | inet_ntoa | setutxent | dbm_store | getlogin | l64a | strerror | dirname | getnetbyaddr | lgamma | strtok | dlerror | getnetbyname | lgammaf | ttyname | drand48 | getnetent | lgammal | unsetenv | ecvt | getopt | localeconv | wcstombs | encrypt | getprotobyname | localtime | wctomb | endgrent | getprotobynumber | lrand48 | | endpwent | getprotoent | mrand48 | | endutxent | getpwent | nftw | |
In traditional UNIX implementations, errno is a global external variable that is set when system functions produce an error. This implementation does not work for multithreading (see Section 2.7), and in most thread implementations errno is a macro that returns thread-specific information. In essence, each thread has a private copy of errno. The main thread does not have direct access to errno for a joined thread, so if needed, this information must be returned through the last parameter of pthread_join.
12.5 User Threads versus Kernel Threads
The two traditional models of thread control are user-level threads and kernel-level threads. User-level threads, shown in Figure 12.3, usually run on top of an existing operating system. These threads are invisible to the kernel and compete among themselves for the resources allocated to their encapsulating process. The threads are scheduled by a thread runtime system that is part of the process code. Programs with user-level threads usually link to a special library in which each library function is enclosed by a jacket. The jacket function calls the thread runtime system to do thread management before and possibly after calling the jacketed library function.
Functions such as read or sleep can present a problem for user-level threads because they may cause the process to block. To avoid blocking the entire process on a blocking call, the user-level thread library replaces each potentially blocking call in the jacket by a nonblocking version. The thread runtime system tests to see if the call would cause the thread to block. If the call would not block, the runtime system does the call right away. If the call would block, however, the runtime system places the thread on a list of waiting threads, adds the call to a list of actions to try later, and picks another thread to run. All this control is invisible to the user and to the operating system.
User-level threads have low overhead, but they also have some disadvantages. The user thread model, which assumes that the thread runtime system will eventually regain control, can be thwarted by CPU-bound threads. A CPU-bound thread rarely performs library calls and may prevent the thread runtime system from regaining control to schedule other threads. The programmer has to avoid the lockout situation by explicitly forcing CPU-bound threads to yield control at appropriate points. A second problem is that user-level threads can share only processor resources allocated to their encapsulating process. This restriction limits the amount of available parallelism because the threads can run on only one processor at a time. Since one of the prime motivations for using threads is to take advantage of multiprocessor workstations, user-level threads alone are not an acceptable approach.
With kernel-level threads, the kernel is aware of each thread as a schedulable entity and threads compete systemwide for processor resources. Figure 12.4 illustrates the visibility of kernel-level threads. The scheduling of kernel-level threads can be almost as expensive as the scheduling of processes themselves, but kernel-level threads can take advantage of multiple processors. The synchronization and sharing of data for kernel-level threads is less expensive than for full processes, but kernel-level threads are considerably more expensive to manage than user-level threads.
Hybrid thread models have advantages of both user-level and kernel-level models by providing two levels of control. Figure 12.5 illustrates a typical hybrid approach. The user writes the program in terms of user-level threads and then specifies how many kernel-schedulable entities are associated with the process. The user-level threads are mapped into the kernel-schedulable entities at runtime to achieve parallelism. The level of control that a user has over the mapping depends on the implementation. In the Sun Solaris thread implementation, for example, the user-level threads are called threads and the kernel-schedulable entities are called lightweight processes. The user can specify that a particular thread be run by a dedicated lightweight process or that a particular group of threads be run by a pool of lightweight processes.
The POSIX thread scheduling model is a hybrid model that is flexible enough to support both user-level and kernel-level threads in particular implementations of the standard. The model consists of two levels of schedulingthreads and kernel entities. The threads are analogous to user-level threads. The kernel entities are scheduled by the kernel. The thread library decides how many kernel entities it needs and how they will be mapped.
POSIX introduces the idea of a thread-scheduling contention scope, which gives the programmer some control over how kernel entities are mapped to threads. A thread can have a contentionscope attribute of either PTHREAD_SCOPE_PROCESS or PTHREAD_SCOPE_SYSTEM. Threads with the PTHREAD_SCOPE_PROCESS attribute contend for processor resources with the other threads in their process. POSIX does not specify how such a thread contends with threads outside its own process, so PTHREAD_SCOPE_PROCESS threads can be strictly user-level threads or they can be mapped to a pool of kernel entities in some more complicated way.
Threads with the PTHREAD_SCOPE_SYSTEM attribute contend systemwide for processor resources, much like kernel-level threads. POSIX leaves the mapping between PTHREAD_SCOPE_SYSTEM threads and kernel entities up to the implementation, but the obvious mapping is to bind such a thread directly to a kernel entity. A POSIX thread implementation can support PTHREAD_SCOPE_PROCESS, PTHREAD_SCOPE_SYSTEM or both. You can get the scope with pthread_attr_getscope and set the scope with pthread_attr_setscope, provided that your POSIX implementation supports both the POSIX:THR Thread Extension and the POSIX:TPS Thread Execution Scheduling Extension.
12.6 Thread Attributes
POSIX takes an object-oriented approach to representation and assignment of properties by encapsulating properties such as stack size and scheduling policy into an object of type pthread_attr_t. The attribute object affects a thread only at the time of creation. You first create an attribute object and associate properties, such as stack size and scheduling policy, with the attribute object. You can then create multiple threads with the same properties by passing the same thread attribute object to pthread_create. By grouping the properties into a single object, POSIX avoids pthread_create calls with a large number of parameters.
Table 12.3 shows the settable properties of thread attributes and their associated functions. Other entities, such as condition variables and mutex locks, have their own attribute object types. Chapter 13 discusses these synchronization mechanisms.
Table 12.3. Summary of settable properties for POSIX thread attribute objects.|
attribute objects | pthread_attr_destroy | pthread_attr_init | state | pthread_attr_getdetachstate | pthread_attr_setdetachstate | stack | pthread_attr_getguardsize | pthread_attr_setguardsize | pthread_attr_getstack | pthread_attr_setstack | scheduling | pthread_attr_getinheritsched | pthread_attr_setinheritsched | pthread_attr_getschedparam | pthread_attr_setschedparam | pthread_attr_getschedpolicy | pthread_attr_setschedpolicy | pthread_attr_getscope | pthread_attr_setscope |
The pthread_attr_init function initializes a thread attribute object with the default values. The pthread_attr_destroy function sets the value of the attribute object to be invalid. POSIX does not specify the behavior of the object after it has been destroyed, but the variable can be initialized to a new thread attribute object. Both pthread_attr_init and pthread_attr_destroy take a single parameter that is a pointer to a pthread_attr_t attribute object.
SYNOPSIS
#include <pthread.h>
int pthread_attr_destroy(pthread_attr_t *attr);
int pthread_attr_init(pthread_attr_t *attr);
POSIX:THR
If successful, pthread_attr_destroy and pthread_attr_init return 0. If unsuccessful, these functions return a nonzero error code. The pthread_attr_init function sets errno to ENOMEM if there is not enough memory to create the thread attribute object.
Most of the get/set thread attribute functions have two parameters. The first parameter is a pointer to a thread attribute object. The second parameter is the new value of the attribute for a set operation or a pointer to location to hold the value for a get operation. The pthread_attr_getstack and pthread_attr_setstack each have one additional parameter.
12.6.1 The thread state
The pthread_attr_getdetachstate function examines the state of an attribute object, and the pthread_attr_setdetachstate function sets the state of an attribute object. The possible values of the thread state are PTHREAD_CREATE_JOINABLE and PTHREAD_CREATE_DETACHED. The attr parameter is a pointer to the attribute object. The detachstate parameter corresponds to the value to be set for pthread_attr_setdetachstate and to a pointer to the value to be retrieved for pthread_attr_getdetachstate.
SYNOPSIS
#include <pthread.h>
int pthread_attr_getdetachstate(const pthread_attr_t *attr,
int *detachstate);
int pthread_attr_setdetachstate(pthread_attr_t *attr, int detachstate);
POSIX:THR
If successful, these functions return 0. If unsuccessful, they return a nonzero error code. The pthread_attr_setdetachstate function sets errno to EINVAL if detachstate is invalid.
Detached threads release their resources when they terminate, whereas joinable threads should be waited for with a pthread_join. A thread that is detached cannot be waited for with a pthread_join. By default, threads are joinable. You can detach a thread by calling the pthread_detach function after creating the thread. Alternatively, you can create a thread in the detached state by using an attribute object with thread state PTHREAD_CREATE_DETACHED.
Example 12.15
The following code segment creates a detached thread to run processfd.
int error, fd;
pthread_attr_t tattr;
pthread_t tid;
if (error = pthread_attr_init(&tattr))
fprintf(stderr, "Failed to create attribute object: %s\n",
strerror(error));
else if (error = pthread_attr_setdetachstate(&tattr,
PTHREAD_CREATE_DETACHED))
fprintf(stderr, "Failed to set attribute state to detached: %s\n",
strerror(error));
else if (error = pthread_create(&tid, &tattr, processfd, &fd))
fprintf(stderr, "Failed to create thread: %s\n", strerror(error));
12.6.2 The thread stack
A thread has a stack whose location and size are user-settable, a useful property if the thread stack must be placed in a particular region of memory. To define the placement and size of the stack for a thread, you must first create an attribute object with the specified stack attributes. Then, call pthread_create with this attribute object.
The pthread_attr_getstack function examines the stack parameters, and the pthread_attr_setstack function sets the stack parameters of an attribute object. The attr parameter of each function is a pointer to the attribute object. The pthread_attr_setstack function takes the stack address and stack size as additional parameters. The pthread_attr_getstack takes pointers to these items.
SYNOPSIS
#include <pthread.h>
int pthread_attr_getstack(const pthread_attr_t *restrict attr,
void **restrict stackaddr, size_t *restrict stacksize);
int pthread_attr_setstack(pthread_attr_t *attr,
void *stackaddr, size_t stacksize);
POSIX:THR,TSA,TSS
If successful, the pthread_attr_getstack and pthread_attr_setstack functions return 0. If unsuccessful, these functions return a nonzero error code. The pthread_attr_setstack function sets errno to EINVAL if stacksize is out of range.
POSIX also provides functions for examining or setting a guard for stack overflows if the stackaddr has not been set by the user. The pthread_attr_getguardsize function examines the guard parameters, and the pthread_attr_setguardsize function sets the guard parameters for controlling stack overflows in an attribute object. If the guardsize parameter is 0, the stack is unguarded. For a nonzero guardsize, the implementation allocates additional memory of at least guardsize. An overflow into this extra memory causes an error and may generate a SIGSEGV signal for the thread.
SYNOPSIS
#include <pthread.h>
int pthread_attr_getguardsize(const pthread_attr_t *restrict attr,
size_t *restrict guardsize);
int pthread_attr_setguardsize(pthread_attr_t *attr,
size_t guardsize);
POSIX:THR,XSI
If successful, pthread_attr_getguardsize and pthread_attr_setguardsize return 0. If unsuccessful, these functions return a nonzero error code. They return EINVAL if the attr or guardsize parameter is invalid. Guards require the POSIX:THR Extension and the POSIX:XSI Extension.
12.6.3 Thread scheduling
The contention scope of an object controls whether the thread competes within the process or at the system level for scheduling resources. The pthread_attr_getscope examines the contention scope, and the pthread_attr_setscope sets the contention scope of an attribute object. The attr parameter is a pointer to the attribute object. The contentionscope parameter corresponds to the value to be set for pthread_attr_setscope and to a pointer to the value to be retrieved for pthread_attr_getscope. The possible values of the contentionscope parameter are PTHREAD_SCOPE_PROCESS and PTHREAD_SCOPE_SYSTEM.
SYNOPSIS
#include <pthread.h>
int pthread_attr_getscope(const pthread_attr_t *restrict attr,
int *restrict contentionscope);
int pthread_attr_setscope(pthread_attr_t *attr, int contentionscope);
POSIX:THR,TPS
If successful, pthread_attr_getscope and pthread_attr_setscope return 0. If unsuccessful, these functions return a nonzero error code. No mandatory errors are defined for these functions.
Example 12.16
The following code segment creates a thread that contends for kernel resources.
int error;
int fd;
pthread_attr_t tattr;
pthread_t tid;
if (error = pthread_attr_init(&tattr))
fprintf(stderr, "Failed to create an attribute object:%s\n",
strerror(error));
else if (error = pthread_attr_setscope(&tattr, PTHREAD_SCOPE_SYSTEM))
fprintf(stderr, "Failed to set scope to system:%s\n",
strerror(error));
else if (error = pthread_create(&tid, &tattr, processfd, &fd))
fprintf(stderr, "Failed to create a thread:%s\n", strerror(error));
POSIX allows a thread to inherit a scheduling policy in different ways. The pthread_attr_getinheritsched function examines the scheduling inheritance policy, and the pthread_attr_setinheritsched function sets the scheduling inheritance policy of an attribute object.
The attr parameter is a pointer to the attribute object. The inheritsched parameter corresponds to the value to be set for pthread_attr_setinheritsched and to a pointer to the value to be retrieved for pthread_attr_getinheritsched. The two possible values of inheritsched are PTHREAD_INHERIT_SCHED and PTHREAD_EXPLICIT_SCHED. The value of inheritsched determines how the other scheduling attributes of a created thread are to be set. With PTHREAD_INHERIT_SCHED, the scheduling attributes are inherited from the creating thread and the other scheduling attributes are ignored. With PTHREAD_EXPLICIT_SCHED, the scheduling attributes of this attribute object are used.
SYNOPSIS
#include <pthread.h>
int pthread_attr_getinheritsched(const pthread_attr_t *restrict attr,
int *restrict inheritsched);
int pthread_attr_setinheritsched(pthread_attr_t *attr,
int inheritsched);
POSIX:THR,TPS
If successful, these functions return 0. If unsuccessful, they return a nonzero error code. No mandatory errors are defined for these functions.
The pthread_attr_getschedparam function examines the scheduling parameters, and the pthread_attr_setschedparam sets the scheduling parameters of an attribute object. The attr parameter is a pointer to the attribute object. The param parameter is a pointer to the value to be set for pthread_attr_setschedparam and a pointer to the value to be retrieved for pthread_attr_getschedparam. Notice that unlike the other pthread_attr_set functions, the second parameter is a pointer because it corresponds to a structure rather than an integer. Passing a structure by value is inefficient.
SYNOPSIS
#include <pthread.h>
int pthread_attr_getschedparam(const pthread_attr_t *restrict attr,
struct sched_param *restrict param);
int pthread_attr_setschedparam(pthread_attr_t *restrict attr,
const struct sched_param *restrict param);
POSIX:THR
If successful, these functions return 0. If unsuccessful, they return a nonzero error code. No mandatory errors are defined for these functions.
The scheduling parameters depend on the scheduling policy. They are encapsulated in a struct sched_param structure defined in sched.h. The SCHED_FIFO and SCHED_RR scheduling policies require only the sched_priority member of struct sched_param. The sched_priority field holds an int priority value, with larger priority values corresponding to higher priorities. Implementations must support at least 32 priorities.
Program 12.10 shows a function that creates a thread attribute object with a specified priority. All the other attributes have their default values. Program 12.10 returns a pointer to the created attribute object or NULL if the function failed, in which case it sets errno. Program 12.10 illustrates the general strategy for changing parametersread the existing values first and change only the ones that you need to change.
Example 12.17
The following code segment creates a dothis thread with the default attributes, except that the priority is HIGHPRIORITY.
#define HIGHPRIORITY 10
int fd;
pthread_attr_t *tattr;
pthread_t tid;
struct sched_param tparam;
if ((tattr = makepriority(HIGHPRIORITY))) {
perror("Failed to create the attribute object");
else if (error = pthread_create(&tid, tattr, dothis, &fd))
fprintf(stderr, "Failed to create dothis thread:%s\n", strerror(error));
Threads of the same priority compete for processor resources as specified by their scheduling policy. The sched.h header file defines SCHED_FIFO for first-in-first-out scheduling, SCHED_RR for round-robin scheduling and SCHED_OTHER for some other policy. One additional scheduling policy, SCHED_SPORADIC, is defined for implementations supporting the POSIX:SS Process Sporadic Server Extension and the POSIX:TSP Thread Sporadic Server Extension. Implementations may also define their own policies.
Program 12.10 makepriority.c
A function to create a thread attribute object with the specified priority.
#include <errno.h>
#include <pthread.h>
#include <stdlib.h>
pthread_attr_t *makepriority(int priority) { /* create attribute object */
pthread_attr_t *attr;
int error;
struct sched_param param;
if ((attr = (pthread_attr_t *)malloc(sizeof(pthread_attr_t))) == NULL)
return NULL;
if (!(error = pthread_attr_init(attr)) &&
!(error = pthread_attr_getschedparam(attr, ¶m))) {
param.sched_priority = priority;
error = pthread_attr_setschedparam(attr, ¶m);
}
if (error) { /* if failure, be sure to free memory */
free(attr);
errno = error;
return NULL;
}
return attr;
}
First-in-first-out scheduling policies (e.g., SCHED_FIFO) use a queue for threads in the runnable state at a specified priority. Blocked threads that become runnable are put at the end of the queue corresponding to their priority, whereas running threads that have been preempted are put at the front of their queue.
Round-robin scheduling (e.g., SCHED_RR) behaves similarly to first-in-first-out except that when a running thread has been running for its quantum, it is put at the end of the queue for its priority. The sched_rr_get_interval function returns the quantum.
Sporadic scheduling, which is similar to first-in-first-out, uses two parameters (the replenishment period and the execution capacity) to control the number of threads running at a given priority level. The rules are reasonably complex, but the policy allows a program to more easily regulate the number of threads competing for the processor as a function of available resources.
Preemptive priority policy is the most common implementation of SCHED_OTHER. A POSIX-compliant implementation can support any of these scheduling policies. The actual behavior of the policy in the implementation depends on the scheduling scope and other factors.
The pthread_attr_getschedpolicy function gets the scheduling policy, and the pthread_attr_setschedpolicy function sets the scheduling policy of an attribute object. The attr parameter is a pointer to the attribute object. For the function pthread_attr_setschedpolicy, the policy parameter is a pointer to the value to be set; for pthread_attr_getschedpolicy, it is a pointer to the value to be retrieved. The scheduling policy values are described above.
SYNOPSIS
#include <pthread.h>
int pthread_attr_getschedpolicy(const pthread_attr_t *restrict attr,
int *restrict policy);
int pthread_attr_setschedpolicy(pthread_attr_t *attr, int policy);
POSIX:THR
If successful, these functions return 0. If unsuccessful, they return a nonzero error code. No mandatory errors are defined for these functions.
12.7 Exercise: Parallel File Copy
This section develops a parallel file copy as an extension of the copier application of Program 12.8. Be sure to use thread-safe calls in the implementation. The main program takes two command-line arguments that are directory names and copies everything from the first directory into the second directory. The copy program preserves subdirectory structure. The same filenames are used for source and destination. Implement the parallel file copy as follows.
Write a function called copydirectory that has the following prototype.
void *copydirectory(void *arg)
The copydirectory function copies all the files from one directory to another directory. The directory names are passed in arg as two consecutive strings (separated by a null character). Assume that both source and destination directories exist when copydirectory is called. In this version, only ordinary files are copied and subdirectories are ignored. For each file to be copied, create a thread to run the copyfilepass function of Program 12.7. For this version, wait for each thread to complete before creating the next one.
Write a main program that takes two command-line arguments for the source and destination directories. The main program creates a thread to run copydirectory and then does a pthread_join to wait for the copydirectory thread to complete. Use this program to test the first version of copydirectory.
Modify the copydirectory function so that if the destination directory does not exist, copydirectory creates the directory. Test the new version.
Modify copydirectory so that after it creates a thread to copy a file, it continues to create threads to copy the other files. Keep the thread ID and open file descriptors for each copyfilepass thread in a linked list with a node structure similar to the following.
typedef struct copy_struct {
char *namestring;
int sourcefd;
int destinationfd;
int bytescopied;
pthread_t tid;
struct copy_struct *next;
} copyinfo_t;
copyinfo_t *head = NULL;
copyinfo_t *tail = NULL;
After the copydirectory function creates threads to copy all the files in the directory, it does a pthread_join on each thread in its list and frees the copyinfo_t structure.
Modify the copyfilepass function of Program 12.7 so that its parameter is a pointer to a copyinfo_t structure. Test the new version of copyfilepass and copydirectory.
Modify copydirectory so that if a file is a directory instead of an ordinary file, copydirectory creates a thread to run copydirectory instead of copyfilepass. Test the new function.
Devise a method for performing timings to compare an ordinary copy with the threaded copy.
If run on a large directory, the program may attempt to open more file descriptors or more threads than are allowed for a process. Devise a method for handling this situation.
See whether there is a difference in running time if the threads have scope PTHREAD_SCOPE_SYSTEM instead of PTHREAD_SCOPE_PROCESS.
|
12.8 Additional Reading
A number of books on POSIX thread programming are available. They include Programming with POSIX(R) Threads by Butenhof [19], Pthreads Programming: A POSIX Standard for Better Multiprocessing by Nichols et al. [87], Multithreaded Programming with Pthreads by Lewis and Berg [72] and Thread Time: The Multithreaded Programming Guide by Norton and DiPasquale. All these books are based on the original POSIX standard. The book Distributed Operating Systems by Tanenbaum [121] presents an understandable general discussion of threads. Approaches to thread scheduling are discussed in [2, 12, 32, 78]. Finally, the POSIX standard [49, 51] is a surprisingly readable account of the conflicting issues and choices involved in implementing a usable threads package.
|
Chapter 13. Thread Synchronization
POSIX supports mutex locks for short-term locking and condition variables for waiting on events of unbounded duration. Signal handling in threaded programs presents additional complications that can be reduced if signal handlers are replaced with dedicated threads. This chapter illustrates these thread synchronization concepts by implementing controlled access to shared objects, reader-writer synchronization and barriers.
Learn the basics of thread synchronization Experiment with mutex locks and condition variables Explore classic synchronization problems Use threads with signals Understand design tradeoffs for synchronization
|
13.1 POSIX Synchronization Functions
This chapter discusses mutex locks, conditions variables and read-write locks. Table 13.1 summarizes the synchronization functions that are available in the POSIX:THR Extension. Each synchronization mechanism provides an initialization function and a function for destroying the object. The mutex locks and condition variables allow static initialization. All three types of synchronization have associated attribute objects, but we work only with synchronization objects that have the default attributes.
Table 13.1. Synchronization functions for POSIX:THR threads.|
mutex locks | pthread_mutex_destroy | pthread_mutex_init | pthread_mutex_lock | pthread_mutex_trylock | pthread_mutex_unlock | condition variables | pthread_cond_broadcast | pthread_cond_destroy | pthread_cond_init | pthread_cond_signal | pthread_cond_timedwait | pthread_cond_wait | read-write locks | pthread_rwlock_destroy | pthread_rwlock_init | pthread_rwlock_rdlock | pthread_rwlock_timedrdlock | pthread_rwlock_timedwrlock | pthread_rwlock_tryrdlock | pthread_rwlock_trywrlock | pthread_rwlock_wrlock |
13.2 Mutex Locks
A mutex is a special variable that can be either in the locked state or the unlocked state. If the mutex is locked, it has a distinguished thread that holds or owns the mutex. If no thread holds the mutex, we say the mutex is unlocked, free or available. The mutex also has a queue for the threads that are waiting to hold the mutex. The order in which the threads in the mutex queue obtain the mutex is determined by the thread-scheduling policy, but POSIX does not require that any particular policy be implemented.
When the mutex is free and a thread attempts to acquire the mutex, that thread obtains the mutex and is not blocked. It is convenient to think of this case as first causing the thread to enter the queue and then automatically removing it from the queue and giving it the mutex.
The mutex or mutex lock is the simplest and most efficient thread synchronization mechanism. Programs use mutex locks to preserve critical sections and to obtain exclusive access to resources. A mutex is meant to be held for short periods of time. Mutex functions are not thread cancellation points and are not interrupted by signals. A thread that waits for a mutex is not logically interruptible except by termination of the process, termination of a thread with pthread_exit (from a signal handler), or asynchronous cancellation (which is normally not used).
Mutex locks are ideal for making changes to data structures in which the state of the data structure is temporarily inconsistent, as when updating pointers in a shared linked list. These locks are designed to be held for a short time. Use condition variables to synchronize on events of indefinite duration such as waiting for input.
13.2.1 Creating and initializing a mutex
POSIX uses variables of type pthread_mutex_t to represent mutex locks. A program must always initialize pthread_mutex_t variables before using them for synchronization. For statically allocated pthread_mutex_t variables, simply assign PTHREAD_MUTEX_INITIALIZER to the variable. For mutex variables that are dynamically allocated or that don't have the default mutex attributes, call pthread_mutex_init to perform initialization.
The mutex parameter of pthread_mutex_init is a pointer to the mutex to be initialized. Pass NULL for the attr parameter of pthread_mutex_init to initialize a mutex with the default attributes. Otherwise, first create and initialize a mutex attribute object in a manner similar to that used for thread attribute objects.
SYNOPSIS
#include <pthread.h>
int pthread_mutex_init(pthread_mutex_t *restrict mutex,
const pthread_mutexattr_t *restrict attr);
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
POSIX:THR
If successful, pthread_mutex_init returns 0. If unsuccessful, pthread_mutex_init returns a nonzero error code. The following table lists the mandatory errors for pthread_mutex_init.
|
EAGAIN | system lacks nonmemory resources needed to initialize *mutex | ENOMEM | system lacks memory resources needed to initialize *mutex | EPERM | caller does not have appropriate privileges |
Example 13.1
The following code segment initializes the mylock mutex with the default attributes, using the static initializer.
pthread_mutex_t mylock = PTHREAD_MUTEX_INITIALIZER;
The mylock variable must be allocated statically.
Static initializers are usually more efficient than pthread_mutex_init, and they are guaranteed to be performed exactly once before any thread begins execution.
Example 13.2
The following code segment initializes the mylock mutex with the default attributes. The mylock variable must be accessible to all the threads that use it.
int error;
pthread_mutex_t mylock;
if (error = pthread_mutex_init(&mylock, NULL))
fprintf(stderr, "Failed to initialize mylock:%s\n", strerror(error));
Example 13.2 uses the strerror function to output a message associated with error. Unfortunately, POSIX does not require strerror to be thread-safe (though many implementations have made it thread-safe). If multiple threads don't call strerror at the same time, you can still use it in threaded programs. For example, if all functions return error indications and only the main thread prints error messages, the main thread can safely call strerror. Section 13.7 gives a thread-safe and signal-safe implementation, strerror_r.
Exercise 13.3
What happens if a thread tries to initialize a mutex that has already been initialized?
Answer:
POSIX explicitly states that the behavior is not defined, so avoid this situation in your programs.
13.2.2 Destroying a mutex
The pthread_mutex_destroy function destroys the mutex referenced by its parameter. The mutex parameter is a pointer to the mutex to be destroyed. A pthread_mutex_t variable that has been destroyed with pthread_mutex_destroy can be reinitialized with pthread_mutex_init.
SYNOPSIS
#include <pthread.h>
int pthread_mutex_destroy(pthread_mutex_t *mutex);
POSIX:THR
If successful, pthread_mutex_destroy returns 0. If unsuccessful, it returns a nonzero error code. No mandatory errors are defined for pthread_mutex_destroy.
Example 13.4
The following code segment destroys a mutex.
pthread_mutex_t mylock;
if (error = pthread_mutex_destroy(&mylock))
fprintf(stderr, "Failed to destroy mylock:%s\n", strerror(error));
Exercise 13.5
What happens if a thread references a mutex after it has been destroyed? What happens if one thread calls pthread_mutex_destroy and another thread has the mutex locked?
Answer:
POSIX explicitly states that the behavior in both situations is not defined.
13.2.3 Locking and unlocking a mutex
POSIX has two functions, pthread_mutex_lock and pthread_mutex_trylock for acquiring a mutex. The pthread_mutex_lock function blocks until the mutex is available, while the pthread_mutex_trylock always returns immediately. The pthread_mutex_unlock function releases the specified mutex. All three functions take a single parameter, mutex, a pointer to a mutex.
SYNOPSIS
#include <pthread.h>
int pthread_mutex_lock(pthread_mutex_t *mutex);
int pthread_mutex_trylock(pthread_mutex_t *mutex);
int pthread_mutex_unlock(pthread_mutex_t *mutex);
POSIX:THR
If successful, these functions return 0. If unsuccessful, these functions return a nonzero error code. The following table lists the mandatory errors for the three functions.
|
EINVAL | mutex has protocol attribute PTHREAD_PRIO_PROTECT and caller's priority is higher than mutex's current priority ceiling (pthread_mutex_lock or pthread_mutex_trylock) | EBUSY | another thread holds the lock (pthread_mutex_trylock) |
The PTHREAD_PRIO_PROTECT attribute prevents priority inversions of the sort described in Section 13.8.
Example 13.6
The following code segment uses a mutex to protect a critical section.
pthread_mutex_t mylock = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_lock(&mylock);
/* critical section */
pthread_mutex_unlock(&mylock);
The code omits error checking for clarity.
Locking and unlocking are voluntary in the sense that a program achieves mutual exclusion only when its threads correctly acquire the appropriate mutex before entering their critical sections and release the mutex when finished. Nothing prevents an uncooperative thread from entering its critical section without acquiring the mutex. One way to ensure exclusive access to objects is to permit access only through well-defined functions and to put the locking calls in these functions. The locking mechanism is then transparent to the calling threads.
Program 13.1 shows an example of a thread-safe counter that might be used for reference counts in a threaded program. The locking mechanisms are hidden in the functions, and the calling program does not have to worry about using mutex variables. The count and countlock variables have the static attribute, so these variables can be referenced only from within counter.c. Following the pattern of the POSIX threads library, the functions in Program 13.1 return 0 if successful or a nonzero error code if unsuccessful.
Exercise 13.7
What can go wrong in a threaded program if the count variable of Program 13.1 is not protected with mutex locks?
Answer:
Without locking, it is possible to get an incorrect value for count, since incrementing and decrementing a variable are not atomic operations on most machines. (Typically, incrementing consists of three distinct steps: loading a memory location into a CPU register, adding 1 to the register, and storing the value back in memory.) Suppose a thread is in the middle of the increment when the process quantum expires. The thread scheduler may select another thread to run when the process runs again. If the newly selected thread also tries to increment or decrement count, the variable's value will be incorrect when the original thread completes its operation.
Program 13.1 counter.c
A counter that can be accessed by multiple threads.
#include <pthread.h>
static int count = 0;
static pthread_mutex_t countlock = PTHREAD_MUTEX_INITIALIZER;
int increment(void) { /* increment the counter */
int error;
if (error = pthread_mutex_lock(&countlock))
return error;
count++;
return pthread_mutex_unlock(&countlock);
}
int decrement(void) { /* decrement the counter */
int error;
if (error = pthread_mutex_lock(&countlock))
return error;
count--;
return pthread_mutex_unlock(&countlock);
}
int getcount(int *countp) { /* retrieve the counter */
int error;
if (error = pthread_mutex_lock(&countlock))
return error;
*countp = count;
return pthread_mutex_unlock(&countlock);
}
13.2.4 Protecting unsafe library functions
A mutex can be used to protect an unsafe library function. The rand function from the C library takes no parameters and returns a pseudorandom integer in the range 0 to RAND_MAX. It is listed in the POSIX standard as being unsafe in multithreaded applications. The rand function can be used in a multithreaded environment if it is guaranteed that no two threads are concurrently calling it. Program 13.2 shows an implementation of the function randsafe that uses rand to produce a single per-process sequence of pseudorandom double values in the range from 0 to 1. Note that rand and therefore randsafe are not particularly good generators; avoid them in real applications.
Program 13.2 randsafe.c
A random number generator protected by a mutex.
#include <pthread.h>
#include <stdlib.h>
int randsafe(double *ranp) {
static pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
int error;
if (error = pthread_mutex_lock(&lock))
return error;
*ranp = (rand() + 0.5)/(RAND_MAX + 1.0);
return pthread_mutex_unlock(&lock);
}
13.2.5 Synchronizing flags and global values
Program 13.3 shows an implementation of a synchronized flag that is initially zero. The getdone function returns the value of the synchronized flag, and the setdone function changes the value of the synchronized flag to 1.
Program 13.3 doneflag.c
A synchronized flag that is 1 if setdone has been called at least once.
#include <pthread.h>
static int doneflag = 0;
static pthread_mutex_t donelock = PTHREAD_MUTEX_INITIALIZER;
int getdone(int *flag) { /* get the flag */
int error;
if (error = pthread_mutex_lock(&donelock))
return error;
*flag = doneflag;
return pthread_mutex_unlock(&donelock);
}
int setdone(void) { /* set the flag */
int error;
if (error = pthread_mutex_lock(&donelock))
return error;
doneflag = 1;
return pthread_mutex_unlock(&donelock);
}
Example 13.8
The following code segment uses the synchronized flag of Program 13.3 to decide whether to process another command in a threaded program.
void docommand(void);
int error = 0;
int done = 0;
while(!done && !error) {
docommand();
error = getdone(&done);
}
Program 13.4 shows a synchronized implementation of a global error value. Functions from different files can call seterror with return values from various functions. The seterror function returns immediately if the error parameter is zero, indicating no error. Otherwise, seterror acquires the mutex and assigns error to globalerror if globalerror is zero. In this way, globalerror holds the error code of the first error that it is assigned. Notice that seterror returns the original error unless there was a problem acquiring or releasing the internal mutex. In this case, the global error value may not be meaningful and both seterror and geterror return the error code from the locking problem.
Program 13.4 globalerror.c
A shared global error flag.
#include <pthread.h>
static int globalerror = 0;
static pthread_mutex_t errorlock = PTHREAD_MUTEX_INITIALIZER;
int geterror(int *error) { /* get the error flag */
int terror;
if (terror = pthread_mutex_lock(&errorlock))
return terror;
*error = globalerror;
return pthread_mutex_unlock(&errorlock);
}
int seterror(int error) { /* globalerror set to error if first error */
int terror;
if (!error) /* it wasn't an error, so don't change globalerror */
return error;
if (terror = pthread_mutex_lock(&errorlock)) /* couldn't get lock */
return terror;
if (!globalerror)
globalerror = error;
terror = pthread_mutex_unlock(&errorlock);
return terror? terror: error;
}
Program 13.5 shows a synchronized implementation of a shared sum object that uses the global error flag of Program 13.4.
Program 13.5 sharedsum.c
A shared sum object that uses the global error flag of Program 13.4.
#include <pthread.h>
#include "globalerror.h"
static int count = 0;
static double sum = 0.0;
static pthread_mutex_t sumlock = PTHREAD_MUTEX_INITIALIZER;
int add(double x) { /* add x to sum */
int error;
if (error = pthread_mutex_lock(&sumlock))
return seterror(error);
sum += x;
count++;
error = pthread_mutex_unlock(&sumlock);
return seterror(error);
}
int getsum(double *sump) { /* return sum */
int error;
if (error = pthread_mutex_lock(&sumlock))
return seterror(error);
*sump = sum;
error = pthread_mutex_unlock(&sumlock);
return seterror(error);
}
int getcountandsum(int *countp, double *sump) { /* return count and sum */
int error;
if (error = pthread_mutex_lock(&sumlock))
return seterror(error);
*countp = count;
*sump = sum;
error = pthread_mutex_unlock(&sumlock);
return seterror(error);
}
Because mutex locks must be accessible to all the threads that need to synchronize, they often appear as global variables (internal or external linkage). Although C is not object oriented, an object organization is often useful. Internal linkage should be used for those objects that do not need to be accessed from outside a given file. Programs 13.1 through 13.5 illustrate methods of doing this. We now illustrate how to use these synchronized objects in a program.
Program 13.6 shows a function that can be called as a thread to do a simple calculation. The computethread calculates the sine of a random number between 0 and 1 in a loop, adding the result to the synchronized sum given by Program 13.5. The computethread sleeps for a short time after each calculation, allowing other threads to use the CPU. The computethread thread uses the doneflag of Program 13.3 to terminate when another thread sets the flag.
Program 13.6 computethread.c
A thread that computes sums of random sines.
#include <math.h>
#include <stdlib.h>
#include <time.h>
#include "doneflag.h"
#include "globalerror.h"
#include "randsafe.h"
#include "sharedsum.h"
#define TEN_MILLION 10000000L
/* ARGSUSED */
void *computethread(void *arg1) { /* compute a random partial sum */
int error;
int localdone = 0;
struct timespec sleeptime;
double val;
sleeptime.tv_sec = 0;
sleeptime.tv_nsec = TEN_MILLION; /* 10 ms */
while (!localdone) {
if (error = randsafe(&val)) /* get a random number between 0.0 and 1.0 */
break;
if (error = add(sin(val)))
break;
if (error = getdone(&localdone))
break;
nanosleep(&sleeptime, NULL); /* let other threads in */
}
seterror(error);
return NULL;
}
Program 13.7 is a driver program that creates a number of computethread threads and allows them to compute for a given number of seconds before it sets a flag to end the calculations. The main program then calls the showresults function of Program 13.8 to retrieve the shared sum and number of the summed values. The showresults function computes the average from these values. It also calculates the theoretical average value of the sine function over the interval [0,1] and gives the total and percentage error of the average value.
The second command-line argument of computethreadmain is the number of seconds to sleep after creating the threads. After sleeping, computethreadmain calls setdone, causing the threads to terminate. The computethreadmain program then uses pthread_join to wait for the threads to finish and calls showresults. The showresults function uses geterror to check to see that all threads completed without reporting an error. If all is well, showresults displays the results.
Program 13.7 computethreadmain.c
A main program that creates a number of computethread threads and allows them to execute for a given number of seconds.
#include <math.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include "computethread.h"
#include "doneflag.h"
#include "globalerror.h"
#include "sharedsum.h"
int showresults(void);
int main(int argc, char *argv[]) {
int error;
int i;
int numthreads;
int sleeptime;
pthread_t *tids;
if (argc != 3) { /* pass number threads and sleeptime on command line */
fprintf(stderr, "Usage: %s numthreads sleeptime\n", argv[0]);
return 1;
}
numthreads = atoi(argv[1]); /* allocate an array for the thread ids */
sleeptime = atoi(argv[2]);
if ((tids = (pthread_t *)calloc(numthreads, sizeof(pthread_t))) == NULL) {
perror("Failed to allocate space for thread IDs");
return 1;
}
for (i = 0; i < numthreads; i++) /* create numthreads computethreads */
if (error = pthread_create(tids + i, NULL, computethread, NULL)) {
fprintf(stderr, "Failed to start thread %d:%s\n", i, strerror(error));
return 1;
}
sleep(sleeptime); /* give them some time to compute */
if (error = setdone()) { /* tell the computethreads to quit */
fprintf(stderr, "Failed to set done:%s\n", strerror(error));
return 1;
}
for (i = 0; i < numthreads; i++) /* make sure that they are all done */
if (error = pthread_join(tids[i], NULL)) {
fprintf(stderr, "Failed to join thread %d:%s\n", i, strerror(error));
return 1;
}
if (showresults())
return 1;
return 0;
}
Program 13.8 showresults.c
A function that displays the results of the computethread calculations.
#include <math.h>
#include <stdio.h>
#include <string.h>
#include "globalerror.h"
#include "sharedsum.h"
int showresults(void) {
double average;
double calculated;
int count;
double err;
int error;
int gerror;
double perr;
double sum;
if (((error = getcountandsum(&count, &sum)) != 0) ||
((error = geterror(&gerror)) != 0)) { /* get results */
fprintf(stderr, "Failed to get results: %s\n", strerror(error));
return -1;
}
if (gerror) { /* an error occurred in compute thread computation */
fprintf(stderr, "Failed to compute sum: %s\n", strerror(gerror));
return -1;
}
if (count == 0)
printf("No values were summed.\n");
else {
calculated = 1.0 - cos(1.0);
average = sum/count;
err = average - calculated;
perr = 100.0*err/calculated;
printf("The sum is %f and the count is %d\n", sum, count);
printf("The average is %f and error is %f or %f%%\n", average, err, perr);
}
return 0;
}
13.2.6 Making data structures thread-safe
Most shared data structures in a threaded program must be protected with synchronization mechanisms to ensure correct results. Program 13.9 illustrates how to use a single mutex to make the list object of Program 2.7 thread-safe. The listlib.c program should be included in the listlib_r.c file. All the functions in listlib.c should be qualified with the static attribute so that they are not accessible outside the file. The list object functions of Program 2.7 return 1 and set errno to report an error. The implementation of Program 13.9 preserves this handling of the errors. Since each thread has its own errno, setting errno in the listlib_r functions is not a problem. The implementation just wraps each function in a pair of mutex calls. Most of the code is for properly handling errors that occur during the mutex calls.
Program 13.9 listlib_r.c
Wrapper functions to make the list object of Program 2.7 thread-safe.
#include <errno.h>
#include <pthread.h>
static pthread_mutex_t listlock = PTHREAD_MUTEX_INITIALIZER;
int accessdata_r(void) { /* return nonnegative traversal key if successful */
int error;
int key;
if (error = pthread_mutex_lock(&listlock)) { /* no mutex, give up */
errno = error;
return -1;
}
key = accessdata();
if (key == -1) {
error = errno;
pthread_mutex_unlock(&listlock);
errno = error;
return -1;
}
if (error = pthread_mutex_unlock(&listlock)) {
errno = error;
return -1;
}
return key;
}
int adddata_r(data_t data) { /* allocate a node on list to hold data */
int error;
if (error = pthread_mutex_lock(&listlock)) { /* no mutex, give up */
errno = error;
return -1;
}
if (adddata(data) == -1) {
error = errno;
pthread_mutex_unlock(&listlock);
errno = error;
return -1;
}
if (error = pthread_mutex_unlock(&listlock)) {
errno = error;
return -1;
}
return 0;
}
int getdata_r(int key, data_t *datap) { /* retrieve node by key */
int error;
if (error = pthread_mutex_lock(&listlock)) { /* no mutex, give up */
errno = error;
return -1;
}
if (getdata(key, datap) == -1) {
error = errno;
pthread_mutex_unlock(&listlock);
errno = error;
return -1;
}
if (error = pthread_mutex_unlock(&listlock)) {
errno = error;
return -1;
}
return 0;
}
int freekey_r(int key) { /* free the key */
int error;
if (error = pthread_mutex_lock(&listlock)) { /* no mutex, give up */
errno = error;
return -1;
}
if (freekey(key) == -1) {
error = errno;
pthread_mutex_unlock(&listlock);
errno = error;
return -1;
}
if (error = pthread_mutex_unlock(&listlock)) {
errno = error;
return -1;
}
return 0;
}
The implementation of Program 13.9 uses a straight locking strategy that allows only one thread at a time to proceed. Section 13.6 revisits this problem with an implementation that allows multiple threads to execute the getdata function at the same time by using reader-writer synchronization.
13.3 At-Most-Once and At-Least-Once-Execution
If a mutex isn't statically initialized, the program must call pthread_mutex_init before using any of the other mutex functions. For programs that have a well-defined initialization phase before they create additional threads, the main thread can perform this initialization. Not all problems fit this structure. Care must be taken to call pthread_mutex_init before any thread accesses a mutex, but having each thread initialize the mutex doesn't work either. The effect of calling pthread_mutex_init for a mutex that has already been initialized is not defined.
The notion of single initialization is so important that POSIX provides the pthread_once function to ensure these semantics. The once_control parameter must be statically initialized with PTHREAD_ONCE_INIT. The init_routine is called the first time pthread_once is called with a given once_control, and init_routine is not called on subsequent calls. When a thread returns from pthread_once without error, the init_routine has been completed by some thread.
SYNOPSIS
#include <pthread.h>
int pthread_once(pthread_once_t *once_control,
void (*init_routine)(void));
pthread_once_t once_control = PTHREAD_ONCE_INIT;
POSIX:THR
If successful, pthread_once returns 0. If unsuccessful, pthread_once returns a nonzero error code. No mandatory errors are defined for pthread_once.
Program 13.10 uses pthread_once to implement an initialization function printinitmutex. Notice that var isn't protected by a mutex because it will be changed only once by printinitonce, and that modification occurs before any caller returns from printinitonce.
Program 13.10 printinitonce.c
A function that uses pthread_once to initialize a variable and print a statement at most once.
#include <pthread.h>
#include <stdio.h>
static pthread_once_t initonce = PTHREAD_ONCE_INIT;
int var;
static void initialization(void) {
var = 1;
printf("The variable was initialized to %d\n", var);
}
int printinitonce(void) { /* call initialization at most once */
return pthread_once(&initonce, initialization);
}
The initialization function of printinitonce has no parameters, making it hard to initialize var to something other than a fixed value. Program 13.11 shows an alternative implementation of at-most-once initialization that uses a statically initialized mutex. The printinitmutex function performs the initialization and printing at most once regardless of how many different variables or values are passed. If successful, printinitmutex returns 0. If unsuccessful, printinitmutex returns a nonzero error code. The mutex in printinitmutex is declared in the function so that it is accessible only inside the function. Giving the mutex static storage class guarantees that the same mutex is used every time the function is called.
Program 13.11 printinitmutex.c
A function that uses a statically initialized mutex to initialize a variable and print a statement at most once.
#include <pthread.h>
#include <stdio.h>
int printinitmutex(int *var, int value) {
static int done = 0;
static pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
int error;
if (error = pthread_mutex_lock(&lock))
return error;
if (!done) {
*var = value;
printf("The variable was initialized to %d\n", value);
done = 1;
}
return pthread_mutex_unlock(&lock);
}
Example 13.9
The following code segment initializes whichiteration to the index of the first loop iteration in which dostuff returns a nonzero value.
int whichiteration = -1;
void *thisthread(void *) {
int i;
for (i = 0; i < 100; i++)
if (dostuff())
printinitmutex(&whichiteration, i);
}
The whichiteration value is changed at most once, even if the program creates several threads running thisthread.
The testandsetonce function of Program 13.12 atomically sets an internal variable to 1 and returns the previous value of the internal variable in its ovalue parameter. The first call to testandsetonce initializes done to 0, sets *ovalue to 0 and sets done to 1. Subsequent calls set *ovalue to 1. The mutex ensures that no two threads have ovalue set to 0. If successful, testandsetonce returns 0. If unsuccessful, testandsetonce returns a nonzero error code.
Exercise 13.10
What happens if you remove the static qualifier from the done and lock variables of testandsetonce of Program 13.12?
Answer:
The static qualifier for variables inside a block ensures that they remain in existence for subsequent executions of the block. Without the static qualifier, done and lock become automatic variables. In this case, each call to testandsetonce allocates new variables and each return deallocates them. The function no longer works.
Program 13.12 testandsetonce.c
A function that uses a mutex to set a variable to 1 at most once.
#include <pthread.h>
int testandsetonce(int *ovalue) {
static int done = 0;
static pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
int error;
if (error = pthread_mutex_lock(&lock))
return error;
*ovalue = done;
done = 1;
return pthread_mutex_unlock(&lock);
}
Exercise 13.11
Does testandsetonce still work if you move the declarations of done and lock outside the testandsetonce function?
Answer:
Yes, testandsetonce still works. However, now done and lock are accessible to other functions defined in the same file. Keeping them inside the function is safer for enforcing at-most-once semantics.
Exercise 13.12
Does the following use of testandsetonce of Program 13.12 ensure that the initialization of var and the printing of the message occur at most once?
int error;
int oflag;
int var;
error = testandsetonce(&oflag);
if (!error && !oflag) {
var = 1;
printf("The variable has been initialized to 1\n");
}
var++;
Answer:
No. Successive calls to testandsetonce of Program 13.12 can return before the variable has been initialized. Consider the following scenario in which var must be initialized before being incremented.
Thread A calls testandsetonce. The testandsetonce returns in thread A. Thread A loses the CPU. Thread B calls testandsetonce. The executeonce returns to thread B without printing or initializing var. Thread B assumes that var has been initialized, and it increments the variable. Thread A gets the CPU again and initializes var to 1. In this case, var should have the value 2 since it was initialized to 1 and incremented once. Unfortunately, it has the value 1.
The strategies discussed in this section guarantee at-most-once execution. They do not guarantee that code has been executed at least once. At-least-once semantics are important for initialization. For example, suppose that you choose to use pthread_mutex_init rather than the static initializer to initialize a mutex. You need both at-most-once and at-least-once semantics. In other words, you need to perform an operation such as initialization exactly once. Sometimes the structure of the program ensures that this is the casea main thread performs all necessary initialization before creating any threads. In other situations, each thread must call initialization when it starts executing, or each function must call the initialization before accessing the mutex. In these cases, you will need to use at-most-once strategies in conjunction with the calls.
13.4 Condition Variables
Consider the problem of having a thread wait until some arbitrary condition is satisfied. For concreteness, assume that two variables, x and y, are shared by multiple threads. We want a thread to wait until x and y are equal. A typical incorrect busy-waiting solution is
while (x != y) ;
Having a thread use busy waiting is particularly troublesome. Depending on how the threads are scheduled, the thread doing the busy waiting may prevent other threads from ever using the CPU, in which case x and y never change. Also, access to shared variables should always be protected.
Here is the correct strategy for non-busy waiting for the predicate x==y to become true.
Lock a mutex.
Test the condition x==y.
If true, unlock the mutex and exit the loop.
If false, suspend the thread and unlock the mutex.
The mutex must be held until a test determines whether to suspend the thread. Holding the mutex prevents the condition x==y from changing between the test and the suspension of the thread. The mutex needs to be unlocked while the thread is suspended so that other threads can access x and y. The strategy assumes that the code protects all other access to the shared variables x and y with the mutex.
Applications manipulate mutex queues through well-defined system library functions such as pthread_mutex_lock and pthread_mutex_unlock. These functions are not sufficient to implement (in a simple manner) the queue manipulations required here. We need a new data type, one associated with a queue of processes waiting for an arbitrary condition such as x==y to become true. Such a data type is called a condition variable.
A classical condition variable is associated with a particular condition. In contrast, POSIX condition variables provide an atomic waiting mechanism but are not associated with particular conditions.
The function pthread_cond_wait takes a condition variable and a mutex as parameters. It atomically suspends the calling thread and unlocks the mutex. It can be thought of as placing the thread in a queue of threads waiting to be notified of a change in a condition. The function returns with the mutex reacquired when the thread receives a notification. The thread must test the condition again before proceeding.
Example 13.13
The following code segment illustrates how to wait for the condition x==y, using a POSIX condition variable v and a mutex m.
pthread_mutex_lock(&m);
while (x != y)
pthread_cond_wait(&v, &m);
/* modify x or y if necessary */
pthread_mutex_unlock(&m);
When the thread returns from pthread_cond_wait it owns m, so it can safely test the condition again. The code segment omits error checking for clarity.
The function pthread_cond_wait should be called only by a thread that owns the mutex, and the thread owns the mutex again when the function returns. The suspended thread has the illusion of uninterrupted mutex ownership because it owns the mutex before the call to pthread_cond_wait and owns the mutex when pthread_cond_wait returns. In reality, the mutex can be acquired by other threads during the suspension.
A thread that modifies x or y can call pthread_cond_signal to notify other threads of the change. The pthread_cond_signal function takes a condition variable as a parameter and attempts to wake up at least one of the threads waiting in the corresponding queue. Since the blocked thread cannot return from pthread_cond_wait without owning the mutex, pthread_cond_signal has the effect of moving the thread from the condition variable queue to the mutex queue.
Example 13.14
The following code might be used by another thread in conjunction with Example 13.13 to notify the waiting thread that it has incremented x.
pthread_mutex_lock(&m);
x++;
pthread_cond_signal(&v);
pthread_mutex_unlock(&m);
The code segment omits error checking for clarity.
In Example 13.14, the caller holds the mutex while calling pthread_cond_signal. POSIX does not require this to be the case, and the caller could have unlocked the mutex before signaling. In programs that have threads of different priorities, holding the mutex while signaling can prevent lower priority threads from acquiring the mutex and executing before a higher-priority thread is awakened.
Several threads may use the same condition variables to wait on different predicates. The waiting threads must verify that the predicate is satisfied when they return from the wait. The threads that modify x or y do not need to know what conditions are being waited for; they just need to know which condition variable is being used.
Exercise 13.15
Compare the use of condition variables with the use of sigsuspend as described in Example 8.24 on page 275.
Answer:
The concepts are similar. Example 8.24 blocks the signal and tests the condition. Blocking the signal is analogous to locking the mutex since the signal handler cannot access the global variable sigreceived while the signal is blocked. The sigsuspend atomically unblocks the signal and suspends the process. When sigsuspend returns, the signal is blocked again. With condition variables, the thread locks the mutex to protect its critical section and tests the condition. The pthread_cond_wait atomically releases the mutex and suspends the process. When pthread_cond_wait returns, the thread owns the mutex again.
13.4.1 Creating and destroying condition variables
POSIX represents condition variables by variables of type pthread_cond_t. A program must always initialize pthread_cond_t variables before using them. For statically allocated pthread_cond_t variables with the default attributes, simply assign PTHREAD_COND_INITIALIZER to the variable. For variables that are dynamically allocated or don't have the default attributes, call pthread_cond_init to perform initialization. Pass NULL for the attr parameter of pthread_cond_init to initialize a condition variable with the default attributes. Otherwise, first create and initialize a condition variable attribute object in a manner similar to that used for thread attribute objects.
SYNOPSIS
#include <pthread.h>
int pthread_cond_init(pthread_cond_t *restrict cond,
const pthread_condattr_t *restrict attr);
pthread_cont_t cond = PTHREAD_COND_INITIALIZER;
POSIX:THR
If successful, pthread_cond_init returns 0. If unsuccessful, pthread_cond_init returns a nonzero error code. The following table lists the mandatory errors for pthread_cond_init.
|
EAGAIN | system lacked nonmemory resources needed to initialize *cond | ENOMEM | system lacked memory resources needed to initialize *cond |
Example 13.16
The following code segment initializes a condition variable.
pthread_cond_t barrier;
int error;
if (error = pthread_cond_init(&barrier, NULL));
fprintf(stderr, "Failed to initialize barrier:%s\n", strerror(error));
The code assumes that strerror will not be called by multiple threads. Otherwise, strerror_r of Section 13.7 should be used.
Exercise 13.17
What happens if a thread tries to initialize a condition variable that has already been initialized?
Answer:
The POSIX standard explicitly states that the results are not defined, so you should avoid doing this.
The pthread_cond_destroy function destroys the condition variable referenced by its cond parameter. A pthread_cond_t variable that has been destroyed with pthread_cond_destroy can be reinitialized with pthread_cond_init.
SYNOPSIS
#include <pthread.h>
int pthread_cond_destroy(pthread_cond_t *cond);
POSIX:THR
If successful, pthread_cond_destroy returns 0. If unsuccessful, it returns a nonzero error code. No mandatory errors are defined for pthread_cond_destroy.
Example 13.18
The following code segment destroys the condition variable tcond.
pthread_cond_t tcond;
if (error = pthread_cond_destroy(&tcond))
fprintf(stderr, "Failed to destroy tcond:%s\n", strerror(error));
Exercise 13.19
What happens if a thread references a condition variable that has been destroyed?
Answer:
POSIX explicitly states that the results are not defined. The standard also does not define what happens when a thread attempts to destroy a condition variable on which other threads are blocked.
13.4.2 Waiting and signaling on condition variables
Condition variables derive their name from the fact that they are called in conjunction with testing a predicate or condition. Typically, a thread tests a predicate and calls pthread_cond_wait if the test fails. The pthread_cond_timedwait function can be used to wait for a limited time. The first parameter of these functions is cond, a pointer to the condition variable. The second parameter is mutex, a pointer to a mutex that the thread acquired before the call. The wait operation causes the thread to release this mutex when the thread is placed on the condition variable wait queue. The pthread_cond_timedwait function has a third parameter, a pointer to the time to return if a condition variable signal does not occur first. Notice that this value represents an absolute time, not a time interval.
SYNOPSIS
#include <pthread.h>
int pthread_cond_timedwait(pthread_cond_t *restrict cond,
pthread_mutex_t *restrict mutex,
const struct timespec *restrict abstime);
int pthread_cond_wait(pthread_cond_t *restrict cond,
pthread_mutex_t *restrict mutex);
POSIX:THR
If successful, pthread_cond_timedwait and pthread_cond_wait return 0. If unsuccessful, these functions return nonzero error code. The pthread_cond_timedwait function returns ETIMEDOUT if the time specified by abstime has expired. If a signal is delivered while a thread is waiting for a condition variable, these functions may resume waiting upon return from the signal handler, or they may return 0 because of a spurious wakeup.
Example 13.20
The following code segment causes a thread to (nonbusy) wait until a is greater than or equal to b.
pthread_mutex_lock(&mutex);
while (a < b)
pthread_cond_wait(&cond, &mutex);
pthread_mutex_unlock(&mutex);
The code omits error checking for clarity.
The calling thread should obtain a mutex before it tests the predicate or calls pthread_cond_wait. The implementation guarantees that pthread_cond_wait causes the thread to atomically release the mutex and block.
Exercise 13.21
What happens if one thread executes the code of Example 13.20 by using mutex and another thread executes Example 13.20 by using mutexA?
Answer:
This is allowed as long as the two threads are not concurrent. The condition variable wait operations pthread_cond_wait and pthread_cond_timedwait effectively bind the condition variable to the specified mutex and release the binding on return. POSIX does not define what happens if threads use different mutex locks for concurrent wait operations on the same condition variable. The safest way to avoid this situation is to always use the same mutex with a given condition variable.
When another thread changes variables that might make the predicate true, it should awaken one or more threads that are waiting for the predicate to become true. The pthread_cond_signal function unblocks at least one of the threads that are blocked on the condition variable pointed to by cond. The pthread_cond_broadcast function unblocks all threads blocked on the condition variable pointed to by cond.
SYNOPSIS
#include <pthread.h>
int pthread_cond_broadcast(pthread_cond_t *cond);
int pthread_cond_signal(pthread_cond_t *cond);
POSIX:THR
If successful, pthread_condition_broadcast and pthread_condition_signal return 0. If unsuccessful, these functions return a nonzero error code.
Example 13.22
Suppose v is a condition variable and m is a mutex. The following is a proper use of the condition variable to access a resource if the predicate defined by test_condition() is true. This code omits error checking for clarity.
static pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
static pthread_cond_t v = PTHREAD_COND_INITIALIZER;
pthread_mutex_lock(&m);
while (!test_condition()) /* get resource */
pthread_cond_wait(&v, &m);
/* do critical section, possibly changing test_condition() */
pthread_cond_signal(&v); /* inform another thread */
pthread_mutex_unlock(&m);
/* do other stuff */
When a thread executes the pthread_cond_wait in Example 13.22, it is holding the mutex m. It blocks atomically and releases the mutex, permitting another thread to acquire the mutex and modify the variables in the predicate. When a thread returns successfully from a pthread_cond_wait, it has acquired the mutex and can retest the predicate without explicitly reacquiring the mutex. Even if the program signals on a particular condition variable only when a certain predicate is true, waiting threads must still retest the predicate. The POSIX standard specifically allows pthread_cond_wait to return, even if no thread has called pthread_cond_signal or pthread_cond_broadcast.
Program 6.2 on page 187 implements a simple barrier by using a pipe. Program 13.13 implements a thread-safe barrier by using condition variables. The limit variable specifies how many threads must arrive at the barrier (execute the waitbarrier) before the threads are released from the barrier. The count variable specifies how many threads are currently waiting at the barrier. Both variables are declared with the static attribute to force access through initbarrier and waitbarrier. If successful, the initbarrier and waitbarrier functions return 0. If unsuccessful, these functions return a nonzero error code.
Remember that condition variables are not linked to particular predicates and that pthread_cond_wait can return because of spurious wakeups. Here are some rules for using condition variables.
Acquire the mutex before testing the predicate. Retest the predicate after returning from a pthread_cond_wait, since the return might have been caused by some unrelated event or by a pthread_cond_signal that did not cause the predicate to become true. Acquire the mutex before changing any of the variables appearing in the predicate. Hold the mutex only for a short period of timeusually while testing the predicate or modifying shared variables. Release the mutex either explicitly (with pthread_mutex_unlock) or implicitly (with pthread_cond_wait).
Program 13.13 tbarrier.c
Implementation of a thread-safe barrier.
#include <errno.h>
#include <pthread.h>
static pthread_cond_t bcond = PTHREAD_COND_INITIALIZER;
static pthread_mutex_t bmutex = PTHREAD_MUTEX_INITIALIZER;
static int count = 0;
static int limit = 0;
int initbarrier(int n) { /* initialize the barrier to be size n */
int error;
if (error = pthread_mutex_lock(&bmutex)) /* couldn't lock, give up */
return error;
if (limit != 0) { /* barrier can only be initialized once */
pthread_mutex_unlock(&bmutex);
return EINVAL;
}
limit = n;
return pthread_mutex_unlock(&bmutex);
}
int waitbarrier(void) { /* wait at the barrier until all n threads arrive */
int berror = 0;
int error;
if (error = pthread_mutex_lock(&bmutex)) /* couldn't lock, give up */
return error;
if (limit <= 0) { /* make sure barrier initialized */
pthread_mutex_unlock(&bmutex);
return EINVAL;
}
count++;
while ((count < limit) && !berror)
berror = pthread_cond_wait(&bcond, &bmutex);
if (!berror)
berror = pthread_cond_broadcast(&bcond); /* wake up everyone */
error = pthread_mutex_unlock(&bmutex);
if (berror)
return berror;
return error;
}
|
13.6 Readers and Writers
The reader-writer problem refers to a situation in which a resource allows two types of access (reading and writing). One type of access must be granted exclusively (e.g., writing), but the other type may be shared (e.g., reading). For example, any number of processes can read from the same file without difficulty, but only one process should modify the file at a time.
Two common strategies for handling reader-writer synchronization are called strong reader synchronization and strong writer synchronization. Strong reader synchronization always gives preference to readers, granting access to readers as long as a writer is not currently writing. Strong writer synchronization always gives preference to writers, delaying readers until all waiting or active writers complete. An airline reservation system would use strong writer preference, since readers need the most up-to-date information. On the other hand, a library reference database might want to give readers preference.
POSIX provides read-write locks that allow multiple readers to acquire a lock, provided that a writer does not hold the lock. POSIX states that it is up to the implementation whether to allow a reader to acquire a lock if writers are blocked on the lock.
POSIX read-write locks are represented by variables of type pthread_rwlock_t. Programs must initialize pthread_rwlock_t variables before using them for synchronization by calling pthread_rwlock_init. The rwlock parameter is a pointer to a read-write lock. Pass NULL for the attr parameter of pthread_rwlock_init to initialize a read-write lock with the default attributes. Otherwise, first create and initialize a read-write lock attribute object in a manner similar to that used for thread attribute objects.
SYNOPSIS
#include <pthread.h>
int pthread_rwlock_init(pthread_rwlock_t *restrict rwlock,
const pthread_rwlockattr_t *restrict attr);
POSIX:THR
If successful, pthread_rwlock_init returns 0. If unsuccessful, it returns a nonzero error code. The following table lists the mandatory errors for pthread_rwlock_init.
|
EAGAIN | system lacked nonmemory resources needed to initialize *rwlock | ENOMEM | system lacked memory resources needed to initialize *rwlock | EPERM | caller does not have appropriate privileges |
Exercise 13.25
What happens when you try to initialize a read-write lock that has already been initialized?
Answer:
POSIX states that the behavior under these circumstances is not defined.
The pthread_rwlock_destroy function destroys the read-write lock referenced by its parameter. The rwlock parameter is a pointer to a read-write lock. A pthread_rwlock_t variable that has been destroyed with pthread_rwlock_destroy can be reinitialized with pthread_rwlock_init.
SYNOPSIS
#include <pthread.h>
int pthread_rwlock_destroy(pthread_rwlock_t *rwlock);
POSIX:THR
If successful, pthread_rwlock_destroy returns 0. If unsuccessful, it returns a nonzero error code. No mandatory errors are defined for pthread_rwlock_destroy.
Exercise 13.26
What happens if you reference a read-write lock that has been destroyed?
Answer:
POSIX states that the behavior under these circumstances is not defined.
The pthread_rwlock_rdlock and pthread_rwlock_tryrdlock functions allow a thread to acquire a read-write lock for reading. The pthread_rwlock_wrlock and pthread_rwlock_trywrlock functions allow a thread to acquire a read-write lock for writing. The pthread_rwlock_rdlock and pthread_rwlock_wrlock functions block until the lock is available, whereas pthread_rwlock_tryrdlock and pthread_rwlock_trywrlock return immediately. The pthread_rwlock_unlock function causes the lock to be released. These functions require that a pointer to the lock be passed as a parameter.
SYNOPSIS
#include <pthread.h>
int pthread_rwlock_rdlock(pthread_rwlock_t *rwlock);
int pthread_rwlock_tryrdlock(pthread_rwlock_t *rwlock);
int pthread_rwlock_wrlock(pthread_rwlock_t *rwlock);
int pthread_rwlock_trywrlock(pthread_rwlock_t *rwlock);
int pthread_rwlock_unlock(pthread_rwlock_t *rwlock);
POSIX:THR
If successful, these functions return 0. If unsuccessful, these functions return a nonzero error code. The pthread_rwlock_tryrdlock and pthread_rwlock_trywrlock functions return EBUSY if the lock could not be acquired because it was already held.
Exercise 13.27
What happens if a thread calls pthread_rwlock_rdlock on a lock that it has already acquired with pthread_rwlock_wrlock?
Answer:
POSIX states that a deadlock may occur. (Implementations are free to detect a deadlock and return an error, but they are not required to do so.)
Exercise 13.28
What happens if a thread calls pthread_rwlock_rdlock on a lock that it has already acquired with pthread_rwlock_rdlock?
Answer:
A thread may hold multiple concurrent read locks on the same read-write lock. It should make sure to match the number of unlock calls with the number of lock calls to release the lock.
Program 13.16 uses read-write locks to implement a thread-safe wrapper for the list object of Program 2.7. The listlib.c module should be included in this file, and its functions should be qualified with the static attribute. Program 13.16 includes an initialize_r function to initialize the read-write lock, since no static initialization is available. This function uses pthread_once to make sure that the read-write lock is initialized only one time.
Exercise 13.29
Compare Program 13.16 to the thread-safe implementation of Program 13.9 that uses mutex locks. What are the advantages/disadvantages of each?
Answer:
The mutex is a low-overhead synchronization mechanism. Since each of the functions in Program 13.9 holds the listlock only for a short period of time, Program 13.9 is relatively efficient. Because read-write locks have some overhead, their advantage comes when the actual read operations take a considerable amount of time (such as incurred by accessing a disk). In such a case, the strictly serial execution order would be inefficient.
Program 13.16 listlibrw_r.c
The list object of Program 2.7 synchronized with read-write locks.
#include <errno.h>
#include <pthread.h>
static pthread_rwlock_t listlock;
static int lockiniterror = 0;
static pthread_once_t lockisinitialized = PTHREAD_ONCE_INIT;
static void ilock(void) {
lockiniterror = pthread_rwlock_init(&listlock, NULL);
}
int initialize_r(void) { /* must be called at least once before using list */
if (pthread_once(&lockisinitialized, ilock))
lockiniterror = EINVAL;
return lockiniterror;
}
int accessdata_r(void) { /* get a nonnegative key if successful */
int error;
int errorkey = 0;
int key;
if (error = pthread_rwlock_wrlock(&listlock)) { /* no write lock, give up */
errno = error;
return -1;
}
key = accessdata();
if (key == -1) {
errorkey = errno;
pthread_rwlock_unlock(&listlock);
errno = errorkey;
return -1;
}
if (error = pthread_rwlock_unlock(&listlock)) {
errno = error;
return -1;
}
return key;
}
int adddata_r(data_t data) { /* allocate a node on list to hold data */
int error;
if (error = pthread_rwlock_wrlock(&listlock)) { /* no writer lock, give up */
errno = error;
return -1;
}
if (adddata(data) == -1) {
error = errno;
pthread_rwlock_unlock(&listlock);
errno = error;
return -1;
}
if (error = pthread_rwlock_unlock(&listlock)) {
errno = error;
return -1;
}
return 0;
}
int getdata_r(int key, data_t *datap) { /* retrieve node by key */
int error;
if (error = pthread_rwlock_rdlock(&listlock)) { /* no reader lock, give up */
errno = error;
return -1;
}
if (getdata(key, datap) == -1) {
error = errno;
pthread_rwlock_unlock(&listlock);
errno = error;
return -1;
}
if (error = pthread_rwlock_unlock(&listlock)) {
errno = error;
return -1;
}
return 0;
}
int freekey_r(int key) { /* free the key */
int error;
if (error = pthread_rwlock_wrlock(&listlock)) {
errno = error;
return -1;
}
if (freekey(key) == -1) {
error = errno;
pthread_rwlock_unlock(&listlock);
errno = error;
return -1;
}
if (error = pthread_rwlock_unlock(&listlock)) {
errno = error;
return -1;
}
return 0;
}
Exercise 13.30
The use of Program 13.16 requires a call to initialize_r at least once by some thread before any threads call other functions in this library. How could this be avoided?
Answer:
The function initialize_r can be given internal linkage by having the other functions in the library call it before accessing the lock.
|
13.5 Signal Handling and Threads
All threads in a process share the process signal handlers, but each thread has its own signal mask. The interaction of threads with signals involves several complications because threads can operate asynchronously with signals. Table 13.2 summarizes the three types of signals and their corresponding methods of delivery.
Table 13.2. Signal delivery in threads.|
asynchronous | delivered to some thread that has it unblocked | synchronous | delivered to the thread that caused it | directed | delivered to the identified thread (pthread_kill) |
Signals such as SIGFPE (floating-point exception) are synchronous to the thread that caused them (i.e., they are always generated at the same point in the thread's execution). Other signals are asynchronous because they are not generated at a predictable time nor are they associated with a particular thread. If several threads have an asynchronous signal unblocked, the thread runtime system selects one of them to handle the signal. Signals can also be directed to a particular thread with pthread_kill.
13.5.1 Directing a signal to a particular thread
The pthread_kill function requests that signal number sig be generated and delivered to the thread specified by thread.
SYNOPSIS
#include <signal.h>
#include <pthread.h>
int pthread_kill(pthread_t thread, int sig);
POSIX:THR
If successful, pthread_kill returns 0. If unsuccessful, pthread_kill returns a nonzero error code. In the latter case, no signal is sent. The following table lists the mandatory errors for pthread_kill.
|
EINVAL | sig is an invalid or unsupported signal number | ESRCH | no thread corresponds to specified ID |
Example 13.23
The following code segment causes a thread to kill itself and the entire process.
if (pthread_kill(pthread_self(), SIGKILL))
fprintf(stderr, "Failed to commit suicide\n");
Example 13.23 illustrates an important point regarding pthread_kill. Although pthread_kill delivers the signal to a particular thread, the action of handling it may affect the entire process. A common confusion is to assume that pthread_kill always causes process termination, but this is not the case. The pthread_kill just causes a signal to be generated for the thread. Example 13.23 causes process termination because the SIGKILL signal cannot be caught, blocked or ignored. The same result occurs for any signal whose default action is to terminate the process unless the process ignores, blocks or catches the signal. Table 8.1 lists the POSIX signals with their symbolic names and default actions.
13.5.2 Masking signals for threads
While signal handlers are process-wide, each thread has its own signal mask. A thread can examine or set its signal mask with the pthread_sigmask function, which is a generalization of sigprocmask to threaded programs. The sigprocmask function should not be used when the process has multiple threads, but it can be called by the main thread before additional threads are created. Recall that the signal mask specifies which signals are to be blocked (not delivered). The how and set parameters specify the way the signal mask is to be modified, as discussed below. If the oset parameter is not NULL, the pthread_sigmask function sets *oset to the thread's previous signal mask.
SYNOPSIS
#include <pthread.h>
#include <signal.h>
int pthread_sigmask(int how, const sigset_t *restrict set,
sigset_t *restrict oset);
POSIX:THR
If successful, pthread_sigmask returns 0. If unsuccessful, pthread_sigmask returns a nonzero error code. The pthread_sigmask function returns EINVAL if how is not valid.
A how value of SIG_SETMASK causes the thread's signal mask to be replaced by set. That is, the thread now blocks all signals in set but does not block any others. A how value of SIG_BLOCK causes the additional signals in set to be blocked by the thread (added to the thread's current signal mask). A how value of SIG_UNBLOCK causes any of the signals in set that are currently being blocked to be removed from the thread's current signal mask (no longer be blocked).
13.5.3 Dedicating threads for signal handling
Signal handlers are process-wide and are installed with calls to sigaction as in single-threaded processes. The distinction between process-wide signal handlers and thread-specific signal masks is important in threaded programs.
Recall from Chapter 8 that when a signal is caught, the signal that caused the event is automatically blocked on entry to the signal handler. With a multithreaded application, nothing prevents another signal of the same type from being delivered to another thread that has the signal unblocked. It is possible to have multiple threads executing within the same signal handler.
A recommended strategy for dealing with signals in multithreaded processes is to dedicate particular threads to signal handling. The main thread blocks all signals before creating the threads. The signal mask is inherited from the creating thread, so all threads have the signal blocked. The thread dedicated to handling the signal then executes sigwait on that signal. (See Section 8.5.) Alternatively, the thread can use pthread_sigmask to unblock the signal. The advantage of using sigwait is that the thread is not restricted to async-signal-safe functions.
Program 13.14 is an implementation of a dedicated thread that uses sigwait to handle a particular signal. A program calls signalthreadinit to block the signo signal and to create a dedicated signalthread that waits for this signal. When the signal corresponding to signo becomes pending, sigwait returns and the signalthread calls setdone of Program 13.3 and returns. You can replace the setdone with any thread-safe function. Program 13.14 has some informative messages, which would normally be removed.
Notice that the implementation of signalthreadinit uses a thread attribute object to create signalthread with higher priority than the default value. The program was tested on a system that used preemptive priority scheduling. When the program executes on this system without first increasing signalthread's priority, it still works correctly, but sometimes the program takes several seconds to react to the signal after it is generated. If a round-robin scheduling policy were available, all the threads could have the same priority.
The dedicated signal-handling thread, signalthread, displays its priority to confirm that the priority is set correctly and then calls sigwait. No signal handler is needed since sigwait removes the signal from those pending. The signal is always blocked, so the default action for signalnum is never taken.
Program 13.15 modifies computethreadmain of Program 13.7 by using the SIGUSR1 signal to set the done flag for the computethread object of Program 13.6. The main program no longer sleeps a specified number of seconds before calling setdone. Instead, the delivery of a SIGUSR1 signal causes signalthread to call setdone.
Program 13.14 signalthread.c
A dedicated thread that sets a flag when a signal is received.
#include <errno.h>
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include "doneflag.h"
#include "globalerror.h"
static int signalnum = 0;
/* ARGSUSED */
static void *signalthread(void *arg) { /* dedicated to handling signalnum */
int error;
sigset_t intmask;
struct sched_param param;
int policy;
int sig;
if (error = pthread_getschedparam(pthread_self(), &policy, ¶m)) {
seterror(error);
return NULL;
}
fprintf(stderr, "Signal thread entered with policy %d and priority %d\n",
policy, param.sched_priority);
if ((sigemptyset(&intmask) == -1) ||
(sigaddset(&intmask, signalnum) == -1) ||
(sigwait(&intmask, &sig) == -1))
seterror(errno);
else
seterror(setdone());
return NULL;
}
int signalthreadinit(int signo) {
int error;
pthread_attr_t highprio;
struct sched_param param;
int policy;
sigset_t set;
pthread_t sighandid;
signalnum = signo; /* block the signal */
if ((sigemptyset(&set) == -1) || (sigaddset(&set, signalnum) == -1) ||
(sigprocmask(SIG_BLOCK, &set, NULL) == -1))
return errno;
if ( (error = pthread_attr_init(&highprio)) || /* with higher priority */
(error = pthread_attr_getschedparam(&highprio, ¶m)) ||
(error = pthread_attr_getschedpolicy(&highprio, &policy)) )
return error;
if (param.sched_priority < sched_get_priority_max(policy)) {
param.sched_priority++;
if (error = pthread_attr_setschedparam(&highprio, ¶m))
return error;
} else
fprintf(stderr, "Warning, cannot increase priority of signal thread.\n");
if (error = pthread_create(&sighandid, &highprio, signalthread, NULL))
return error;
return 0;
}
Program 13.15 computethreadsig.c
A main program that uses signalthread with the SIGUSR1 signal to terminate the computethread computation of Program 13.6.
#include <math.h>
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include "computethread.h"
#include "globalerror.h"
#include "sharedsum.h"
#include "signalthread.h"
int showresults(void);
int main(int argc, char *argv[]) {
int error;
int i;
int numthreads;
pthread_t *tids;
if (argc != 2) { /* pass number threads on command line */
fprintf(stderr, "Usage: %s numthreads\n", argv[0]);
return 1;
}
if (error = signalthreadinit(SIGUSR1)) { /* set up signal thread */
fprintf(stderr, "Failed to set up signal thread: %s\n", strerror(error));
return 1;
}
numthreads = atoi(argv[1]);
if ((tids = (pthread_t *)calloc(numthreads, sizeof(pthread_t))) == NULL) {
perror("Failed to allocate space for thread IDs");
return 1;
}
for (i = 0; i < numthreads; i++) /* create numthreads computethreads */
if (error = pthread_create(tids+ i, NULL, computethread, NULL)) {
fprintf(stderr, "Failed to start thread %d: %s\n", i,
strerror(error));
return 1;
}
fprintf(stderr, "Send SIGUSR1(%d) signal to proc %ld to stop calculation\n",
SIGUSR1, (long)getpid());
for (i = 0; i < numthreads; i++) /* wait for computethreads to be done */
if (error = pthread_join(tids[i], NULL)) {
fprintf(stderr, "Failed to join thread %d: %s\n", i, strerror(error));
return 1;
}
if (showresults())
return 1;
return 0;
}
The modular design of the signalthread object makes the object easy to modify. Chapter 16 uses signalthread for some implementations of a bounded buffer.
Exercise 13.24
Run computethreadsig of Program 13.15 from one command window. Send the SIGUSR1 signal from another command window, using the kill shell command. What is its effect?
Answer:
The dedicated signal thread calls setdone when the signal is pending, and the threads terminate normally.
|
13.7 A strerror_r Implementation
Unfortunately, POSIX lists strerror as one of the few functions that are not thread-safe. Often, this is not a problem since often the main thread is the only thread that prints error messages. If you need to use strerror concurrently in a program, you will need to protect it with mutex locks. Neither perror nor strerror is async-signal safe. One way to solve both the thread-safety and async-signal-safety problems is to encapsulate the synchronization in a wrapper, as shown in Program 13.17.
The perror_r and strerror_r functions are both thread-safe and async-signal safe. They use a mutex to prevent concurrent access to the static buffer used by strerror. The perror function is also protected by the same mutex to prevent concurrent execution of strerror and perror. All signals are blocked before the mutex is locked. If this were not done and a signal were caught with the mutex locked, a call to one of these from inside the signal handler would deadlock.
|
13.8 Deadlocks and Other Pesky Problems
Programs that use synchronization constructs have the potential for deadlocks that may not be detected by implementations of the POSIX base standard. For example, suppose that a thread executes pthread_mutex_lock on a mutex that it already holds (from a previously successful pthread_mutex_lock). The POSIX base standard states that pthread_mutex_lock may fail and return EDEADLK under such circumstances, but the standard does not require the function to do so. POSIX takes the position that implementations of the base standard are not required to sacrifice efficiency to protect programmers from their own bad programming. Several extensions to POSIX allow more extensive error checking and deadlock detection.
Another type of problem arises when a thread that holds a lock encounters an error. You must take care to release the lock before returning from the thread, or other threads might be blocked.
Threads with priorities can also complicate matters. A famous example occurred in the Mars Pathfinder mission. The Pathfinder executed a "flawless" Martian landing on July 4, 1997, and began gathering and transmitting large quantities of scientific data to Earth [34]. A few days after landing, the spacecraft started experiencing total system resets, each of which delayed data collection by a day. Several accounts of the underlying causes and the resolution of the problem have appeared, starting with a keynote address at the IEEE Real-Time Systems Symposium on Dec. 3, 1997, by David Wilner, Chief Technical Officer of Wind River [61].
Program 13.17 strerror_r.c
Async-signal-safe, thread-safe versions of strerror and perror.
#include <errno.h>
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
static pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
int strerror_r(int errnum, char *strerrbuf, size_t buflen) {
char *buf;
int error1;
int error2;
int error3;
sigset_t maskblock;
sigset_t maskold;
if ((sigfillset(&maskblock)== -1) ||
(sigprocmask(SIG_SETMASK, &maskblock, &maskold) == -1))
return errno;
if (error1 = pthread_mutex_lock(&lock)) {
(void)sigprocmask(SIG_SETMASK, &maskold, NULL);
return error1;
}
buf = strerror(errnum);
if (strlen(buf) >= buflen)
error1 = ERANGE;
else
(void *)strcpy(strerrbuf, buf);
error2 = pthread_mutex_unlock(&lock);
error3 = sigprocmask(SIG_SETMASK, &maskold, NULL);
return error1 ? error1 : (error2 ? error2 : error3);
}
int perror_r(const char *s) {
int error1;
int error2;
sigset_t maskblock;
sigset_t maskold;
if ((sigfillset(&maskblock) == -1) ||
(sigprocmask(SIG_SETMASK, &maskblock, &maskold) == -1))
return errno;
if (error1 = pthread_mutex_lock(&lock)) {
(void)sigprocmask(SIG_SETMASK, &maskold, NULL);
return error1;
}
perror(s);
error1 = pthread_mutex_unlock(&lock);
error2 = sigprocmask(SIG_SETMASK, &maskold, NULL);
return error1 ? error1 : error2;
}
The Mars Pathfinder flaw was found to be a priority inversion on a mutex [105]. A thread whose job was gathering meteorological data ran periodically at low priority. This thread would acquire the mutex for the data bus to publish its data. A periodic high-priority information thread also acquired the mutex, and occasionally it would block, waiting for the low-priority thread to release the mutex. Each of these threads needed the mutex only for a short time, so on the surface there could be no problem. Unfortunately, a long-running, medium-priority communication thread occasionally preempted the low-priority thread while the low-priority thread held the mutex, causing the high-priority thread to be delayed for a long time.
A second aspect of the problem was the system reaction to the error. The system expected the periodic high-priority thread to regularly use the data bus. A watchdog timer thread would notice if the data bus was not being used, assume that a serious problem had occurred, and initiate a system reboot. The high-priority thread should have been blocked only for a short time when the low-priority thread held the mutex. In this case, the high-priority thread was blocked for a long time because the low-priority thread held the mutex and the long-running, medium-priority thread had preempted it.
A third aspect was the test and debugging of the code. The Mars Pathfinder system had debugging code that could be turned on to run real-time diagnostics. The software team used an identical setup in the lab to run in debug mode (since they didn't want to debug on Mars). After 18 hours, the laboratory version reproduced the problem, and the engineers were able to devise a patch. Glenn Reeves [93], leader of the Mars Pathfinder software team, was quoted as saying "We strongly believe in the 'test what you fly and fly what you test' philosophy." The same ideas apply here on Earth too. At a minimum, you should always think about instrumenting code with test and debugging functions that can be turned on or off by conditional compilation. When possible, allow debugging functions to be turned on dynamically at runtime.
A final aspect of this story is timing. In some ways, the Mars Pathfinder was a victim of its own success. The software team did extensive testing within the parameters of the mission. They actually saw the system reset problem once or twice during testing, but did not track it down. The reset problem was exacerbated by high data rates that caused the medium-priority communication thread to run longer than expected. Prelaunch testing was limited to "best case" high data rates. In the words of Glenn Reeves, "We did not expect nor test the 'better than we could have ever imagined' case." Threaded programs should never rely on quirks of timing to workthey must work under all possible timings.
13.9 Exercise: Multiple Barriers
Reimplement the barrier of Program 13.13 so that it supports multiple barriers. One possible approach is to use an array or a linked list of barriers. Explore different designs with respect to synchronization. Is it better to use a single bmutex lock and bcond condition variable to synchronize all the barriers, or should each barrier get its own synchronization? Why?
|
13.10 Additional Reading
Most operating systems books spend some time on synchronization and use of standard synchronization mechanisms such as mutex locks, condition variables and read-write locks. The review article "Concepts and notations for concurrent programming," by Andrews and Schneider [3] gives an excellent overview of much of the classical work on synchronization. "Interrupts as threads" by Kleiman and Eykholt [63] discusses some interesting aspects of the interaction of threads and interrupts in the kernel. An extensive review of monitors can be found in "Monitor classification," by Buhr et al. [17]. The signal and wait operations of monitors are higher-level implementations of the mutex-conditional variable combination. The Solaris Multithreaded Programming Guide [109], while dealing primarily with Solaris threads, contains some interesting examples of synchronization. Finally, the article "Schedule-conscious synchronization" by Kontothanassis et al. [65] discusses implementation of mutex locks, read-write locks and barriers in a multiprocessor environment.
|
|
|
|
|
|
|
|
|
|