or:

and:

LINUX

Language

Kernel

Package

Book

Test

Forum

iakovlev.org

Программирование в UNIX

Код лежит тут

Введение

Любая операционная система имеет свой интерфейс для работы с программами. В набор таких стандартных сервисов обычно входят открытие новой программы, открытие файла,чтение файла,выделение памяти,получение текущего времени,и т.д. В этой книге будут описаны аналогичные сервиы,присущие юниксу.

Describing the UNIX System in a strictly linear fashion, without any forward references to terms that haven't been described yet, is nearly impossible (and would probably be boring). This chapter provides a whirlwind tour of the UNIX System from a programmer's perspective. We'll give some brief descriptions and examples of terms and concepts that appear throughout the text. We describe these features in much more detail in later chapters. This chapter also provides an introduction and overview of the services provided by the UNIX System, for programmers new to this environment.

UNIX Архитектура

In a strict sense, an operating system can be defined as the software that controls the hardware resources of the computer and provides an environment under which programs can run. Generally, we call this software the kernel, since it is relatively small and resides at the core of the environment. Figure 1.1 shows a diagram of the UNIX System architecture.

Архитектура UNIX

Logging In

Login Name

When we log in to a UNIX system, we enter our login name, followed by our password. The system then looks up our login name in its password file, usually the file /etc/passwd. If we look at our entry in the password file we see that it's composed of seven colon-separated fields: the login name, encrypted password, numeric user ID (205), numeric group ID (105), a comment field, home directory (/home/sar), and shell program (/bin/ksh).

    sar:x:205:105:Stephen Rago:/home/sar:/bin/ksh

All contemporary systems have moved the encrypted password to a different file. In Chapter 6, we'll look at these files and some functions to access them.

Shells

Once we log in, some system information messages are typically displayed, and then we can type commands to the shell program. (Some systems start a window management program when you log in, but you generally end up with a shell running in one of the windows.) A shell is a command-line interpreter that reads user input and executes commands. The user input to a shell is normally from the terminal (an interactive shell) or sometimes from a file (called a shell script). The common shells in use are summarized in Figure 1.2.

Figure 1.2. Common shells used on UNIX systems
Name
Path
FreeBSD 5.2.1
Linux 2.4.22
Mac OS X 10.3
Solaris 9
Bourne shell
/bin/sh
•
link to bash
link to bash
•
Bourne-again shell
/bin/bash
optional
•
•
•
C shell
/bin/csh
link to tcsh
link to tcsh
link to tcsh
•
Korn shell
/bin/ksh
•
TENEX C shell
/bin/tcsh
•
•
•
•

The system knows which shell to execute for us from the final field in our entry in the password file.

The Bourne shell, developed by Steve Bourne at Bell Labs, has been in use since Version 7 and is provided with almost every UNIX system in existence. The control-flow constructs of the Bourne shell are reminiscent of Algol 68.

The C shell, developed by Bill Joy at Berkeley, is provided with all the BSD releases. Additionally, the C shell was provided by AT&T with System V/386 Release 3.2 and is also in System V Release 4 (SVR4). (We'll have more to say about these different versions of the UNIX System in the next chapter.) The C shell was built on the 6th Edition shell, not the Bourne shell. Its control flow looks more like the C language, and it supports additional features that weren't provided by the Bourne shell: job control, a history mechanism, and command line editing.

The Korn shell is considered a successor to the Bourne shell and was first provided with SVR4. The Korn shell, developed by David Korn at Bell Labs, runs on most UNIX systems, but before SVR4 was usually an extra-cost add-on, so it is not as widespread as the other two shells. It is upward compatible with the Bourne shell and includes those features that made the C shell popular: job control, command line editing, and so on.

The Bourne-again shell is the GNU shell provided with all Linux systems. It was designed to be POSIX-conformant, while still remaining compatible with the Bourne shell. It supports features from both the C shell and the Korn shell.

The TENEX C shell is an enhanced version of the C shell. It borrows several features, such as command completion, from the TENEX operating system (developed in 1972 at Bolt Beranek and Newman). The TENEX C shell adds many features to the C shell and is often used as a replacement for the C shell.

Linux uses the Bourne-again shell for its default shell. In fact, /bin/sh is a link to /bin/bash. The default user shell in FreeBSD and Mac OS X is the TENEX C shell, but they use the Bourne shell for their administrative shell scripts because the C shell's programming language is notoriously difficult to use. Solaris, having its heritage in both BSD and System V, provides all the shells shown in Figure 1.2. Free ports of most of the shells are available on the Internet.

Throughout the text, we will use parenthetical notes such as this to describe historical notes and to compare different implementations of the UNIX System. Often the reason for a particular implementation technique becomes clear when the historical reasons are described.

Throughout this text, we'll show interactive shell examples to execute a program that we've developed. These examples use features common to the Bourne shell, the Korn shell, and the Bourne-again shell.

Файлы и каталоги

File System

The UNIX file system is a hierarchical arrangement of directories and files. Everything starts in the directory called root whose name is the single character /.

A directory is a file that contains directory entries. Logically, we can think of each directory entry as containing a filename along with a structure of information describing the attributes of the file. The attributes of a file are such things as type of fileregular file, directorythe size of the file, the owner of the file, permissions for the filewhether other users may access this fileand when the file was last modified. The stat and fstat functions return a structure of information containing all the attributes of a file. In Chapter 4, we'll examine all the attributes of a file in great detail.

We make a distinction between the logical view of a directory entry and the way it is actually stored on disk. Most implementations of UNIX file systems don't store attributes in the directory entries themselves, because of the difficulty of keeping them in synch when a file has multiple hard links. This will become clear when we discuss hard links in Chapter 4.

Filename

The names in a directory are called filenames. The only two characters that cannot appear in a filename are the slash character (/) and the null character. The slash separates the filenames that form a pathname (described next) and the null character terminates a pathname. Nevertheless, it's good practice to restrict the characters in a filename to a subset of the normal printing characters. (We restrict the characters because if we use some of the shell's special characters in the filename, we have to use the shell's quoting mechanism to reference the filename, and this can get complicated.)

Two filenames are automatically created whenever a new directory is created: . (called dot) and .. (called dot-dot). Dot refers to the current directory, and dot-dot refers to the parent directory. In the root directory, dot-dot is the same as dot.

The Research UNIX System and some older UNIX System V file systems restricted a filename to 14 characters. BSD versions extended this limit to 255 characters. Today, almost all commercial UNIX file systems support at least 255-character filenames.

Pathname

A sequence of one or more filenames, separated by slashes and optionally starting with a slash, forms a pathname. A pathname that begins with a slash is called an absolute pathname; otherwise, it's called a relative pathname. Relative pathnames refer to files relative to the current directory. The name for the root of the file system (/) is a special-case absolute pathname that has no filename component.

Example

Listing the names of all the files in a directory is not difficult. Figure 1.3 shows a bare-bones implementation of the ls(1) command.

The notation ls(1) is the normal way to reference a particular entry in the UNIX system manuals. It refers to the entry for ls in Section 1. The sections are normally numbered 1 through 8, and all the entries within each section are arranged alphabetically. Throughout this text, we assume that you have a copy of the manuals for your UNIX system.

Historically, UNIX systems lumped all eight sections together into what was called the UNIX Programmer's Manual. As the page count increased, the trend changed to distributing the sections among separate manuals: one for users, one for programmers, and one for system administrators, for example.

Some UNIX systems further divide the manual pages within a given section, using an uppercase letter. For example, all the standard input/output (I/O) functions in AT&T [1990e] are indicated as being in Section 3S, as in fopen(3S). Other systems have replaced the numeric sections with alphabetic ones, such as C for commands.

Today, most manuals are distributed in electronic form. If your manuals are online, the way to see the manual pages for the ls command would be something like

    man 1 ls

    man -s1 ls

Figure 1.3 is a program that just prints the name of every file in a directory, and nothing else. If the source file is named myls.c, we compile it into the default a.out executable file by

    cc myls.c

Historically, cc(1) is the C compiler. On systems with the GNU C compilation system, the C compiler is gcc(1). Here, cc is often linked to gcc.

Some sample output is

    $ ./a.out /dev
    .
    ..
    console
    tty
    mem
    kmem
    null
    mouse
    stdin
    stdout
    stderr
    zero
                        many more lines that aren't shown
    cdrom
    $ ./a.out /var/spool/cron
    can't open /var/spool/cron: Permission denied
    $ ./a.out /dev/tty
    can't open /dev/tty: Not a directory

Throughout this text, we'll show commands that we run and the resulting output in this fashion: Characters that we type are shown in this font, whereas output from programs is shown like this. If we need to add comments to this output, we'll show the comments in italics. The dollar sign that precedes our input is the prompt that is printed by the shell. We'll always show the shell prompt as a dollar sign.

Note that the directory listing is not in alphabetical order. The ls command sorts the names before printing them.

There are many details to consider in this 20-line program.

First, we include a header of our own: apue.h. We include this header in almost every program in this text. This header includes some standard system headers and defines numerous constants and function prototypes that we use throughout the examples in the text. A listing of this header is in Appendix B.
The declaration of the main function uses the style supported by the ISO C standard. (We'll have more to say about the ISO C standard in the next chapter.)
We take an argument from the command line, argv[1], as the name of the directory to list. In Chapter 7, we'll look at how the main function is called and how the command-line arguments and environment variables are accessible to the program.
Because the actual format of directory entries varies from one UNIX system to another, we use the functions opendir, readdir, and closedir to manipulate the directory.
The opendir function returns a pointer to a DIR structure, and we pass this pointer to the readdir function. We don't care what's in the DIR structure. We then call readdir in a loop, to read each directory entry. The readdir function returns a pointer to a dirent structure or, when it's finished with the directory, a null pointer. All we examine in the dirent structure is the name of each directory entry (d_name). Using this name, we could then call the stat function (Section 4.2) to determine all the attributes of the file.
We call two functions of our own to handle the errors: err_sys and err_quit. We can see from the preceding output that the err_sys function prints an informative message describing what type of error was encountered ("Permission denied" or "Not a directory"). These two error functions are shown and described in Appendix B. We also talk more about error handling in Section 1.7.
When the program is done, it calls the function exit with an argument of 0. The function exit terminates a program. By convention, an argument of 0 means OK, and an argument between 1 and 255 means that an error occurred. In Section 8.5, we show how any program, such as a shell or a program that we write, can obtain the exit status of a program that it executes.

Figure 1.3. List all the files in a directory

 #include "apue.h"
 #include <dirent.h>
 
 int
 main(int argc, char *argv[])
 {
     DIR             *dp;
     struct dirent   *dirp;
 
     if (argc != 2)
         err_quit("usage: ls directory_name");
 
     if ((dp = opendir(argv[1])) == NULL)
         err_sys("can't open %s", argv[1]);
     while ((dirp = readdir(dp)) != NULL)
         printf("%s\n", dirp->d_name);
 
     closedir(dp);
     exit(0);
 }

Working Directory

Every process has a working directory, sometimes called the current working directory. This is the directory from which all relative pathnames are interpreted. A process can change its working directory with the chdir function.

For example, the relative pathname doc/memo/joe refers to the file or directory joe, in the directory memo, in the directory doc, which must be a directory within the working directory. From looking just at this pathname, we know that both doc and memo have to be directories, but we can't tell whether joe is a file or a directory. The pathname /usr/lib/lint is an absolute pathname that refers to the file or directory lint in the directory lib, in the directory usr, which is in the root directory.

Home Directory

When we log in, the working directory is set to our home directory. Our home directory is obtained from our entry in the password file (Section 1.3).

1.5. Input and Output

File Descriptors

File descriptors are normally small non-negative integers that the kernel uses to identify the files being accessed by a particular process. Whenever it opens an existing file or creates a new file, the kernel returns a file descriptor that we use when we want to read or write the file.

Standard Input, Standard Output, and Standard Error

By convention, all shells open three descriptors whenever a new program is run: standard input, standard output, and standard error. If nothing special is done, as in the simple command

ls

then all three are connected to the terminal. Most shells provide a way to redirect any or all of these three descriptors to any file. For example,

    ls > file.list

executes the ls command with its standard output redirected to the file named file.list.

Unbuffered I/O

Unbuffered I/O is provided by the functions open, read, write, lseek, and close. These functions all work with file descriptors.

Example

If we're willing to read from the standard input and write to the standard output, then the program in Figure 1.4 copies any regular file on a UNIX system.

The <unistd.h> header, included by apue.h, and the two constants STDIN_FILENO and STDOUT_FILENO are part of the POSIX standard (about which we'll have a lot more to say in the next chapter). In this header are function prototypes for many of the UNIX system services, such as the read and write functions that we call.

The constants STDIN_FILENO and STDOUT_FILENO are defined in <unistd.h> and specify the file descriptors for standard input and standard output. These values are typically 0 and 1, respectively, but we'll use the new names for portability.

In Section 3.9, we'll examine the BUFFSIZE constant in detail, seeing how various values affect the efficiency of the program. Regardless of the value of this constant, however, this program still copies any regular file.

The read function returns the number of bytes that are read, and this value is used as the number of bytes to write. When the end of the input file is encountered, read returns 0 and the program stops. If a read error occurs, read returns -1. Most of the system functions return 1 when an error occurs.

If we compile the program into the standard name (a.out) and execute it as

    ./a.out > data

standard input is the terminal, standard output is redirected to the file data, and standard error is also the terminal. If this output file doesn't exist, the shell creates it by default. The program copies lines that we type to the standard output until we type the end-of-file character (usually Control-D).

If we run

    ./a.out < infile > outfile

then the file named infile will be copied to the file named outfile.

Figure 1.4. List all the files in a directory

 #include "apue.h"
 
 #define BUFFSIZE    4096
 
 int
 main(void)
 {
     int     n;
     char    buf[BUFFSIZE];
 
     while ((n = read(STDIN_FILENO, buf, BUFFSIZE)) > 0)
         if (write(STDOUT_FILENO, buf, n) != n)
             err_sys("write error");
         if (n < 0)
             err_sys("read error");
 
         exit(0);
 }

In Chapter 3, we describe the unbuffered I/O functions in more detail.

Standard I/O

The standard I/O functions provide a buffered interface to the unbuffered I/O functions. Using standard I/O prevents us from having to worry about choosing optimal buffer sizes, such as the BUFFSIZE constant in Figure 1.4. Another advantage of using the standard I/O functions is that they simplify dealing with lines of input (a common occurrence in UNIX applications). The fgets function, for example, reads an entire line. The read function, on the other hand, reads a specified number of bytes. As we shall see in Section 5.4, the standard I/O library provides functions that let us control the style of buffering used by the library.

The most common standard I/O function is printf. In programs that call printf, we'll always include <stdio.h>normally by including apue.has this header contains the function prototypes for all the standard I/O functions.

Example

The program in Figure 1.5, which we'll examine in more detail in Section 5.8, is like the previous program that called read and write. This program copies standard input to standard output and can copy any regular file.

The function getc reads one character at a time, and this character is written by putc. After the last byte of input has been read, getc returns the constant EOF (defined in <stdio.h>). The standard I/O constants stdin and stdout are also defined in the <stdio.h> header and refer to the standard input and standard output.

Figure 1.5. Copy standard input to standard output, using standard I/O

 #include "apue.h"
 
 int
 main(void)
 {
     int     c;
 
     while ((c = getc(stdin)) != EOF)
         if (putc(c, stdout) == EOF)
             err_sys("output error");
 
     if (ferror(stdin))
         err_sys("input error");
 
     exit(0);
 }

1.6. Programs and Processes

Program

A program is an executable file residing on disk in a directory. A program is read into memory and is executed by the kernel as a result of one of the six exec functions. We'll cover these functions in Section 8.10.

Processes and Process ID

An executing instance of a program is called a process, a term used on almost every page of this text. Some operating systems use the term task to refer to a program that is being executed.

The UNIX System guarantees that every process has a unique numeric identifier called the process ID. The process ID is always a non-negative integer.

Example

The program in Figure 1.6 prints its process ID.

If we compile this program into the file a.out and execute it, we have

    $ ./a.out
    hello world from process ID 851
    $ ./a.out
    hello world from process ID 854

When this program runs, it calls the function getpid to obtain its process ID.

Figure 1.6. Print the process ID

 #include "apue.h"
 
 int
 main(void)
 {
     printf("hello world from process ID %d\n", getpid());
     exit(0);
 }

Process Control

There are three primary functions for process control: fork, exec, and waitpid. (The exec function has six variants, but we often refer to them collectively as simply the exec function.)

Example

The process control features of the UNIX System are demonstrated using a simple program (Figure 1.7) that reads commands from standard input and executes the commands. This is a bare-bones implementation of a shell-like program. There are several features to consider in this 30-line program.

We use the standard I/O function fgets to read one line at a time from the standard input. When we type the end-of-file character (which is often Control-D) as the first character of a line, fgets returns a null pointer, the loop stops, and the process terminates. In Chapter 18, we describe all the special terminal charactersend of file, backspace one character, erase entire line, and so onand how to change them.
Because each line returned by fgets is terminated with a newline character, followed by a null byte, we use the standard C function strlen to calculate the length of the string, and then replace the newline with a null byte. We do this because the execlp function wants a null-terminated argument, not a newline-terminated argument.
We call fork to create a new process, which is a copy of the caller. We say that the caller is the parent and that the newly created process is the child. Then fork returns the non-negative process ID of the new child process to the parent, and returns 0 to the child. Because fork creates a new process, we say that it is called onceby the parentbut returns twicein the parent and in the child.
In the child, we call execlp to execute the command that was read from the standard input. This replaces the child process with the new program file. The combination of a fork, followed by an exec, is what some operating systems call spawning a new process. In the UNIX System, the two parts are separated into individual functions. We'll have a lot more to say about these functions in Chapter 8.
Because the child calls execlp to execute the new program file, the parent wants to wait for the child to terminate. This is done by calling waitpid, specifying which process we want to wait for: the pid argument, which is the process ID of the child. The waitpid function also returns the termination status of the childthe status variablebut in this simple program, we don't do anything with this value. We could examine it to determine exactly how the child terminated.
The most fundamental limitation of this program is that we can't pass arguments to the command that we execute. We can't, for example, specify the name of a directory to list. We can execute ls only on the working directory. To allow arguments would require that we parse the input line, separating the arguments by some convention, probably spaces or tabs, and then pass each argument as a separate argument to the execlp function. Nevertheless, this program is still a useful demonstration of the process control functions of the UNIX System.

If we run this program, we get the following results. Note that our program has a different promptthe percent signto distinguish it from the shell's prompt.

    $ ./a.out
    % date
    Sun Aug 1 03:04:47 EDT 2004            programmers work late
    % who
    sar     :0       Jul 26 22:54
    sar     pts/0    Jul 26 22:54 (:0)
    sar     pts/1    Jul 26 22:54 (:0)
    sar     pts/2    Jul 26 22:54 (:0)
    % pwd
    /home/sar/bk/apue/2e
    % ls
    Makefile
    a.out
    shell1.c
    % ^D                                   type the end-of-file character
    $                                      the regular shell prompt

Figure 1.7. Read commands from standard input and execute them

 #include "apue.h"
 #include <sys/wait.h>
 
 int
 main(void)
 {
     char    buf[MAXLINE];   /* from apue.h */
     pid_t   pid;
     int     status;
 
     printf("%% ");  /* print prompt (printf requires %% to print %) */
     while (fgets(buf, MAXLINE, stdin) != NULL) {
         if (buf[strlen(buf) - 1] == "\n")
             buf[strlen(buf) - 1] = 0; /* replace newline with null */
 
         if ((pid = fork()) < 0) {
             err_sys("fork error");
         } else if (pid == 0) {      /* child */
             execlp(buf, buf, (char *)0);
             err_ret("couldn't execute: %s", buf);
             exit(127);
         }
 
         /* parent */
         if ((pid = waitpid(pid, &status, 0)) < 0)
             err_sys("waitpid error");
         printf("%% ");
     }
     exit(0);
 }

The notation ^D is used to indicate a control character. Control characters are special characters formed by holding down the control keyoften labeled Control or Ctrlon your keyboard and then pressing another key at the same time. Control-D, or ^D, is the default end-of-file character. We'll see many more control characters when we discuss terminal I/O in Chapter 18.

Threads and Thread IDs

Usually, a process has only one thread of controlone set of machine instructions executing at a time. Some problems are easier to solve when more than one thread of control can operate on different parts of the problem. Additionally, multiple threads of control can exploit the parallelism possible on multiprocessor systems.

All the threads within a process share the same address space, file descriptors, stacks, and process-related attributes. Because they can access the same memory, the threads need to synchronize access to shared data among themselves to avoid inconsistencies.

As with processes, threads are identified by IDs. Thread IDs, however, are local to a process. A thread ID from one process has no meaning in another process. We use thread IDs to refer to specific threads as we manipulate the threads within a process.

Functions to control threads parallel those used to control processes. Because threads were added to the UNIX System long after the process model was established, however, the thread model and the process model have some complicated interactions, as we shall see in Chapter 12.

1.7. Error Handling

When an error occurs in one of the UNIX System functions, a negative value is often returned, and the integer errno is usually set to a value that gives additional information. For example, the open function returns either a non-negative file descriptor if all is OK or 1 if an error occurs. An error from open has about 15 possible errno values, such as file doesn't exist, permission problem, and so on. Some functions use a convention other than returning a negative value. For example, most functions that return a pointer to an object return a null pointer to indicate an error.

The file <errno.h> defines the symbol errno and constants for each value that errno can assume. Each of these constants begins with the character E. Also, the first page of Section 2 of the UNIX system manuals, named intro(2), usually lists all these error constants. For example, if errno is equal to the constant EACCES, this indicates a permission problem, such as insufficient permission to open the requested file.

On Linux, the error constants are listed in the errno(3) manual page.

POSIX and ISO C define errno as a symbol expanding into a modifiable lvalue of type integer. This can be either an integer that contains the error number or a function that returns a pointer to the error number. The historical definition is

    extern int errno;

But in an environment that supports threads, the process address space is shared among multiple threads, and each thread needs its own local copy of errno to prevent one thread from interfering with another. Linux, for example, supports multithreaded access to errno by defining it as

    extern int *_ _errno_location(void);
    #define errno  (*_ _errno_location())

There are two rules to be aware of with respect to errno. First, its value is never cleared by a routine if an error does not occur. Therefore, we should examine its value only when the return value from a function indicates that an error occurred. Second, the value of errno is never set to 0 by any of the functions, and none of the constants defined in <errno.h> has a value of 0.

Two functions are defined by the C standard to help with printing error messages.

#include <string.h> char *strerror(int errnum);

Returns: pointer to message string

This function maps errnum, which is typically the errno value, into an error message string and returns a pointer to the string.

The perror function produces an error message on the standard error, based on the current value of errno, and returns.

#include <stdio.h> void perror(const char *msg);

It outputs the string pointed to by msg, followed by a colon and a space, followed by the error message corresponding to the value of errno, followed by a newline.

Example

Figure 1.8 shows the use of these two error functions.

If this program is compiled into the file a.out, we have

    $ ./a.out
    EACCES: Permission denied
    ./a.out: No such file or directory

Note that we pass the name of the programargv[0], whose value is ./a.outas the argument to perror. This is a standard convention in the UNIX System. By doing this, if the program is executed as part of a pipeline, as in

    prog1 < inputfile | prog2 | prog3 > outputfile

we are able to tell which of the three programs generated a particular error message.

Figure 1.8. Demonstrate `strerror` and `perror`

 #include "apue.h"
 #include <errno.h>
 
 int
 main(int argc, char *argv[])
 {
     fprintf(stderr, "EACCES: %s\n", strerror(EACCES));
     errno = ENOENT;
     perror(argv[0]);
     exit(0);
 }

Instead of calling either strerror or perror directly, all the examples in this text use the error functions shown in Appendix B. The error functions in this appendix let us use the variable argument list facility of ISO C to handle error conditions with a single C statement.

Error Recovery

The errors defined in <errno.h> can be divided into two categories: fatal and nonfatal. A fatal error has no recovery action. The best we can do is print an error message on the user's screen or write an error message into a log file, and then exit. Nonfatal errors, on the other hand, can sometimes be dealt with more robustly. Most nonfatal errors are temporary in nature, such as with a resource shortage, and might not occur when there is less activity on the system.

Resource-related nonfatal errors include EAGAIN, ENFILE, ENOBUFS, ENOLCK, ENOSPC, ENOSR, EWOULDBLOCK, and sometimes ENOMEM. EBUSY can be treated as a nonfatal error when it indicates that a shared resource is in use. Sometimes, EINTR can be treated as a nonfatal error when it interrupts a slow system call (more on this in Section 10.5).

The typical recovery action for a resource-related nonfatal error is to delay a little and try again later. This technique can be applied in other circumstances. For example, if an error indicates that a network connection is no longer functioning, it might be possible for the application to delay a short time and then reestablish the connection. Some applications use an exponential backoff algorithm, waiting a longer period of time each iteration.

Ultimately, it is up to the application developer to determine which errors are recoverable. If a reasonable strategy can be used to recover from an error, we can improve the robustness of our application by avoiding an abnormal exit.

1.8. User Identification

User ID

The user ID from our entry in the password file is a numeric value that identifies us to the system. This user ID is assigned by the system administrator when our login name is assigned, and we cannot change it. The user ID is normally assigned to be unique for every user. We'll see how the kernel uses the user ID to check whether we have the appropriate permissions to perform certain operations.

We call the user whose user ID is 0 either root or the superuser. The entry in the password file normally has a login name of root, and we refer to the special privileges of this user as superuser privileges. As we'll see in Chapter 4, if a process has superuser privileges, most file permission checks are bypassed. Some operating system functions are restricted to the superuser. The superuser has free rein over the system.

Client versions of Mac OS X ship with the superuser account disabled; server versions ship with the account already enabled. Instructions are available on Apple's Web site describing how to enable it. See http://docs.info.apple.com/article.html?artnum=106290.

Group ID

Our entry in the password file also specifies our numeric group ID. This too is assigned by the system administrator when our login name is assigned. Typically, the password file contains multiple entries that specify the same group ID. Groups are normally used to collect users together into projects or departments. This allows the sharing of resources, such as files, among members of the same group. We'll see in Section 4.5 that we can set the permissions on a file so that all members of a group can access the file, whereas others outside the group cannot.

There is also a group file that maps group names into numeric group IDs. The group file is usually /etc/group.

The use of numeric user IDs and numeric group IDs for permissions is historical. With every file on disk, the file system stores both the user ID and the group ID of a file's owner. Storing both of these values requires only four bytes, assuming that each is stored as a two-byte integer. If the full ASCII login name and group name were used instead, additional disk space would be required. In addition, comparing strings during permission checks is more expensive than comparing integers.

Users, however, work better with names than with numbers, so the password file maintains the mapping between login names and user IDs, and the group file provides the mapping between group names and group IDs. The ls -l command, for example, prints the login name of the owner of a file, using the password file to map the numeric user ID into the corresponding login name.

Early UNIX systems used 16-bit integers to represent user and group IDs. Contemporary UNIX systems use 32-bit integers.

Example

The program in Figure 1.9 prints the user ID and the group ID.

We call the functions getuid and getgid to return the user ID and the group ID. Running the program yields

    $ ./a.out
    uid = 205, gid = 105

Figure 1.9. Print user ID and group ID

 #include "apue.h"
 
 int
 main(void)
 {
     printf("uid = %d, gid = %d\n", getuid(), getgid());
     exit(0);
 }

Supplementary Group IDs

In addition to the group ID specified in the password file for a login name, most versions of the UNIX System allow a user to belong to additional groups. This started with 4.2BSD, which allowed a user to belong to up to 16 additional groups. These supplementary group IDs are obtained at login time by reading the file /etc/group and finding the first 16 entries that list the user as a member. As we shall see in the next chapter, POSIX requires that a system support at least eight supplementary groups per process, but most systems support at least 16.

1.9. Signals

Signals are a technique used to notify a process that some condition has occurred. For example, if a process divides by zero, the signal whose name is SIGFPE (floating-point exception) is sent to the process. The process has three choices for dealing with the signal.

Ignore the signal. This option isn't recommended for signals that denote a hardware exception, such as dividing by zero or referencing memory outside the address space of the process, as the results are undefined.
Let the default action occur. For a divide-by-zero condition, the default is to terminate the process.
Provide a function that is called when the signal occurs (this is called "catching" the signal). By providing a function of our own, we'll know when the signal occurs and we can handle it as we wish.

Many conditions generate signals. Two terminal keys, called the interrupt key often the DELETE key or Control-Cand the quit keyoften Control-backslashare used to interrupt the currently running process. Another way to generate a signal is by calling the kill function. We can call this function from a process to send a signal to another process. Naturally, there are limitations: we have to be the owner of the other process (or the superuser) to be able to send it a signal.

Example

Recall the bare-bones shell example (Figure 1.7). If we invoke this program and press the interrupt key, the process terminates because the default action for this signal, named SIGINT, is to terminate the process. The process hasn't told the kernel to do anything other than the default with this signal, so the process terminates.

To catch this signal, the program needs to call the signal function, specifying the name of the function to call when the SIGINT signal is generated. The function is named sig_int; when it's called, it just prints a message and a new prompt. Adding 11 lines to the program in Figure 1.7 gives us the version in Figure 1.10. (The 11 new lines are indicated with a plus sign at the beginning of the line.)

In Chapter 10, we'll take a long look at signals, as most nontrivial applications deal with them.

Figure 1.10. Read commands from standard input and execute them

   #include "apue.h"
   #include <sys/wait.h>
 
 + static void sig_int(int);       /* our signal-catching function */
 +
   int
   main(void)
   {
       char    buf[MAXLINE];    /* from apue.h */
       pid_t   pid;
       int     status;
 
 +     if (signal(SIGINT, sig_int) == SIG_ERR)
 +         err_sys("signal error");
 +
       printf("%% ");  /* print prompt (printf requires %% to print %) */
       while (fgets(buf, MAXLINE, stdin) != NULL) {
           if (buf[strlen(buf) - 1] == "\n")
               buf[strlen(buf) - 1] = 0; /* replace newline with null */
 
           if ((pid = fork()) < 0) {
               err_sys("fork error");
           } else if (pid == 0) {        /* child */
               execlp(buf, buf, (char *)0);
               err_ret("couldn't execute: %s", buf);
               exit(127);
           }
 
           /* parent */
           if ((pid = waitpid(pid, &status, 0)) < 0)
               err_sys("waitpid error");
           printf("%% ");
       }
       exit(0);
   }
 +
 + void
 + sig_int(int signo)
 + {
 +     printf("interrupt\n%% ");
 + }

1.10. Time Values

Historically, UNIX systems have maintained two different time values:

Calendar time. This value counts the number of seconds since the Epoch: 00:00:00 January 1, 1970, Coordinated Universal Time (UTC). (Older manuals refer to UTC as Greenwich Mean Time.) These time values are used to record the time when a file was last modified, for example.
The primitive system data type time_t holds these time values.
Process time. This is also called CPU time and measures the central processor resources used by a process. Process time is measured in clock ticks, which have historically been 50, 60, or 100 ticks per second.
The primitive system data type clock_t holds these time values. (We'll show how to obtain the number of clock ticks per second with the sysconf function in Section 2.5.4.)

When we measure the execution time of a process, as in Section 3.9, we'll see that the UNIX System maintains three values for a process:

Clock time
User CPU time
System CPU time

The clock time, sometimes called wall clock time, is the amount of time the process takes to run, and its value depends on the number of other processes being run on the system. Whenever we report the clock time, the measurements are made with no other activities on the system.

The user CPU time is the CPU time attributed to user instructions. The system CPU time is the CPU time attributed to the kernel when it executes on behalf of the process. For example, whenever a process executes a system service, such as read or write, the time spent within the kernel performing that system service is charged to the process. The sum of user CPU time and system CPU time is often called the CPU time.

It is easy to measure the clock time, user time, and system time of any process: simply execute the time(1) command, with the argument to the time command being the command we want to measure. For example:

    $ cd /usr/include
    $ time -p grep _POSIX_SOURCE */*.h > /dev/null
 
    real    0m0.81s
    user    0m0.11s
    sys     0m0.07s

The output format from the time command depends on the shell being used, because some shells don't run /usr/bin/time, but instead have a separate built-in function to measure the time it takes commands to run.

In Section 8.16, we'll see how to obtain these three times from a running process. The general topic of times and dates is covered in Section 6.10.

1.11. System Calls and Library Functions

All operating systems provide service points through which programs request services from the kernel. All implementations of the UNIX System provide a well-defined, limited number of entry points directly into the kernel called system calls (recall Figure 1.1). Version 7 of the Research UNIX System provided about 50 system calls, 4.4BSD provided about 110, and SVR4 had around 120. Linux has anywhere between 240 and 260 system calls, depending on the version. FreeBSD has around 320.

The system call interface has always been documented in Section 2 of the UNIX Programmer's Manual. Its definition is in the C language, regardless of the actual implementation technique used on any given system to invoke a system call. This differs from many older operating systems, which traditionally defined the kernel entry points in the assembler language of the machine.

The technique used on UNIX systems is for each system call to have a function of the same name in the standard C library. The user process calls this function, using the standard C calling sequence. This function then invokes the appropriate kernel service, using whatever technique is required on the system. For example, the function may put one or more of the C arguments into general registers and then execute some machine instruction that generates a software interrupt in the kernel. For our purposes, we can consider the system calls as being C functions.

Section 3 of the UNIX Programmer's Manual defines the general-purpose functions available to programmers. These functions aren't entry points into the kernel, although they may invoke one or more of the kernel's system calls. For example, the printf function may use the write system call to output a string, but the strcpy (copy a string) and atoi (convert ASCII to integer) functions don't involve the kernel at all.

From an implementor's point of view, the distinction between a system call and a library function is fundamental. But from a user's perspective, the difference is not as critical. From our perspective in this text, both system calls and library functions appear as normal C functions. Both exist to provide services for application programs. We should realize, however, that we can replace the library functions, if desired, whereas the system calls usually cannot be replaced.

Consider the memory allocation function malloc as an example. There are many ways to do memory allocation and its associated garbage collection (best fit, first fit, and so on). No single technique is optimal for all programs. The UNIX system call that handles memory allocation, sbrk(2), is not a general-purpose memory manager. It increases or decreases the address space of the process by a specified number of bytes. How that space is managed is up to the process. The memory allocation function, malloc(3), implements one particular type of allocation. If we don't like its operation, we can define our own malloc function, which will probably use the sbrk system call. In fact, numerous software packages implement their own memory allocation algorithms with the sbrk system call. Figure 1.11 shows the relationship between the application, the malloc function, and the sbrk system call.

Figure 1.11. Separation of `malloc` function and `sbrk` system call

Here we have a clean separation of duties: the system call in the kernel allocates an additional chunk of space on behalf of the process. The malloc library function manages this space from user level.

Another example to illustrate the difference between a system call and a library function is the interface the UNIX System provides to determine the current time and date. Some operating systems provide one system call to return the time and another to return the date. Any special handling, such as the switch to or from daylight saving time, is handled by the kernel or requires human intervention. The UNIX System, on the other hand, provides a single system call that returns the number of seconds since the Epoch: midnight, January 1, 1970, Coordinated Universal Time. Any interpretation of this value, such as converting it to a human-readable time and date using the local time zone, is left to the user process. The standard C library provides routines to handle most cases. These library routines handle such details as the various algorithms for daylight saving time.

An application can call either a system call or a library routine. Also realize that many library routines invoke a system call. This is shown in Figure 1.12.

Figure 1.12. Difference between C library functions and system calls

Another difference between system calls and library functions is that system calls usually provide a minimal interface, whereas library functions often provide more elaborate functionality. We've seen this already in the difference between the sbrk system call and the malloc library function. We'll see this difference later, when we compare the unbuffered I/O functions (Chapter 3) and the standard I/O functions (Chapter 5).

The process control system calls (fork, exec, and wait) are usually invoked by the user's application code directly. (Recall the bare-bones shell in Figure 1.7.) But some library routines exist to simplify certain common cases: the system and popen library routines, for example. In Section 8.13, we'll show an implementation of the system function that invokes the basic process control system calls. We'll enhance this example in Section 10.18 to handle signals correctly.

To define the interface to the UNIX System that most programmers use, we have to describe both the system calls and some of the library functions. If we described only the sbrk system call, for example, we would skip the more programmer-friendly malloc library function that many applications use. In this text, we'll use the term function to refer to both system calls and library functions, except when the distinction is necessary.

1.12. Summary

This chapter has been a short tour of the UNIX System. We've described some of the fundamental terms that we'll encounter over and over again. We've seen numerous small examples of UNIX programs to give us a feel for what the remainder of the text talks about.

The next chapter is about standardization of the UNIX System and the effect of work in this area on current systems. Standards, particularly the ISO C standard and the POSIX.1 standard, will affect the rest of the text.

2.1. Introduction

Much work has gone into standardizing the UNIX programming environment and the C programming language. Although applications have always been quite portable across different versions of the UNIX operating system, the proliferation of versions and differences during the 1980s led many large users, such as the U.S. government, to call for standardization.

In this chapter we first look at the various standardization efforts that have been under way over the past two decades. We then discuss the effects of these UNIX programming standards on the operating system implementations that are described in this book. An important part of all the standardization efforts is the specification of various limits that each implementation must define, so we look at these limits and the various ways to determine their values.

2.2. UNIX Standardization

2.2.1. ISO C

In late 1989, ANSI Standard X3.1591989 for the C programming language was approved. This standard has also been adopted as international standard ISO/IEC 9899:1990. ANSI is the American National Standards Institute, the U.S. member in the International Organization for Standardization (ISO). IEC stands for the International Electrotechnical Commission.

The C standard is now maintained and developed by the ISO/IEC international standardization working group for the C programming language, known as ISO/IEC JTC1/SC22/WG14, or WG14 for short. The intent of the ISO C standard is to provide portability of conforming C programs to a wide variety of operating systems, not only the UNIX System. This standard defines not only the syntax and semantics of the programming language but also a standard library [Chapter 7 of ISO 1999; Plauger 1992; Appendix B of Kernighan and Ritchie 1988]. This library is important because all contemporary UNIX systems, such as the ones described in this book, provide the library routines that are specified in the C standard.

In 1999, the ISO C standard was updated and approved as ISO/IEC 9899:1999, largely to improve support for applications that perform numerical processing. The changes don't affect the POSIX standards described in this book, except for the addition of the restrict keyword to some of the function prototypes. This keyword is used to tell the compiler which pointer references can be optimized, by indicating that the object to which the pointer refers is accessed in the function only via that pointer.

As with most standards, there is a delay between the standard's approval and the modification of software to conform to it. As each vendor's compilation systems evolve, they add more support for the latest version of the ISO C standard.

A summary of the current level of conformance of gcc to the 1999 version of the ISO C standard is available at http://www.gnu.org/software/gcc/c99status.html.

The ISO C library can be divided into 24 areas, based on the headers defined by the standard. Figure 2.1 lists the headers defined by the C standard. The POSIX.1 standard includes these headers, as well as others. We also list which of these headers are supported by the four implementations (FreeBSD 5.2.1, Linux 2.4.22, Mac OS X 10.3, and Solaris 9) that are described later in this chapter.

Figure 2.1. Headers defined by the ISO C standard
Header
FreeBSD 5.2.1
Linux 2.4.22
Mac OS X 10.3
Solaris 9
Description
<assert.h>
•
•
•
•
verify program assertion
<complex.h>
•
•
•

complex arithmetic support
<ctype.h>
•
•
•
•
character types
<errno.h>
•
•
•
•
error codes (Section 1.7)
<fenv.h>

•
•

floating-point environment
<float.h>
•
•
•
•
floating-point constants
<inttypes.h>
•
•
•
•
integer type format conversion
<iso646.h>
•
•
•
•
alternate relational operator macros
<limits.h>
•
•
•
•
implementation constants (Section 2.5)
<locale.h>
•
•
•
•
locale categories
<math.h>
•
•
•
•
mathematical constants
<setjmp.h>
•
•
•
•
nonlocal goto (Section 7.10)
<signal.h>
•
•
•
•
signals (Chapter 10)
<stdarg.h>
•
•
•
•
variable argument lists
<stdbool.h>
•
•
•
•
boolean type and values
<stddef.h>
•
•
•
•
standard definitions
<stdint.h>
•
•
•

integer types
<stdio.h>
•
•
•
•
standard I/O library (Chapter 5)
<stdlib.h>
•
•
•
•
utility functions
<string.h>
•
•
•
•
string operations
<tgmath.h>

•

type-generic math macros
<time.h>
•
•
•
•
time and date (Section 6.10)
<wchar.h>
•
•
•
•
extended multibyte and wide character support
<wctype.h>
•
•
•
•
wide character classification and mapping support

The ISO C headers depend on which version of the C compiler is used with the operating system. When considering Figure 2.1, note that FreeBSD 5.2.1 ships with version 3.3.3 of gcc, Solaris 9 ships with both version 2.95.3 and version 3.2 of gcc, Mandrake 9.2 (Linux 2.4.22) ships with version 3.3.1 of gcc, and Mac OS X 10.3 ships with version 3.3 of gcc. Mac OS X also includes older versions of gcc.

2.2.2. IEEE POSIX

POSIX is a family of standards developed by the IEEE (Institute of Electrical and Electronics Engineers). POSIX stands for Portable Operating System Interface. It originally referred only to the IEEE Standard 1003.11988the operating system interfacebut was later extended to include many of the standards and draft standards with the 1003 designation, including the shell and utilities (1003.2).

Of specific interest to this book is the 1003.1 operating system interface standard, whose goal is to promote the portability of applications among various UNIX System environments. This standard defines the services that must be provided by an operating system if it is to be "POSIX compliant," and has been adopted by most computer vendors. Although the 1003.1 standard is based on the UNIX operating system, the standard is not restricted to UNIX and UNIX-like systems. Indeed, some vendors supplying proprietary operating systems claim that these systems have been made POSIX compliant, while still leaving all their proprietary features in place.

Because the 1003.1 standard specifies an interface and not an implementation, no distinction is made between system calls and library functions. All the routines in the standard are called functions.

Standards are continually evolving, and the 1003.1 standard is no exception. The 1988 version of this standard, IEEE Standard 1003.11988, was modified and submitted to the International Organization for Standardization. No new interfaces or features were added, but the text was revised. The resulting document was published as IEEE Std 1003.11990 [IEEE 1990]. This is also the international standard ISO/IEC 99451:1990. This standard is commonly referred to as POSIX.1, which we'll use in this text.

The IEEE 1003.1 working group continued to make changes to the standard. In 1993, a revised version of the IEEE 1003.1 standard was published. It included 1003.1-1990 standard and the 1003.1b-1993 real-time extensions standard. In 1996, the standard was again updated as international standard ISO/IEC 99451:1996. It included interfaces for multithreaded programming, called pthreads for POSIX threads. More real-time interfaces were added in 1999 with the publication of IEEE Standard 1003.1d-1999. A year later, IEEE Standard 1003.1j-2000 was published, including even more real-time interfaces, and IEEE Standard 1003.1q-2000 was published, adding event-tracing extensions to the standard.

The 2001 version of 1003.1 departed from the prior versions in that it combined several 1003.1 amendments, the 1003.2 standard, and portions of the Single UNIX Specification (SUS), Version 2 (more on this later). The resulting standard, IEEE Standard 1003.1-2001, includes the following other standards:

ISO/IEC 9945-1 (IEEE Standard 1003.1-1996), which includes
- IEEE Standard 1003.1-1990
- IEEE Standard 1003.1b-1993 (real-time extensions)
- IEEE Standard 1003.1c-1995 (pthreads)
- IEEE Standard 1003.1i-1995 (real-time technical corrigenda)
IEEE P1003.1a draft standard (system interface revision)
IEEE Standard 1003.1d-1999 (advanced real-time extensions)
IEEE Standard 1003.1j-2000 (more advanced real-time extensions)
IEEE Standard 1003.1q-2000 (tracing)
IEEE Standard 1003.2d-1994 (batch extensions)
IEEE P1003.2b draft standard (additional utilities)
Parts of IEEE Standard 1003.1g-2000 (protocol-independent interfaces)
ISO/IEC 9945-2 (IEEE Standard 1003.2-1993)
The Base Specifications of the Single UNIX Specification, version 2, which include
- System Interface Definitions, Issue 5
- Commands and Utilities, Issue 5
- System Interfaces and Headers, Issue 5
Open Group Technical Standard, Networking Services, Issue 5.2
ISO/IEC 9899:1999, Programming Languages - C

Figure 2.2, Figure 2.3, and Figure 2.4 summarize the required and optional headers as specified by POSIX.1. Because POSIX.1 includes the ISO C standard library functions, it also requires the headers listed in Figure 2.1. All four figures summarize which headers are included in the implementations discussed in this book.

Figure 2.2. Required headers defined by the POSIX standard
Header
FreeBSD 5.2.1
Linux 2.4.22
Mac OS X 10.3
Solaris 9
Description
<dirent.h>
•
•
•
•
directory entries (Section 4.21)
<fcntl.h>
•
•
•
•
file control (Section 3.14)
<fnmatch.h>
•
•
•
•
filename-matching types
<glob.h>
•
•
•
•
pathname pattern-matching types
<grp.h>
•
•
•
•
group file (Section 6.4)
<netdb.h>
•
•
•
•
network database operations
<pwd.h>
•
•
•
•
password file (Section 6.2)
<regex.h>
•
•
•
•
regular expressions
<tar.h>
•
•
•
•
tar archive values
<termios.h>
•
•
•
•
terminal I/O (Chapter 18)
<unistd.h>
•
•
•
•
symbolic constants
<utime.h>
•
•
•
•
file times (Section 4.19)
<wordexp.h>
•
•

•
word-expansion types
<arpa/inet.h>
•
•
•
•
Internet definitions (Chapter 16)
<net/if.h>
•
•
•
•
socket local interfaces (Chapter 16)
<netinet/in.h>
•
•
•
•
Internet address family (Section 16.3)
<netinet/tcp.h>
•
•
•
•
Transmission Control Protocol definitions
<sys/mman.h>
•
•
•
•
memory management declarations
<sys/select.h>
•
•
•
•
select function (Section 14.5.1)
<sys/socket.h>
•
•
•
•
sockets interface (Chapter 16)
<sys/stat.h>
•
•
•
•
file status (Chapter 4)
<sys/times.h>
•
•
•
•
process times (Section 8.16)
<sys/types.h>
•
•
•
•
primitive system data types (Section 2.8)
<sys/un.h>
•
•
•
•
UNIX domain socket definitions (Section 17.3)
<sys/utsname.h>
•
•
•
•
system name (Section 6.9)
<sys/wait.h>
•
•
•
•
process control (Section 8.6)

Figure 2.3. XSI extension headers defined by the POSIX standard
Header
FreeBSD 5.2.1
Linux 2.4.22
Mac OS X 10.3
Solaris 9
Description
<cpio.h>
•
•

•
cpio archive values
<dlfcn.h>
•
•
•
•
dynamic linking
<fmtmsg.h>
•
•

•
message display structures
<ftw.h>

•

•
file tree walking (Section 4.21)
<iconv.h>

•
•
•
codeset conversion utility
<langinfo.h>
•
•
•
•
language information constants
<libgen.h>
•
•
•
•
definitions for pattern-matching function
<monetary.h>
•
•
•
•
monetary types
<ndbm.h>
•

•
•
database operations
<nl_types.h>
•
•
•
•
message catalogs
<poll.h>
•
•
•
•
poll function (Section 14.5.2)
<search.h>
•
•
•
•
search tables
<strings.h>
•
•
•
•
string operations
<syslog.h>
•
•
•
•
system error logging (Section 13.4)
<ucontext.h>
•
•
•
•
user context
<ulimit.h>
•
•
•
•
user limits
<utmpx.h>

•

•
user accounting database
<sys/ipc.h>
•
•
•
•
IPC (Section 15.6)
<sys/msg.h>
•
•

•
message queues (Section 15.7)
<sys/resource.h>
•
•
•
•
resource operations (Section 7.11)
<sys/sem.h>
•
•
•
•
semaphores (Section 15.8)
<sys/shm.h>
•
•
•
•
shared memory (Section 15.9)
<sys/statvfs.h>
•
•

•
file system information
<sys/time.h>
•
•
•
•
time types
<sys/timeb.h>
•
•
•
•
additional date and time definitions
<sys/uio.h>
•
•
•
•
vector I/O operations (Section 14.7)

Figure 2.4. Optional headers defined by the POSIX standard
Header
FreeBSD 5.2.1
Linux 2.4.22
Mac OS X 10.3
Solaris 9
Description
<aio.h>
•
•
•
•
asynchronous I/O
<mqueue.h>
•

•
message queues
<pthread.h>
•
•
•
•
threads (Chapters 11 and 12)
<sched.h>
•
•
•
•
execution scheduling
<semaphore.h>
•
•
•
•
semaphores
<spawn.h>

•

real-time spawn interface
<stropts.h>

•

•
XSI STREAMS interface (Section 14.4)
<trace.h>

event tracing

In this text we describe the 2001 version of POSIX.1, which includes the functions specified in the ISO C standard. Its interfaces are divided into required ones and optional ones. The optional interfaces are further divided into 50 sections, based on functionality. The sections containing nonobsolete programming interfaces are summarized in Figure 2.5 with their respective option codes. Option codes are two- to three-character abbreviations that help identify the interfaces that belong to each functional area. The option codes highlight text on manual pages where interfaces depend on the support of a particular option. Many of the options deal with real-time extensions.

Figure 2.5. POSIX.1 optional interface groups and codes
Code
SUS mandatory
Symbolic constant
Description
ADV

_POSIX_ADVISORY_INFO
advisory information (real-time)
AIO

_POSIX_ASYNCHRONOUS_IO
asynchronous input and output (real-time)
BAR

_POSIX_BARRIERS
barriers (real-time)
CPT

_POSIX_CPUTIME
process CPU time clocks (real-time)
CS

_POSIX_CLOCK_SELECTION
clock selection (real-time)
CX
•

extension to ISO C standard
FSC
•
_POSIX_FSYNC
file synchronization
IP6

_POSIX_IPV6
IPv6 interfaces
MF
•
_POSIX_MAPPED_FILES
memory-mapped files
ML

_POSIX_MEMLOCK
process memory locking (real-time)
MLR

_POSIX_MEMLOCK_RANGE
memory range locking (real-time)
MON

_POSIX_MONOTONIC_CLOCK
monotonic clock (real-time)
MPR
•
_POSIX_MEMORY_PROTECTION
memory protection
MSG

_POSIX_MESSAGE_PASSING
message passing (real-time)
MX

IEC 60559 floating-point option
PIO

_POSIX_PRIORITIZED_IO
prioritized input and output
PS

_POSIX_PRIORITIZED_SCHEDULING
process scheduling (real-time)
RS

_POSIX_RAW_SOCKETS
raw sockets
RTS

_POSIX_REALTIME_SIGNALS
real-time signals extension
SEM

_POSIX_SEMAPHORES
semaphores (real-time)
SHM

_POSIX_SHARED_MEMORY_OBJECTS
shared memory objects (real-time)
SIO

_POSIX_SYNCHRONIZED_IO
synchronized input and output (real-time)
SPI

_POSIX_SPIN_LOCKS
spin locks (real-time)
SPN

_POSIX_SPAWN
spawn (real-time)
SS

_POSIX_SPORADIC_SERVER
process sporadic server (real-time)
TCT

_POSIX_THREAD_CPUTIME
thread CPU time clocks (real-time)
TEF

_POSIX_TRACE_EVENT_FILTER
trace event filter
THR
•
_POSIX_THREADS
threads
TMO

_POSIX_TIMEOUTS
timeouts (real-time)
TMR

_POSIX_TIMERS
timers (real-time)
TPI

_POSIX_THREAD_PRIO_INHERIT
thread priority inheritance (real-time)
TPP

_POSIX_THREAD_PRIO_PROTECT
thread priority protection (real-time)
TPS

_POSIX_THREAD_PRIORITY_SCHEDULING
thread execution scheduling (real-time)
TRC

_POSIX_TRACE
trace
TRI

_POSIX_TRACE_INHERIT
trace inherit
TRL

_POSIX_TRACE_LOG
trace log
TSA
•
_POSIX_THREAD_ATTR_STACKADDR
thread stack address attribute
TSF
•
_POSIX_THREAD_SAFE_FUNCTIONS
thread-safe functions
TSH
•
_POSIX_THREAD_PROCESS_SHARED
thread process-shared synchronization
TSP

_POSIX_THREAD_SPORADIC_SERVER
thread sporadic server (real-time)
TSS
•
_POSIX_THREAD_ATTR_STACKSIZE
thread stack address size
TYM

_POSIX_TYPED_MEMORY_OBJECTS
typed memory objects (real-time)
XSI
•
_XOPEN_UNIX
X/Open extended interfaces
XSR

_XOPEN_STREAMS
XSI STREAMS

POSIX.1 does not include the notion of a superuser. Instead, certain operations require "appropriate privileges," although POSIX.1 leaves the definition of this term up to the implementation. UNIX systems that conform to the Department of Defense security guidelines have many levels of security. In this text, however, we use the traditional terminology and refer to operations that require superuser privilege.

After almost twenty years of work, the standards are mature and stable. The POSIX.1 standard is maintained by an open working group known as the Austin Group (http://www.opengroup.org/austin). To ensure that they are still relevant, the standards need to be either updated or reaffirmed every so often.

2.2.3. The Single UNIX Specification

The Single UNIX Specification, a superset of the POSIX.1 standard, specifies additional interfaces that extend the functionality provided by the basic POSIX.1 specification. The complete set of system interfaces is called the X/Open System Interface (XSI). The _XOPEN_UNIX symbolic constant identifies interfaces that are part of the XSI extensions to the base POSIX.1 interfaces.

The XSI also defines which optional portions of POSIX.1 must be supported for an implementation to be deemed XSI conforming. These include file synchronization, memory-mapped files, memory protection, and thread interfaces, and are marked in Figure 2.5 as "SUS mandatory." Only XSI-conforming implementations can be called UNIX systems.

The Open Group owns the UNIX trademark and uses the Single UNIX Specification to define the interfaces an implementation must support to call itself a UNIX system. Implementations must file conformance statements, pass test suites that verify conformance, and license the right to use the UNIX trademark.

Some of the additional interfaces defined in the XSI are required, whereas others are optional. The interfaces are divided into option groups based on common functionality, as follows:

Encryption: denoted by the _XOPEN_CRYPT symbolic constant
Real-time: denoted by the _XOPEN_REALTIME symbolic constant
Advanced real-time
Real-time threads: denoted by the _XOPEN_REALTIME_THREADS symbolic constant
Advanced real-time threads
Tracing
XSI STREAMS: denoted by the _XOPEN_STREAMS symbolic constant
Legacy: denoted by the _XOPEN_LEGACY symbolic constant

The Single UNIX Specification (SUS) is a publication of The Open Group, which was formed in 1996 as a merger of X/Open and the Open Software Foundation (OSF), both industry consortia. X/Open used to publish the X/Open Portability Guide, which adopted specific standards and filled in the gaps where functionality was missing. The goal of these guides was to improve application portability past what was possible by merely conforming to published standards.

The first version of the Single UNIX Specification was published by X/Open in 1994. It was also known as "Spec 1170," because it contained roughly 1,170 interfaces. It grew out of the Common Open Software Environment (COSE) initiative, whose goal was to further improve application portability across all implementations of the UNIX operating system. The COSE groupSun, IBM, HP, Novell/USL, and OSFwent further than endorsing standards. In addition, they investigated interfaces used by common commercial applications. The resulting 1,170 interfaces were selected from these applications, and also included the X/Open Common Application Environment (CAE), Issue 4 (known as "XPG4" as a historical reference to its predecessor, the X/Open Portability Guide), the System V Interface Definition (SVID), Edition 3, Level 1 interfaces, and the OSF Application Environment Specification (AES) Full Use interfaces.

The second version of the Single UNIX Specification was published by The Open Group in 1997. The new version added support for threads, real-time interfaces, 64-bit processing, large files, and enhanced multibyte character processing.

The third version of the Single UNIX Specification (SUSv3, for short) was published by The Open Group in 2001. The Base Specifications of SUSv3 are the same as the IEEE Standard 1003.1-2001 and are divided into four sections: Base Definitions, System Interfaces, Shell and Utilities, and Rationale. SUSv3 also includes X/Open Curses Issue 4, Version 2, but this specification is not part of POSIX.1.

In 2002, ISO approved this version as International Standard ISO/IEC 9945:2002. The Open Group updated the 1003.1 standard again in 2003 to include technical corrections, and ISO approved this as International Standard ISO/IEC 9945:2003. In April 2004, The Open Group published the Single UNIX Specification, Version 3, 2004 Edition. It included more technical corrections edited in with the main text of the standard.

2.2.4. FIPS

FIPS stands for Federal Information Processing Standard. It was published by the U.S. government, which used it for the procurement of computer systems. FIPS 1511 (April 1989) was based on the IEEE Std. 1003.11988 and a draft of the ANSI C standard. This was followed by FIPS 1512 (May 1993), which was based on the IEEE Standard 1003.11990. FIPS 1512 required some features that POSIX.1 listed as optional. All these options have been included as mandatory in POSIX.1-2001.

The effect of the POSIX.1 FIPS was to require any vendor that wished to sell POSIX.1-compliant computer systems to the U.S. government to support some of the optional features of POSIX.1. The POSIX.1 FIPS has since been withdrawn, so we won't consider it further in this text.

2.3. UNIX System Implementations

The previous section described ISO C, IEEE POSIX, and the Single UNIX Specification; three standards created by independent organizations. Standards, however, are interface specifications. How do these standards relate to the real world? These standards are taken by vendors and turned into actual implementations. In this book, we are interested in both these standards and their implementation.

Section 1.1 of McKusick et al. [1996] gives a detailed history (and a nice picture) of the UNIX System family tree. Everything starts from the Sixth Edition (1976) and Seventh Edition (1979) of the UNIX Time-Sharing System on the PDP-11 (usually called Version 6 and Version 7). These were the first releases widely distributed outside of Bell Laboratories. Three branches of the tree evolved.

One at AT&T that led to System III and System V, the so-called commercial versions of the UNIX System.
One at the University of California at Berkeley that led to the 4.xBSD implementations.
The research version of the UNIX System, developed at the Computing Science Research Center of AT&T Bell Laboratories, that led to the UNIX Time-Sharing System 8th Edition, 9th Edition, and ended with the 10th Edition in 1990.

2.3.1. UNIX System V Release 4

UNIX System V Release 4 (SVR4) was a product of AT&T's UNIX System Laboratories (USL, formerly AT&T's UNIX Software Operation). SVR4 merged functionality from AT&T UNIX System V Release 3.2 (SVR3.2), the SunOS operating system from Sun Microsystems, the 4.3BSD release from the University of California, and the Xenix system from Microsoft into one coherent operating system. (Xenix was originally developed from Version 7, with many features later taken from System V.) The SVR4 source code was released in late 1989, with the first end-user copies becoming available during 1990. SVR4 conformed to both the POSIX 1003.1 standard and the X/Open Portability Guide, Issue 3 (XPG3).

AT&T also published the System V Interface Definition (SVID) [AT&T 1989]. Issue 3 of the SVID specified the functionality that an operating system must offer to qualify as a conforming implementation of UNIX System V Release 4. As with POSIX.1, the SVID specified an interface, not an implementation. No distinction was made in the SVID between system calls and library functions. The reference manual for an actual implementation of SVR4 must be consulted to see this distinction [AT&T 1990e].

2.3.2. 4.4BSD

The Berkeley Software Distribution (BSD) releases were produced and distributed by the Computer Systems Research Group (CSRG) at the University of California at Berkeley; 4.2BSD was released in 1983 and 4.3BSD in 1986. Both of these releases ran on the VAX minicomputer. The next release, 4.3BSD Tahoe in 1988, also ran on a particular minicomputer called the Tahoe. (The book by Leffler et al. [1989] describes the 4.3BSD Tahoe release.) This was followed in 1990 with the 4.3BSD Reno release; 4.3BSD Reno supported many of the POSIX.1 features.

The original BSD systems contained proprietary AT&T source code and were covered by AT&T licenses. To obtain the source code to the BSD system you had to have a UNIX source license from AT&T. This changed as more and more of the AT&T source code was replaced over the years with non-AT&T source code and as many of the new features added to the Berkeley system were derived from non-AT&T sources.

In 1989, Berkeley identified much of the non-AT&T source code in the 4.3BSD Tahoe release and made it publicly available as the BSD Networking Software, Release 1.0. This was followed in 1991 with Release 2.0 of the BSD Networking Software, which was derived from the 4.3BSD Reno release. The intent was that most, if not all, of the 4.4BSD system would be free of any AT&T license restrictions, thus making the source code available to all.

4.4BSD-Lite was intended to be the final release from the CSRG. Its introduction was delayed, however, because of legal battles with USL. Once the legal differences were resolved, 4.4BSD-Lite was released in 1994, fully unencumbered, so no UNIX source license was needed to receive it. The CSRG followed this with a bug-fix release in 1995. This release, 4.4BSD-Lite, release 2, was the final version of BSD from the CSRG. (This version of BSD is described in the book by McKusick et al. [1996].)

The UNIX system development done at Berkeley started with PDP-11s, then moved to the VAX minicomputer, and then to other so-called workstations. During the early 1990s, support was provided to Berkeley for the popular 80386-based personal computers, leading to what is called 386BSD. This was done by Bill Jolitz and was documented in a series of monthly articles in Dr. Dobb's Journal throughout 1991. Much of this code appears in the BSD Networking Software, Release 2.0.

2.3.3. FreeBSD

FreeBSD is based on the 4.4BSD-Lite operating system. The FreeBSD project was formed to carry on the BSD line after the Computing Science Research Group at the University of California at Berkeley decided to end its work on the BSD versions of the UNIX operating system, and the 386BSD project seemed to be neglected for too long.

All software produced by the FreeBSD project is freely available in both binary and source forms. The FreeBSD 5.2.1 operating system was one of the four used to test the examples in this book.

Several other BSD-based free operating systems are available. The NetBSD project (http://www.netbsd.org) is similar to the FreeBSD project, with an emphasis on portability between hardware platforms. The OpenBSD project (http://www.openbsd.org) is similar to FreeBSD but with an emphasis on security.

2.3.4. Linux

Linux is an operating system that provides a rich UNIX programming environment, and is freely available under the GNU Public License. The popularity of Linux is somewhat of a phenomenon in the computer industry. Linux is distinguished by often being the first operating system to support new hardware.

Linux was created in 1991 by Linus Torvalds as a replacement for MINIX. A grass-roots effort then sprang up, whereby many developers across the world volunteered their time to use and enhance it.

The Mandrake 9.2 distribution of Linux was one of the operating systems used to test the examples in this book. That distribution uses the 2.4.22 version of the Linux operating system kernel.

2.3.5. Mac OS X

Mac OS X is based on entirely different technology than prior versions. The core operating system is called "Darwin," and is based on a combination of the Mach kernel (Accetta et al. [1986]) and the FreeBSD operating system. Darwin is managed as an open source project, similar to FreeBSD and Linux.

Mac OS X version 10.3 (Darwin 7.4.0) was used as one of the operating systems to test the examples in this book.

2.3.6. Solaris

Solaris is the version of the UNIX System developed by Sun Microsystems. It is based on System V Release 4, with more than ten years of enhancements from the engineers at Sun Microsystems. It is the only commercially successful SVR4 descendant, and is formally certified to be a UNIX system. (For more information on UNIX certification, see http://www.opengroup.org/certification/idx/unix.html.)

The Solaris 9 UNIX system was one of the operating systems used to test the examples in this book.

2.3.7. Other UNIX Systems

Other versions of the UNIX system that have been certified in the past include

AIX, IBM's version of the UNIX System
HP-UX, Hewlett-Packard's version of the UNIX System
IRIX, the UNIX System version shipped by Silicon Graphics
UnixWare, the UNIX System descended from SVR4 and currently sold by SCO

2.4. Relationship of Standards and Implementations

The standards that we've mentioned define a subset of any actual system. The focus of this book is on four real systems: FreeBSD 5.2.1, Linux 2.4.22, Mac OS X 10.3, and Solaris 9. Although only Solaris can call itself a UNIX system, all four provide a UNIX programming environment. Because all four are POSIX compliant to varying degrees, we will also concentrate on the features that are required by the POSIX.1 standard, noting any differences between POSIX and the actual implementations of these four systems. Those features and routines that are specific to only a particular implementation are clearly marked. As SUSv3 is a superset of POSIX.1, we'll also note any features that are part of SUSv3 but not part of POSIX.1.

Be aware that the implementations provide backward compatibility for features in earlier releases, such as SVR3.2 and 4.3BSD. For example, Solaris supports both the POSIX.1 specification for nonblocking I/O (O_NONBLOCK) and the traditional System V method (O_NDELAY). In this text, we'll use only the POSIX.1 feature, although we'll mention the nonstandard feature that it replaces. Similarly, both SVR3.2 and 4.3BSD provided reliable signals in a way that differs from the POSIX.1 standard. In Chapter 10 we describe only the POSIX.1 signal mechanism.

2.5. Limits

The implementations define many magic numbers and constants. Many of these have been hard coded into programs or were determined using ad hoc techniques. With the various standardization efforts that we've described, more portable methods are now provided to determine these magic numbers and implementation-defined limits, greatly aiding the portability of our software.

Two types of limits are needed:

Compile-time limits (e.g., what's the largest value of a short integer?)
Runtime limits (e.g., how many characters in a filename?)

Compile-time limits can be defined in headers that any program can include at compile time. But runtime limits require the process to call a function to obtain the value of the limit.

Additionally, some limits can be fixed on a given implementationand could therefore be defined statically in a headeryet vary on another implementation and would require a runtime function call. An example of this type of limit is the maximum number of characters in a filename. Before SVR4, System V historically allowed only 14 characters in a filename, whereas BSD-derived systems increased this number to 255. Most UNIX System implementations these days support multiple file system types, and each type has its own limit. This is the case of a runtime limit that depends on where in the file system the file in question is located. A filename in the root file system, for example, could have a 14-character limit, whereas a filename in another file system could have a 255-character limit.

To solve these problems, three types of limits are provided:

Compile-time limits (headers)
Runtime limits that are not associated with a file or directory (the sysconf function)
Runtime limits that are associated with a file or a directory (the pathconf and fpathconf functions)

To further confuse things, if a particular runtime limit does not vary on a given system, it can be defined statically in a header. If it is not defined in a header, however, the application must call one of the three conf functions (which we describe shortly) to determine its value at runtime.

2.5.1. ISO C Limits

All the limits defined by ISO C are compile-time limits. Figure 2.6 shows the limits from the C standard that are defined in the file <limits.h>. These constants are always defined in the header and don't change in a given system. The third column shows the minimum acceptable values from the ISO C standard. This allows for a system with 16-bit integers using one's-complement arithmetic. The fourth column shows the values from a Linux system with 32-bit integers using two's-complement arithmetic. Note that none of the unsigned data types has a minimum value, as this value must be 0 for an unsigned data type. On a 64-bit system, the values for long integer maximums match the maximum values for long long integers.

Figure 2.6. Sizes of integral values from <limits.h>
Name
Description
Minimum acceptable value
Typical value
CHAR_BIT
bits in a char
8
8
CHAR_MAX
max value of char
(see later)
127
CHAR_MIN
min value of char
(see later)
128
SCHAR_MAX
max value of signed char
127
127
SCHAR_MIN
min value of signed char
127
128
UCHAR_MAX
max value of unsigned char
255
255
INT_MAX
max value of int
32,767
2,147,483,647
INT_MIN
min value of int
32,767
2,147,483,648
UINT_MAX
max value of unsigned int
65,535
4,294,967,295
SHRT_MIN
min value of short
32,767
32,768
SHRT_MAX
max value of short
32,767
32,767
USHRT_MAX
max value of unsigned short
65,535
65,535
LONG_MAX
max value of long
2,147,483,647
2,147,483,647
LONG_MIN
min value of long
2,147,483,647
2,147,483,648
ULONG_MAX
max value of unsigned long
4,294,967,295
4,294,967,295
LLONG_MAX
max value of long long
9,223,372,036,854,775,807
9,223,372,036,854,775,807
LLONG_MIN
min value of long long
9,223,372,036,854,775,807
9,223,372,036,854,775,808
ULLONG_MAX
max value of unsigned long long
18,446,744,073,709,551,615
18,446,744,073,709,551,615
MB_LEN_MAX
max number of bytes in a multibyte character constant
1
16

One difference that we will encounter is whether a system provides signed or unsigned character values. From the fourth column in Figure 2.6, we see that this particular system uses signed characters. We see that CHAR_MIN equals SCHAR_MIN and that CHAR_MAX equals SCHAR_MAX. If the system uses unsigned characters, we would have CHAR_MIN equal to 0 and CHAR_MAX equal to UCHAR_MAX.

The floating-point data types in the header <float.h> have a similar set of definitions. Anyone doing serious floating-point work should examine this file.

Another ISO C constant that we'll encounter is FOPEN_MAX, the minimum number of standard I/O streams that the implementation guarantees can be open at once. This value is in the <stdio.h> header, and its minimum value is 8. The POSIX.1 value STREAM_MAX, if defined, must have the same value as FOPEN_MAX.

ISO C also defines the constant TMP_MAX in <stdio.h>. It is the maximum number of unique filenames generated by the tmpnam function. We'll have more to say about this constant in Section 5.13.

In Figure 2.7, we show the values of FOPEN_MAX and TMP_MAX on the four platforms we discuss in this book.

Figure 2.7. ISO limits on various platforms
Limit
FreeBSD 5.2.1
Linux 2.4.22
Mac OS X 10.3
Solaris 9
FOPEN_MAX
20
16
20
20
TMP_MAX
308,915,776
238,328
308,915,776
17,576

ISO C also defines the constant FILENAME_MAX, but we avoid using it, because some operating system implementations historically have defined it to be too small to be of use.

2.5.2. POSIX Limits

POSIX.1 defines numerous constants that deal with implementation limits of the operating system. Unfortunately, this is one of the more confusing aspects of POSIX.1. Although POSIX.1 defines numerous limits and constants, we'll only concern ourselves with the ones that affect the base POSIX.1 interfaces. These limits and constants are divided into the following five categories:

Invariant minimum values: the 19 constants in Figure 2.8
Invariant value: SSIZE_MAX
Runtime increasable values: CHARCLASS_NAME_MAX, COLL_WEIGHTS_MAX, LINE_MAX, NGROUPS_MAX, and RE_DUP_MAX
Runtime invariant values, possibly indeterminate: ARG_MAX, CHILD_MAX, HOST_NAME_MAX, LOGIN_NAME_MAX, OPEN_MAX, PAGESIZE, RE_DUP_MAX, STREAM_MAX, SYMLOOP_MAX, TTY_NAME_MAX, and TZNAME_MAX
Pathname variable values, possibly indeterminate: FILESIZEBITS, LINK_MAX, MAX_CANON, MAX_INPUT, NAME_MAX, PATH_MAX, PIPE_BUF, and SYMLINK_MAX

Figure 2.8. POSIX.1 invariant minimum values from <limits.h>
Name
Description: minimum acceptable value for
Value
_POSIX_ARG_MAX
length of arguments to exec functions
4,096
_POSIX_CHILD_MAX
number of child processes per real user ID
25
_POSIX_HOST_NAME_MAX
maximum length of a host name as returned by gethostname
255
_POSIX_LINK_MAX
number of links to a file
8
_POSIX_LOGIN_NAME_MAX
maximum length of a login name
9
_POSIX_MAX_CANON
number of bytes on a terminal's canonical input queue
255
_POSIX_MAX_INPUT
space available on a terminal's input queue
255
_POSIX_NAME_MAX
number of bytes in a filename, not including the terminating null
14
_POSIX_NGROUPS_MAX
number of simultaneous supplementary group IDs per process
8
_POSIX_OPEN_MAX
number of open files per process
20
_POSIX_PATH_MAX
number of bytes in a pathname, including the terminating null
256
_POSIX_PIPE_BUF
number of bytes that can be written atomically to a pipe
512
_POSIX_RE_DUP_MAX
number of repeated occurrences of a basic regular expression permitted by the regexec and regcomp functions when using the interval notation \{m,n\}
255
_POSIX_SSIZE_MAX
value that can be stored in ssize_t object
32,767
_POSIX_STREAM_MAX
number of standard I/O streams a process can have open at once
8
_POSIX_SYMLINK_MAX
number of bytes in a symbolic link
255
_POSIX_SYMLOOP_MAX
number of symbolic links that can be traversed during pathname resolution
8
_POSIX_TTY_NAME_MAX
length of a terminal device name, including the terminating null
9
_POSIX_TZNAME_MAX
number of bytes for the name of a time zone
6

Of these 44 limits and constants, some may be defined in <limits.h>, and others may or may not be defined, depending on certain conditions. We describe the limits and constants that may or may not be defined in Section 2.5.4, when we describe the sysconf, pathconf, and fpathconf functions. The 19 invariant minimum values are shown in Figure 2.8.

These values are invariant; they do not change from one system to another. They specify the most restrictive values for these features. A conforming POSIX.1 implementation must provide values that are at least this large. This is why they are called minimums, although their names all contain MAX. Also, to ensure portability, a strictly-conforming application must not require a larger value. We describe what each of these constants refers to as we proceed through the text.

A strictly-conforming POSIX application is different from an application that is merely POSIX conforming. A POSIX-conforming application uses only interfaces defined in IEEE Standard 1003.1-2001. A strictly-conforming application is a POSIX-conforming application that does not rely on any undefined behavior, does not use any obsolescent interfaces, and does not require values of constants larger than the minimums shown in Figure 2.8.

Unfortunately, some of these invariant minimum values are too small to be of practical use. For example, most UNIX systems today provide far more than 20 open files per process. Also, the minimum limit of 255 for _POSIX_PATH_MAX is too small. Pathnames can exceed this limit. This means that we can't use the two constants _POSIX_OPEN_MAX and _POSIX_PATH_MAX as array sizes at compile time.

Each of the 19 invariant minimum values in Figure 2.8 has an associated implementation value whose name is formed by removing the _POSIX_ prefix from the name in Figure 2.8. The names without the leading _POSIX_ were intended to be the actual values that a given implementation supports. (These 19 implementation values are items 25 from our list earlier in this section: the invariant value, the runtime increasable value, the runtime invariant values, and the pathname variable values.) The problem is that not all of the 19 implementation values are guaranteed to be defined in the <limits.h> header.

For example, a particular value may not be included in the header if its actual value for a given process depends on the amount of memory on the system. If the values are not defined in the header, we can't use them as array bounds at compile time. So, POSIX.1 decided to provide three runtime functions for us to callsysconf, pathconf, and fpathconfto determine the actual implementation value at runtime. There is still a problem, however, because some of the values are defined by POSIX.1 as being possibly "indeterminate" (logically infinite). This means that the value has no practical upper bound. On Linux, for example, the number of iovec structures you can use with readv or writev is limited only by the amount of memory on the system. Thus, IOV_MAX is considered indeterminate on Linux. We'll return to this problem of indeterminate runtime limits in Section 2.5.5.

2.5.3. XSI Limits

The XSI also defines constants that deal with implementation limits. They include:

Invariant minimum values: the ten constants in Figure 2.9
Numerical limits: LONG_BIT and WORD_BIT
Runtime invariant values, possibly indeterminate: ATEXIT_MAX, IOV_MAX, and PAGE_SIZE

Figure 2.9. XSI invariant minimum values from <limits.h>
Name
Description
Minimum acceptable value
Typical value
NL_ARGMAX
maximum value of digit in calls to printf and scanf
9
9
NL_LANGMAX
maximum number of bytes in LANG environment variable
14
14
NL_MSGMAX
maximum message number
32,767
32,767
NL_NMAX
maximum number of bytes in N-to-1 mapping characters
(none specified)
1
NL_SETMAX
maximum set number
255
255
NL_TEXTMAX
maximum number of bytes in a message string
_POSIX2_LINE_MAX
2,048
NZERO
default process priority
20
20
_XOPEN_IOV_MAX
maximum number of iovec structures that can be used with readv or writev
16
16
_XOPEN_NAME_MAX
number of bytes in a filename
255
255
_XOPEN_PATH_MAX
number of bytes in a pathname
1,024
1,024

The invariant minimum values are listed in Figure 2.9. Many of these values deal with message catalogs. The last two illustrate the situation in which the POSIX.1 minimums were too smallpresumably to allow for embedded POSIX.1 implementationsso the Single UNIX Specification added symbols with larger minimum values for XSI-conforming systems.

2.5.4. `sysconf`, `pathconf`, and `fpathconf` Functions

We've listed various minimum values that an implementation must support, but how do we find out the limits that a particular system actually supports? As we mentioned earlier, some of these limits might be available at compile time; others must be determined at runtime. We've also mentioned that some don't change in a given system, whereas others can change because they are associated with a file or directory. The runtime limits are obtained by calling one of the following three functions.

#include <unistd.h> long sysconf(int name); long pathconf(const char *pathname, int name); long fpathconf(int filedes, int name);

All three return: corresponding value if OK, 1 on error (see later)

The difference between the last two functions is that one takes a pathname as its argument and the other takes a file descriptor argument.

Figure 2.10 lists the name arguments that sysconf uses to identify system limits. Constants beginning with _SC_ are used as arguments to sysconf to identify the runtime limit. Figure 2.11 lists the name arguments that are used by pathconf and fpathconf to identify system limits. Constants beginning with _PC_ are used as arguments to pathconf and fpathconf to identify the runtime limit.

Figure 2.10. Limits and name arguments to sysconf
Name of limit
Description
name argument
ARG_MAX
maximum length, in bytes, of arguments to the exec functions
_SC_ARG_MAX
ATEXIT_MAX
maximum number of functions that can be registered with the atexit function
_SC_ATEXIT_MAX
CHILD_MAX
maximum number of processes per real user ID
_SC_CHILD_MAX
clock ticks/second
number of clock ticks per second
_SC_CLK_TCK
COLL_WEIGHTS_MAX
maximum number of weights that can be assigned to an entry of the LC_COLLATE order keyword in the locale definition file
_SC_COLL_WEIGHTS_MAX
HOST_NAME_MAX
maximum length of a host name as returned by gethostname
_SC_HOST_NAME_MAX
IOV_MAX
maximum number of iovec structures that can be used with readv or writev
_SC_IOV_MAX
LINE_MAX
maximum length of a utility's input line
_SC_LINE_MAX
LOGIN_NAME_MAX
maximum length of a login name
_SC_LOGIN_NAME_MAX
NGROUPS_MAX
maximum number of simultaneous supplementary process group IDs per process
_SC_NGROUPS_MAX
OPEN_MAX
maximum number of open files per process
_SC_OPEN_MAX
PAGESIZE
system memory page size, in bytes
_SC_PAGESIZE
PAGE_SIZE
system memory page size, in bytes
_SC_PAGE_SIZE
RE_DUP_MAX
number of repeated occurrences of a basic regular expression permitted by the regexec and regcomp functions when using the interval notation \{m,n\}
_SC_RE_DUP_MAX
STREAM_MAX
maximum number of standard I/O streams per process at any given time; if defined, it must have the same value as FOPEN_MAX
_SC_STREAM_MAX
SYMLOOP_MAX
number of symbolic links that can be traversed during pathname resolution
_SC_SYMLOOP_MAX
TTY_NAME_MAX
length of a terminal device name, including the terminating null
_SC_TTY_NAME_MAX
TZNAME_MAX
maximum number of bytes for the name of a time zone
_SC_TZNAME_MAX

Figure 2.11. Limits and name arguments to pathconf and fpathconf
Name of limit
Description
name argument
FILESIZEBITS
minimum number of bits needed to represent, as a signed integer value, the maximum size of a regular file allowed in the specified directory
_PC_FILESIZEBITS
LINK_MAX
maximum value of a file's link count
_PC_LINK_MAX
MAX_CANON
maximum number of bytes on a terminal's canonical input queue
_PC_MAX_CANON
MAX_INPUT
number of bytes for which space is available on terminal's input queue
_PC_MAX_INPUT
NAME_MAX
maximum number of bytes in a filename (does not include a null at end)
_PC_NAME_MAX
PATH_MAX
maximum number of bytes in a relative pathname, including the terminating null
_PC_PATH_MAX
PIPE_BUF
maximum number of bytes that can be written atomically to a pipe
_PC_PIPE_BUF
SYMLINK_MAX
number of bytes in a symbolic link
_PC_SYMLINK_MAX

We need to look in more detail at the different return values from these three functions.

All three functions return 1 and set errno to EINVAL if the name isn't one of the appropriate constants. The third column in Figures 2.10 and 2.11 lists the limit constants we'll deal with throughout the rest of this book.
Some names can return either the value of the variable (a return value 0) or an indication that the value is indeterminate. An indeterminate value is indicated by returning 1 and not changing the value of errno.
The value returned for _SC_CLK_TCK is the number of clock ticks per second, for use with the return values from the times function (Section 8.16).

There are some restrictions for the pathname argument to pathconf and the filedes argument to fpathconf. If any of these restrictions isn't met, the results are undefined.

The referenced file for _PC_MAX_CANON and _PC_MAX_INPUT must be a terminal file.
The referenced file for _PC_LINK_MAX can be either a file or a directory. If the referenced file is a directory, the return value applies to the directory itself, not to the filename entries within the directory.
The referenced file for _PC_FILESIZEBITS and _PC_NAME_MAX must be a directory. The return value applies to filenames within the directory.
The referenced file for _PC_PATH_MAX must be a directory. The value returned is the maximum length of a relative pathname when the specified directory is the working directory. (Unfortunately, this isn't the real maximum length of an absolute pathname, which is what we want to know. We'll return to this problem in Section 2.5.5.)
The referenced file for _PC_PIPE_BUF must be a pipe, FIFO, or directory. In the first two cases (pipe or FIFO) the return value is the limit for the referenced pipe or FIFO. For the other case (a directory) the return value is the limit for any FIFO created in that directory.
The referenced file for _PC_SYMLINK_MAX must be a directory. The value returned is the maximum length of the string that a symbolic link in that directory can contain.

Example

The awk(1) program shown in Figure 2.12 builds a C program that prints the value of each pathconf and sysconf symbol.

The awk program reads two input filespathconf.sym and sysconf.symthat contain lists of the limit name and symbol, separated by tabs. All symbols are not defined on every platform, so the awk program surrounds each call to pathconf and sysconf with the necessary #ifdef statements.

For example, the awk program transforms a line in the input file that looks like

    NAME_MAX      _PC_NAME_MAX

into the following C code:

 #ifdef NAME_MAX
      printf("NAME_MAX is defined to be %d\n", NAME_MAX+0);
 #else
      printf("no symbol for NAME_MAX\n");
 #endif
 #ifdef _PC_NAME_MAX
      pr_pathconf("NAME_MAX =", argv[1], _PC_NAME_MAX);
 #else
      printf("no symbol for _PC_NAME_MAX\n");
 #endif

The program in Figure 2.13, generated by the awk program, prints all these limits, handling the case in which a limit is not defined.

Figure 2.14 summarizes results from Figure 2.13 for the four systems we discuss in this book. The entry "no symbol" means that the system doesn't provide a corresponding _SC or _PC symbol to query the value of the constant. Thus, the limit is undefined in this case. In contrast, the entry "unsupported" means that the symbol is defined by the system but unrecognized by the sysconf or pathconf functions. The entry "no limit" means that the system defines no limit for the constant, but this doesn't mean that the limit is infinite.

We'll see in Section 4.14 that UFS is the SVR4 implementation of the Berkeley fast file system. PCFS is the MS-DOS FAT file system implementation for Solaris.

Figure 2.12. Build C program to print all supported configuration limits

 BEGIN   {
     printf("#include \"apue.h\"\n")
     printf("#include <errno.h>\n")
     printf("#include <limits.h>\n")
     printf("\n")
     printf("static void pr_sysconf(char *, int);\n")
     printf("static void pr_pathconf(char *, char *, int);\n")
     printf("\n")
     printf("int\n")
     printf("main(int argc, char *argv[])\n")
     printf("{\n")
     printf("\tif (argc != 2)\n")
     printf("\t\terr_quit(\"usage: a.out <dirname>\");\n\n")
     FS="\t+"
     while (getline <"sysconf.sym" > 0) {
         printf("#ifdef %s\n", $1)
         printf("\tprintf(\"%s defined to be %%d\\n\", %s+0);\n", $1, $1)
         printf("#else\n")
         printf("\tprintf(\"no symbol for %s\\n\");\n", $1)
         printf("#endif\n")
         printf("#ifdef %s\n", $2)
         printf("\tpr_sysconf(\"%s =\", %s);\n", $1, $2)
         printf("#else\n")
         printf("\tprintf(\"no symbol for %s\\n\");\n", $2)
         printf("#endif\n")
     }
     close("sysconf.sym")
     while (getline <"pathconf.sym" > 0) {
         printf("#ifdef %s\n", $1)
         printf("\tprintf(\"%s defined to be %%d\\n\", %s+0);\n", $1, $1)
         printf("#else\n")
         printf("\tprintf(\"no symbol for %s\\n\");\n", $1)
         printf("#endif\n")
         printf("#ifdef %s\n", $2)
         printf("\tpr_pathconf(\"%s =\", argv[1], %s);\n", $1, $2)
         printf("#else\n")
         printf("\tprintf(\"no symbol for %s\\n\");\n", $2)
         printf("#endif\n")
     }
     close("pathconf.sym")
     exit
 }
 END {
     printf("\texit(0);\n")
     printf("}\n\n")
     printf("static void\n")
     printf("pr_sysconf(char *mesg, int name)\n")
     printf("{\n")
     printf("\tlong val;\n\n")
     printf("\tfputs(mesg, stdout);\n")
     printf("\terrno = 0;\n")
     printf("\tif ((val = sysconf(name)) < 0) {\n")
     printf("\t\tif (errno != 0) {\n")
     printf("\t\t\tif (errno == EINVAL)\n")
     printf("\t\t\t\tfputs(\" (not supported)\\n\", stdout);\n")
     printf("\t\t\telse\n")
     printf("\t\t\t\terr_sys(\"sysconf error\");\n")
     printf("\t\t} else {\n")
     printf("\t\t\tfputs(\" (no limit)\\n\", stdout);\n")
     printf("\t\t}\n")
     printf("\t} else {\n")
     printf("\t\tprintf(\" %%ld\\n\", val);\n")
     printf("\t}\n")
     printf("}\n\n")
     printf("static void\n")
     printf("pr_pathconf(char *mesg, char *path, int name)\n")
     printf("{\n")
     printf("\tlong val;\n")
     printf("\n")
     printf("\tfputs(mesg, stdout);\n")
     printf("\terrno = 0;\n")
     printf("\tif ((val = pathconf(path, name)) < 0) {\n")
     printf("\t\tif (errno != 0) {\n")
     printf("\t\t\tif (errno == EINVAL)\n")
     printf("\t\t\t\tfputs(\" (not supported)\\n\", stdout);\n")
     printf("\t\t\telse\n")
     printf("\t\t\t\terr_sys(\"pathconf error, path = %%s\", path);\n")
     printf("\t\t} else {\n")
     printf("\t\t\tfputs(\" (no limit)\\n\", stdout);\n")
     printf("\t\t}\n")
     printf("\t} else {\n")
     printf("\t\tprintf(\" %%ld\\n\", val);\n")
     printf("\t}\n")
     printf("}\n")
 }

Figure 2.13. Print all possible `sysconf` and `pathconf` values

 #include "apue.h"
 #include <errno.h>
 #include <limits.h>
 
 static void pr_sysconf(char *, int);
 static void pr_pathconf(char *, char *, int);
 
 int
 main(int argc, char *argv[])
 {
     if (argc != 2)
         err_quit("usage: a.out <dirname>");
 
 #ifdef ARG_MAX
     printf("ARG_MAX defined to be %d\n", ARG_MAX+0);
 #else
     printf("no symbol for ARG_MAX\n");
 #endif
 #ifdef _SC_ARG_MAX
     pr_sysconf("ARG_MAX =", _SC_ARG_MAX);
 #else
     printf("no symbol for _SC_ARG_MAX\n");
 #endif
 
 /* similar processing for all the rest of the sysconf symbols... */
 
 #ifdef MAX_CANON
     printf("MAX_CANON defined to be %d\n", MAX_CANON+0);
 #else
     printf("no symbol for MAX_CANON\n");
 #endif
 #ifdef _PC_MAX_CANON
     pr_pathconf("MAX_CANON =", argv[1], _PC_MAX_CANON);
 #else
     printf("no symbol for _PC_MAX_CANON\n");
 #endif
 
 /* similar processing for all the rest of the pathconf symbols... */
 
    exit(0);
 }
 static void
 pr_sysconf(char *mesg, int name)
 {
     long    val;
 
     fputs(mesg, stdout);
     errno = 0;
     if ((val = sysconf(name)) < 0) {
         if (errno != 0) {
             if (errno == EINVAL)
                 fputs(" (not supported)\n", stdout);
             else
                 err_sys("sysconf error");
         } else {
             fputs(" (no limit)\n", stdout);
         }
     } else {
         printf(" %ld\n", val);
     }
 }
 
 static void
 pr_pathconf(char *mesg, char *path, int name)
 {
     long    val;
 
     fputs(mesg, stdout);
     errno = 0;
     if ((val = pathconf(path, name)) < 0) {
         if (errno != 0) {
             if (errno == EINVAL)
                 fputs(" (not supported)\n", stdout);
             else
                 err_sys("pathconf error, path = %s", path);
         } else {
             fputs(" (no limit)\n", stdout);
         }
     } else {
         printf(" %ld\n", val);
     }
 }

Figure 2.14. Examples of configuration limits
Limit
FreeBSD 5.2.1
Linux 2.4.22
Mac OS X 10.3
Solaris 9
UFS file system
PCFS file system
ARG_MAX
65,536
131,072
262,144
1,048,320
1,048,320
ATEXIT_MAX
32
2,147,483,647
no symbol
no limit
no limit
CHARCLASS_NAME_MAX
no symbol
2,048
no symbol
14
14
CHILD_MAX
867
999
100
7,877
7,877
clock ticks/second
128
100
100
100
100
COLL_WEIGHTS_MAX
0
255
2
10
10
FILESIZEBITS
unsupported
64
no symbol
41
unsupported
HOST_NAME_MAX
255
unsupported
no symbol
no symbol
no symbol
IOV_MAX
1,024
no limit
no symbol
16
16
LINE_MAX
2,048
2,048
2,048
2,048
2,048
LINK_MAX
32,767
32,000
32,767
32,767
1
LOGIN_NAME_MAX
17
256
no symbol
9
9
MAX_CANON
255
255
255
256
256
MAX_INPUT
255
255
255
512
512
NAME_MAX
255
255
765
255
8
NGROUPS_MAX
16
32
16
16
16
OPEN_MAX
1,735
1,024
256
256
256
PAGESIZE
4,096
4,096
4,096
8,192
8,192
PAGE_SIZE
4,096
4,096
no symbol
8,192
8,192
PATH_MAX
1,024
4,096
1,024
1,024
1,024
PIPE_BUF
512
4,096
512
5,120
5,120
RE_DUP_MAX
255
32,767
255
255
255
STREAM_MAX
1,735
16
20
256
256
SYMLINK_MAX
unsupported
no limit
no symbol
no symbol
no symbol
SYMLOOP_MAX
32
no limit
no symbol
no symbol
no symbol
TTY_NAME_MAX
255
32
no symbol
128
128
TZNAME_MAX
255
6
255
no limit
no limit

2.5.5. Indeterminate Runtime Limits

We mentioned that some of the limits can be indeterminate. The problem we encounter is that if these limits aren't defined in the <limits.h> header, we can't use them at compile time. But they might not be defined at runtime if their value is indeterminate! Let's look at two specific cases: allocating storage for a pathname and determining the number of file descriptors.

Pathnames

Many programs need to allocate storage for a pathname. Typically, the storage has been allocated at compile time, and various magic numbersfew of which are the correct valuehave been used by different programs as the array size: 256, 512, 1024, or the standard I/O constant BUFSIZ. The 4.3BSD constant MAXPATHLEN in the header <sys/param.h> is the correct value, but many 4.3BSD applications didn't use it.

POSIX.1 tries to help with the PATH_MAX value, but if this value is indeterminate, we're still out of luck. Figure 2.15 shows a function that we'll use throughout this text to allocate storage dynamically for a pathname.

Figure 2.15. Dynamically allocate space for a pathname

 #include "apue.h"
 #include <errno.h>
 #include <limits.h>
 
 #ifdef  PATH_MAX
 static int  pathmax = PATH_MAX;
 #else
 static int  pathmax = 0;
 #endif
 
 #define SUSV3   200112L
 
 static long posix_version = 0;
 
 /* If PATH_MAX is indeterminate, no guarantee this is adequate */
 #define PATH_MAX_GUESS  1024
 
 char *
 path_alloc(int *sizep) /* also return allocated size, if nonnull */
 {
     char    *ptr;
     int size;
 
     if (posix_version == 0)
         posix_version = sysconf(_SC_VERSION);
 
     if (pathmax == 0) {     /* first time through */
         errno = 0;
         if ((pathmax = pathconf("/", _PC_PATH_MAX)) < 0) {
             if (errno == 0)
                 pathmax = PATH_MAX_GUESS; /* it's indeterminate */
             else
                 err_sys("pathconf error for _PC_PATH_MAX");
         } else {
             pathmax++;      /* add one since it's relative to root */
         }
     }
     if (posix_version < SUSV3)
         size = pathmax + 1;
     else
         size = pathmax;
 
     if ((ptr = malloc(size)) == NULL)
         err_sys("malloc error for pathname");
 
     if (sizep != NULL)
         *sizep = size;
     return(ptr);
 }

If the constant PATH_MAX is defined in <limits.h>, then we're all set. If it's not, we need to call pathconf. The value returned by pathconf is the maximum size of a relative pathname when the first argument is the working directory, so we specify the root as the first argument and add 1 to the result. If pathconf indicates that PATH_MAX is indeterminate, we have to punt and just guess a value.

Standards prior to SUSv3 were unclear as to whether or not PATH_MAX included a null byte at the end of the pathname. If the operating system implementation conforms to one of these prior versions, we need to add 1 to the amount of memory we allocate for a pathname, just to be on the safe side.

The correct way to handle the case of an indeterminate result depends on how the allocated space is being used. If we were allocating space for a call to getcwd, for exampleto return the absolute pathname of the current working directory; see Section 4.22and if the allocated space is too small, an error is returned and errno is set to ERANGE. We could then increase the allocated space by calling realloc (see Section 7.8 and Exercise 4.16) and try again. We could keep doing this until the call to getcwd succeeded.

Maximum Number of Open Files

A common sequence of code in a daemon processa process that runs in the background, not connected to a terminalis one that closes all open files. Some programs have the following code sequence, assuming the constant NOFILE was defined in the <sys/param.h> header:

    #include  <sys/param.h>
 
    for (i = 0; i < NOFILE; i++)
        close(i);

Other programs use the constant _NFILE that some versions of <stdio.h> provide as the upper limit. Some hard code the upper limit as 20.

We would hope to use the POSIX.1 value OPEN_MAX to determine this value portably, but if the value is indeterminate, we still have a problem. If we wrote the following and if OPEN_MAX was indeterminate, the loop would never execute, since sysconf would return -1:

    #include  <unistd.h>
 
    for (i = 0; i < sysconf(_SC_OPEN_MAX); i++)
        close(i);

Our best option in this case is just to close all descriptors up to some arbitrary limit, say 256. As with our pathname example, this is not guaranteed to work for all cases, but it's the best we can do. We show this technique in Figure 2.16.

Figure 2.16. Determine the number of file descriptors

 #include "apue.h"
 #include <errno.h>
 #include <limits.h>
 
 #ifdef  OPEN_MAX
 static long openmax = OPEN_MAX;
 #else
 static long openmax = 0;
 #endif
 
 /*
  * If OPEN_MAX is indeterminate, we're not
  * guaranteed that this is adequate.
  */
 #define OPEN_MAX_GUESS 256
 
 long
 open_max(void)
 {
     if (openmax == 0) {      /* first time through */
         errno = 0;
         if ((openmax = sysconf(_SC_OPEN_MAX)) < 0) {
            if (errno == 0)
                openmax = OPEN_MAX_GUESS;    /* it's indeterminate */
            else
                err_sys("sysconf error for _SC_OPEN_MAX");
         }
     }
 
     return(openmax);
 }

We might be tempted to call close until we get an error return, but the error return from close (EBADF) doesn't distinguish between an invalid descriptor and a descriptor that wasn't open. If we tried this technique and descriptor 9 was not open but descriptor 10 was, we would stop on 9 and never close 10. The dup function (Section 3.12) does return a specific error when OPEN_MAX is exceeded, but duplicating a descriptor a couple of hundred times is an extreme way to determine this value.

Some implementations will return LONG_MAX for limits values that are effectively unlimited. Such is the case with the Linux limit for ATEXIT_MAX (see Figure 2.14). This isn't a good idea, because it can cause programs to behave badly.

For example, we can use the ulimit command built into the Bourne-again shell to change the maximum number of files our processes can have open at one time. This generally requires special (superuser) privileges if the limit is to be effectively unlimited. But once set to infinite, sysconf will report LONG_MAX as the limit for OPEN_MAX. A program that relies on this value as the upper bound of file descriptors to close as shown in Figure 2.16 will waste a lot of time trying to close 2,147,483,647 file descriptors, most of which aren't even in use.

Systems that support the XSI extensions in the Single UNIX Specification will provide the getrlimit(2) function (Section 7.11). It can be used to return the maximum number of descriptors that a process can have open. With it, we can detect that there is no configured upper bound to the number of open files our processes can open, so we can avoid this problem.

The OPEN_MAX value is called runtime invariant by POSIX, meaning that its value should not change during the lifetime of a process. But on systems that support the XSI extensions, we can call the setrlimit(2) function (Section 7.11) to change this value for a running process. (This value can also be changed from the C shell with the limit command, and from the Bourne, Bourne-again, and Korn shells with the ulimit command.) If our system supports this functionality, we could change the function in Figure 2.16 to call sysconf every time it is called, not only the first time.

2.6. Options

We saw the list of POSIX.1 options in Figure 2.5 and discussed XSI option groups in Section 2.2.3. If we are to write portable applications that depend on any of these optionally-supported features, we need a portable way to determine whether an implementation supports a given option.

Just as with limits (Section 2.5), the Single UNIX Specification defines three ways to do this.

Compile-time options are defined in <unistd.h>.
Runtime options that are not associated with a file or a directory are identified with the sysconf function.
Runtime options that are associated with a file or a directory are discovered by calling either the pathconf or the fpathconf function.

The options include the symbols listed in the third column of Figure 2.5, as well as the symbols listed in Figures 2.17 and 2.18. If the symbolic constant is not defined, we must use sysconf, pathconf, or fpathconf to determine whether the option is supported. In this case, the name argument to the function is formed by replacing the _POSIX at the beginning of the symbol with _SC or _PC. For constants that begin with _XOPEN, the name argument is formed by prepending the string with _SC or _PC. For example, if the constant _POSIX_THREADS is undefined, we can call sysconf with the name argument set to _SC_THREADS to determine whether the platform supports the POSIX threads option. If the constant _XOPEN_UNIX is undefined, we can call sysconf with the name argument set to _SC_XOPEN_UNIX to determine whether the platform supports the XSI extensions.

Figure 2.17. Options and name arguments to sysconf
Name of option
Description
name argument
_POSIX_JOB_CONTROL
indicates whether the implementation supports job control
_SC_JOB_CONTROL
_POSIX_READER_WRITER_LOCKS
indicates whether the implementation supports readerwriter locks
_SC_READER_WRITER_LOCKS
_POSIX_SAVED_IDS
indicates whether the implementation supports the saved set-user-ID and the saved set-group-ID
_SC_SAVED_IDS
_POSIX_SHELL
indicates whether the implementation supports the POSIX shell
_SC_SHELL
_POSIX_VERSION
indicates the POSIX.1 version
_SC_VERSION
_XOPEN_CRYPT
indicates whether the implementation supports the XSI encryption option group
_SC_XOPEN_CRYPT
_XOPEN_LEGACY
indicates whether the implementation supports the XSI legacy option group
_SC_XOPEN_LEGACY
_XOPEN_REALTIME
indicates whether the implementation supports the XSI real-time option group
_SC_XOPEN_REALTIME
_XOPEN_REALTIME_THREADS
indicates whether the implementation supports the XSI real-time threads option group
_SC_XOPEN_REALTIME_THREADS
_XOPEN_VERSION
indicates the XSI version
_SC_XOPEN_VERSION

Figure 2.18. Options and name arguments to pathconf and fpathconf
Name of option
Description
name argument
_POSIX_CHOWN_RESTRICTED
indicates whether use of chown is restricted
_PC_CHOWN_RESTRICTED
_POSIX_NO_TRUNC
indicates whether pathnames longer than NAME_MAX generate an error
_PC_NO_TRUNC
_POSIX_VDISABLE
if defined, terminal special characters can be disabled with this value
_PC_VDISABLE
_POSIX_ASYNC_IO
indicates whether asynchronous I/O can be used with the associated file
_PC_ASYNC_IO
_POSIX_PRIO_IO
indicates whether prioritized I/O can be used with the associated file
_PC_PRIO_IO
_POSIX_SYNC_IO
indicates whether synchronized I/O can be used with the associated file
_PC_SYNC_IO

If the symbolic constant is defined by the platform, we have three possibilities.

If the symbolic constant is defined to have the value 1, then the corresponding option is unsupported by the platform.
If the symbolic constant is defined to be greater than zero, then the corresponding option is supported.
If the symbolic constant is defined to be equal to zero, then we must call sysconf, pathconf, or fpathconf to determine whether the option is supported.

Figure 2.17 summarizes the options and their symbolic constants that can be used with sysconf, in addition to those listed in Figure 2.5.

The symbolic constants used with pathconf and fpathconf are summarized in Figure 2.18. As with the system limits, there are several points to note regarding how options are treated by sysconf, pathconf, and fpathconf.

The value returned for _SC_VERSION indicates the four-digit year and two-digit month of the standard. This value can be 198808L, 199009L, 199506L, or some other value for a later version of the standard. The value associated with Version 3 of the Single UNIX Specification is 200112L.
The value returned for _SC_XOPEN_VERSION indicates the version of the XSI that the system complies with. The value associated with Version 3 of the Single UNIX Specification is 600.
The values _SC_JOB_CONTROL, _SC_SAVED_IDS, and _PC_VDISABLE no longer represent optional features. As of Version 3 of the Single UNIX Specification, these features are now required, although these symbols are retained for backward compatibility.
_PC_CHOWN_RESTRICTED and _PC_NO_TRUNC return 1 without changing errno if the feature is not supported for the specified pathname or filedes.
The referenced file for _PC_CHOWN_RESTRICTED must be either a file or a directory. If it is a directory, the return value indicates whether this option applies to files within that directory.
The referenced file for _PC_NO_TRUNC must be a directory. The return value applies to filenames within the directory.
The referenced file for _PC_VDISABLE must be a terminal file.

In Figure 2.19 we show several configuration options and their corresponding values on the four sample systems we discuss in this text. Note that several of the systems haven't yet caught up to the latest version of the Single UNIX Specification. For example, Mac OS X 10.3 supports POSIX threads but defines _POSIX_THREADS as

    #define _POSIX_THREADS

without specifying a value. To conform to Version 3 of the Single UNIX Specification, the symbol, if defined, should be set to -1, 0, or 200112.

Figure 2.19. Examples of configuration options
Limit
FreeBSD 5.2.1
Linux 2.4.22
Mac OS X 10.3
Solaris 9
UFS file system
PCFS file system
_POSIX_CHOWN_RESTRICTED
1
1
1
1
1
_POSIX_JOB_CONTROL
1
1
1
1
1
_POSIX_NO_TRUNC
1
1
1
1
unsupported
_POSIX_SAVED_IDS
unsupported
1
unsupported
1
1
_POSIX_THREADS
200112
200112
defined
1
1
_POSIX_VDISABLE
255
0
255
0
0
_POSIX_VERSION
200112
200112
198808
199506
199506
_XOPEN_UNIX
unsupported
1
undefined
1
1
_XOPEN_VERSION
unsupported
500
undefined
3
3

An entry is marked as "undefined" if the feature is not defined, i.e., the system doesn't define the symbolic constant or its corresponding _PC or _SC name. In contrast, the "defined" entry means that the symbolic constant is defined, but no value is specified, as in the preceding _POSIX_THREADS example. An entry is "unsupported" if the system defines the symbolic constant, but it has a value of -1, or it has a value of 0 but the corresponding sysconf or pathconf call returned -1.

Note that pathconf returns a value of 1 for _PC_NO_TRUNC when used with a file from a PCFS file system on Solaris. The PCFS file system supports the DOS format (for floppy disks), and DOS filenames are silently truncated to the 8.3 format limit that the DOS file system requires.

2.7. Feature Test Macros

The headers define numerous POSIX.1 and XSI symbols, as we've described. But most implementations can add their own definitions to these headers, in addition to the POSIX.1 and XSI definitions. If we want to compile a program so that it depends only on the POSIX definitions and doesn't use any implementation-defined limits, we need to define the constant _POSIX_C_SOURCE. All the POSIX.1 headers use this constant to exclude any implementation-defined definitions when _POSIX_C_SOURCE is defined.

Previous versions of the POSIX.1 standard defined the _POSIX_SOURCE constant. This has been superseded by the _POSIX_C_SOURCE constant in the 2001 version of POSIX.1.

The constants _POSIX_C_SOURCE and _XOPEN_SOURCE are called feature test macros. All feature test macros begin with an underscore. When used, they are typically defined in the cc command, as in

    cc -D_POSIX_C_SOURCE=200112 file.c

This causes the feature test macro to be defined before any header files are included by the C program. If we want to use only the POSIX.1 definitions, we can also set the first line of a source file to

    #define _POSIX_C_SOURCE 200112

To make the functionality of Version 3 of the Single UNIX Specification available to applications, we need to define the constant _XOPEN_SOURCE to be 600. This has the same effect as defining _POSIX_C_SOURCE to be 200112L as far as POSIX.1 functionality is concerned.

The Single UNIX Specification defines the c99 utility as the interface to the C compilation environment. With it we can compile a file as follows:

    c99 -D_XOPEN_SOURCE=600 file.c -o file

To enable the 1999 ISO C extensions in the gcc C compiler, we use the -std=c99 option, as in

    gcc -D_XOPEN_SOURCE=600 -std=c99 file.c -o file

Another feature test macro is _ _STDC_ _, which is automatically defined by the C compiler if the compiler conforms to the ISO C standard. This allows us to write C programs that compile under both ISO C compilers and non-ISO C compilers. For example, to take advantage of the ISO C prototype feature, if supported, a header could contain

    #ifdef _ _STDC_ _
    void  *myfunc(const char *, int);
    #else
    void  *myfunc();
    #endif

Although most C compilers these days support the ISO C standard, this use of the _ _STDC_ _ feature test macro can still be found in many header files.

2.8. Primitive System Data Types

Historically, certain C data types have been associated with certain UNIX system variables. For example, the major and minor device numbers have historically been stored in a 16-bit short integer, with 8 bits for the major device number and 8 bits for the minor device number. But many larger systems need more than 256 values for these device numbers, so a different technique is needed. (Indeed, Solaris uses 32 bits for the device number: 14 bits for the major and 18 bits for the minor.)

The header <sys/types.h> defines some implementation-dependent data types, called the primitive system data types. More of these data types are defined in other headers also. These data types are defined in the headers with the C typedef facility. Most end in _t. Figure 2.20 lists many of the primitive system data types that we'll encounter in this text.

Figure 2.20. Some common primitive system data types
Type
Description
caddr_t
core address (Section 14.9)
clock_t
counter of clock ticks (process time) (Section 1.10)
comp_t
compressed clock ticks (Section 8.14)
dev_t
device numbers (major and minor) (Section 4.23)
fd_set
file descriptor sets (Section 14.5.1)
fpos_t
file position (Section 5.10)
gid_t
numeric group IDs
ino_t
i-node numbers (Section 4.14)
mode_t
file type, file creation mode (Section 4.5)
nlink_t
link counts for directory entries (Section 4.14)
off_t
file sizes and offsets (signed) (lseek, Section 3.6)
pid_t
process IDs and process group IDs (signed) (Sections 8.2 and 9.4)
ptrdiff_t
result of subtracting two pointers (signed)
rlim_t
resource limits (Section 7.11)
sig_atomic_t
data type that can be accessed atomically (Section 10.15)
sigset_t
signal set (Section 10.11)
size_t
sizes of objects (such as strings) (unsigned) (Section 3.7)
ssize_t
functions that return a count of bytes (signed) (read, write, Section 3.7)
time_t
counter of seconds of calendar time (Section 1.10)
uid_t
numeric user IDs
wchar_t
can represent all distinct character codes

By defining these data types this way, we do not build into our programs implementation details that can change from one system to another. We describe what each of these data types is used for when we encounter them later in the text.

2.9. Conflicts Between Standards

All in all, these various standards fit together nicely. Our main concern is any differences between the ISO C standard and POSIX.1, since SUSv3 is a superset of POSIX.1. There are some differences.

ISO C defines the function clock to return the amount of CPU time used by a process. The value returned is a clock_t value. To convert this value to seconds, we divide it by CLOCKS_PER_SEC, which is defined in the <time.h> header. POSIX.1 defines the function times that returns both the CPU time (for the caller and all its terminated children) and the clock time. All these time values are clock_t values. The sysconf function is used to obtain the number of clock ticks per second for use with the return values from the times function. What we have is the same term, clock ticks per second, defined differently by ISO C and POSIX.1. Both standards also use the same data type (clock_t) to hold these different values. The difference can be seen in Solaris, where clock returns microseconds (hence CLOCKS_PER_SEC is 1 million), whereas sysyconf returns the value 100 for clock ticks per second.

Another area of potential conflict is when the ISO C standard specifies a function, but doesn't specify it as strongly as POSIX.1 does. This is the case for functions that require a different implementation in a POSIX environment (with multiple processes) than in an ISO C environment (where very little can be assumed about the host operating system). Nevertheless, many POSIX-compliant systems implement the ISO C function, for compatibility. The signal function is an example. If we unknowingly use the signal function provided by Solaris (hoping to write portable code that can be run in ISO C environments and under older UNIX systems), it'll provide semantics different from the POSIX.1 sigaction function. We'll have more to say about the signal function in Chapter 10.

2.10. Summary

Much has happened over the past two decades with the standardization of the UNIX programming environment. We've described the dominant standardsISO C, POSIX, and the Single UNIX Specificationand their effect on the four implementations that we'll examine in this text: FreeBSD, Linux, Mac OS X, and Solaris. These standards try to define certain parameters that can change with each implementation, but we've seen that these limits are imperfect. We'll encounter many of these limits and magic constants as we proceed through the text.

The bibliography specifies how one can obtain copies of the standards that we've discussed.

Оставьте свой комментарий !

Ваше имя:

Комментарий:

Оба поля являются обязательными

Автор	Комментарий к данной статье

Программирование в UNIX

Введение

UNIX Архитектура

Архитектура UNIX

Logging In

Login Name

Shells

Figure 1.2. Common shells used on UNIX systems

Файлы и каталоги

File System

Filename

Pathname

Example

Figure 1.3. List all the files in a directory

Working Directory

Home Directory

1.5. Input and Output

File Descriptors

Standard Input, Standard Output, and Standard Error

Unbuffered I/O

Example

Figure 1.4. List all the files in a directory

Standard I/O

Example

Figure 1.5. Copy standard input to standard output, using standard I/O

1.6. Programs and Processes

Program

Processes and Process ID

Example

Figure 1.6. Print the process ID

Process Control

Example

Figure 1.7. Read commands from standard input and execute them

Threads and Thread IDs

1.7. Error Handling

Example

Figure 1.8. Demonstrate strerror and perror

Error Recovery

1.8. User Identification

User ID

Group ID

Example

Figure 1.9. Print user ID and group ID

Supplementary Group IDs

1.9. Signals

Example

Figure 1.10. Read commands from standard input and execute them

1.10. Time Values

1.11. System Calls and Library Functions

Figure 1.11. Separation of malloc function and sbrk system call

Figure 1.12. Difference between C library functions and system calls

1.12. Summary

2.1. Introduction

2.2. UNIX Standardization

2.2.1. ISO C

Figure 2.1. Headers defined by the ISO C standard

2.2.2. IEEE POSIX

Figure 2.2. Required headers defined by the POSIX standard

Figure 2.3. XSI extension headers defined by the POSIX standard

Figure 2.4. Optional headers defined by the POSIX standard

Figure 2.5. POSIX.1 optional interface groups and codes

2.2.3. The Single UNIX Specification

2.2.4. FIPS

2.3. UNIX System Implementations

2.3.1. UNIX System V Release 4

2.3.2. 4.4BSD

2.3.3. FreeBSD

2.3.4. Linux

2.3.5. Mac OS X

2.3.6. Solaris

2.3.7. Other UNIX Systems

2.4. Relationship of Standards and Implementations

2.5. Limits

2.5.1. ISO C Limits

Figure 2.6. Sizes of integral values from <limits.h>

Figure 2.7. ISO limits on various platforms

2.5.2. POSIX Limits

Figure 2.8. POSIX.1 invariant minimum values from <limits.h>

2.5.3. XSI Limits

Figure 2.9. XSI invariant minimum values from <limits.h>

Figure 1.8. Demonstrate `strerror` and `perror`

Figure 1.11. Separation of `malloc` function and `sbrk` system call

Figure 2.6. Sizes of integral values from `<limits.h>`

Figure 2.8. POSIX.1 invariant minimum values from `<limits.h>`

Figure 2.9. XSI invariant minimum values from `<limits.h>`

2.5.4. `sysconf`, `pathconf`, and `fpathconf` Functions

Figure 2.10. Limits and name arguments to `sysconf`

Figure 2.11. Limits and name arguments to `pathconf` and `fpathconf`

Figure 2.13. Print all possible `sysconf` and `pathconf` values

Figure 2.17. Options and name arguments to `sysconf`

Figure 2.18. Options and name arguments to `pathconf` and `fpathconf`