Принципы операционной системы
BRINCH HANSEN , 1973
Вступление
Главная цель этой книги - дать представление об операционной системе как программе,
которая позволяет эффективно использовать компьютер и определяет правила поведения ее пользователей.
Тема этой книги - операционная система , которая радикально ничем не отличается от других больших программ.
Проблемы при ее разработке такие же , как и в остальных случаях при разработке таких больших программ,
как например компиляторы.
Историческое значение операционных систем в том , что они привели к открытию новых принципов совместного
использования ресурсов , мультипрограммирования , конструирования программ.
Задача операционной системы - распределение ресурсов между пользователями.
Для решения этой задачи разработчик должен ввести ограничения.
Один процессор может выполнять несколько различных вычислений,
создавая иллюзию , что они происходят одновременно.
Ось постоянно сталкивается с непредсказуемыми ситуациями,большим количеством различных запросов,
и она обязана распределять ресурсы между ними таким образом ,
чтобы каждый запрос получал адекватный по времени ответ.
Некоторые компоненты компьютера могут работать одновременно.
Проблемы , возникающие при программировании таких параллельных процессов,
заключаются в том , что очень сложно выявить ошибки при их тестировании.
Желательно выявлять такие ошибки еще на этапе компиляции.
В этой книге используется паскаль при обьяснении основных концепций и алгоритмов.
Книга состоит из 8 глав:
Часть 1 - обзорная. Обясняются цели и принципы операционных систем,
исторический взгляд на их разработку.
Показаны их технологические ограничения и возможности.
Часть 2 - показывает роль абстрактно-структурированного подхода при возникающих
вычислительных проблемах.Даются примеры последовательных программ
и иерархических програмных конструкций.
Часть 3 - о параллельных процессах,методах синхронизации,
семафорах,очередях,дидлоках.
Глава 4 - о планировщике процессов и real-time характеристиках.
Глава 5 - управление памятью .
Глава 6 - рассматривается теория планирования процессов на одно-процессорных системах.
Глава 7 - защита и распределение ресурсов системы.
Рассматривается взаимодействие таких систем,как управление процессами,
памятью,input/output, preemptive scheduling.
Глава 8 - конкретный пример системы на основе RC 4000.
Часть 1 - Обзор операционных систем
This chapter describes the purpose and technological background of
operating systems. It stresses the similarities of all operating systems and
points out the advantages of special-purpose over general-purpose systems.
1,1. THE PURPOSE OF AN OPERATING SYSTEM
1.1.1. Resource Sharing
An operating system is a set of manual and automatic procedures that
enable a group of people to share a computer installation efficiently.
The key word in this definition is sharing: it means that people will
compete for the use of physical resources such as processor time, storage
space, and peripheral devices; but it also means that people can cooperate
by exchanging programs and data on the same installation. The sharing of a
computer installation is an economic necessity, and the purpose of an
operating system is to make the sharing tolerable.
An operating system must have a policy for choosing the order in which
competing users are served and for resolving conflicts of simultaneous
requests for the same resources; it must also have means of enforcing this
policy in spite of the presence of erroneous or malicious user programs.
Present computer installations can execute several user programs
simultaneously and allow users to retain data on backing storage for weeks
or months. The simultaneous presence of data and programs belonging to
different users requires that an operating system protect users against each
other.
Since users must pay for the cost of computing, an operating system
must also perform accounting of the usage of resources.
In early computer installations, operators carried out most of these
functions. The purpose of present operating systems is to carry out these
tasks automatically by means of the computer itself. But when all
these aspects of sharing are automated, it becomes quite difficult for the
installation management to find out what the computer is actually doing
and to modify the rules of sharing to improve performance. A good
operating system will assist management in this evaluation by collecting
measurements on the utilization of the equipment.
Most components of present computer installations are sequential in
nature: they can only execute operations or transfer data items one at a
time. But it is possible to have activities going on simultaneously in several
of these components. This influences the design of operating systems so
much that our subject can best be described as the management of shared
multiprogramming systems.
1.1.2. Virtual Machines
An operating system defines several languages in which the rules of
resource sharing and the requests for service can be described. One of these
languages is the job control language, which enables users to identify
themselves and describe the requirements of computational jobs: the types
and amounts of resources needed, and the names of programs and data files
used.
Another language is the virtual machine language: the set of machine
operations available to a user during program execution. To maintain
control of a computer installation and isolate users from each other, an
operating system must prevent user programs from executing certain
operations; otherwise, these programs could destroy procedures or data
inside the operating system or start input/output on peripheral devices
assigned to other users. So the set of machine operations available to users
is normally a subset of the original machine language.
But users must have some means of doing input/output. The operating
system enables them to do so by calling certain standard procedures that
handle the peripherals in a well-defined manner. To the user programs,
these standard procedures appear to be extensions of the machine language
available to them. The user has the illusion of working on a machine that
can execute programs written in this language. Because this machine is
partly simulated by program, it is called a virtual machine. So an operating
system makes a virtual machine available to each user and prevents these
machines from interfering destructively with each other. The simultaneous
presence of several users makes the virtual machines much slower than the
physical machine.
An operating system can make the programming language of the virtual
machine more attractive than that of the original machine. This can be
done by relieving the user of the burden of technological details such as the
physical identity of peripheral devices and minor differences in their
operation. This enables the user to concentrate on logical concepts such as
the names of data files and the transfer of data records to and from these
files. The virtual machine can also be made more attractive by error
correction techniques; these make the virtual machine appear more reliable
than the real one (for example, by automatic repetition of unsuccessful
input/output operations). In this way an operating system may succeed in
making a virtue out of a necessity.
Yet another language is the one used inside the operating system itself
to define the policy of sharing, the rules of protection, and so on. A certain
amount of bitter experience with present operating systems has clearly
shown that an operating system may turn out to be inefficient, unreliable,
or built on wrong assumptions just like any other large program. Operating
systems should be designed so that they are simple to understand, and easy
to use and modify. Even if an operating system works correctly, there is
still a need for experimenting with its policy towards users and for adapting
it to the requirements of a particular environment, so it is important not
only to give users an attractive programming language, but also to design
good programming tools to be used inside the operating system itself. But
since the operating system is imposed on everyone, it is extremely
important that the language used to implement it reflect the underlying
machine features in an efficient manner.
1.1.3. Operating Systems and User Programs
Operating systems are large programs developed and used by a changing
group of people. They are often modified considerably during their
lifetimes. Operating systems must necessarily impose certain restrictions on
all users. But this should not lead us to regard them as being radically
different from other programs--they are just complicated applications of
general programming techniques.
During the construction of operating systems over the past decade, new
methods of multiprogramming and resource sharing were discovered. We
now realize that these methods are equally useful in other programming
applications. Any large programming effort will be heavily influenced by
the characteristics and amounts of physical resources available, by the
possibility of executing smaller tasks simultaneously, and by the need for
sharing a set of data among such tasks.
It may be useful to distinguish between operating systems and user
computations because the former can enforce certain rules of behavior on
the latter. But it is important to understand that each level of programming
solves some aspect of resource allocation.
Let me give a few examples of the influence of resource sharing on the
design of standard programs and user programs.
Store allocation. One of the main reasons for dividing a compiler into
smaller parts (called passes) is to allocate storage efficiently. During a
compilation, the passes can be loaded one at a time from drum or disk into
a small internal store where they are executed.
Job scheduling. A data processing application for an industrial plant can
involve quite complicated rules for the sequence in Which smaller tasks are
scheduled for execution. There may be a daily job which records details of
production; weekly and monthly jobs which compute wages; a yearly job
associated with the fiscal year; and several other jobs. Such long-term
scheduling of related jobs which share large data files is quite difficult to
control automatically. In contrast, most operating systems only worry
about the scheduling of independent jobs over time spans of a few minutes
or hours.
Multiprogramming. To control an industrial process, engineers must be
able to write programs that can carry out many tasks simultaneously, for
example, measure process variables continuously, report alarms to
operators, accumulate measurements of production, and print reports to
management.
Program protection. The ability to protect smaller components of a
large program against each other is essential in real-time applications (such
as banking and ticket reservation) where the service of reliable program
components must be continued while new components are being tested.
So the problems of resource sharing solved by operating systems repeat
themselves in user programs; or, to put it differently, every large
application of a computer includes a local operating system that
coordinates resource sharing among smaller tasks of that application. What
is normally called "the operating system" is just the one that coordinates
the sharing of an entire installation among users.
When you realize that resource sharing is not a unique characteristic of
operating systems, you may wonder whether the simulation of virtual
machines makes operating systems different from other programs. But alas,
a closer inspection shows that all programs simulate virtual machines.
Computer programs are designed to solve a class of problems such as
the editing of all possible texts, the compilation of all possible Algol
programs, the sorting of arbitrary sets of data, the computation of payrolls
for a varying number of employees, and so on. The user specifies a
particular case of the class of problems by means of a set of data, called the
input, and the program delivers as its result another set of data, called the
output.
One way of looking at this flexibility is to say that the input is a
sequence of instructions written in a certain language, and the function of
the program is to follow these instructions.
From this point of view, an editing program can execute other
programs written in an editing language consisting of instructions such as
search, delete, and insert textstring. And an Algol compiler can execute
programs written in the Algol 60 language. The computer itself can be
viewed as a physical implementation of a program called the instruction
execution cycle. This program can carry out other programs written in a
so-called machine language.
If we adopt the view that a computer is a device able to follow and
carry out descriptions of processes written in a formal language, then we
realize that each of these descriptions (or programs) in turn makes the
original computer appear to be another computer which interprets a
different language. In other words, an editing program makes the computer
behave like an editing machine, and an Algol compiler turns it into an Algol
60 machine. Using slightly different words, we can say that a program
executed on a physical machine makes that machine behave like a virtual
machine which can interpret a different programming language. And this
language is certainly more attractive for its purpose than the original
machine language; otherwise, there would be no reason to write the
program in the first place!
From these considerations it is hard to avoid the conclusion that
operating systems must be regarded merely as large application programs.
Their purpose is to manage resource sharing, and they are based on general
programming methods. The proper aim of education is to identify these
methods. But before we do that, I will briefly describe the technological
development of operating systems. This will give you a more concrete idea
of what typical operating systems do and what they have in common.
1.2. TECHNOLOGICAL BACKGROUND
1.2.1. Computer and Job Profiles
We now go back to the middle of the 1950's to trace the influence of
the technological development of computers on the structure of operating
systems.
When many users share a computer installation, queues of computa6
tions submitted for execution are normally formed, and a decision has to
be made about the order in which they should be executed to obtain
acceptable overall service. This decision rule is called a scheduling
algorithm.
A computation requested by a user is called a job; it can involve the
execution of several programs in succession, such as editing followed by
compilation and execution of a program written in a high-level language. A
job can also require simultaneous execution of several programs cooperating
on the same task. One program may, for example, control the printing
of data, while another program computes more output.
In the following, I justify the need for automatic scheduling of jobs by
quite elementary considerations about a computer installation with the
following characteristics:
instruction execution time 2/~sec
internal store 32 K words
card reader 1,000 cards/min
line printer 1,000 cards/min
magnetic tape stations 80,000 char/sec
(1 ~ = 10-6, and 1 K = 1024).
We will consider an environment in which the main problem is to
schedule a large number of small jobs whose response times are as short as
possible. {The response time of a job is the interval between the request for
its execution and the return of its results.) This assumption is justified for
universities and engineering laboratories where program development is the
main activity.
A number of people have described the typical job profile for this type
of environment (Rosin, 1965; Walter, 1967). We will assume that the
average job consists of a compilation and execution of a program written in
a high-level language. The source text read from cards and listed on a
printer, is the major part of the input/output. More precisely, the average
job will be characterized by the following figures:
input time (300 cards) 0.3 min
output time (500 lines) 0.5 min
execution time 1 min
1.2.2. Batch-processing Systems
For the moment we will assume that magnetic tape is the only form of
backing store available. This has a profound influence on the possible forms
of scheduling. We also impose the technological restriction on the computer
that its mode of operation be strictly sequential. This means that: (1) it can
Sec. 1.2. TECHNOLOGICAL BACKGROUND 7
only execute one program at a time; and (2) after the start of an
input/output operation, program execution stops until the transfer of data
has been completed.
The simplest scheduling rule is the open shop where users each sign up
for a period of, say, 15 min and operate the machine themselves. For the
individual user this is an ideal form of service: it enables him to correct
minor programming errors on the spot and experiment with programs
during their execution. Unfortunately, such a system leads to prohibitive
costs of idle machinery: for an average job, the central processor will only
be working for one out of every 15 min; the rest of the time will be spent
waiting for the operator. The situation can be characterized by two simple
measures of average performance:
processor utilization = execution time/total time
throughput = number of jobs executed per time unit
For the open shop, processor utilization is only about 7 per cent with a
throughput of no more than 4 jobs per hour, each requiring only one
minute of execution time(!).
Idle processor time caused by manual intervention can be greatly
reduced by even the most primitive form of automatic scheduling. Figure
1.1 illustrates an installation in which users no longer can interact with
programs during execution. They submit their jobs to an operator who
stacks them in the card reader in their order of arrival. From the card
reader, jobs are input directly to the computer, listed on the printer, and
executed one by one. This scheduling is done by an operating system which
resides permanently in the internal store.
Under this form of scheduling an average job occupies the computer for
1.8 min (the sum of input/output and execution times). This means that
processor utilization has been improved to 55 percent with a corresponding
throughput of 33 jobs per hour.
But even this simple form of automatic scheduling creates new
problems: How do we protect the operating system against erroneous user
programs? How can we force user programs to return control to the
operating system when they are finished, or if they fail to terminate after a
period of time defined by the operating system? Early operating systems
offered no satisfactory solutions to these problems and were frequently
brought down by their jobs. We will ignore this problem at the moment and
return to it in the chapters on processor management and resource
protection.
This argument in favor of automatic scheduling has ignored the
processor time that is lost while the operator handles the peripheral
devices: inserting paper in the printer, mounting tapes for larger jobs, and
so forth. The argument has also ignored processor time that is wasted when
the operator makes a mistake. But these factors are ignored throughout the
chain of arguments and do not affect the trend towards better utilization of
the processor.
The main weakness is that the argument did not include an evaluation
of the amount of processor time used by the new component--the
operating system. The reason for this omission is that an operating system
carries out certain indispensable functions (input/output, scheduling, and
accounting) which previously had to be done elsewhere in the installation
by operators and users. The relevant factor here--the amount of processor
time lost by an inefficient implementation of the operating system--
unfortunately cannot be measured. But in any case, the figures given in the
following are not my estimates, but measurements of the actual
performance of some recent operating systems.
The bottleneck in the previous simple system is the slow input/output
devices; they keep the central processor waiting 45 per cent of the time
during an average job execution. So the next step is to use the fast tape
stations to implement a batch processing system as shown in Fig. 1.2. First,
a number of jobs are collected from users by an operator and copied from
cards to magnetic tape on a small, cheap computer. This tape is carried by
the operator to the main computer, which executes the batch of jobs one
by one, delivering their output to another tape. Finally, this output tape is
carried to the small computer and listed on the printer. Notice that
although jobs are executed in their order of arrival inside a batch, the
printed output of the first job is not available until the entire batch has
been executed.
During the execution of a batch on the main computer, the operator
uses the small computer to print the output of an earlier batch and input a
new batch on tape. In this way the main computer, as well as the card
reader and printer, is kept busy all the time. Input/output delays on the
main computer are negligible in this system, but another source of idle time
has appeared: the mounting and dismounting of tapes. This can only be
reduced by batching many jobs together on a single tape. But in doing so
we also increase the waiting time of users for the results of their jobs. This
dilemma between idle processor time and user response time can be
expressed by the following relation:
processor utilization = (batch execution time) / batch response time
where
batch response time = batch mounting time + batch execution time
This can also be rewritten as follows:
batch response time = batch mounting time / 1 - processor utilization
Since there is a limit to the amount of idle processor time management
is prepared to accept, the net result is that response time for users is still
determined by the manual speed of operators! In the installation
considered a batch cycle typically proceeds as follows:
Delivery time of 50 jobs 30 min
Conversion of cards to tape 15 min
Mounting of tapes 5 min
Batch execution 50 min
Conversion of tape to printer 25 min
Manual separation of output 15 min
Total batch cycle 140 min
With a tape mounting time of 5 min per batch, utilization of the main
processor is now as high as 50/55 = 90 per cent, and throughput has
reached 55 jobs per hour. But at the same time, the shortest response time
for any job is 140 min. And this is obtained only if the job joins a batch
immediately after submission.
We have also ignored the problem of the large jobs: When jobs requiring
hours for execution are included in a batch, the jobs following will
experience much longer response times. Most users are only interested in
fast response during working hours. So an obvious remedy is to let the
operators sort jobs manually and schedule the shorter ones during the
daytime and the longer ones at night.
If the operator divides the jobs into three groups, the users might
typically expect response times of the following order:
1-minjobs: 2-3 hours
5-min jobs: 8-10 hours
other jobs: 1-7 days
We have followed the rationale behind the classical batch-processing
system of the late 1950's (Bratman, 1959). The main concern has been to
reduce idle processor time, unfortunately with a resultant increase in user
response time.
In this type of system the most complicated aspects of sharing are still
handled by operators, for example, the scheduling of simultaneous
input/output and program execution on two computers, and the
assignment of priorities to user jobs. For this reason I have defined an
operating system as a set of manual and automatic procedures that enable a
group of people to share a computer installation efficiently (Section 1.1.1).
1.2.3. Spooling Systems
It is illuminating to review the technological restrictions that dictated
the previous development towards batch processing. The first one was the
strict sequential nature of the computer which made it necessary to prevent
conversational interaction with running programs; the second limitation
was the sequential nature of the backing store (magnetic tapes) which
forced us to schedule large batches of jobs strictly in the order in which
they were input to the system.
The sequential restrictions on scheduling were made much less severe
(but were by no means removed) by technological developments in the
early 1960's. The most important improvement was the design of
autonomous peripheral devices which can carry out input/output operations
independently while the central processor continues to execute
programs.
The problem of synchronizing the central processor and the peripheral
devices after the completion of input/output operations was solved by the
interrupt concept. An interrupt is a timing signal set by a peripheral device
in a register connected to a central processor. It is examined by the central
processor after the execution of each instruction. When an interrupt occurs,
the central processor suspends the execution of its current program and
starts another program--the operating system. When the operating system
has responded properly to the device signal, it can either resume the
execution of the interrupted program or start a more urgent program (for
example, the one that was waiting for the input/output).
This technique made concurrent operation of a central processor and its
peripheral devices possible. The programming technique used to control
concurrent operation is called multiprogramming.
It was soon realized that the same technique could be used to simulate
concurrent execution of several user programs on a single processor. Each
program is allowed to execute for a certain period of time, say of the order
of 0.1-1 sec. At the end of this interval a timing device interrupts the
program and starts the operating system. This in turn selects another
program, which now runs until a timing interrupt makes the system switch
to a third program, and so forth.
This form of scheduling, in which a single resource (the central
processor) is shared by several users, one at a time in rapid succession, is
called multiplexing. Further improvements are made possible by enabling a
program to ask the operating system to switch to other programs while it
waits for input/output.
The possibility of more than one program being in a state of execution
at one time has considerable influence on the organization of storage. It is
no longer possible to predict in which part of the internal store a program
will be placed for execution. So there is no fixed correspondence at
compile time between the names used in a program to refer to data and the
addresses of their store locations during execution. This problem of
program relocation was first solved by means of a loading program, which
examined user programs before execution and modified addresses used in
them to correspond to the store locations actually used.
Later, program relocation was included in the logic of the central
processor: a base register was used to modify instruction addresses
automatically during execution by the start address of the storage area
assigned to a user. Part of the protection problem was solved by extending
this scheme with a limit register defining the size of the address space
available to a user; any attempt to refer to data or programs outside this
space would be trapped by the central processor and cause the operating
system to be activated.
The protection offered by base and limit registers was, of course,
illusory as long as user programs could modify these registers. The
recognition of this flaw led to the design of central processors with two
states of execution: a privileged state, in which there are no restrictions on
the operations executed; and a user state, in which the execution of
operations controlling interruption, input/output, and store allocation is
forbidden and will be trapped if attempted. A transition to the privileged
state is caused by interrupts from peripherals and by protection violations
inside user programs. A transition to the user state is caused by execution
of a privileged operation.
It is now recognized that it is desirable to be able to distinguish in a
more flexible manner between many levels of protection (and not just
two). This early protection system is safe, but it assigns more responsibility
to the operating system than necessary. The operating system must, for
example, contain code which can start input/output on every type of
device because no other program is allowed to do that. But actually, all that
is needed in this case is a mechanism which ensures that a job only operate
devices assigned to it; whether a job handles its own devices correctly or
not is irrelevant to the operating system. So a centralized protection
scheme tends to increase the complexity of an operating system and make
it a bottleneck at run time. Nevertheless, this early protection scheme must
be recognized as an invaluable improvement: clearly, the more responsibility
management delegates to an operating system, the less they can tolerate
that it breaks down.
Another major innovation of this period was the construction of large
backing stores, disks and drums, which permit fast, direct access to data
and programs. This, in combination with multiprogramming, makes it
possible to build operating systems which handle a continuous stream of
input, computation, and output on a single computer. Figure 1.3 shows
the organization of such a spooling system. The central processor is
multiplexed between four programs: one controls input of cards to a queue
on the backing store; another selects user jobs from this input queue and
starts their execution one at a time; and a third one controls printing of
output from the backing store. These three programs form the operating
system. The fourth program held in the internal store is the current user
program which reads its data from the input queue and writes its results in
an output queue on the backing store.
The point of using the backing store as a buffer is that the input of a
job can be fed into the machine in advance of its execution; and its output
can be printed during the execution of later jobs. This eliminates the
manual overhead of tape mounting. At the same time, direct access to the
backing store makes it possible to schedule jobs in order of priority rather
than in order of arrival. The spooling technique was pioneered on the Atlas
computer at Manchester University (Kilburn, 1961).
A very successful operating system with input/output spooling,
called Exec H, was designed by Computer Sciences Corporation (Lynch,
1967 and 1971). It controlled a Univac 1107 computer with an instruction
execution time of 4 psec. The backing store consisted of two or more fast
drums, each capable of transferring 10,000 characters during a single
revolution of 33 msec. The system typically processed 800 jobs per day,
each job requiring an average of 1.2 min. It was operated by the users
themselves: To run a job, a user simply placed his cards in a reader and
pushed a button. As a rule, the system could serve the users faster than
they could load cards. So a user could immediately remove his cards from
the reader and proceed to a printer where his output would appear shortly.
Less than 5 per cent of the jobs required magnetic tapes. The users who
needed tapes had to mount them in advance of program execution.
Fast response to student jobs was achieved by using the scheduling
algorithm shortest job next. Priorities were based on estimates of execution
time supplied by users, but jobs that exceeded their estimated time limits
were terminated by force.
Response times were so short that the user could observe an error,
repunch a few cards, and resubmit his job immediately. System
performance was measured in terms of the circulation time of jobs. This
was defined as the sum of the response time of a job after its submission
and the time required by the user to interpret the results, correct the cards,
and resubmit the job; or, to put it more directly, the circulation time was
the interval between two successive arrivals of the same job for execution.
About a third of all jobs had a circulation time of less than 5 min, and
90 per cent of all jobs were recirculated ones that had already been run one
or more times the same day. This is a remarkable achievement compared to
the earlier batch-processing system in which small jobs took a few hours to
complete! At the same time, the processor utilization in the Exec H system
was as high as 90 per cent.
The Exec H system has demonstrated that many users do not need a
direct, conversational interaction with programs during execution. Users
will often be quite satisfied with a non-interactive system which offers
them informal access, fast response, and minimal cost.
Non-interactive scheduling of small jobs with fast response is
particularly valuable for program testing. Program tests are usually short in
duration: After a few seconds the output becomes meaningless due to a
programming error. The main thing for the programmer is to get the initial
output as soon as possible and to be able to run another test after
correcting the errors shown by the previous test.
There are, however, also cases in which the possibility of interacting
with running programs is highly desirable. In the following section, I
describe operating systems which permit this.
1.2.4. Interactive Systems
To make direct conversation with running programs tolerable to human
beings, the computer must respond to requests within a few seconds. As an
experiment, try to ask a friend a series of simple questions and tell him to
wait ten seconds before answering each of them; I am sure you will agree
that this form of communication is not well-suited to the human
temperament.
A computer can only respond to many users in a few seconds when the
processing time of each request is very small. So the use of multiprogramming
for conversation is basically a means of giving fast response to
trivial requests; for example, in the editing of programs, in ticket
reservation systems, in teaching programs, and so forth. These are all
situations in which the pace is limited by human thinking. They involve
very moderate amounts of input/output data which can be handled by
low-speed terminals such as typewriters or displays.
In interactive systems in which the processor time per request is only a
few hundred milliseconds, scheduling cannot be based on reliable user
estimates of service time. This uncertainty forces the scheduler to allocate
processor time in small slices. The simplest rule is round-robin scheduling:
each job in turn is given a fixed amount of processor time called a time
slice; if a job is not completed at the end of its time slice, it is interrupted
and returned to the end of a queue to walt for another time slice. New jobs
are placed at the end of the queue. This policy guarantees fast response to
user requests that can be processed within a single time slice.
Conversational access in this sense was first proposed by Strachey
(1959). The creative advantages of a closer interaction between man and
machine were pointed out a few years later by Licklider and Clark (1962).
The earliest operational systems were the CTSS system developed at
Massachusetts Institute of Technology and the SDC Q-32 system built by
the System Development Corporation. They are described in excellent
papers by Corbato (1962) and Schwartz {1964 and 1967).
Figure 1.4 illustrates the SDC Q-32 system in a simplified form, ignoring
certain details. An internal store of 65 K words is divided between a
resident operating system and a single active user job; the rest of the jobs
are kept on a drum with a capacity of 400 K words.
The average time slice is about 40 msec. At the end of this interval, the
active job is transferred to the drum and another job is loaded into the
internal store. This exchange of jobs between two levels of storage is called
swapping. It takes roughly another 40 msec. During the swapping, the
central processor is idle so it is never utilized more than 50 per cent of the
time.
During the daytime the system is normally accessed simultaneously by
about 25 user terminals. So a user can expect response to a simple request
(requiring a single time slice only) in 25*80 msec = 2 sec.
A small computer controls terminal input/output and ensures that users
can continue typing requests and receiving replies while their jobs are
waiting for more time. A disk of 4000 K words with an average access time
of 225 msec is used for semi-permanent storage of data files and programs.
The users communicate with the operating system in a simple job
control language with the following instructions:
LOGIN: The user identifies himself and begins using the system.
LOAD: The user requests the transfer of a program from disk to
drum.
START: The user starts the execution of a loaded program or
resumes the execution of a stopped program.
STOP: The user stops the execution of a program temporarily.
DIAL: The user communicates with other users or with system
operators.
LOGOUT: The user terminates his use of the system.
The system has been improved over the years: Processor utilization has
been increased from 50 to 80 per cent by the use of a more complicated
scheduling algorithm. Nevertheless, it is still true that this system and
similar ones are forced to spend processor time on unproductive transfers
of jobs between two levels of storage: Interactive systems achieve
guaranteed response to short requests at the price of decreased processor
utilization.
In the SDC Q-32 system with an average of 25 active users, each user is
slowed down by a factor of 50. Consequently, a one-minute job takes
about 50 min to complete compared to the few minutes required in the
Exec H system. So interactive scheduling only makes sense for trivial
requests; it is not a realistic method for computational jobs that run for
minutes and hours.
Later and more ambitious projects are the MULTICS system (Corbato,
1965) also developed at MIT, and the IBM system TSS-360 (Alexander,
1968). In these systems, the problem of store multiplexing among several
users is solved by a more refined method: Programs and data are transferred
between two levels of storage in smaller units, called pages, when they are
actually needed during the execution of jobs. The argument is that this is
less wasteful in terms of processor time than the crude method of swapping
entire programs. But experience has shown that this argument is not always
valid because the overhead of starting and completing transfers increases
when it is done in smaller portions. We will look at this problem in more
detail in the chapter on store management. The paging concept was
originally invented for the Atlas computer (Kilburn, 1962).
Another significant contribution of these systems is the use of large
disks for semi-permanent storage of data and programs. A major problem in
such a filing system is to ensure the integrity of data in spite of occasional
hardware failures and protect them against unauthorized usage. The
integrity problem can be solved by periodic copying of data from disk to
magnetic tape. These tapes enable an installation to restore data on the disk
after a hardware failure. This is a more complicated example of the design
goal mentioned in Section 1.1.2: An operating system should try to make
the virtual machine appear more reliable than the real one.
The protection problem can be solved by a password scheme which
enables users to identify themselves and by maintaining as part of the filing
system a directory describing the authority of users (Fraser, 1971).
Interactive systems can also be designed for real-time control of
industrial processes. The central problem in a real-time environment is that
the computer must be able to receive data as fast as they arrive; otherwise,
they will be lost. (In that sense, a conversational system also works under
real-time constraints: It must receive input as fast as users type it.)
Usually, a process control system consists of several programs which are
executed simultaneously. There may, for example, be a program which is
started every minute to measure various temperatures, pressures, and flows
and compare these against preset alarm limits. Each time an alarm
condition is detected, another program is started to report it to an
operator, and while this is being done, the scan for further alarms
continues. Still other programs may at the same time accumulate
measurements on the production and consumption of materials and energy.
And every few hours these data are probably printed by yet another
program as a report to plant management.
This concludes the overview of the technological background of shared
computer installations. I have tried through a series of simple arguments to
illustrate the influence of technological constraints on the service offered
by operating systems. Perhaps the most valid conclusion is this: In spite of
the ability of present computer installations to perform some operations
simultaneously, they remain basically sequential in nature; a central
processor can only execute one operation at a time, and a drum or disk can
only transfer one block of data at a time. There are computers and backing
stores which can car~ out more than one operation at a time, but never to
the extent where the realistic designer of operating systems can afford to
forget completely about sequential resource constraints.
1.3. THE SIMILARITIES OF OPERATING SYSTEMS
The previous discussion may have left the impression that there are
basic differences between batch processing, spooling, and interactive
systems. This is certainly true as long as we are interested mainly in the
relation between the user service and the underlying technology. But to
gain a deeper insight into the nature of operating systems, we must look for
their similarities before we stress their differences.
To mention one example: All shared computer installations must
handle concurrent activities at some level. Even if a system only schedules
one job at a time, users can still make their requests simultaneously. This is
a real-time situation in which data (requests) must be received when they
arrive. The problem can, of course, be solved by the users themselves (by
forming a waiting line) and by the operators (by writing down requests on
paper); but the observation is important since our goal is to handle the
problems of sharing automatically.
It is also instructive to compare the batch-processing and spooling
systems. Both achieve high efficiency by means of a small number of
concurrent activities: In the batch processing system, independent
processors work together; in the spooling system, a single processor
switches among independent programs. Furthermore, both systems use
backing storage (tape and drum) as a buffer to compensate for speed
variations of the producers and consumers of data.
As another example, consider real-time systems for process control and
conversational programming. In these systems, concurrently executed
programs must be able to exchange data to cooperate on common tasks.
But again, this problem exists in all shared computer installations: In a
spooling system, user computations exchange data with concurrent
input/output processes; and in a batch processing system, another set of
concurrent processes exchanges data by means of tapes mounted by
operators.
As you see, all operating systems face a common set of problems. To
recognize these, we must reject the established classification of operating
systems (into batch processing, spooling, and interactive systems) which
stresses the dissimilarities of various forms of technology and user service.
This does not mean that the problems of adjusting an operating system to
the constraints of a particular environment should be ignored. But the
designer will solve them much more easily when he fully understands the
principles common to all operating systems.
1.4. DESIGN OBJECTIVES
The key to success in programming is to have a realistic, clearly-defined
goal and use the simplest possible methods to achieve it. Several operating
systems have failed because their designers started with very vague or
overambitious goals. In the following discussion, I will describe two quite
opposite views on the overall objectives of operating systems.
1.4.1. Special-purpose Systems
The operating systems described so far have one thing in common:
Each of them tries to use the available resources in the most simple and
efficient manner to give a restricted, but useful form of computational
service.
The Exec H spooling system, for example, strikes a very careful balance
between the requirements of fast response and efficient utilization. The
overhead of processor and store multiplexing is kept low by executing jobs
one at a time. But with only one job running, processor time is lost when a
job waits for input/output. So to reduce input/output delays, data are
buffered on a fast drum. But since a drum has a small capacity, it becomes
essential to keep the volume of buffered data small. This again depends on
the achievement of short response time for the following reason: the faster
the jobs are completed, the less time they occupy space within the system.
The keys to the success of the Exec H are that the designers: (1) were
aware of the sequential nature of the processor and the drum, and
deliberately kept the degree of multiprogramming low; and (2) took
advantage of their knowledge of the expected workload--a large number of
small programs executed without conversational interaction. In short, the
Exec H is a simple, very efficient system that serves a special purpose.
The SDC Q-32 serves a different special purpose: conversational
programming. It sacrifices 20 per cent of its processor time to give response
within a few seconds to 25 simultaneous users; its operating system
occupies only 16 K words of the internal store. This too is a simple,
successful system.
The two systems do not compete with each other. Each gives the users
a special service in a very efficient manner. But the spooling system is
useless for conversation, and so is the interactive system for serious
computation.
On the other hand, neither system would be practical in an
environment where many large programs run for hours each and where
operators mount a thousand tapes daffy. This may be the situation in a
large atomic research center where physicists collect and process large
volumes of experimental data. An efficient solution to this problem
requires yet another operating system.
The design approach described here has been called design according to
performance specifications. Its success strongly suggests that efficient
sharing of a large installation requires a range of operating systems, each of
which provides a special service in the most efficient and simple manner.
Such an installation might, for example, use three different operating
systems to offer:
(1) conversational editing and preparation of jobs;
(2) non-interactive scheduling of small jobs with fast response; and
(3) non-interactive scheduling of large jobs.
These services can be offered on different computers or at different times
on the same computer. For the users, the main thing is that programs
written in high-level languages can be processed directly by all three
systems.
1.4.2. General-purpose Systems
An alternative method is to make a single operating system which offers
a variety of services on a whole range of computers. This approach has
often been taken by computer manufacturers.
The 0S/360 for the IBM 360 computer family was based on this
philosophy. Mealy (1966) described it as follows:
"Because the basic structure of 0S/360 is equally applicable to
batchedojob and real-time applications, it may be viewed as one of the first
instances of a second-generation operating system. The new objective of
such a system is to accommodate an environment of diverse applications
and operating modes. Although not to be discounted in importance, various
other objectives are not new--they have been recognized to some degree in
prior systems. Foremost among these secondary objectives are:
Increased throughput
[] Lowered response time
[] Increased programmer productivity
[] Adaptability (of programs to changing resources)
[] Expandability
"A second-generation operating system must be geared to change and
diversity. System/360 itself can exist in an almost unlimited variety of
machine configurations."
Notice that performance (throughput and response) is considered to be
of secondary importance to functional scope.
An operating system that tries to be all things to all men naturally
becomes very large. 0S/360 is more than an operating system--it is a library
of compilers, utility programs, and resource management programs. It
contains several million instructions. Nash gave the following figures for the
resource management components of 0S/360 in 1966:
Data management 58.6 K statements
Scheduler 45.0
Supervisor 26.0
Utilities 53.0
Linkage editor 12.3
Testran 20.4
System generator 4.4
219.7 K
(Nato report, 1968, page 67).
Because of its size, the 0S/360 is also quite unreliable. To cite Hopkins:
"We face a fantastic problem in big systems. For instance, in 0S/360 we
have about 1000 errors each release and this number seems to be
reasonably constant" (Nato report, 1969, page 20).
This is actually a very low percentage of errors considering the size of
the system.
This method has been called design according to functional specifications.
The results have been generally disappointing, and the reason is
simply this: Resource sharing is the main purpose of an operating system,
and resources are shared most efficiently when the designer takes full
advantage of his knowledge of the special characteristics of the resources
and the jobs using them. This advantage is immediately denied him by
requiring that an operating system must work in a much more general case.
This concludes the overview of operating systems. In the following
chapters, operating systems are studied at a detailed level in an attempt to
build a sound theoretical understanding of the general principles of
multiprogramming and resource sharing.
1.5. LITERATURE
This chapter owes much to a survey by Rosin (1969) of the
technological development of operating systems.
With the background presented, you will easily follow the arguments in
the excellent papers on the early operating systems mentioned: Atlas
(Kilburn, 1961; Morris, 1967), Exec II (Lynch, 1967 and 1971), CTSS
(Corbato, 1962), and SDC Q-32 (Schwartz, 1964 and 1967). I recommend
that you study these papers to become more familiar with the purpose of
operating systems before you proceed with the analysis of their
fundamentals. The Atlas, Exec H, and SDC systems are of special interest
because they were critically reevaluated after several years of actual use.
The paper by Fraser (1971) explains in some detail the practical
problems of maintaining the integrity of data in a disk filing system in spite
of occasional hardware malfunction and of protecting these data against
unauthorized usage.
CORBATO, F. J., MERWIN.DAGGETT, M., and DALEY, R. C., "An experimental
time-sharing system," Proc. AFIPS Fall Joint Computer Conf., pp. 335-44, May 1962.
FRASER, A. G., "The integrity of a disc based file system," International Seminar on
Operating System Techniques, Belfast, Northern Ireland, Aug.-Sept. 1971.
KILBURN, T., HOWARTH, D. J., PAYNE, R. B., and SUMNER, F. H., "The
Manchester University Atlas operating system. Part I: Internal organization,"
Computer Journal 4, 3, 222-25, Oct. 1961.
LYNCH, W. C. "Description of a high capacity fast turnaround university computing
center," Proc. ACM National Meeting, pp. 273-88, Aug. 1967.
LYNCH, W. C., "An operating system design for the computer utility environment,"
International Seminar on Operating System Techniques, Belfast, Northern Ireland,
Aug.-Sept. 1971.
MORRIS, D., SUMNER, F. H., and WYLD, M. T., "An appraisal of the Atlas
supervisor," Proc. ACM National Meeting, pp. 67.75, Aug. 1967.
ROSIN, R. F., "Supervisory and monitor systems," Computing Surveys 1, 1, pp. 15-32,
March 1969.
SCHWARTZ, J. L, COFFMAN, E. G., and WEISSMAN, C., "A general purpose
time.sharing system," Proc. AFIPS Spring Joint Computer Conf., pp. 397-411, April
1964.
SCHWARTZ, J. I. and WEISSMAN, C., "The SDC timesharing system revisited," Proc.
ACMNationalMeeting, pp. 263-71, Aug. 1967.
Глава 2 - SEQUENTIAL PROCESSES
This chapter describes the role of abstraction and structure in problem
solving, and the nature of computations. It also summarizes the structuring
principles of data and sequential programs and gives an example of
hierarchal program construction.
2.1. INTRODUCTION
The starting point of a theory of operating systems must be a sequential
process--a succession of events that occur one at a time. This is the way our
machines work; this is the way we think. Present computers are built from
a small number of large sequential components: store modules which can
access one word at a time, arithmetic units which can perform one addition
at a time, and peripherals which can transfer one data block at a time.
Programs for these computers are written by human beings who master
complexity by dividing their tasks into smaller parts which can be analyzed
and solved one at a time.
This chapter is a summary of the basic concepts of sequential
programming. I assume you already have an intuitive understanding of
many of the problems from your own programming experience. We shall
begin by discussing the role of abstraction and structure in problem solving.
2.2. ABSTRACTION AND STRUCTURE
Human beings can think precisely only of simple problems. In our
efforts to understand complicated problems, we must concentrate at any
moment on a small number of properties that we believe are essential for
our present purpose and ignore all other aspects. Our partial descriptions of
the world are called abstractions or models.
One form of abstraction is the use of names as abbreviations for more
detailed explanations. This is the whole purpose of terminology. It enables
us to say "operating system" instead of "a set of manual and automatic
procedures that enable a group of people to share a computer installation
efficiently."
In programming, we use names to refer to variables. This abstraction
permits us to ignore their actual values. Names are also used to refer to
programs and procedures. We can, for example, speak of "the editing
program." And if we understand what editing is, then we can ignore, for
the moment, how it is done in detail.
Once a problem is understood in terms of a limited number of aspects,
we proceed to analyze each aspect separately in more detail. It is often
necessary to repeat this process so that the original problem is viewed as a
hierarchy of abstractions which are related as components within
components at several levels of detail.
In the previous example, when we have defined what an editing
program must do, we can proceed to construct it. In doing so, we will
discover the need for more elementary editing procedures which can
"search," "delete," and "insert" a textstring. And within these procedures,
we will probably write other procedures operating on single characters. So
we end up with several levels of procedures, one within the other.
As long as our main interest is the properties of a component as a
whole, it is considered a primitive component, but, when we proceed to
observe smaller, related components inside a larger one, the latter is
regarded as a structured component or system. When we wish to make a
distinction between a given component and the rest of the world, we refer
to the latter as the environment of that component.
The environment makes certain assumptions about the properties of a
component and vice versa: The editing program assumes that its input
consists of a text and some editing commands, and the user expects the
program to perform editing as defined in the manual. These assumptions
are called the connections between the component and its environment.
The set of connections between components at a given level of detail
defines the structure of the system at that level. The connections between
the editing program and its users are defined in the program manual. Inside
the editing program, the connections between the components "search,"
"delete," and "insert" are defined by what these procedures assume about
the properties and location of a textstring and by what operations they
perform on it.
Abstraction and recognition of structure are used in all intellectual
disciplines to present knowledge in forms that are easily understood and
remembered. Our concern is the systematic use of abstraction in the design
of operating systems. We will try to identify problems which occur in all
shared computer installations and define a set of useful components and
rules for connecting them into systems.
In the design of large computer programs, the following difficulties
must be taken for granted: (1) improved understanding of the problems
will change our goals in time; (2) technological innovations will eventually
change our tools; and (3) our intellectual limitations will often cause us to
make errors in the construction of large systems. These difficulties imply
that large systems will be modified during their entire existence by
designers and users, and it is essential that we build such systems with this
in mind. If it is difficult to understand a large system, it is also difficult to
predict the consequences of modifying it. So reliability is intimately related
to the simplicity of structure at all levels.
Figure 2.1(a) shows a complicated system S, consisting of n components
SI, $2 . . . . , Sn. In this system, each component depends directly on
the behavior of all other components. Suppose an average of p simple steps
of reasoning or testing are required to understand the relationship between
one component and another and to verify that they are properly
connected. Then the connection of a single component to its environment
can be verified in (n-1)p steps, and the complete system requires
n(n - 1)p steps.
This can be compared with the system shown in Fig. 2.1(b): There, the
connections between the n components are defined by a common set of
constraints chosen so that the number of steps q required to verify whether
a component satisfies them is independent of the number of components n.
The proof or test effort is now q steps per component and nq steps for the
whole system.
The argument is, of course, extreme, but it does drive home the
following point: If the intellectual effort required to understand and test a
system increases more than linearly with the size of the system, we shall
never be able to build reliable systems beyond a certain complexity. Our
only hope is to restrict ourselves to simple structures for which the effort
of verification is proportional to the number of components.
The importance of precise documentation of system structure can
hardly be overemphasized. Quite often, a group of designers intend to
adopt a simple structure, but they fail to state precisely what the
assumptions are at each level of programming, and instead rely on informal,
spoken agreements. Inevitably, the result is that each member of the group
adds complexity to the structure by making unnecessary or erroneous
assumptions about the behavior of components designed by his colleagues.
The importance of making assumptions explicit is especially felt when an
initial version of a system must be modified; perhaps by a different group
of people.
2.3. COMPUTATIONS
In Chapter 1 the word computation was used intuitively to refer to
program execution. In the following, this concept and its components data
and operations are defined explicitly.
2.3.1. Data and Operations
The exchange of facts or ideas among human beings by speech is based
on mutual agreement on the meaning of certain sounds and combinations
of sounds. Other conventions enable us to express the same ideas by means
of text and pictures, holes in punched cards, polarity of magnetized media,
and modulated electromagnetic waves. In short, we communicate by means
of physical phenomena chosen by us to represent certain aspects of our
world. These physical representations of our abstractions are called data,
and the meanings we assign to them are called their information.
Data are used to transmit information between human beings, to store
information for future use, and to derive new information by manipulating
the data according to certain rules. Our most important tool for the
manipulation of data is the digital computer.
A datum stored inside a computer can only assume a finite set of values
called its type. Primitive types are defined by enumeration of their values,
for example:
type boolean = (false, true)
or by definition of their range:
type integer = - 8388608..8388607
Structured types are defined in terms of primitive types, as is explained
later in this chapter.
The rules of data manipulation are called operations. An operation
maps a finite set of data, called its input, into a finite set of data, called its
output. Once initiated, an operation is executed to completion within a
finite time. These assumptions imply that the output of an operation is a
time-independent function of its input, or, to put it differently: An
operation always delivers the same output values when it is applied to a
given set of input values.
An operation can be defined by enumeration of its output values for all
possible combinations of its input values. This set of values must be finite
since an operation only involves a finite set of data with finite ranges. But
in practice, enumeration is only useful for extremely simple operations
such as the addition of two decimal digits. As soon as we extend this
method of definition to the addition of two decimal numbers of, say, 10
digits each, it requires enumeration of 10 2 0 triples (x, y, x + y)!
A more realistic method is to define an operation by a computational
rule involving a finite sequence of simpler operations. This is precisely the
way we define the addition of numbers. But like other abstractions,
computational rules are useful intellectual tools only as long as they remain
simple.
The most powerful method of defining an operation is by assertions
about the type of its variables and the relationships between their values
before and after the execution of the operation. These relationships are
expressed by statements of the following kind: If the assertion P is true
before initiation of the operation Q, then the assertion R will be true on its
completion. I will use the notation
"P" Q "R"
to define such relationships.
As an example, the effect of an operation sort which orders the n
elements of an integer array A in a non-decreasing sequence can be defined
as follows:
"A: array 1..n of integer"
sort(A);
"for all i, j: 1..n (i ~j implies A(i) ~ A(j))"
Formal assertions also have limitations: They tend to be as large as the
programs they refer to. This is evident from the simple examples that have
been published (Hoare, 1971a).
The whole purpose of abstraction is abbreviation. You can often help
the reader of your programs much more by a short, informal statement that
appeals to a common background of more rigorous definition. If your
reader knows what a Fibonacci number is, then why write
"F: array 0..n of integer & j: O. .n &
for all i: 0..j (F(i) = if i < 2 then i else F(i - 2) + F(i - 1))"
when the following will do
"F(0) to F(j) are the first j + 1 Fibonaeci numbers"
Definition by formal or informal assertion is an abstraction which
enables us to concentrate on what an operation does and ignore the details
of how it is carried out. The type concept is an abstraction which permits
us to ignore the actual values of variables and state that an operation has
the effect defined for all values of the given types.
2.3.2. Processes
Data and operations are the primitive components of computations.
More precisely, a computation is a finite set of operations applied to a
finite set of data in an attempt to solve a problem. If a computation solves
the given problem, it is also called an algorithm. But it is possible that a
computation is meaningless in the sense that it does not solve the problem
it was intended to solve.
The operations of a computation must be carried out in a certain order
of precedence to ensure that the results of some operations can be used by
others. The simplest possible precedence rule is the execution of operations
in strict sequential order, one at a time. This type of computation is called
a sequential process. It consists of a set of operations which are totally
ordered in time.
Figure 2.2(a) shows a precedence graph of the process performed by an
operating system that schedules user computations one at a time. Each
node represents an instance of one of the following operations: r (read user
request), x (execute user computation), and p (print user results). The
directed branches represent precedence of operations. In this case:
rl precedes xl,
xl precedes pl,
pl precedes r2,
r2 precedes x2,
Most of our computational problems require only a partial ordering of
operations in time: Some operations must be carried out before others, but
some of them can also be carried out concurrently. This is illustrated in Fig.
2.2(b) by a precedence graph of a spooling system (see Chapter 1, Fig. 1.3).
Here the precedence rules are:
rl precedes xl and r2,
xl precedes pl and x2,
pl precedes p2,
r2 precedes x2 and r3,
Partial ordering makes concurrent execution of some operations
possible. In Fig. 2.2(b), the executions of the following operations may
overlap each other in time:
r2 and xl,
r3 and x2 and pl,
In order to understand concurrent computations, it is often helpful to
try to partition them into a number of sequential processes which can be
analyzed separately. This decomposition can usually be made in several
ways, depending on what your purpose is.
If you wish to design a spooling system, then you will probably
partition the computation in Fig. 2.2(b) into three sequential processes of a
cyclical nature each in control of a physical resource:
reader process: r l ; r 2 ; r 3 ; . . .
scheduler process: xl;x2;x3;. . .
printer process: pl;p2;p3;. . .
These processes can proceed simultaneously with independent speeds,
except during short intervals when they must exchange data: The scheduler
process must receive user requests from the reader process, and the printer
process must be informed by the scheduler process of where user results are
stored. Processes which cooperate in this manner are called loosely
connected processes.
On the other hand, if you are a user, it makes more sense to recognize
the following processes in Fig. 2.2(b):
job 1: r l ; x l ; p l ;
job 2: r2; x2; p2;
job 3: r3;x3;p3;
Both decompositions are useful for a particular purpose, but each of
them also obscures certain facts about the original computation. From the
first decomposition it is not evident that the reader, scheduler, and printer
processes execute a stream of jobs; the second decomposition hides the fact
that the jobs share the same reader, processor, and printer. A decomposition
of a computation into a set of processes is a partial description or an
abstraction of that computation. And how we choose our abstractions
depends on our present purpose.
The abstractions chosen above illustrate a general principle: In a
successful decomposition, the connections between components are much
weaker than the connections inside components. It is the loose connections
or infrequent interactions between the processes above which make it
possible for us to study them separately and consider their interactions
only at a few, well-defined points.
One of the recurrent themes of this book is process interaction. It is a
direct consequence of the sharing of a computer. Processes can interact for
various reasons: (1) because they exchange data, such as the reader,
scheduler, and printer processes; (2) because they share physical resources,
such as jobl, job2, job3, and so on; or (3) because interaction simplifies our
understanding and verification of the correctness of a computation--this is
a strong point in favor of sequential computations.
Concurrent computations permit better utilization of a computer
installation because timing constraints among physical components are
reduced to a minimum. But since the order of operations in time is not
completely specified, the output of a concurrent computation may be a
time-dependent function of its input unless special precautions are taken.
This makes it impossible to reproduce erroneous computations in order to
locate and correct observed errors. In contrast, the output of a sequential
process can always be reproduced when its input is known. This property
of sequential processes along with their extreme simplicity makes them
important components for the construction of concurrent computations.
The main obstacles to the utilization of concurrency in computer
installations are economy and human imagination. Sequential processes can
be carried out cheaply by repeated use of simple equipment; concurrent
computations require duplicated equipment.
Human beings find it very difficult to comprehend the combined effect
of activities which evolve simultaneously with independent speeds. Those
who have studied the history of nations in school, one by one--American
history, French history, and so on--recall how difficult it was to remember,
in connection with a crucial time of transition in one country, what
happened in other countries at the same time. The insight into our
historical background can be greatly improved by presenting history as a
sequence of stages and by discussing the situation in several countries at
each stage--but then the student finds it equally difficult to remember the
continuous history of a single nation !
It is hard to avoid the conclusion that we understand concurrent events
by looking at sequential subsets of them. This would mean that, even
though technological improvements may eventually make a high degree of
concurrency possible in our computations, we shall still attempt to
partition our problems conceptually into a moderate number of sequential
activities which can be programmed separately and then connected loosely
for concurrent execution.
In contrast, our understanding of a sequential process is independent of
its actual speed of execution. All that matters is that operations are carried
out one at a time with finite speed and that certain relations hold between
the data before and after each operation.
2.3.3. Computers and Programs
The idea of defining complicated computations rigorously implies the
use of a formal language to describe primitive data types and operations as
well as combinations of them. A formal description of a computation is
called a program, and the language in which it is expressed is called a
programming language.
Programs can be used to communicate algorithms among human beings,
but, in general, we write programs to solve problems on computers. We will
indeed define a computer installation as a physical system capable of
carrying out computations by interpreting programs.
Figure 2.3 shows a model of a computer installation. It consists of a
store and one or more processors. The store is a physical component in
which data and programs can be retained for future use. It is divided into a
finite set of primitive components called locations. Each location can store
any one of a finite set of data values.
A processor is a physical component which can carry out a sequential
process defined by a program. During its execution, a program is stored as a
sequence of data called instructions. An instruction consists of four
components defining an operation, its input and output, and a successor
instruction. Data used to identify store locations are called addresses.
The processors can work concurrently and share the common store.
Some of the processors are called terminals or peripheral devices; they are
dedicated to the transfer of data between the environment and the store.
Other processors are called central processors; they operate mainly on
stored data. For our purposes, the distinction between peripheral devices
and central processors is not fundamental; it merely reflects various degrees
of specialization.
The rest of this chapter is a discussion of the fundamental abstraction,
sequential processes. It summarizes methods of structuring data and
sequential programs, and serves as a presentation of the programming
language used throughout the text, a subset of the language Pascal, created
by Wirth. The algorithmic statements of Pascal are based on the principles
and notation of Algol 60. But the data structures of Pascal are much more
general than those of Algol 60.
Pascal permits hierarchal structuring of data and program, extensive
error checking at compile time, and production of efficient machine code
on present computers. It combines the clarity needed for teaching the
subject with the efficiency required for designing operating systems. If you
are familiar with Algol 60, you will find it quite natural to adopt Pascal as
your programming tool.
The following is a brief and very informal description of a subset of
Pascal with the emphasis on the language features that make it different
from Algol 60. Although my summary of Pascal is sufficient for
understanding the rest of the book, I recommend that you study the
official Pascal report, which is written in clear, informal prose (Wirth,
1971a).
I have taken a few minor liberties with the Pascal notation. They are
not mentioned explicitly because my subject is not rigorous language
definition, but operating system principles.
2.4. DATA STRUCTURES
2.4.1. Primitive Data Types
Constants are denoted by numbers or identifiers. A definition of the
form:
constal=cl, a2=c2, . . . , ak = ck;
introduces the identifiers al, a2 . . . . , ak as synonyms of the constants cl,
c2, . .. , ck, for example:
const e = 2.718281828;
Variables are introduced by declarations of the form:
vat vl, v2 . . . . , vk: ;
which associates the identifiers vl, v2 . . . . . , vk with a data type.
A data type is the set of values which can be assumed by a variable. A
type can be defined either directly in the declaration of a variable or
separately in a type definition which associates an identifier T with the
type:
type T = ;
vat vl, v2, . . . , vk: T;
A primitive type is a finite, ordered set of values. The primitive types:
boolean
integer
real
are predefined for a given computer.
Other primitive types can be defined by enumeration of a set of
successive values:
(al, a2, . . . ,ak)
denoted by identifiers al, a2, . . . , ak, for example:
type name of month =
(January, February, March, April, May, June, July,
August, September, October, November, December);
A primitive type can also be defined as a range within another primitive
type:
cmin. . emax
where cmin and cmax are constants denoting the minimum and maximum
values in the range, for example:
type number of day = 1..31;
vat payday: number of day;
war summer month: June..August;
The first example is a variable payday, which can assume the values 1 to 31
(a subrange of the standard type integer). The second example is a variable
summer month, which can assume the values June to August (a subrange of
the type name of month defined previously).
The set of values which can be assumed by a variable v of a primitive
type T can be generated by means of the standard functions:
rain(T) max(T) succ(v) pred(v)
in ascending order:
v:= rain(T);
while v -~ max(T) do v:= succ(v);
or in descending order:
v: = max(T);
while v ? min(T) do v: = pred(v);
2.4.2. Structured Data Types
Structured types are defined in terms of primitive types or in terms of
other structured types using the connection rules of arrays and records.
The type definition:
array D of R
defines a data structure consisting of a fixed number of components of
type R. Each component of an array variable v is selected by an index
expression E of type D:
v(E)
Examples:
type table = array 1..20 of integer;
vat A : table; i: 1. •20;
. . . A ( i ) . . .
vat length of month:
array name of month of number of day;
• . . length of month (February) . ..
The type definition:
record fl: T1; f2: T2;... ; fk: Tk end
defines a data structure consisting of a fixed number of components of
types T1, T 2 , . . . , Tk. The components are selected by identifiers f l ,
f2, . . . , fk. A component fj within a record variable v is denoted:
v.fi
Example:
type date = record
day: number of day;
month: number of month;
year: O. .2000;
end
vat birthday: date;
• . . birthday, month...
Good programmers do not confine themselves exclusively to the
structures defined by an available programming language. They invent
notations for abstractions that are ideally suited to their present purpose.
But they will choose abstractions which later can be represented efficiently
in terms of the standard features of their language.
I will occasionally do the same and postulate extensions to Pascal which
help me to stress essential concepts and, for the moment, ignore trivial
details.
As one example, I will assume that one can declare a variable s
consisting of a sequence of components of type T:
vat s: sequence of T
var s:sequence of T
Initially, the sequence is empty. The value of a variable t of type T can
be appended to or removed from the sequence s by means of the standard
procedures
put(t, s) get(t, s)
The components are removed in the order in which they are appended to
the sequence. In other words, a sequence is a first-in, first-out store.
The boolean function
empty(s)
defines whether or not the sequence s is empty.
The implementation of sequences by means of arrays will be explained
in Chapter 3. For a more detailed discussion of the representation of
various data structures see Knuth (1969).
2.5. PROGRAM STRUCTURES
2.5.1. Primitive Statements
Operations and combinations of them are described by statements. The
primitive statements are exit statements, assignment statements, and
procedure statements.
The exit statement
exit L
is a restricted form of a go to statement. It causes a jump to the end of a
compound statement labeled L:
labelLbegin . . . e x i t L ; . . , end
This use of jumps is an efficient way of leaving a compound statement
in the exceptional cases when a solution to a problem is found earlier than
expected, or when no solution exists. But in contrast to the unstructured
go to statement, the exit statement simplifies the verification of program
correctness.
Suppose an assertion P holds before a compound statement Q is
executed. There are now two cases to consider: Either Q is executed to
completion, in which case an assertion R is known to hold; or an exit is
made from Q when an exceptional case S holds. So the effect of statement
Q is the following:
"P" Q "R or S"
The assignment statement
V: = E
assigns the value of an expression E to a variable v. The expression must be
of the same type as v. Expressions consist of operators and functions
applied to constants, variables, and other expressions. The operators are:
arithmetic: + - * / mod
relational: = ~ < >
boolean: & or not
The function designator
F(al, a2, . . . . ak)
causes the evaluation of a function F with the actual parameters al,
a2, . . . , ak.
The procedure statement
P(al, a2, . . . ,ak)
causes the execution of a procedure P with the actual parameters al,
a2, . . . , ak.
The actual parameters of functions and procedures can be variables and
expressions. Expressions used as parameters are evaluated before a function
or procedure is executed.
Structured statements are formed by connecting primitive statements
or other structured statements according to the following rules:
(1) Concatenation of statements S1, $2, . . . , Sn into a compound
statement:
label L begin S1; $2; . . . ; Sn end
beginS1;S2; . . . ;Sn end
(See Fig. 2.4).
(2) Selection of one of a set of statements by means of a boolean
expression B:
ifB then S1 else $2
if B then S
or by means of an expression E of a primitive type T:
type T = (cl, c2 . . . . , cn);
. , •
case E of
cl: S1;c2: $2; . . . cn: Sn;
end
If E = c] then statement S] is executed (See Fig. 2.5).
Repetition of a statement S while a boolean expression B remains true
while B do S
or repetition of a sequence of statements S1, $2, . . . , Sn until a boolean
expression B becomes true:
repeat 81; $2; . . . ; Sn until B
(See Fig. 2.6).
Another possibility is to repeat a statement S with a succession of
primitive values assigned to a control variable v:
for v:= Emin to Emax do S
(4) Recursion of a procedure P which calls itself:
procedumP( . . . );
begin . . . P; . . . end
(See Fig. 2.7.)
Notice that the analytical effort required to understand the effect of
these structures is proportional to the number of component statements.
For example, in the analysis of an if statement, we must first prove that a
certain assertion R will be true after execution of statement S1, provided
assertions B and P hold before $1 is initiated, and similarly for $2:
"B & P"S1 "R" "not B & P"S2 "R"
From this we infer that
"P" ifB then S1 else $2 "R"
The repetition statements are understood by mathematical induction.
From
"B & P" S "P"
we infer that
"P" while B do S "not B & P"
The aim is to find an invariant P--an assertion which is true before the
iteration is started and remains true after each execution of the statment S.
For records, the following structured statement can be used:
with v do S
It enables the statement S to refer to the components of a record variable v
by their identifiers f l , f2, . . . , fk without qualifying them with the record
identifier v, for example:
with birthday do
begin day:= 19; month:= April; year:= 1938 end
Finally, we have the important abstraction of assigning a name to a
sequence of statements $1, $2, . . . , Sn by means of procedure and
function declarations of the form:
procedure P(pl; p2; . . . ; pk);
< local declarations >
beginS1;S2; . . . ;Snend
function F(pl;p2; . . . ;pk): ;
< local declarations >
begin S1; $2; . . . ; Sn end
where pl, p2, . . . . pk are declarations of formalparameters. The declaration
p] of a constant or variable parameter vj of type Tj has the form:
const vj: Tj var vj: Tj
The preceding specifier can be omitted for constant parameters.
A function F computes a value that must be of a primitive type. At
least one of its statements $1, $2, . . . , Sn must assign a result value to the
function identifier F
F:= E
where E is an expression of the result type.
The function statements must not assign values to actual parameters
and non-local variables. This rule simplifies program verification as will be
explained in Chapter 3.
The declarations of identifiers which are local to a procedure or
function are written before the begin symbol of the statement part. They
are written in the following order:
const
type
var
A program consists of a declaration part and a compound statement.
Finally, I should mention that comments are enclosed in quotes:
"This is a comment"
This concludes the presentation of the Pascal subset.
2.6. PROGRAM CONSTRUCTION
We design programs the same way we solve other complex problems: by
step-wise analysis and refinement. In this section, I give an example of
hierarchal program construction.
2.6.1. The Banker's Algorithm
The example chosen is a resource sharing problem first described and
solved by Dijkstra (1965). Although the problem involves concurrent
processes, it is used here as an example of sequential programming.
An operating system shares a set of resources among a number of
concurrent processes. The resources are equivalent in the sense that when a
process makes a request for one of them, it is irrelevant which one is
chosen. Examples of equivalent resources are peripheral devices of the same
type and store pages of equal size.
When a resource has been allocated to a process, it is occupied until the
process releases it again. When concurrent processes share resources in this
manner, there is a danger that they may end up in a deadlock, a state in
which two or more processes are waiting indefinitely for an event that will
never happen.
Suppose we have 5 units of a certain resource, and we are in a state
where 2 units are allocated to a process P and 1 unit to another process Q.
But both processes need two more units to run to completion. If we are
lucky, one of them, say Q, will acquire the last two units, run to
completion, and release all three of its units in time to satisfy further
requests from P. But it is also possible that P and Q both will acquire one of
the last two units and then (since there are no more) will decide to wait
until another unit becomes available. Now they are deadlocked: P cannot
continue until Q releases a unit; Q cannot continue until P releases a unit;
and each of them expects the other to resolve the conflict.
The deadlock could have been prevented by allocating all units needed
by P (or Q) at the same time rather than one by one. This policy would
have forced P and Q to run at different times, and as we have seen in
Chapter 1, this is often the most efficient way of using the resources. But
for the moment, we will try to solve the problem without this restriction.
Let me define the problem more precisely in Dijkstra's terminology: A
banker wishes to share a fixed capital of florins among a fixed number of
customers. Each customer specifies in advance his maximum need for
florins. The banker will accept a customer if his need does not exceed the
capital.
During a customer's transactions, he can only borrow or return florins
one by one. It may sometimes be necessary for a customer to wait before
he can borrow another florin, but the banker guarantees that the waiting
time will always be finite. The current loan of a customer can never exceed
his maximum need.
If the banker is able to satisfy the maximum need of a customer, then
the customer guarantees that he will complete his transactions and repay
his loan within a finite time.
The current situation is safe if it is possible for the banker to enable all
his present customers to complete their transactions within a finite time;
otherwise, it is unsafe.
We wish to find an algorithm which can determine whether the banker's
current situation is safe or unsafe. If the banker has such an algorithm, he
can use it in a safe situation to decide whether a customer who wants to
borrow another florin should be given one immediately or told to walt. The
banker makes this decision by pretending to grant the florin and then
observing whether this leads to a safe situation or not.
The situation of a customer is characterized by his current loan and his
further claim where
claim = need - loan
The situation of the banker is characterized by his original capital and
his current amount of cash where
cash = capital - sum of loans
The algorithm that determines whether the overall situation is safe or
not is quite simple. Let me illustrate it by an example: Fig. 2.8 shows a
situation in which three customers, P, Q, and R, share a capital of 10
florins. Their combined need is 20 florins. In the current situation, Fig.
2.8(a), customer Q has a loan of 2 florins and a claim of 1 florin; this is
denoted 2 (1). For P and R the loans and claims are 4 (4) and 2 (7)
respectively. So the available cash C at the moment is 10 - 4 - 2 - 2 = 2.
The algorithm examines the customers one by one, looking for one who
has a claim not exceeding the cash. In Fig. 2.8(a), customer Q has a claim
of 1. Since the cash is 2, customer Q will be able in this situation to
complete his transactions and return his current loan of 2 florins to the
banker.
After the departure of customer Q, the situation will be the one shown
in Fig. 2.8(b). The algorithm now scans the remaining customers and
compares their claims with the increased cash of 4 florins. It is now possible
to satisfy customer P completely.
This leads to the situation in Fig. 2.8(c) in which customer R can
complete his transactions. So finally, in Fig. 2.8(d) the banker has regained
his capital of 10 florins. Consequently, the original state Fig. 2.8(a) was
safe.
It is possible to go from the safe state in Fig. 2.8(a) to an unsafe
situation, such as the one in Fig. 2.9(a). Here, the banker has granted a
request from customer R for another florin. In this new situation, customer
Q can be satisfied. But this leads us to the situation in Fig. 2.9(b) in which
we are stuck: neither P nor R can complete their transactions.
If the banker's algorithm finds that a situation is unsafe, this does not
necessarily mean that a deadlock will occur--only that it might occur. But
if the situation is safe, it is always possible to prevent a deadlock. Notice
that the banker prevents deadlocks in the same way as we originally
proposed: by serving the customers one at a time; but the banker only does
so in situations where it is strictly necessary.
We will now program the banker's algorithm for the general case
considered by Habermann (1969), in which the banker's capital consists of
several currencies: florins, dollars, pounds, and so on.
2.6.2. A Hierarchal Solution
The first version of the banker's algorithm is trivial:
type S = ?
function safe(current state: S): boolean;
It consists of a boolean function safe with a parameter defining the current
state. The details of the function and the type of its parameter are as yet
unknown.
The first refinement (Algorithm 2.1) expresses in some detail what the
ALGORITHM 2. 1 The Banker's Algorithm
type S = ?
function safe(current state: S): boolean;
vat state: S;
begin
state:= current state;
complete transactions(state);
safe: = all transactions completed(state);
end
function safe does: It simulates the completion of customer transactions as
far as possible. If all transactions can be completed, the current state is safe.
In the second refinement (Algorithm 2.2), the state is decomposed
into: (1) an array defining the claim and loan of each customer and
whether or not that customer's transactions have been completed; and (2)
two components defining the capital and cash of the banker. The exact
representation of currencies (type C) is still undefined.
The procedure complete transactions is now defined in some detail. It
examines the transactions of one customer at a time: If they have not
already been completed and completion is possible, the procedure simulates
the return of the customer's loan to the banker. This continues until no
more transactions can be completed.
ALGORITHM 2.2 The Banker's Algorithm (cont.)
type S = record
transactions: array B of
record
claim, loan: C;
completed: boolean;
end
capital, cash: C;
end
B = 1..number of customers;
C=?
procedure complete transactions(vat state: S);
vat customer: B; progress: boolean;
begin
with state do
repeat
progress:= false;
for every customer do
with transactions(customer) do
ff not completed then
if completion possible(claim, cash) then
begin
return loan(loan, cash);
completed: = true;
progress:= true;
end
until not progress;
end
The statement
for every customer do...
is equivalent to
for customer:= rain(B) to max(B) do...
Algorithm 2.3 shows that the current state is safe if the banker
eventually can get his original capital back.
ALGORITHM 2.3 The Banker's AIgorithm (cont.)
function all transactions completed(state: S): boolean;
begin
with state do
all transactions completed := capital = cash;
end
In the third refinement (Algorithm 2.4), we define the representation
of a set of currencies as an array of integers and write the details of the
function completion possible, which shows that the transactions of a single
customer can be completed if his claim of each currency does not exceed
the available cash.
ALGORITHM 2.4 The Banker's Algorithm (cont.)
type C = array D of integer;
D = 1..number of currencies;
function completion possible(claim, cash: C): boolean;
vat currency: D;
label no
begin
for every currency do
if claim(currency) > cash(currency)then
begin
completion possible:= false;
exit no;
end
completion possible:= true;
end
Algorithm 2.5 shows the final details of the banker's algorithm.
ALGORITHM 2.5 The Banker's Algorithm (cont.)
procedure return loan(var loan, cash: C);
vat currency: D;
begin
for every currency do
cash (currency): = cash (currency) + loan (currency);
end
To be honest, I do not construct my first version of any program in the
orderly manner described here. Nor do mathematicians construct proofs in
the way in which they present them in textbooks. We must all experiment
with a problem by trial and error until we understand it intuitively and
have rejected one or more incorrect solutions. But it is important that the
final result be so well-structured that it can be described in a step-wise
hierarchal manner. This greatly simplifies the effort required for other
people to understand the solution.
2.6.3. Conclusion
I have constructed a non-trivial program (Algorithm 2.6) step by step in
a hierarchal manner. At each level of programming, the problem is
described in terms of a small number of variables and operations on these
variables.
ALGORITHM 2.6 The Complete Banker's AIgorithm
type S = record
transactions: array B of
record
claim, loan: C;
completed: boolean;
end
capital, cash: C;
end
B = 1..number of customers;
C = array D of integer;
D = 1..number of currencies;
function safe(current state: S): boolean;
vat state: S;
procedure complete transactions(vat state: S);
vat customer: B; progress: boolean;
Sec. 2.6. PROGRAM CONSTRUCTION
function completion possible(claim, cash: C): boolean;
var currency: D;
label no
begin
for every currency do
if claim(currency) > cash(currency)then
begin
completion possible:= false;
exit no;
end
completion possible := true;
end
procedure return loan(var loan, cash: C);
var currency: D;
begin
for every currency do
cash(currency):= cash(currency) + loan(currency);
end
begin
with state do
repeat
progress:= false;
for every customer do
with transactions(customer) do
if not completed then
if completion possible(claim, cash) then
begin
return loan(loan, cash);
completed: = true;
progress: = true;
end
until not progress;
end
function all transactions completed(state: S): boolean;
begin
with state do
all transactions completed:= capital = cash;
end
begin
state:= current state;
complete transactions( sta te ) ;
safe: = all transactions completed(state);
end
At the most abstract level, the program consists of a single variable,
current state, and a single operation, safe. If this operation had been
available as a machine instruction, our problem would have been solved.
Since this was not the case, we wrote another program (Algorithm 2.1),
which can solve the problem on a simpler machine using the operations
complete transactions and all transactions completed.
This program in turn was rewritten for a still simpler machine
(Algorithms 2.2 and 2.3). The refinement of previous solutions was
repeated until the level of detail required by the available machine was
reached.
So the design of a program involves the construction of a series of
programming layers, which gradually transform the data structures and
operations of an ideal, non-existing machine into those of an existing
machine. The non-existing machines, which are simulated by program, are
called virtual machines to distinguish them from the physical machine.
It was mentioned in Section 1.1.3 that every program simulates a
virtual machine that is more ideal than an existing machine for a particular
purpose: A machine that can execute the banker's algorithm is, of course,
ideally suited to the banker's purpose. We now see that the construction of
a large program involves the simulation of a hierarchy of virtual machines.
Figure 2.10 illustrates the virtual and physical machines on which the
banker's algorithm is executed.
At each level of programming, some operations are accepted as
primitives in the sense that it is known what they do as a whole, but the
details of how it is done are unknown and irrelevant at that level.
Consequently, it is only meaningful at each level to describe the effect of
the program at discrete points in time before and after the execution of
each primitive. At these points, the state of the sequential process is
defined by assertions about the relationships between its variables, for
exam ple:
"all transactions completed =- capital = cash"
Since each primitive operation causes a transition from one state to
another, a sequential process can also be defined as a succession of states in
time.
As we proceed from detailed to more abstract levels of programming,
some concepts become irrelevant and can be ignored. In Algorithm 2.2, we
must consider assertions about the local variable, progress, as representing
distinct states during the execution of the partial algorithm. But at the level
of programming where Algorithm 2.2 is accepted as a primitive, complete
transactions, the intermediate states necessary to implement it are completely
irrelevant.
So a state is a partial description of a computation just like the
concepts sequential process and operation. A precise definition of these
abstractions depends on the level of detail desired by the observer. A user
may recognize many intermediate states in his computation, but for the
operating system in control of its execution, the computation has only a
few relevant states, such as "waiting" or "running."
In order to test the correctness of a program, we must run it through all
its relevant states at least once by supplying it with appropriate input and
observing its output. I remarked in Section 2.3.1 that the definition of
operations by enumeration of all possible data values is impractical except
in extremely simple cases. The same argument leads to the conclusion that
exhaustive testing of operations for all possible input values is out of the
question.
If we were to test the addition of two decimal numbers of 10 digits
each exhaustively, it would require 1020 executions of a program loop of,
say 10/~sec, or, all in all, 3 * 107 years. The only way to reduce this time
is to use our knowledge of the internal structure of the adder. If we know
that it consists of 10 identical components, each capable of adding two
digits and a carry, we also know that it is sufficient to test each component
separately with 10 * 10 * 2 combinations of input digits. This insight
immediately reduces the number of test cases to 2000 and brings the total
test time down to only 20 msec.
Returning to the problem of testing the correctness of a program such
as the banker's algorithm, we must accept that such a test is impossible at a
level where it is only understood as a primitive: safe. We would have to
exhaust all combinations of various currencies and customers in every
possible state! But if we take advantage of the layered structure of our
programs (see Fig. 2.10), we can start at the machine level and demonstrate
once and for all that the machine instructions work correctly. This proof
can then be appealed to at higher programming levels independent of the
actual data values involved. At the next level of programming, it is proved
once and for all that the compiler transforms Pascal statements correctly
into machine instructions. And when we reach the levels at which the
banker's algorithm is programmed, it is again possible to test each level
separately, starting with Algorithm 2.5 and working towards Algorithm 2.1.
In this chapter, I have stressed the need for simplicity in programming.
I cannot accept the viewpoint that the construction of programs with a
pleasant structure is an academic exercise that is irrelevant or impractical to
use in real life. Simplicity of structure is not just an aesthetic pursuit--It is
the key to survival in programming! Large systems can only be fully
understood and tested if they can be studied in small, simple parts at many
levels of detail.
2.7. LITERATURE
This chapter has briefly summarized concepts which are recognized and
understood, at least intuitively, by most programmers.
The role of hierarchal structure in biological, physical, social, and
conceptual systems is discussed with deep insight by Simon (1962).
In the book by Minsky (1967) you will find an excellent and simple
presentation of the essential aspects of sequential machines and algorithms.
Homing and Randell (1972) have analyzed the concept "sequential
process" from a more formal point of view.
Hopefully, this book will make you appreciate the Pascal language. It is
defined concisely in the report by Wirth (1971a).
A subject which has only been very superficially mentioned here is
correctness proofs of algorithms. It was suggested independently by Naur
and Floyd and further developed by Hoare (1969).
The practice of designing programs as a sequence of clearly separated
layers is due to Dijkstra (1971a). He has successfully used it for the
construction of an entire operating system (1968). Eventually, this
constructive approach to programming may change the field from a
hazardous application of clever tricks into a mature engineering discipline.
DIJKSTRA, E. W., "The structure of THE multiprogramming system," Comm. ACM
11, 5, pp. 341-46, May 1968.
DIJKSTRA, E. W., A short introduction to the art of programming, Technological
University, Eindhoven, The Netherlands, Aug. 1971a.
HOARE, C. A. R., "An axiomatic basis for computer programming," Comm. ACM 12,
10, pp. 576-83, Oct. 1969.
HORNING, J. J. and RANDELL, B., "Process structuring," University of Newcastle
upon Tyne, England, 1972.
MINSKY, M. L., Computation: finite and infinite machines, Prentice-Hall Inc.,
Englewood Cliffs, New Jersey, 1967.
SIMON, H. A., "The architecture of complexity," Proc. American Philosophical Society
106, 6, pp. 468-82, 1962.
WIRTH, N., "The programming language Pascal," Acta Informatica 1, 1, pp. 35-63,
1971a.
Глава 3 CONCURRENT PROCESSES
This chapter is a study of concurrent processes. It emphasizes the role
of reproducible behavior in program verification and compares various
methods of process synchronization: critical regions, semaphores, message
buffers, and event queues. It concludes with an analysis of the prevention
of deadlocks by hierarchal ordering of process interactions.
3.1. CONCURRENCY
The process concept was introduced in the previous chapter (see
Section 2.3.2). In the following, I will summarize the basic properties of
sequential and concurrent processes, and introduce a language notation for
the latter.
3.1.1. Definition
A process is a sequence of operations carried out one at a time. The
precise definition of an operation depends on the level of detail at which
the process is described. For some purposes, you may regard a process as a
single operation A, as shown on top of Fig. 3.1. For other purposes, it may be
more convenient to look upon the same process as a sequence of simpler
operations B, C, D, and E. And when you examine it in still more detail,
previously recognized operations can be partitioned into still simpler ones:
F, G, H, and so on.
If you compare this picture with the hierarchy of machines on which
the banker's algorithm is executed (see Fig. 2.10), you will see that as we
proceed to more detailed levels of programming, a process is described in
terms of increasingly simpler operations which are carried out in increasingly
smaller grains of time.
At all levels of programming, we assume that when an operation is
initiated, it terminates within a finite time and delivers output which is a
time-independent function of its input (see Section 2.3.1).
If the variables of a process are inaccessible to other processes, it is easy
to show by induction that the final output of a sequence of operations will
be a time-independent function of the initial input. In this case, a process
can be regarded as a single operation, provided that it terminates.
But if one process can change the variables of another process, the
output of the latter may depend on the relative speed of the processes. In
this case, a process cannot be regarded as a single operation. I will describe
the problem of multiprogramming systems with time-dependent behavior
later and proceed here to describe the basic properties of concurrent
processes.
Processes are concurrent if their executions overlap in time. Figure 3.2
shows three concurrent processes, P, Q, and R.
Whether the individual operations of concurrent processes are
overlapped or interleaved in time, or both (as shown in Fig. 3.3), is
irrelevant. Whenever the first operation of one process is started before the
last operation of another process is completed, the two processes are
concurrent.
In an installation where several processors work simultaneously, the
machine instructions of concurrent processes can overlap in time. But if
one processor is multiplexed among concurrent processes, the machine
instructions of these processes can only be interleaved in time. The logical
problems turn out to be the same in both cases; they are caused by our
ignorance of the relative speeds of concurrent processes.
3.1.2. Concurrent Statements
The language notation
cobegin S1; $2; . . . ; Sn coend
indicates that the statements $1, $2, . . . , Sn can be executed
concurrently. It was first proposed by Dijkstra (1965).
To define the effect of a concurrent statement, we must take into
account the statements SO and Sn+l, which precede and follow it in a
given program:
This piece of program can be represented by the precedence graph shown in
Fig. 3.4. The desired effect is to execute SO first, and then execute S1,
$2 . . . . , Sn concurrently; when all the statements S1, $2, . . . , Sn have
been terminated, the following statement Sn +1 is executed.
Concurrent statements can be arbitrarily nested, for example:
cobegin
$1;
begin
$2;
cobegin $3; $4 coend
$5;
end
$6;
coend
This corresponds to the precedence graph shown in Fig. 3.5.
3.1.3. An Example: Copying
The use of the concurrent statement is illustrated by Algorithm 3.1,
which copies records from one sequence to another.
ALGORITHM 3.1 Copying of a Sequence of Records
procedure copy (vaz f, g: sequence of T);
vat s, t: T; completed: boolean;
begin
if not empty(f) then
begin
completed:= false;
get(s, f);
repeat
t:= s;
cobegin
put(t, g);
ff empty(f) then completed:= true
else get(s, f);
coend
until completed;
end
end
The variables used are two sequences, f and g, of records of type T; two
buffers, s and t, each holding one record; and a boolean, indicating whether
or not the copying has been completed.
The algorithm gets a record from the inPut sequence, copies it from one
buffer to another, puts it on the output sequence, and, at the same time,
gets the next record from the input sequence. The copying, output, and
input are repeated until the input sequence is empty.
Figure 3.6 shows a precedence graph of this computation using the
operations: g (get record), c {copy record), and p (put record).
3.2. FUNCTIONAL SYSTEMS
In the following, we consider more precisely what time-independent or
functional behavior means, and under what circumstances multiprogramming
systems have this property. As an introduction to this topic, we
will first examine our ability to verify the correctness of programs.
3.2.1. Program Verification
The ideal program is one which is known with absolute certainty to be
correct. It has been argued that this can be achieved by rigorous proofs
(Hoare, 1969 and 1971a). I believe that the use of proof techniques
contributes to the correctness of programs by forcing programmers to
express solutions to problems in two different ways: by an algorithm and
by a proof. But it must be remembered that a proof is merely another
formal statement of the same size as the program it refers to, and as such it
is also subject to human errors. This means that some other form of
program verification is still needed, at least for large programs.
The next best thing to absolute correctness is immediate detection of
errors when they occur. This can be done either at compile time or at run
time. In either case, we rely on a certain amount of redundancy in our
programs, which makes it possible to check automatically whether
operations are consistent with the types of their variables and whether they
preserve certain relations among those variables.
Error detection at compile time is possible only by restricting the
language constructions; error detection at run time is possible only by
executing redundant statements that check the effect of other statements.
In practice, both methods are used, but there are limits to how far one can
go in each direction: At some point, severe language restrictions and
excessive run-time checking will make the system useless for practical
purposes.
This leaves a class of errors that is caught neither at compile time nor at
run time. It seems fair to say that these errors can only be located and
corrected if programs have functional behavior which enables the designer
to reproduce errors under controlled circumstances.
One can subject a sequential program to a fairly systematic initial
testing by examining its internal structure and defining a sequence of input
data which will cause it to execute all statements at least once.
Dijkstra once remarked that "program testing can be used to show the
presence of errors, but never their absence" (Nato report, 1969). Even a
systematically tested program may contain some undetected errors after it
has been released for normal use. If it has been subject to intensive initial
testing, the remaining errors are usually quite subtle. Often, the designer
must repeat the erroneous computation many times and print successive
values of various internal variables to find out in which part of the program
the error was made.
So program testing is fundamentally based on the ability to reproduce
computations. The difficulty is that we must reproduce them under varying
external circumstances. In a multiprogramming system, several user
computations may be in progress simultaneously. These computations are
started at unpredictable times at the request of users. 8o, strictly speaking,
the environment of a program is unique during each execution. Even when
computations are scheduled one at a time, a programming error is frequently
observed by a user on one installation and corrected by a designer
on another installation with a different configuration.
This has the following consequences: (1) an operating system must
protect the data and physical resources of each computation against
unintended interference by other computations; and (2) the results of each
computation must be independent of the speed at which the computation
is carried out.
The protection problem is discussed in a later chapter. Here we are
concerned with the assumption of speed-independence. This assumption is
necessary because a computation has no influence on the rate at which it
proceeds. That rate depends on the presence of other computations and the
possibly dynamic policy of scheduling. It is also a highly desirable
assumption to make, considering our difficulty in understanding concurrent
processes in terms of their absolute speeds.
The next section illustrates the consequences of time-dependent
behavior of erroneous programs.
3.2.2. Time-dependent Errors
Consider again Algorithm 3.1, which copies records from a sequence f
to another sequence g. Initially, f contains a sequence of records 1, 2, . . . ,
m, while g is empty. We will denote this as follows:
f=(1,2, . . . ,m) g=(n)
When the copying has been completed, g contains all the records in
their original order while f is empty:
f = (D) g = (1, 2 . . . . , m)
The repetition statement in Algorithm 3.1 contains three component
statements, which will be called COPY, PUT, and GET:
COPY_= t:= s
PUT -~ put(t, g)
GET -= if empty(f) then completed:= true
else get (s, f)
They are used as follows:
repeat
COPY cobegin PUT; GET coend
until completed;
Now suppose the programmer expresses the repetition by mistake as
follows:
repeat
cobegin COPY; PUT; GET coend
until completed;
The copying, output, and input of a single record can now be executed
concurrently.
To simplify the argument, we will only consider cases in which the
statements COPY, PUT, and GET can be arbitrarily interleaved in time, but
we will ignore the possibility that they can overlap in time.
Suppose the first record is copied correctly from sequence f to g. The
computation will then be in the following state after the first execution of
the repetition statement:
s =2&f=(3, . . . ,m)&t =l&g=(1)
The following table shows the possible execution sequences of COPY,
PUT, and GET during the second execution of the repetition statement. It
also shows the resulting output sequence g after each execution sequence.
COPY; PUT; GET leads to g = (1, 2)
COPY; GET; PUT leads to g = (1, 2)
PUT; COPY; GET leads to g = (1, 1)
PUT; GET; COPY leads to g=(1,1)
GET; COPY; PUT leads to g = (1, 3)
GET; PUT; COPY leads to g = (1, 1)
One of these sequences is shown below in detail:
"s=2&f=(3, . . . ,m)&t=l&g =(1)''
GET;
"s =3&f=(4, . . . ,m)&t =l&g=(1)''
COPY;
"s= 3& f=(4 . . . . . m)& t= 3&g=(1) ''
PUT;
"s=3&f =(4, . . . ,m)&t =3&g = ( 1 , 3 ) ' '
The erroneous concurrent statement can be executed in six different
ways with three possible results: (1) if copying is completed before input
and output are initiated, the correct record will be output; (2) if output is
completed before copying is initiated, the previous record will again be
output; and (3) if input is completed before copying is initiated and this in
turn completed before output is initiated, the next record will instead be
output.
This is just for a single record of the output sequence. If we copy a
sequence of 10000 records, the program can give of the order of 3 l°°°°
different results! It is therefore extremely unlikely that the programmer
will ever observe the same result twice.
If we consider the general case in which concurrent operations overlap
in time, we are unable even to enumerate the possible results of a
programming error without knowing in detail how the machine reacts to
attempts to perform more than one operation simultaneously on the same
variable.
The actual sequence of operations in time will depend on the presence
of other (unrelated) computations and the scheduling policy used by the
operating system to share the available processors among them. The
programmer is, of course, unaware of the precise combination of external
events that caused his program to fail and is unable to repeat it under
controlled circumstances. When he repeats the program execution with the
same input, it will sometimes produce correct results, sometimes different
erroneous results.
The programmer's only hope of locating the error is to study the
program text. This can be very frustrating (if not impossible) if the text
consists of thousands of lines and one has no clues about where to look for
the error.
It can be argued that such errors are perfectly reproducible in the
sense that if we had observed and recorded the behavior of all
computations and physical resources continuously during the execution
of a given computation, it would also have been possible to reproduce
the complete behavior of the computer installation. But this is a purely
hypothetical possibility. Human beings are unable to design programs
which take into account all simultaneous events occurring in a large
computer installation. We must limit ourselves to programs which can be
verified independently of other programs. So, in practice, events causing
time-dependent results are not observed and recorded, and must therefore
be considered irreproducible.
Concurrent programming is far more hazardous than sequential
programming unless we ensure that the results of our computations are
reproducible in spite of errors. In the example studied here, this can
easily be checked at compile time, as I will describe in the next section.
- 3.2.3. Disjoint Processes
The two concurrent processes in Algorithm 3.1
cobegin
put(t, g);
if empty(f} then completed: = true
else get(s, f);
coend
are completely independent processes which operate on disjoint sets of
variables (t, g) and (s, f, completed).
Concurrent processes which operate on disjoint sets of variables are
called disjoint or non-interacting processes.
In the erroneous version of Algorithm 3.1
cobegin
t:=s;
put(t, g);
if empty(f) then completed:= true
else get(s,f);
coend
the processes are not disjoint: The output process refers to a variable t
changed by the copying process, and the copying process refers to a
variable s changed by the input process.
When a process refers to a variable changed by another process, it is
inevitable that the result of the former process will depend on the time at
which the latter process makes an assignment to this variable.
These time-dependent errors can be caught at compile time if the
following restriction on concurrent statements is made: The notation
cobeginS1;S2; . . . ;Sn coend
indicates that statements S1, $2, . . . , Sn define disjoint processes which
can be executed concurrently. This means that a variable vi changed by a
statement Si cannot be referenced by another statement Sj {where j ? i).
In other words, we insist that a variable subject to change by a process
must be strictly private to that process; but disjoint processes can refer
to common variables not changed by any of them.
To enable the compiler to check the disjointness of processes, the
language must have the following property: It must be possible to
determine the identity of a statement's constant and variable parameters by
inspecting the statement.
In Pascal this is certainly possible for assignment statements and
expressions involving variables of primitive types and record types only.
Components of arrays are selected by indices determined at run time
only. So, at compile time it is necessary to require that an entire array be
private to a single process within a concurrent statement.
Functions present no problems in Pascal since they cannot perform
assignment to non-local variables. But with procedures, it is necessary to
observe a certain discipline in the use of parameters.
The language notation must distinguish between constant and variable
parameters. Pascal already does this: A comparison of a procedure
statement
P(a, b)
with the corresponding procedure declaration
procedure P(const c: C; vat v: V)
immediately shows that the procedure statement will leave variable a
unchanged, but may change variable b.
To make a cleat distinction between constant and variable parameters,
we will not permit a variable to occur both as a constant and as a variable
parameter in the same procedure statement.
Without this rule one cannot make simple assertions about the effect of
procedures. Consider, for example, the following procedure:
procedure P(const c: integer; vat v: integer);
begin v := c + 1 end
If we call this procedure as follows:
P(a, b)
where a and b are integer variables, we can make the assertion that upon
return from the procedure, b = a + 1. But this assertion leads to a
contradiction in the following case:
P(a, a)
namely a = a + 1.
Another contradiction occurs if the same variable is used twice as a
variable parameter in a procedure statement. For example, if a program
contains the following procedure:
procedure P(var v, w: integer);
begin v: = 1; w: = 2 end
then the call
P(a, b)
leads to the result a = 1 & b = 2. But when the same assertion is used for
the call
P(a, a)
it leads to the contradiction a = 1 & a = 2.
These contradictions can be avoided by obeying the following rule: All
variables used as variable parameters in a procedure statement must be
distinct and cannot occur as constant parameters in the same statement.
A Pascal procedure can also have side effects: It can change non-local
variables directly, as is shown in the following example:
var v: T;
procedure P;
begin . . . v:= E . . . end
or call other procedures which have side effects. So, it is not possible to
identify the variables involved by simple inspection of a procedure
statement and the corresponding procedure declaration.
One possibility is to specify all global variables referred to within a
procedure P (and within other procedures called by P) explicitly in the
procedure declaration:
vat v: T;
procedure P;
global v;
begin . . . v:= E . . . end
A more radical solution is to forbid side effects and check that a
procedure only refers to non-local variables used as parameters.
A final difficulty is the exit statement. Consider the following program:
label L
begin . . . label M
begin . . .
cobegin
• . . exit L;
. . . exit M;
coend
end
end
Here we have two processes trying to exit different compound statements
labeled L and M at the same time. But this is meaningless: First, we have
defined that a concurrent statement is terminated only when all its
processes are terminated; so a single process cannot terminate it by an exit
statement; second, it is possible for the program to continue as a purely
sequential process after the concurrent statement (this is indeed the case in
the above example), so the desire to continue the process simultaneously at
two different points is a contradiction.
We will therefore also forbid jumps out of concurrent statements.
The rules presented here are due to Hoare (1971b). They enable a
compiler to identify the constant and variable parameters of every
statement by simple inspection and check the disjointness of concurrent
processes. This property also simplifies the analysis of programs by people
and should therefore be regarded as a helpful guideline for program
structuring--not as a severe language restriction.
When these rules are obeyed, the axiomatic properties of concurrent
statements become very simple. Suppose statements S1, $2, . . . , Sn
define disjoint processes, and each Si is an operation that makes a result Ri
true if a predicate Pi holds before its execution• In other words, it is known
that the individual statements have the following effects:
"PI" S1 "RI"
"P2" $2 "R2"
. . . . .
"Pn" Sn "R n"
(I assume that statements and assertions made about them only refer to
variables which are accessible to the statements according to the rule of
disjointness).
The concurrent statement
cobegin S1; $2; . . . ; Sn coend
can then be regarded as a single operation S with the following effect
"P" S "R"
where
P -= Pi & P2 & . . . & Pn
R=- RI&R2&...&Rn
As Hoare (1971b) puts it: "Each Si makes its contribution to the
common goal." The previous result can also be stated as follows:
Disjointness is a sufficient condition for time-independent behavior of
concurrent processes.
The usefulness of disjoint processes is of course, limited. In some cases,
concurrent processes must be able to access and change common variables.
I will introduce language constructs suitable for process interactions of this
kind later in this chapter.
In the following we will derive a sufficient condition for timeindependent
behavior of concurrent interacting processes. To do this, it is
first necessary to formulate the requirement of time-independence in a
slightly different way using the concept--the history of a computation.
3.2.4. The History Concept*
Consider a program which is connected to its environment by a single
input variable x and a single output variable y. During the execution of the
program, the input variable x assumes a sequence of values determined by
the environment:
x:= a0;x:= a l ; . . . ;x:= am;
and the output variable y assumes another sequence of values determined
by the program and its input:
y:= b0;y:= b l ; . . . ;y:= bn;
Together, the input sequence X = (a0, al, . . . , am) and the output
sequence Y = (bO, bl . . . . , bn) define the history of a computation. For
analytical purposes, the history can be represented by an array in which
each row defines the sequence of values assumed by a given variable, as
shown in Fig. 3.7.
A history is simply a listing of the sequences of input and output values
observed during a computation. It is a time-independent representation that
says nothing about the precise time at which these values were assigned to
the variables.
When a computation involves several variables, the history array
contains a row for each of them. We can divide the complete history into
the input history and the output history. Sometimes we will include the
history of internal variables in the output history.
A program is functional if the output history Y of its execution always
is a time-independent function f of its input history X:
Y = f(x)
A functional program produces identical output histories every time it is
executed with the same input history, independent of its speed of
execution.
As an example, let us again look at Algorithm 3.1 in which an input
sequence
f=(1,2, . . . ,m)
is copied to an (initially empty) output sequence g.
Figure 3.8 shows the sequence of assignments made initially during the
execution of this algorithm.
From these values (and the ones assigned during the rest of the
computation), we derive the history shown in Fig. 3.9.
The input history defines the sequence of assignments to the variable s;
the output history comprises the variables completed and t.
Notice that the rows in a history array are not necessarily of the same
completed
length because the number of assignments may be different for each
variable.
Notice also that the elements in a given column of a history array do
not necessarily define values assigned at the same time. A comparison of
Figs. 3.8 and 3.9 immediately shows that all values in the leftmost column
of the history array were assigned at different times. When a program
contains concurrent statements, the sequence in which assignments to
disjoint variables are made depends on the relative rates of the processes.
So, in general there are many different sequences in which a given
history can be produced in time. The functional requirement only says that
when the sequence of values is known for each input variable, it must be
possible to predict the sequence of values for each output variable,
independent of the rate at which they will be produced.
In some cases, we are still interested in observing actual rates of
progress. For this purpose, time is regarded as a discrete variable which is
increased by one each time an assignment to at least one of the variables
considered has been completed.
These instants of time are indicated explicitly in Fig. 3.8. In a history
array, an instant of time is defined by a line which crosses each row exactly
once. Such a time slice defines the extent of the history at a particular
instant.
Figure 3.10 shows the previous history with two successive time slices tl
and t2, which define the extent of the history after the first assignments to
the variables completed and s.
A history H1 is called an initial part of another history H2 if HI is
contained in H2. More precisely, H1 and H2 must have the same number of
rows, and each row in H1 must be equal to an initial part of the
corresponding row in H2; but some rows in HI may be shorter than the
corresponding rows in H2. This relationship is denoted
H1 ~ H2
During a computation, every time slice t defines an initial part H(t) of
the final history H.
We will use the history concept to prove an important theorem by Patil
(1970).
3.2.5. A Closure Property
Consider again a program which communicates with its environment by
means of a set of distinct input and output variables. We will observe the
execution of the program in two cases: In the first case, the environment
supplies an input history X, and the program responds by returning an
output history Y:
X~Y
In the second case, the observed input/output relationship is:
X~ -> Y~
So far, nothing has been said about whether the program is functional.
If we repeat the execution with the input history X' the program may
possibly produce a different output history, Y" ~ Y'.
But suppose we only consider programs which satisfy the following
requirement: If during its execution, a program is supplied with two
different input histories X and X', where X is contained in X', then the
same relationship will hold between the corresponding output histories Y
and Y':
X < X' implied Y < Y'
This is called the consistency requirement. It is a sufficient condition
for functional behavior. If a consistent program is executed twice with the
same input, it delivers the same output in both cases because
X = X' implies X ~ X' & X' ~ X
implies Y ~ Y' & Y' ~ Y
implies Y = Y'
Intuitively, the consistency requirement means that it is possible, at
every instant in time, to predict a portion Y of the final output from the
observed input X. If a program in execution is supplied with more input X'
X, the additional input cannot affect the output Y predicted earlier, but
can only extend it to Y' ~ Y.
X and X' refer to the input produced by the environment. This is not
necessarily the same as the input consumed by the program. The
environment may assign new values to input variables, due to erroneous
synchronization, before previous values have been consumed by the
program. In that case, it is impossible to predict the final output from the
observed input. So the program is not consistent.
The consistency requirement is also violated if concurrent processes can
enter a deadlock state in which they are unable to respond to further input
under circumstances which depend on their speed of execution (see Section
2.6.1).
We now introduce an additional requirement of consistent programs:
Since the output is a function of the input, a program in execution cannot
produce an output value until the corresponding input value is present. This
cause-and-effect relationship can be expressed as follows: Suppose we
observe an input history at successive instants in time
X(0),X(1), . . . ,X(t), . . . . X
where t indicates the earliest time at which an initial part X(t) of the input
is available. Let the final input and output histories be X and Y,
respectively. Now it is clear that an initial input history X(t) must be
contained in the final input history:
x(t) < x
Because the program is consistent, it is possible to predict part of the
output history from the initial input history X(t). But in a physical system
it takes a finite time to produce this output; so the earliest moment at
which some or all of this output can be available is the next instant of time
t + 1 where the output history is Y(t + 1). The output Y(t + 1), which is
predicted from an initial part of the input, must itself be an initial part of
the final output Y. So we have
Y(t + 1) ~< Y
If we combine these two physical conditions, we get the so-called
dependency requirement:
X(t) ~ X implies Y(t + 1) ~< Y
In the following, we will consider programs for which the consistency
and dependency requirements hold unconditionally. They are called
unconditionally functional programs. We will prove the important closure
property, which states that any interconnection of a finite number of
unconditionally functional programs is unconditionally functional as a
whole.
A system S consisting of a finite number of components S1, $2, . . . ,
Sn can always be partitioned successively into smaller systems consisting of
two components each:
S = (S1, $2')
$2' = ($2, $3')
• • •
Sn- l ' = ( Sn- l , Sn )
It is therefore sufficient to show that the interconnection of two
unconditionally functional programs, S1 and $2, also is an unconditionally
functional program S. The general theorem follows by induction.
The input and output histories of S are called X and Y, as shown in Fig.
3.11. X in turn consists of two separate input histories, X1 and X2 for S1
and $2, respectively. Similarly, Y consists of two separate output histories,
Y1 and Y2 for S1 and $2, respectively. The histories of internal output
produced by S1 for $2, and vice versa, are called J and K.
We will study the combined system S = (S1, $2) when it is supplied
with two different input histories X and X' where
X~< X'
and delivers the output histories J, K, Y and J', K', Y'.
Consider first subsystem S1 and observe its input history X, K and the
corresponding output history J as shown in Fig. 3.12. (The output history
Y1 is ignored for the moment.)
Suppose the following holds at time t:
X(t) ~ X' & K(t) < K'
This means that at time t during the first execution, the initial input history
to S1 is contained in its final input history of the second execution.
Now since S1 is unconditionally functional, it satisfies the dependency
requirement; so we have
X(t) ~ X' & K(t) ~ K' implies J(t + 1) ~ J'
By applying similar reasoning to subsystem $2, we find that
X(t) ~ X' & J(t) ~ J' implies K(t + 1) ~ K'
These results can be combined into the following:
X(t) ~ X' & J(t) ~ J' & K(t) < K'
implies J(t + 1) ~ J' & K(t + 1) ~ K'
We already assumed that X ~ X', so X(t) ~ X' holds at any instant of
time. Consequently, if the initial histories, J(t) and K(t), at time t are
contained in the final histories, J' and K', then this is also the case at time
t + 1. This is trivially true at time t = 0 where J(t) and K(t) are empty. So it
holds throughout the execution that
X ~ X' implies J < J' & K ~ K'
But since the subsystems satisfy the consistency requirement, this in turn
means that
X ~ X' implies Y ~ Y'
So the combined system S also satisfies the consistency requirement. And,
for physical reasons, it also satisfies the dependency requirement. (This can
be proved formally as well.)
The closure property enables a designer to verify that a large program is
functional by a step-wise analysis of smaller program components. The
functional behavior is ensured by local conditions (the consistency and
dependency requirements) which only constrain the relationship between a
program component and its immediate environment (represented by
input/output variables).
In actual systems, the consistency requirement seldom holds unconditionally.
The functional behavior may depend on certain additional
requirements; for example, that the input is of a certain type and is not
delivered more rapidly than the system is able to consume it. A system
which satisfies the consistency and dependency requirements, provided that
additional constraints hold, is called conditionally functional. For such
systems, the closure property holds when the additional constraints hold.
I will conclude the present discussion with some examples of useful
systems in which time-dependent behavior is inherent and desired.
3.2.6. Non-functional Systems
I have stressed the importance of designing multiprogramming systems
which are functional in behavior. But it must be admitted that some
computational tasks are inherently time-dependent.
Consider, for example, priority scheduling according to the rule
shortest job next. This involves decisions based on the particular set of
computations which users have requested at a given time. If a scheduling
decision is postponed a few seconds, the outcome may well be different
because new and more urgent requests can be made in the meantime by
users. So priority scheduling depends not only on the order in which users
input their requests, but also on their relative occurrence in time.
In the spooling system shown in Fig. 1.3 there is a time-dependent
relationship between the stream of jobs input and the stream of printed
output. But the spooling system has one important property which makes
the time-dependent behavior tolerable: It maintains a functional relationship
between the input and output of each job.
The user computations are disjoint; they could be carried out on
different machines. But, for economic reasons, they share the same
machine. And this can only be done efficiently by introducing a certain
amount of time-dependent behavior on a microscopic time scale. However,
on a macroscopic time scale, the operating system hides the timedependency
from the individual user and offers him a virtual machine with
perfectly reproducible behavior.
Figure 3.13 shows another example of the same principle: (a) shows
two disjoint processes consisting of the operations P, Q, R and S, Q, T; (b)
shows an equivalent system in which we have taken advantage of the
occurrence of a common operation Q in both processes. The processes now
share a single instance of operation Q, which can receive input from
operations P and S and deliver output to operations R and T.
The equivalence of the two systems in Fig. 3.13(a) and (b) is based on
the assumption that requests received by Q from P have no effect on the
response of Q to requests received from S, and vice versa, and that Q
responds to any request within a finite time.
An example of a set of common operations shared by all user
computations is a filing system, which permits users to create and
manipulate data and programs on backing storage and maintain them over a
certain period of time.
The system in Fig. 3.13(b) is non-functional if we observe the input and
output of the common operation Q because this is a time-dependent
merging of data values belonging to independent processes. But the system
is still functional if we only observe the relationship between the output
from P and the input to R (or that between the output from S and the
input to T).
So we come to the conclusion that functional behavior is an abstraction
just like the concepts operation, process, and state (see Section 2.6.3).
Whether or not a system is considered functional depends on the level of
description chosen by the observer.
• 3.3. MUTUAL EXCLUSION
Much has been said about the role of disjoint processes and functional
behavior in multiprogramming systems. We must now deal with interacting
processes--concurrent processes which have access to common variables.
Common variables are used to represent the state of physical resources
shared by competing processes. They are also used to communicate data
between cooperating processes. In general, we will say that all common
variables represent shared objects called resources. Interacting processes can
therefore also be defined as processes which share resources.
To share resources, concurrent processes must be synchronized.
Synchronization is a general term for any constraint on the ordering of
operations in time. We have already met synchronizing rules which specify
a precedence or priority of operations by constraints of the form
"operation A must be executed before operation B" and "an operation of
priority P must only be executed when all operations of higher priority
have been executed."
In this section, we will discuss process interactions in which the
constraint is of the form "operations A and B must never be executed at
the same time." This synchronizing rule specifies mutual exclusion of
operations in time.
3.3.1. Resource Sharing
We begin with an example in which two concurrent processes, P and Q,
share a single physical resource R. P and Q are cyclical processes which
every now and then wish to use R for a finite period of time, but they
cannot use it at the same time.
An example of a resource that can be accessed only by one process at a
time is a magnetic tape station. In this case, the requirement of mutual
exclusion is dictated by the physical characteristics of the resource: Since
we can only nhount one tape at a time on a magnetic tape station and access
its data strictly sequentially, it is impractical to let several processes use the
tape station at the same time. But, as we shall see, there is also a deeper
logical reason for the requirement of mutual exclusion of certain operations
in time.
We will follow a chain of arguments made by Dijkstra (1965) to
illuminate the nature of the mutual exclusion problem.
Let us first try the following approach: The resource R is represented
by a boolean, indicating whether it is free or occupied at the moment. Each
process goes through a cycle in which it first reserves the resource, then
uses it, and finally releases it again. Before reserving the resource, a process
waits in a loop until the resource is free. This leads to the following
program:
var free: boolean;
begin
free: = true;
cobegin
"P" repeat
repeat until free;
free: = false;
use resource;
free: = true;
P passive;
forever
"Q" repeat
repeat until free;
free:-- false;
use resource;
free:= true;
Q passive;
~orever~
coend
end
This program clearly violates the rules of concurrent statements laid
down in Section 3.2.3: Both processes change and refer to the variable free.
But let us ignore this for the moment.
Initially, free is true. It is therefore possible for P a~ld Q to refer to free
at the same time and find it true. Then, they will both in good faith assign
the value false to free and start using the resource. So this program does not
satisfy the condition that the resource can be used by at most one process
at a time.
Well, then we must try something else. In the next version, the boolean
free is replaced by another boolean, Pturn, which is true when it is P's turn
to use the resource and false when it is Q's turn:
vat Pturn : boolean;
begin
Pturn: = true;
cobegin
"P" repeat
repeat until Pturn;
use resource;
Pturn:= false;
P passive;
forever (cont.)
"Q" repeat
repeat until not Pturn;
use resource;
Pturn:= true;
Q passive;
forever
coend
end
There are only two cases to consider: either Pturn is true or it is false.
When Pturn is true, only process P can start using the resource and release it
again. And when Pturn is false, only process Q can start using the resource
and release it again. So mutual exclusion of P and Q is indeed achieved. But
the solution is far too restrictive: It forces the processes to share the
resource in a strictly alternating sequence:
P,Q,P,Q, . . .
If one of the processes, say P, is stopped in its passive state after having
released the resource, the other process Q will also be stopped when it has
released the resource (by means of the assignment Pturn:= true) and then
tries to reserve it again.
This is quite unacceptable in an installation where P and Q could be
two independent user computations. We must require that a solution to the
problem make no assumptions about the relative progress of the processes
when they are not using the resource.
Another highly undesirable effect of the previous solution is the waste
of processing time in unproductive waiting loops. This is called the busy
form of waiting.
Our third attempt uses two booleans to indicate whose turn it is:
vat Pturn, Qturn : boolean;
begin
Pturn : = false; Qturn : = false;
cobegin
"P" repeat
Pturn : = true;
repeat until not Qturn;
use resource;
Pturn := false;
P passive;
forever (cont.)
"Q" repeat
Qturn : = true;
repeat until not Pturn;
use resource;
Qturn := false;.
Q passive;
forever
coend
end
Assignment to variable Pturn is only made by process P. When P does
not use the resource, Pturn is false, so
not Pturn implies P passive
The two processes are completely symmetrical, so the assertion
not Pturn implies P passive
not Qturn implies Q passive
must be invariant.
Process P waits until Qturn is false before it uses the resource, so
P active implies not Q turn
implies Q passive
Similar reasoning about process Q leads to the conclusion that
P active implies Q passive
& Q active implies P passive
is invariant.
Mutual exclusion is therefore guaranteed, and it is not difficult to see
that stopping one process in its passive state does not influence the progress
of the other process. But it is a highly dangerous solution: The processes
can make the assignments
Pturn := true Qturn := true
' at the same time and then remain indefinitely in their waiting loops waiting
for one of the booleans to become false. In short, the solution can lead to a
deadlock.
We learn from our mistakes. The lesson of this one is that when
competing processes request the resource at the same time, the decision to
grant it to one of them must be made within a finite time.
It is possible to continue along these lines and find a correct solution.
This was first done by Dekker and described by Dijkstra (1965). Dijkstra
solved the general case, in which n processes share a single resource. The
solution turns out to be far too complicated and inefficient to be of
practical value.
I have described these preliminary, incorrect solutions to make you
appreciate the subtlety of the mutual exclusion problem and discover a set
of criteria that a realistic solution must satisfy. Let me repeat these criteria:
(1) The resource in question can be used by one process at a time at
most.
{2) When the resource is requested simultaneously by several processes,
it must be granted to one of them within a finite time.
(3) When a process acquires the resource, the process must release it
again within a finite time. Apart from this, no assumption is made about
the relative speeds of the processes; processes may even be stopped when
they are not using the resource.
(4) A process should not consume processing time when it is waiting to
acquire the resource. ~ ~P,~ ,~?K~
For the moment, we will leave the problem there and consider another
instance of it in which interacting processes are cooperating rather than
competing.
• 3.3.2. Data Sharing
The second example of process interaction is taken from an actual
system that supervises an industrial plant. Among other things, this
real-time system measures the consumption of energy and the production
of materials in the plant continuously and reports it to management every
few hours.
Consumption and production are measured in discrete units (kilowatthours
and product units) and recorded in pulse registers connected to a
computer. The computer is multiplexed among a number of concurrent
processes. A cyclical process P inputs and resets the pulse registers every
few seconds and adds their values (0 or 1) to a corresponding number of
integer variables. Another cyclical process Q outputs and resets the integer
counters every few hours. Operators can change the frequency of execution
of P and Q independently. The processes therefore do not know their
relative rates.
Let us assume that the periodic scheduling of processes and the input/
output are handled correctly by standard procedures. No generality is lost
by considering only one pulse counter v. As a first attempt, I suggest the
following solution:
var-v: integer;
begin
V: = 0;
cobegin
"P" repeat
delay(l);
v:= v + input;
forever
"Q" repeat
delay(18000);
output(v);
v:= 0;
forever
coend
end
This program too violates the rules of concurrent statements because
process Q refers to a variable v changed by process P. But again we will
ignore this violation for the moment since it expresses the desired
interaction between the processes.
The intention is that a value of v output at time t should define the
number of pulses accumulated over the past 18,000 seconds, or 5 hours,
prior to time t.
Now, in most computers, statements such as
v: = v + input output(v)
are executed as a sequence of instructions which use registers, say r and s,
to hold intermediate results:
r:= input; s: = v;
r: = r + v; output(s);
V: = r;
A concurrent statement can therefore be executed as a sequence of
instructions interleaved in time. The operating system will ensure that the
instructions of each process are executed in the right order, but apart from
this, the system may interleave them in many different ways. In the system
considered here, the operating system switches from one process to another
every 20 msec to maintain the illusion that they are executed simultaneously.
The actual interleaving of instructions in time is therefore a
time-dependent function of other concurrent processes not considered here.
One possibility is that the instructions of P and Q are interleaved as
follows:
"v = pulse count"
P: r: = 1;
Q: s: = v; output(s);
P: r:= r + v;v: = r;
Q: s: = 0; v: = 8;
"v = 0 & output = pulse count"
Another possibility is the following:
"v = pulse count"
P: r: = l;r: =r + v;
Q: s:= v; output(s);
Q: S: = 0; V: = S;
P: v:= r;
"v = pulse count + 1 & output = pulse count"
In the first case, the program measures an input pulse (r: = 1), but fails
to accumulate it (v = 0); in the second case, the program outputs the
current pulse count correctly, but proceeds to include it in the count over
the next five hours. The result should have been
"v = 1 & output = pulse count"
or
"v = 0 & output = pulse count + 1"
depending on when the last pulse was measured.
The penalty of breaking the laws of concurrent statements is
time-dependent erroneous behavior. The example shown here illustrates the
remark made in Section 3.1.1 about concurrent processes executed
simultaneously on several processors or on a single, multiplexed processor:
"The logical problems turn out to be the same in both cases; they are
caused by our ignorance of the relative speeds of concurrent processes."
It should be evident by now that our present programming tools are
hopelessly inadequate for controlling access to physical resources and data
shared by concurrent processes. A new tool is needed, and like our previous
tools it should be simple to understand, efficient to implement, and safe to
use.
• 3.3.3. Critical Regions
The erroneous behavior in the pulse counting program is caused by the
interleaving of concurrent statements in time. If we were sure that only one
process at a time could operate on the common variable v, the previous
solution would work.
So let us postulate a language construction which has precisely this
effect. I will use the notation
vat v: shared T
to declare a common variable v of type T.
Concurrent processes can only refer to and change common variables
within structured statements called critical regions. A critical region is
defined by the notation
region v do S
which associates a statement S with a common variable v. This notation
enables a compiler to check that common variables are used only inside
critical regions.
Critical regions referring to the same variable v exclude one another in
time. More precisely, we make the following assumptions about critical
regions on a variable v:
(1) When a process wishes to enter a critical region, it will be enabled
to do so within a finite time.
(2) At most, one process at a time can be inside a critical region.
(3) A process remains inside a critical region for a finite time only.
Criterion 3 says that all statements operating on a common variable
must terminate.
Criterion 2 expresses the requirement of mutual exclusion in time of
these statements.
Criteria 1 and 2 put the following constraints on the scheduling of
critical regions associated with a given variable: (a) when no processes are
inside critical regions, a process can enter a critical region immediately; (b)
when a process is inside a critical region, other processes trying to enter
critical regions will be delayed; and (c) when a process leaves a critical
region while other processes are trying to enter critical regions, one of the
processes will be enabled to proceed inside its region; (d) these decisions
must be made within finite periods of time.
Finally, (e) the priority rule used to select a delayed process must be
fair: It must not delay a process indefinitely in favor of more urgent
processes. An example of a fair scheduling policy is first-come, first-served,
which selects processes in their order of arrival; an unfair policy is
last-come, first-served, which favors the latest request.
Although we require that scheduling be fair, no assumptions are made
about the specific order in which processes are scheduled. This depends
entirely on the underlying implementation. As Dijkstra (1971b) puts it: "In
the long run a number of identical processes will proceed at the same
macroscopic speed. But we don't tell how long this run is." (See also
Dijkstra, 1972.)
Critical regions can be implemented in various ways. We are going to
consider some of them later.
• Critical regions referring to different variables can be executed
simultaneously, for example:
vat v: shared V; w: shared W;
cobegin
region v do P;
region w do Q;
coend
A process can also enter nested critical regions:
region v do
begin . . .
regionwdo . . . ;
. . .
end
In doing so, one must be aware of the danger of deadlock in such
constructions as:
cobegin
"P" region v do region w do . . . ;
"Q" region w do region v do . . . ;
coend
It is possible that process P enters its region v at the same time as
process Q enters its region w. When process P tries to enter its region w, it
will be delayed because Q is already inside its region w. And process Q will
be delayed trying to enter its region v because P is inside its region v. This is
a case in which correctness criterion 3 is violated: The processes do not
leave their critical regions after a finite time. A method of avoiding this
problem will be described later.
The concept critical region is due to Dijkstra (1965). The language
constructs which associate critical regions with a common variable are my
own (Brinch Hansen, 1972b). Similar constructs have been proposed
independently by Hoare (1971b).
With this new tool, common variables and critical regions, the solutions
to our previous problems become trivial. In the first problem, our only
concern was to guarantee mutual exclusion of operations on a physical
resource performed by concurrent processes. Algorithm 3.2 solves the
problem in the general case of n processes.
ALGORITHM 3.2 A Resource R Shared by n Concurrent Processes
vat R: shared boolean;
cobegin
"PI" repeat
region R do use resource;
P1 passive;
forever
• • • • •
"Pn" repeat
region R do use resource;
Pn passive;
forever
coend
The pulse counting problem is solved by Algorithm 3.3.
ALGORITHM 3.3 A Variable v Shared by Two Concurrent Processes
vat v: shared integer;
begin
v: = O;
cobegin
"P" repeat
delay(I);
region v do v:= v + input;
forever
"Q" repeat
delay(18000);
region v do begin
output(v);
V: = O;
end
forever
coend
end
3.3.4. Conclusion
Critical regions provide an elegant solution to the resource sharing
problem. But how fundamental is this concept? After all, we succeeded in
one instance in solving the problem without introducing critical regions in
our language. I am referring to the algorithm in Section 3.3.1, which
permits a resource to alternate between two processes: P, Q, P, Q, . . . .
This solution may not be ideal, but it does ensure mutual exclusion without
using critical regions--or does it?
In the analysis of its effect, I made the innocent statement: "There are
only two cases to consider: eitherPturn is true or it is false." These are the
only values that a boolean variable can assume from a formal point of view.
But suppose the computer installation consists of several processors
connected to a single store as shown in Fig. 2.3. Then, it is quite possible
that the operations of processes P and Q can overlap in time. We must now
carefully examine what happens in this physical system when P and Q try
to operate simultaneously on the variable Pturn.
Suppose process P initiates the assignment
Pturn := false
This storage operation takes a finite time. We know that Pturn is true
before the operation is started and that Pturn will be false when the
operation is completed. But there is a transitional period, called the
instruction execution time, during which the physical state of the storage
cell is unknown. It is changing in a time-dependent manner, determined by
the characteristics of the electronic circuits and the storage medium used.
If process Q is allowed to refer to the same storage location while the
assignment is in progress, the resulting value (or more precisely, the bit
combination used to represent the value) is unpredictable. In other words,
process Q is performing an undefined operation.
The hardware designer solves this problem by connecting the
concurrent processors to a sequential switching circuit called an arbiter.
The arbiter ensures that, at most, one processor at a time can access a given
storage location. It is a hardware implementation of critical regions.
So our reasoning on the effect of the algorithm in Section 3.3.1 was
based on the implicit assumption that load and store operations on a given
variable exclude each other in time. The same assumption underlies
Dekker's solution to the resource sharing problem.
The load and store operations are proper critical regions at the short
grains of time controlled by hardware. But at higher levels of programming,
we must guarantee mutual exclusion in larger grains of time for arbitrary
statements. The language construct
region v do S
is introduced for this purpose.
Mutual exclusion is indeed one of the most fundamental concepts of
programming. Whenever we make assertions about the effect of any
operation on a certain variable, for example:
"A: array 1..n of integer"
sort(A);
"for all i,1: 1.. n (i ~ jimpliesA(i) ~ A(j))"
it is tacitly understood that this is the only operation performed at that
time on this variable. In the above example, it would indeed have been a
coincidence if the array A had been sorted if other processes were happily
modifying it simultaneously.
• In sequential computations, mutual exclusion is automatically achieved
because the operations are applied one at a time. But we must explicitly
indicate the need for mutual exclusion in concurrent computations.
It is impossible to make meaningful statements about the effect of
concurrent computations unless operations on common variables exclude
one another in time. So in the end, our understanding of concurrent
processes is based on our ability to execute their interactions strictly
sequentially. Only disjoint processes can proceed at the same time.
• But there is one important difference between sequential processes and
critical regions: The statements $1, $2, . . . , Sn of a sequential process are
totally ordered in time. We can therefore make assertions about its progress
from an initial predicate P towards a final result R :
"P" S1 "Ri" $2 "R2" . . . "Rn-l" Sn "R"
In contrast, nothing is specified about the ordering of critical regions in
time--they can be arbitrarily interleaved. The idea of progressing towards a
final result is therefore meaningless. All we can expect is that each critical
region leave certain relationships among the components of a shared
variable v unchanged. These relationships can be defined by an assertion I
about v which must be true after initialization of v and before and after
each subsequent critical region. Such an assertion is called an invariant.
When a process enters a critical region to execute a statement S, a
predicate P holds for the variables accessible to the process outside the
critical region, and an invariant I holds for the shared variable accessible
inside the critical region. After the completion of statement S, a result R
holds for the former variables and invariant I has been maintained.
So a critical region has the following axiomatic property:
,,p,,
region v do "P & I" S "R & I";
,,R,,
The interactions controlled by critical regions are rather indirect--each
process can ignore the existence and function of other processes as long as
they exclude each other in time and maintain invariants for common
variables.
We will now study more direct interactions between processes
cooperating on common tasks. Such processes are well aware of each
other's existence and purpose: Each of them depends directly on data
produced by other members of the community.
3.4.1. Process Communication
To cooperate on common tasks, processes must be able to exchange
data. Figure 3.14 shows the situation we will study. A process P produces
and sends a sequence of data to another process C, which receives and
consumes them. The data are transmitted between the processes in discrete
portions called messages. They are regarded as output by P and as input by
C.
Since either process can proceed at a rate independent of the other's, it
is possible that the sender may produce a message at a time when the
receiver is not yet ready to consume it (it may still be processing an earlier
message). To avoid delaying the sender in this situation, we introduce a
temporary storage area in which the sender can place one or more messages
until the receiver is ready to consume them. This storage area is called a
buffer and designated B; its function is to smooth speed variations between
the processes and occasionally permit the sender to be ahead of the
receiver.
The communication must be subject to two resource constraints: (1)
the sender cannot exceed the finite capacity of the buffer; and (2) the
receiver cannot consume messages faster than they are produced.
These constraints are satisfied by the following synchronizing rule: If
the sender tries to put a message in a full buffer, the sender will be delayed
until the receiver has taken another message from the buffer; if the receiver
tries to take a message from an empty buffer, the receiver will be delayed
until the sender has put another message into the buffer.
We also assume that all messages are intended to be received exactly as
they were sent--in the same order and with their original content. Messages
must not be lost, changed, or permuted within the buffer. These
requirements can be stated more precisely with the aid of Fig. 3.15.
The sequence S of messages sent by a producer P and the corresponding
sequence R of messages received by a consumer C are shown conceptually
as infinite arrays:
vat S, R: array 0..co of message
Two index variables s and r
vat s, r: 0..co
initialized to zero, define the number of messages sent and received at any
instant in time:
s > 0 implies S(1) to S(s) have been sent in that order
r > 0 implies R(1) to R(r) have been received in that order
Our requirements are the following:
(1) The number of messages received cannot exceed the number of
messages sent
0 < r < s
(2) The number of messages sent, but not yet received, cannot exceed
the buffer capacity max
O < s-r < max
(3) Messages must be received exactly as they are sent
for i: 1..r (R(i) = S(i))
So, all in all, we have the following communication invariant:
O < r < s < r+max&
for i: 1..r (R(i) = S(i))
I suggest the following notation for a language construct which satisfies
this requirement:
var B: buffer max of T
It declares a message buffer B, which can transmit messages of type T
between concurrent processes. The buffer capacity is defined by an integer
constant, max.
Messages are sent and received by means of the following standard
procedures:
send(M, B) receive(M, B)
where M is a variable of type T.
Since send and receive refer to a common variable B, we require that
these operations exclude each other in time. They are critical regions with
respect to the buffer used.
It is worth pointing out that with the synchronizing rules defined, a
sender and a receiver cannot be deadlocked with respect to a single message
buffer. A sender can only be delayed when a buffer is full
s = r + max
and a receiver can only be delayed when a buffer is empty
s=r
A situation in which they are both delayed with respect to the same buffer
is therefore impossible if the buffer has a non:zero capacity max > O.
It is also interesting to notice that a message buffer (if properly used) is
unconditionally functional in the sense defined in Section 3.2.5.
If we observe a message sequence X going into a buffer, we expect that
a message sequence Y = X will eventually come out of it. And if we observe
two input sequences, X and X', where the former is contained in the latter,
then we have trivially that
X < X' implies Y < Y'
So a message buffer satisfies the consistency requirement if all messages
sent are eventually received. And since send and receive cannot take place
at the same time, it takes a finite time for a message to pass through the
buffer. So the dependency requirement is also satisfied.
From the closure property we conclude that a set of unconditionally
functional processes connected only by message buffers is unconditionally
functional as a whole, provided there are only one sender and one receiver
for each buffer, and all messages sent are eventually received.
It is quite possible, with the synchronizing rules defined here, to
connect several senders and receivers to a single buffer, as shown in Fig.
3.16. But when this is done, the input to the buffer is a time-dependent
merging of messages from m different processes, and the output is a
time-dependent splitting of messages to n different receivers. In that case,
there is no functional relationship between the output of each producer
and the input of each consumer.
The communication primitives, send and receive, as proposed here will
transmit messages by value; that is, by copying them first from a sender
variable M into a buffer B, and then from B into a receiver variable M'. This
is only practical for small messages.
For larger messages, the data structures and primitives must be defined
such that messages are transmitted by reference, that is; by copying an
address from one process to another. But in order to satisfy the
communication invariant, such a language construct must ensure that at
most one process at a time can refer to a message variable.
In the following, I will consider the simplest possible form of process
communication--the exchange of timing signals.
3.4.2. Semaphores
In some cases, a process is only interested in receiving a timing signal
from another process when a certain event has occurred. But apart from
that, no other exchange of data is required.
An example is Algorithm 3.3 where each process wishes to be delayed
until a certain interval of time has elapsed.
This can be regarded as a special case of process communication in
which an empty message is sent each time a certain event occurs. Since the
messages are empty (and therefore indistinguishable), it is sufficient to
count them. The buffer is therefore reduced to a single, non-negative
integer, which defines the number of signals sent, but not yet received.
Such a synchronizing variable is called a semaphore. It was first
proposed and investigated by Scholten and Dijkstra. A semaphore variable v
will be declared as follows:
vat v: semaphore
The corresponding send and receive primitives will be called
signal(v) wait(v)
(Dijkstra originally called them V and P.)
Since a semaphore is a common variable for its senders and receivers,
we must require that signal and wait operations on the same semaphore
exclude each other in time. They are critical regions with respect to the
semaphore.
Following Habermann (1972), a semaphore v will be characterized by
two integer components:
s(v) the number of signals sent
r(v) the number of signals received
Initially, s(v) = r(v) = O.
The communication invariant defined in the previous section now
becomes:
0 < r(v) < s(v) < r(v) + max(integer)
It means that: (1) a process can only send or receive some or no signals;
(2) signals cannot be received faster than they are sent; and (3) a semaphore
can only count to the upper limit of integers set by a given machine.
It is useful to permit an initial assignment to a semaphore before
process communication begins. So we will introduce a third semaphore
component:
c(v) the number of initial signals
and permit an assignment:
V: = C
to be made outside concurrent statements. But within concurrent statements,
the orily operations permitted on a semaphore v are still signal and
wait.
The semaphore invariant now becomes
0 < r(v) < s(v) + c(v) ~ r(v) + max(integer)
In most computers, the range of integers is much larger than the
number of unconsumed signals that can occur in practice. For example, in a
computer with 24-bit words, max(integer) = 8388607. We will therefore
ignore this constraint in the following and assume that an implementation
treats semaphore overflow as a programming error.
The synchronizing rules of semaphores are the following:
(1) If the operation wait (v) is started at a time when r(v) < s(v) + c(v),
then r(v) is increased by one and the receiver continues; but if r(v) = s(v) +
c(v), then the receiver is delayed in a process queue associated with the
semaphore v.
(2) The operation signal (v) increases s(v) by one; if one or more
processes are waiting in the queue associated with the semaphore v, then
one of them is selected and enabled to continue, and r(v) is increased by
one. The scheduling of waiting processes must be fair (see Section 3.3.3).
A critical region
vat R : shared T;
region R do S;
can be implemented by associating a semaphore mutex, initialized to one,
with the shared variable R and then surrounding the statement S by a pair
of wait and signal operations as shown in Algorithm 3.4.
Sec. 3.4. PROCESS COOPERATION 95
ALGORITHM 3.4 Mutual Exclusion Implemented With Semaphores
vat R: record content: T; mutex: semaphore end
begin
with R do mutex := 1;
cobegin
with R do
begin wait( mutex ) ; $1; signal( mutex ) end
• • .
with R do
begin wait( mu tex ) ; Sn ; signal( mutex ) end
o o o
coend
end
To see that this implementation is correct, observe the following: Since
the processes always execute a wait operation before a signal operation on
mutex, we have
0 ~< s(mutex) ~< r(mutex)
When this is combined with the semaphore invariant, the result is
0 ~< s(mutex) ~ r(mutex) ~ s(mutex) + 1
From the structure of Algorithm 3.4, it is evident that the number of
processes n that are inside their critical regions at a given time are those
which have passed a wait operation at the beginning of these regions, but
have not as yet passed a corresponding signal operation at the end of the
regions. So
n(mutex) = r(mutex) - s(mutex)
This in turn means that
0 ~ n(mutex) ~ 1
In other words, one process at most can be inside its critical region at
any time. When no processes are inside their critical regions, we have
n(mutex) = 0
or
r(mutex) = s(mutex)
Since r(mutex) < s(mutex) + 1, in this situation another process can
complete a wait operation and enter its critical region immediately.
So the correctness criteria for critical regions are satisfied provided each
of the statements $1 to Sn terminates.
It is an amusing paradox of critical regions that, in order to implement
one, we must appeal to the existence of simpler critical regions (namely,
wait and signal). In the next chapter, which explains an implementation of
wait and signal, I shall appeal to the existence of a storage arbiter--a
hardware implementation of still simpler critical regions, and so on, ad
infinitum. The buck ends at the atomic level, where nuclear states are
known to be discrete and mutually exclusive.
At this stage, it is tempting to conclude that there is no need for
extending a programming language with a construct for critical regions and
common variables. The problem can evidently be solved by wait and signal
operations on semaphores.
This conclusion is indeed valid in a world where programs are known to
be correct with absolute certainty. But in practice it is untenable. The
purpose of the language construct
• var v: shared T;
region v do S;
is to enable a compiler to distinguish between disjoint and interacting
processes, and check that common variables are used only within critical
regions.
If we replace this structured notation with semaphores, this will have
grave consequences:
(1) Since a semaphore can be used to solve arbitrary synchronizing
problems, a compiler cannot conclude that a pair of wait and signal
operations on a given semaphore initialized to one delimits a critical region,
nor that a missing member of such a pair is an error. A compiler will also be
unaware of the correspondence between a semaphore and the common
variable it protects. In short, a compiler cannot give the programmer any
assistance whatsoever in establishing critical regions correctly.
(2) Since a compiler is unable to recognize critical regions, it cannot
make the distinction between critical regions and disjoint processes.
Consequently, it must permit the use of common variables everywhere. So
a compiler can no longer give the programmer any assistance in avoiding
time-dependent errors in supposedly disjoint processes.
The horrors that this leads to have already been demonstrated (see
Section 3.2.2).
As an example of the first problem, consider the programmer who by
mistake writes the following:
wait(mutex); signal(mutex);
S; S;
wait(mutex); wait(mutex);
In the left example, the program will be deadlocked at the end of the
"critical" region; in the right example, the program will sometimes permit 3
processes to be inside a "critical" region simultaneously.
The advantage of language constructs is that their correctness can be
ALGORITHM 3.5 Periodic Scheduling of Concurrent Processes
vat schedule: array 1..n of
record deadline, interval: integer end
start: array 1..n of semaphore;
timer: semaphore;
task: 1.. n; t: in teger;
begin
timer: = O; initialize(schedule, start);
cobegin
repeat "hardware timer"
t: = interval desired;
while t > Odot:=t-1;
signal(timer);
forever
repeat "scheduler"
wait(timer);
for every task do
with schedule(task) do
begin
deadline := deadline - 1;
if deadline ~ 0 then
begin
deadline: = interval;
signal(start(task));
end
end
forever
repeat "task i"
wait(start(i));
perform task;
forever
coend
end
established once and for all when the compiler is tested. The alternative is
to test each critical region separately in all user programs!
It is reasonable to use semaphores in a compiler to implement critical
regions. But at higher levels of programming, the main applicability of
semaphores is in situations in which processes exchange only timing signals.
This is the concept that semaphores represent in a direct, natural manner.
As an example, let us again consider the problem of scheduling a
number of processes periodically in a real-time system. In Algorithm 3.3,
the periodic scheduling was handled by a standard procedure that delays a
process until a certain interval of time has elapsed. We now want to show
how this delay can be implemented in terms of semaphores.
For each task process, we introduce a record defining the time interval
between two successive executions (called the interval), the time interval
until its next execution. (called the deadline), and a semaphore on which
the process can wait until its next execution starts.
The time schedule of all tasks is scanned by a central scheduling process
every second: In each cycle, the scheduler decreases all deadlines by one
and sends a start signal to the relevant tasks.
The scheduling process in turn receives a signal from a hardware timer
every second.
This scheme is shown in Algorithm 3.5.
3.4.3. Conditional Critical Regions
The synchronizing tools introduced so far:
critical regions
message buffers
semaphores
are simple and efficient tools for delaying a process until a special condition
holds:
mutual exclusion
message available (or buffer element available)
signal available
We will now study a more general synchronizing tool which enables a
process to wait until an arbitrary condition holds.
For this purpose I propose a synchronizing primitive await, which
delays a process until the components of a shared variable v satisfy a
condition B:
var v: shared T
region v do
begin . . . await B; . . . end
The await primitive must be textually enclosed by a critical region
associated with the variable v. If critical regions are nested, the
synchronizing condition B is associated with the innermost enclosing
region.
The shared variable v can be of an arbitrary type T. The synchronizing
condition B is an arbitrary boolean expression which can refer to
components of v.
The await primitive can for example be used to implement conditional
critical regions of the form proposed by Hoare (1971b):
"Consumer" "Producer"
region v do region v do $2
begin await B; $1 end
Two processes, called the consumer and the producer, cooperate on a
common task. The consumer wishes to enter a critical region and operate
on a shared variable v by a statement $1 when a certain relationship B holds
among the components of v. The producer enters a critical region
unconditionally and changes v by a statement $2 to make B hold.
I will use this example and Fig. 3.17 to explain the implementation of
critical regions and the await primitive. When a process such as the
consumer above wishes to enter a critical region, it enters a main queue, Or,
associated with the shared variable v. From this queue, the processes enter
their critical regions one at a time to ensure mutual exclusion of operations
on v.
After entering its critical region, the consumer inspects the shared
variable v to determine whether it satisfies the condition B: In that case,
the consumer completes its critical region by executing the statement S1;
otherwise, the process leaves its critical region temporarily and joins an
event queue, Qe, associated with the shared variable.
Other processes can now enter their critical regions through the main
queue, Qv. These processes may either complete their critical regions
unconditionally or decide to await the holding of an arbitrary condition on
variable v. If their conditions are not satisfied, they all enter the same event
queue, Qe.
When another process such as the previous producer changes the shared
variable v by a statement $2 inside a critical region, it is possible that one or
more of the conditions expected by processes in the event queue, Qe, will
be satisfied. Consequently, when a critical region has been successfully
completed, all processes in the event queue, Qe, are transferred to the main
queue, Qv, to permit them to reenter their critical regions and inspect the
shared variable v again.
It is possible that a consumer will be transferred in vain between the
main queue and the event queue several times before its condition B holds.
But this can only occur as frequently as producers change the shared
variable. This controlled amount of busy waiting is the price we pay for the
conceptual simplicity achieved by using arbitrary boolean expressions as
synchronizing conditions.
In the case of simple critical regions, we expect that all operations on a
shared variable maintain an invariant I (see Section 3.3.4). Within
conditional critical regions, we must also require that the desired invariant
is satisfied before an await statement is executed. When the waiting cycle of
a consumer terminates, the assertion B & I holds.
In the following, we will solve two synchronizing problems using first
critical regions and semaphores, and then conditional critical regions. This
will enable us to make a comparison of these synchronizing concepts.
• 3.4.4. An Example: Message Buffers
We will implement the message buffer defined in Section 3.4.1 and
shown in Fig. 3.18.
The buffer consists of a finite number of identical elements arranged in
a circle. The circle consists of a sequence of empty elements that can be
filled by a producer and a sequence of full elements that can be emptied by
a consumer. The producer and consumer refer to empty and full elements
by means of two pointers, p and c. During a computation, both pointers
move clockwise around the circle without overtaking each other.
The sequence of messages received
R(1),R(2), . . . ,R(r)
must be contained in the sequence of messages sent
S(1),S(2), . . . ,S(s)
Let the buffer and its pointers be declared as follows:
buffer: array 0..max-1 of T;
p, c: O. .max-l;
The desired effect is achieved if the following rules are observed:
(1) During a computation, pointers p and e must enumerate the same
sequence of buffer elements:
buffer(O): = S(1); R(1):= buffer(O);
buffer(l): = S(2); R(2):= b u f f e r ( l ) ;
• * • • •
buffer(p - 1 rood max): = S(s);R(r): = buffer(c - 1 rood max);
(2) The receiver must not empty an element until it has been sent:
O~< r< s
(3) The sender must not fill an element until it has been received:
0 ~< s - r < max
The solution that uses conditional critical regions is Algorithm 3.6.
Initially, pointers p and c are equal and they are moved by the same
function p + 1 rood max and c + 1 mod max. So, the pointers enumerate
the same sequence of buffer elements.
ALGORITHM 3.6 Process Communication With Conditional Critical Regions
type B = shared record
buffer: array 0..max-1 of T;
p, c: O. .max- l ;
full: O. .max;
end
" I n i t i a l l y p = c = full = 0"
procedure send(m: T; vat b: B);
region b do
begin
await full < max;
buffer(p): = m;
p:= (p + 1) mod max;
full: = full + 1;
end
procedure receive(vat m: T; b: B);
region b do
begin
await full > 0;
m: = buffer(c);
c: = ( c + 1) mod max;
full: = full - 1;
end
The variable full is initially zero. It is increased by one after each send
and decreased by one after each receive. So
full = s - r
Receiving is only done when full > 0 or s > r; sending is only done
when full < max or s - r < max. The solution is therefore correct.
Now, let us solve the same problem with simple critical regions and
semaphores: We will use the pointers p and c exactly as before--so
condition 1 is still satisfied.
The delay of the receiver will be controlled by a semaphore full in the
following way:
initially: full: = 0
before receive: w a i t ( f u l l )
after send: signal(full)
When the semaphore is used in this way, we evidently have
0 ~< r ~< number of waits(full) &
0 ~< number of signals(full) ~< s
The semaphore invariant defined in Section 3.4.2 ensures that, when
the initial number of full signals has been consumed by wait operations,
further wait operations cannot be completed faster than the corresponding
signal operations:
number of waits(full) ~< number of signals(full) + 0
Immediately before a wait operation is completed, the stronger condition
number of waits(full) < number of signals(full) + 0
holds.
From this we conclude that before a message is received, the condition
0~< r< s
holds.
We have chosen to represent the condition r < s by the semaphore full.
But since timing signals are merely indistinguishable boolean events, we
cannot use the same semaphore to represent the other condition, s - r
max, which controls sending.
So we must introduce another semaphore, empty, and use it as follows:
initially: empty := max
before send: wait(empty)
after receive: signal(empty)
By similar reasoning, we can show that before sending the following
holds:
0 ~< s ~< number of waits(empty) &
0 ~< numberofsignals(empty) ~< r &
number of waits(empty) < number of signals(empty) + max
or
0 < s < r + max
The complete solution is Algorithm 3.7.
ALGORITHM 3.7 Process Communication With Semaphores
type B = record
v: shared record
buffer: array O..max-1 of T;
p, c: O. .max- l ;
end
full, empty: semaphore;
end
"Initially p = c = full = 0 & empty = max"
procedure send(m: T; vat b: B);
begin
with b do
begin
wait(empty);
region v do
begin
buffer(p) := m;
p:= (p + 1) mod max;
end
signal(full);
end
end
procedure receive(vat m: T; b: B);
begin
with b do
begin
wait(full);
region v do
begin
m: = buffer(c);
c: = (c +1) rood max;
end
signal(empty);
end
end
This exercise has already given a good indication of the diffel'ence
between conditional critical regions and semaphores:
(1) A semaphore can only transmit indistinguishable timing signals; it
can therefore only be used to count the number of times a specific event
has occurred in one process without being detected by another process.
Consequently, it is necessary to associate a semaphore with each
synchronizing condition.
The first algorithm with conditional critical regions uses two synchronizing
conditions
full > 0 full< max
but only one variable full.
In the second algorithm, these conditions are represented by two
semaphore variables
full empty
(2) Notice also that within conditional critical regions, the program
text shows directly the condition under which a region is executed. But in
the second algorithm, the association between a semaphore and a
synchronizing condition exists only in the mind of the programmer. He
cannot deduct from the statement
wait(empty)
that full < max before sending without examining the rest of the program
and discovering that full < max when
signal(empty)
is executed after receiving! When semaphores are used to represent general
scheduling conditions, these conditions can only be deduced from the
program text in the most indirect manner.
One can also explain the difficulty with semaphores in the following
way: If a process decided to wait on a semaphore inside a critical region, it
would be impossible for another process to enter its critical region and
wake up the former process. This is the reason that the wait operations in
Algorithm 3.7 are executed outside the critical regions.
However, it is dangerous to separate a synchronizing condition from a
successive critical region. The danger is that another process may enter its
critical region first and make the condition false again. If we split critical
regions in this way, we must introduce additional variables to represent
intermediate states of the form "I expect condition B to hold when I
enter my critical region." So, while a single variable full is sufficient in
Algorithm 3.6, we need two variables, full and empty, in Algorithm 3.7.
The elegance of Algorithm 3.6 makes it tempting to suggest that the
language constructs for message buffers suggested in Section 3.4.1 can be
replaced by conditional critical regions and common variables. But I wish
to point out that the arguments made earlier in favor of the language
construct for simple critical regions can also be made for message buffers.
In Section 3.4.1, we found that a system consisting of processes
connected only by buffers can be made functional as a whole. But this is
only true if the send and receive operations are implemented correctly and
if they are the only operations on the buffers. A compiler is unable to
recognize the data structure B in Algorithm 3.6 as a message buffer and
check that it is used correctly. So when message buffers are used
frequently, it may well be worth including them as a primitive concept in a
programming language.
We now proceed to the next problem, which is due to Courtois,
Heymans, and Parnas (1971).
|