Autoconf, Automake, and Libtool
Авторы :
9. A Small GNU Autotools Project
This chapter introduces a small--but real--worked example, to
illustrate some of the features, and highlight some of the pitfalls, of
the GNU Autotools discussed so far. All of the source can be downloaded
from the book's web
page (5).
The text is peppered with my own pet ideas, accumulated over a several
years of working with the GNU Autotools and you should be able to easily
apply these to your own projects. I will begin by describing some of
the choices and problems I encountered during the early stages of the
development of this project. Then by way of illustration of the issues
covered, move on to showing you a general infrastructure that I use as
the basis for all of my own projects, followed by the specifics of the
implementation of a portable command line shell library. This chapter
then finishes with a sample shell application that uses that library.
9.1.1 Project Directory Structure
Before starting to write code for any project, you need to decide on
the directory structure you will use to organise the code. I like to
build each component of a project in its own subdirectory, and to keep
the configuration sources separate from the source code. The great
majority of GNU projects I have seen use a similar method, so
adopting it yourself will likely make your project more familiar to your
developers by association.
The top level directory is used for configuration files, such as
`configure' and `aclocal.m4', and for a few other sundry
files, `README' and a copy of the project license for example.
Any significant libraries will have a subdirectory of their own,
containing all of the sources and headers for that library along with a
`Makefile.am' and anything else that is specific to just that
library. Libraries that are part of a small like group, a set of
pluggable application modules for example, are kept together in a single
directory.
The sources and headers for the project's main application will be
stored in yet another subdirectory, traditionally named `src'. There
are other conventional directories your developers might expect too: A
`doc' directory for project documentation; and a `test'
directory for the project self test suite.
To keep the project top-level directory as uncluttered as possible, as I
like to do, you can take advantage of Autoconf's
`AC_CONFIG_AUX_DIR' by creating another durectory, say
`config', which will be used to store many of the GNU Autotools
intermediate files, such as install-sh . I always store all
project specific Autoconf M4 macros to this same subdirectory.
So, this is what you should start with:
|
$ pwd
~/mypackage
$ ls -F
Makefile.am config/ configure.in lib/ test/
README configure* doc/ src/
|
9.1.2 C Header Files
There is a small amount of boiler-plate that should be added to all
header files, not least of which is a small amount of code to prevent
the contents of the header from being scanned multiple times. This is
achieved by enclosing the entire file in a preprocessor conditional
which evaluates to false after the first time it has been seen by the
preprocessor. Traditionally, the macro used is in all upper case, and
named after the installation path without the installation prefix.
Imagine a header that will be intalled to
`/usr/local/include/sys/foo.h', for example. The preprocessor
code would be as follows:
|
#ifndef SYS_FOO_H
#define SYS_FOO_H 1
...
#endif /* !SYS_FOO_H */
|
Apart from comments, the entire content of the rest of this header file
must be between these few lines. It is worth mentioning that inside the
enclosing ifndef , the macro SYS_FOO_H must be defined
before any other files are #include d. It is a common mistake to
not define that macro until the end of the file, but mutual dependency
cycles are only stalled if the guard macro is defined before the
#include which starts that cycle(6).
If a header is designed to be installed, it must #include other
installed project headers from the local tree using angle-brackets.
There are some implications to working like this:
-
You must be careful that the names of header file directories in the
source tree match the names of the directories in the install tree. For
example, when I plan to install the aforementioned `foo.h' to
`/usr/local/include/project/foo.h', from which it will be included
using `#include <project/foo.h>', then in order for the same
include line to work in the source tree, I must name the source
directory it is installed from `project' too, or other headers which
use it will not be able to find it until after it has been installed.
-
When you come to developing the next version of a project laid out in
this way, you must be careful about finding the correct header.
Automake takes care of that for you by using `-I' options that
force the compiler to look for uninstalled headers in the current source
directory before searching the system directories for installed headers
of the same name.
-
You don't have to install all of your headers to `/usr/include' --
you can use subdirectories. And all without having to rewrite the
headers at install time.
9.1.3 C++ Compilers
In order for a C++ program to use a library compiled with a C compiler,
it is neccessary for any symbols exported from the C library to be
declared between `extern "C" {' and `}'. This code is
important, because a C++ compiler mangles(7) all variable and function names, where
as a C compiler does not. On the other hand, a C compiler will not
understand these lines, so you must be careful to make them invisible
to the C compiler.
Sometimes you will see this method used, written out in long hand in
every installed header file, like this:
| #ifdef __cplusplus
extern "C" {
#endif
...
#ifdef __cplusplus
}
#endif
|
But that is a lot of unnecessary typing if you have a few dozen headers
in your project. Also the additional braces tend to confuse text
editors, such as emacs, which do automatic source indentation based on
brace characters.
Far better, then, to declare them as macros in a common header file, and
use the macros in your headers:
| #ifdef __cplusplus
# define BEGIN_C_DECLS extern "C" {
# define END_C_DECLS }
#else /* !__cplusplus */
# define BEGIN_C_DECLS
# define END_C_DECLS
#endif /* __cplusplus */
|
I have seen several projects that name such macros with a leading
underscore -- `_BEGIN_C_DECLS'. Any symbol with a leading
underscore is reserved for use by the compiler implementation, so you
shouldn't name any symbols of your own in this way. By way of
example, I recently ported the
Small(8) language
compiler to Unix, and almost all of the work was writing a Perl script
to rename huge numbers of symbols in the compiler's reserved namespace
to something more sensible so that GCC could even parse the
sources. Small was originally developed on Windows, and the author had
used a lot of symbols with a leading underscore. Although his symbol
names didn't clash with his own compiler, in some cases they were the
same as symbols used by GCC.
9.1.4 Function Definitions
As a stylistic convention, the return types for all function definitions
should be on a separate line. The main reason for this is that it makes
it very easy to find the functions in source file, by looking for
a single identifier at the start of a line followed by an open
parenthesis:
| $ egrep '^[_a-zA-Z][_a-zA-Z0-9]*[ \t]*\(' error.c
set_program_name (const char *path)
error (int exit_status, const char *mode, const char *message)
sic_warning (const char *message)
sic_error (const char *message)
sic_fatal (const char *message)
|
There are emacs lisp functions and various code analysis tools, such as
ansi2knr (see section 9.1.6 K&R Compilers), which rely on this
formatting convention, too. Even if you don't use those tools yourself,
your fellow developers might like to, so it is a good convention to
adopt.
9.1.5 Fallback Function Implementations
Due to the huge number of Unix varieties in common use today, many of
the C library functions that you take for granted on your prefered
development platform are very likely missing from some of the
architectures you would like your code to compile on. Fundamentally
there are two ways to cope with this:
-
Use only the few library calls that are available everywhere. In
reality this is not actually possible because there are two lowest
common denominators with mutually exclusive APIs, one rooted in
BSD Unix (`bcopy', `rindex') and the other in
SYSV Unix (`memcpy', `strrchr'). The only way to deal
with this is to define one API in terms of the other using the
preprocessor. The newer POSIX standard deprecates many of the
BSD originated calls (with exceptions such as the
BSD socket API). Even on non-POSIX platforms, there
has been so much cross pollination that often both varieties of a given
call may be provided, however you would be wise to write your code
using POSIX endorsed calls, and where they are missing, define them
in terms of whatever the host platform provides.
This approach requires a lot of knowledge about various system libraries
and standards documents, and can leave you with reams of preprocessor
code to handle the differences between APIS. You will also need
to perform a lot of checking in `configure.in' to figure out which
calls are available. For example, to allow the rest of your code to use
the `strcpy' call with impunity, you would need the following code
in `configure.in':
|
AC_CHECK_FUNCS(strcpy bcopy)
|
And the following preprocessor code in a header file that is seen by
every source file:
|
#if !HAVE_STRCPY
# if HAVE_BCOPY
# define strcpy(dest, src) bcopy (src, dest, 1 + strlen (src))
# else /* !HAVE_BCOPY */
error no strcpy or bcopy
# endif /* HAVE_BCOPY */
#endif /* HAVE_STRCPY */
|
-
Alternatively you could provide your own fallback implementations of
function calls you know are missing on some platforms. In practice you
don't need to be as knowledgable about problematic functions when using
this approach. You can look in GNU libiberty(9) or Franзois
Pinard's libit project(10) to see for which
functions other GNU developers have needed to implement fallback
code. The libit project is especially useful in this respect as it
comprises canonical versions of fallback functions, and suitable
Autoconf macros assembled from across the entire GNU project. I
won't give an example of setting up your package to use this approach,
since that is how I have chosen to structure the project described in
this chapter.
Rather than writing code to the lowest common denominator of system
libraries, I am a strong advocate of the latter school of thought in the
majority of cases. As with all things it pays to take a pragmatic
approach; don't be afraid of the middle ground -- weigh the options on
a case by case basis.
9.1.6 K&R Compilers
K&R C is the name now used to describe the original C language specified
by Brian Kernighan and Dennis Ritchie (hence, `K&R'). I have
yet to see a C compiler that doesn't support code written in the K&R
style, yet it has fallen very much into disuse in favor of the newer
ANSI C standard. Although it is increasingly common for vendors to
unbundle their ANSI C compiler, the GCC
project (11) is available for all of the architectures I have ever
used.
There are four differences between the two C standards:
-
ANSI C expects full type specification in function prototypes, such
as you might supply in a library header file:
|
extern int functionname (const char *parameter1, size_t parameter 2);
|
The nearest equivalent in K&R style C is a forward declaration, which
allows you to use a function before its corresponding definition:
|
extern int functionname ();
|
As you can imagine, K&R has very bad type safety, and does not perform
any checks that only function arguments of the correct type are used.
-
The function headers of each function definition are written
differently. Where you might see the following written in ANSI C:
|
int
functionname (const char *parameter1, size_t parameter2)
{
...
}
|
K&R expects the parameter type declarations separately, like this:
|
int
functionname (parameter1, parameter2)
const char *parameter1;
size_t parameter2;
{
...
}
|
-
There is no concept of an untyped pointer in K&R C. Where you might be
used to seeing `void *' pointers in ANSI code, you are forced
to overload the meaning of `char *' for K&R compilers.
-
Variadic functions are handled with a different API in K&R C,
imported with `#include <varargs.h>'. A K&R variadic function
definition looks like this:
|
int
functionname (va_alist)
va_dcl
{
va_list ap;
char *arg;
va_start (ap);
...
arg = va_arg (ap, char *);
...
va_end (ap);
return arg ? strlen (arg) : 0;
}
|
ANSI C provides a similar API, imported with `#include
<stdarg.h>', though it cannot express a variadic function with no named
arguments such as the one above. In practice, this isn't a problem
since you always need at least one parameter, either to specify the
total number of arguments somehow, or else to mark the end of the
argument list. An ANSI variadic function definition looks like
this:
|
int
functionname (char *format, ...)
{
va_list ap;
char *arg;
va_start (ap, format);
...
arg = va_arg (ap, char *);
...
va_end (ap);
return format ? strlen (format) : 0;
}
|
Except in very rare cases where you are writing a low level project
(GCC for example), you probably don't need to worry about K&R
compilers too much. However, supporting them can be very easy, and if
you are so inclined, can be handled either by employing the
ansi2knr program supplied with Automake, or by careful use of
the preprocessor.
Using ansi2knr in your project is described in some detail in
section `Automatic de-ANSI-fication' in The Automake Manual, but
boils down to the following:
-
Add this macro to your `configure.in' file:
-
Rewrite the contents of `LIBOBJS' and/or `LTLIBOBJS' in
the following fashion:
|
# This is necessary so that .o files in LIBOBJS are also built via
# the ANSI2KNR-filtering rules.
Xsed='sed -e "s/^X//"'
LIBOBJS=`echo X"$LIBOBJS"|\
[$Xsed -e 's/\.[^.]* /.\$U& /g;s/\.[^.]*$/.\$U&/']`
|
Personally, I dislike this method, since every source file is filtered
and rewritten with ANSI function prototypes and declarations
converted to K&R style adding a fair overhead in additional files in
your build tree, and in compilation time. This would be reasonable were
the abstraction sufficient to allow you to forget about K&R entirely,
but ansi2knr is a simple program, and does not address any of
the other differences between compilers that I raised above, and it
cannot handle macros in your function prototypes of definitions. If you
decide to use ansi2knr in your project, you must make the
decision before you write any code, and be aware of its limitations as
you develop.
For my own projects, I prefer to use a set of preprocessor macros along
with a few stylistic conventions so that all of the differences between
K&R and ANSI compilers are actually addressed, and so that the
unfortunate few who have no access to an ANSI compiler (and who
cannot use GCC for some reason) needn't suffer the overheads of
ansi2knr .
The four differences in style listed at the beginning of this subsection
are addressed as follows:
-
The function protoype argument lists are declared inside a
PARAMS
macro invocation so that K&R compilers will still be able to compile the
source tree. PARAMS removes ANSI argument lists from
function prototypes for K&R compilers. Some developers
continue to use __P for this purpose, but strictly speaking,
macros starting with `_' (and especially `__') are reserved
for the compiler and the system headers, so using `PARAMS', as
follows, is safer:
|
#if __STDC__
# ifndef NOPROTOS
# define PARAMS(args) args
# endif
#endif
#ifndef PARAMS
# define PARAMS(args) ()
#endif
|
This macro is then used for all function declarations like this:
|
extern int functionname PARAMS((const char *parameter));
|
-
With the
PARAMS macro is used for all function declarations,
ANSI compilers are given all the type information they require to
do full compile time type checking. The function definitions
proper must then be declared in K&R style so that K&R compilers don't
choke on ANSI syntax. There is a small amount of overhead in
writing code this way, however: The ANSI compile time type
checking can only work in conjunction with K&R function definitions if
it first sees an ANSI function prototype. This forces you to
develop the good habit of prototyping every single function in
your project. Even the static ones.
-
The easiest way to work around the lack of
void * pointers, is to
define a new type that is conditionally set to void * for
ANSI compilers, or char * for K&R compilers. You
should add the following to a common header file:
|
#if __STDC__
typedef void *void_ptr;
#else /* !__STDC__ */
typedef char *void_ptr;
#endif /* __STDC__ */
|
-
The difference between the two variadic function APIs pose a
stickier problem, and the solution is ugly. But it does work.
FIrst you must check for the headers in `configure.in':
|
AC_CHECK_HEADERS(stdarg.h varargs.h, break)
|
Having done this, add the following code to a common header file:
|
#if HAVE_STDARG_H
# include <stdarg.h>
# define VA_START(a, f) va_start(a, f)
#else
# if HAVE_VARARGS_H
# include <varargs.h>
# define VA_START(a, f) va_start(a)
# endif
#endif
#ifndef VA_START
error no variadic api
#endif
|
You must now supply each variadic function with both a K&R and an
ANSI definition, like this:
|
int
#if HAVE_STDARG_H
functionname (const char *format, ...)
#else
functionname (format, va_alist)
const char *format;
va_dcl
#endif
{
va_alist ap;
char *arg;
VA_START (ap, format);
...
arg = va_arg (ap, char *);
...
va_end (ap);
return arg : strlen (arg) ? 0;
}
|
9.2 A Simple Shell Builders Library
An application which most developers try their hand at sooner or later
is a Unix shell. There is a lot of functionality common to all
traditional command line shells, which I thought I would push into a
portable library to get you over the first hurdle when that moment is
upon you. Before elabourating on any of this I need to name the
project. I've called it sic, from the Latin so it is,
because like all good project names it is somewhat pretentious and it
lends itself to the recursive acronym sic is cumulative.
The gory detail of the minutae of the source is beyond the scope of
this book, but to convey a feel for the need for Sic, some of the
goals which influenced the design follow:
-
Sic must be very small so that, in addition to being used as the basis
for a full blown shell, it can be linked (unadorned) into an application
and used for trivial tasks, such as reading startup configuration.
-
It must not be tied to a particular syntax or set of reserved words. If
you use it to read your startup configuration, I don't want to force you
to use my syntax and commands.
-
The boundary between the library (`libsic') and the application
must be well defined. Sic will take strings of characters as input, and
internally parse and evaluate them according to registered commands and
syntax, returning results or diagnostics as appropriate.
-
It must be extremely portable -- that is what I am trying to illustrate
here, after all.
9.2.1 Portability Infrastructure
As I explained in 9.1.1 Project Directory Structure, I'll first create
the project directories, a toplevel dirctory and a subdirectory to put
the library sources into. I want to install the library header files
to `/usr/local/include/sic', so the library subdirectory must be
named appropriately. See section 9.1.2 C Header Files.
|
$ mkdir sic
$ mkdir sic/sic
$ cd sic/sic
|
I will describe the files I add in this section in more detail than the
project specific sources, because they comprise an infrastructure that I
use relatively unchanged for all of my GNU Autotools projects. You could
keep an archive of these files, and use them as a starting point
each time you begin a new project of your own.
9.2.1.1 Error Management
A good place to start with any project design is the error management
facility. In Sic I will use a simple group of functions to display
simple error messages. Here is `sic/error.h':
|
#ifndef SIC_ERROR_H
#define SIC_ERROR_H 1
#include <sic/common.h>
BEGIN_C_DECLS
extern const char *program_name;
extern void set_program_name (const char *argv0);
extern void sic_warning (const char *message);
extern void sic_error (const char *message);
extern void sic_fatal (const char *message);
END_C_DECLS
#endif /* !SIC_ERROR_H */
|
This header file follows the principles set out in 9.1.2 C Header Files.
I am storing the program_name variable in the library that uses
it, so that I can be sure that the library will build on architectures
that don't allow undefined symbols in libraries (12).
Keeping those preprocessor macro definitions designed to aid code
portability together (in a single file), is a good way to maintain the
readability of the rest of the code. For this project I will put that
code in `common.h':
|
#ifndef SIC_COMMON_H
#define SIC_COMMON_H 1
#if HAVE_CONFIG_H
# include <sic/config.h>
#endif
#include <stdio.h>
#include <sys/types.h>
#if STDC_HEADERS
# include <stdlib.h>
# include <string.h>
#elif HAVE_STRINGS_H
# include <strings.h>
#endif /*STDC_HEADERS*/
#if HAVE_UNISTD_H
# include <unistd.h>
#endif
#if HAVE_ERRNO_H
# include <errno.h>
#endif /*HAVE_ERRNO_H*/
#ifndef errno
/* Some systems #define this! */
extern int errno;
#endif
#endif /* !SIC_COMMON_H */
|
You may recognise some snippets of code from the Autoconf manual here---
in particular the inclusion of the project `config.h', which will
be generated shortly. Notice that I have been careful to conditionally
include any headers which are not guaranteed to exist on every
architecture. The rule of thumb here is that only `stdio.h' is
ubiquitous (though I have never heard of a machine that has no
`sys/types.h'). You can find more details of some of these in
section `Existing Tests' in The GNU Autoconf Manual.
Here is a little more code from `common.h':
|
#ifndef EXIT_SUCCESS
# define EXIT_SUCCESS 0
# define EXIT_FAILURE 1
#endif
|
The implementation of the error handling functions goes in
`error.c' and is very straightforward:
|
#if HAVE_CONFIG_H
# include <sic/config.h>
#endif
#include "common.h"
#include "error.h"
static void error (int exit_status, const char *mode,
const char *message);
static void
error (int exit_status, const char *mode, const char *message)
{
fprintf (stderr, "%s: %s: %s.\n", program_name, mode, message);
if (exit_status >= 0)
exit (exit_status);
}
void
sic_warning (const char *message)
{
error (-1, "warning", message);
}
void
sic_error (const char *message)
{
error (-1, "ERROR", message);
}
void
sic_fatal (const char *message)
{
error (EXIT_FAILURE, "FATAL", message);
}
|
I also need a definition of program_name ;
set_program_name copies the filename component of path into
the exported data, program_name . The xstrdup function
just calls strdup , but abort s if there is not enough
memory to make the copy:
|
const char *program_name = NULL;
void
set_program_name (const char *path)
{
if (!program_name)
program_name = xstrdup (basename (path));
}
|
9.2.1.2 Memory Management
A useful idiom common to many GNU projects is to wrap the memory
management functions to localise out of memory handling, naming
them with an `x' prefix. By doing this, the rest of the project is
relieved of having to remember to check for `NULL' returns from the
various memory functions. These wrappers use the error API
to report memory exhaustion and abort the program. I have placed the
implementation code in `xmalloc.c':
|
#if HAVE_CONFIG_H
# include <sic/config.h>
#endif
#include "common.h"
#include "error.h"
void *
xmalloc (size_t num)
{
void *new = malloc (num);
if (!new)
sic_fatal ("Memory exhausted");
return new;
}
void *
xrealloc (void *p, size_t num)
{
void *new;
if (!p)
return xmalloc (num);
new = realloc (p, num);
if (!new)
sic_fatal ("Memory exhausted");
return new;
}
void *
xcalloc (size_t num, size_t size)
{
void *new = xmalloc (num * size);
bzero (new, num * size);
return new;
}
|
Notice in the code above, that xcalloc is implemented in terms of
xmalloc , since calloc itself is not available in some
older C libraries.
Rather than create a separate `xmalloc.h' file, which would need to
be #include d from almost everywhere else, the logical place to
declare these functions is in `common.h', since the wrappers will
be called from most everywhere else in the code:
|
#ifdef __cplusplus
# define BEGIN_C_DECLS extern "C" {
# define END_C_DECLS }
#else
# define BEGIN_C_DECLS
# define END_C_DECLS
#endif
#define XCALLOC(type, num) \
((type *) xcalloc ((num), sizeof(type)))
#define XMALLOC(type, num) \
((type *) xmalloc ((num) * sizeof(type)))
#define XREALLOC(type, p, num) \
((type *) xrealloc ((p), (num) * sizeof(type)))
#define XFREE(stale) do { \
if (stale) { free (stale); stale = 0; } \
} while (0)
BEGIN_C_DECLS
extern void *xcalloc (size_t num, size_t size);
extern void *xmalloc (size_t num);
extern void *xrealloc (void *p, size_t num);
extern char *xstrdup (const char *string);
extern char *xstrerror (int errnum);
END_C_DECLS
|
By using the macros defined here, allocating and freeing heap memory is
reduced from:
|
char **argv = (char **) xmalloc (sizeof (char *) * 3);
do_stuff (argv);
if (argv)
free (argv);
|
to the simpler and more readable:
|
char **argv = XMALLOC (char *, 3);
do_stuff (argv);
XFREE (argv);
|
In the same spirit, I have borrowed `xstrdup.c' and
`xstrerror.c' from project GNU's libiberty. See section 9.1.5 Fallback Function Implementations.
9.2.1.3 Generalised List Data Type
In many C programs you will see various implementations and
re-implementations of lists and stacks, each tied to its own particular
project. It is surprisingly simple to write a catch-all implementation,
as I have done here with a generalised list operation API in
`list.h':
|
#ifndef SIC_LIST_H
#define SIC_LIST_H 1
#include <sic/common.h>
BEGIN_C_DECLS
typedef struct list {
struct list *next; /* chain forward pointer*/
void *userdata; /* incase you want to use raw Lists */
} List;
extern List *list_new (void *userdata);
extern List *list_cons (List *head, List *tail);
extern List *list_tail (List *head);
extern size_t list_length (List *head);
END_C_DECLS
#endif /* !SIC_LIST_H */
|
The trick is to ensure that any structures you want to chain together
have their forward pointer in the first field. Having done that, the
generic functions declared above can be used to manipulate any such
chain by casting it to List * and back again as necessary.
For example:
| struct foo {
struct foo *next;
char *bar;
struct baz *qux;
...
};
...
struct foo *foo_list = NULL;
foo_list = (struct foo *) list_cons ((List *) new_foo (),
(List *) foo_list);
...
|
The implementation of the list manipulation functions is in
`list.c':
|
#include "list.h"
List *
list_new (void *userdata)
{
List *new = XMALLOC (List, 1);
new->next = NULL;
new->userdata = userdata;
return new;
}
List *
list_cons (List *head, List *tail)
{
head->next = tail;
return head;
}
List *
list_tail (List *head)
{
return head->next;
}
size_t
list_length (List *head)
{
size_t n;
for (n = 0; head; ++n)
head = head->next;
return n;
}
|
9.2.2.1 `sic.c' & `sic.h'
Here are the functions for creating and managing sic parsers.
|
#ifndef SIC_SIC_H
#define SIC_SIC_H 1
#include <sic/common.h>
#include <sic/error.h>
#include <sic/list.h>
#include <sic/syntax.h>
typedef struct sic {
char *result; /* result string */
size_t len; /* bytes used by result field */
size_t lim; /* bytes allocated to result field */
struct builtintab *builtins; /* tables of builtin functions */
SyntaxTable **syntax; /* dispatch table for syntax of input */
List *syntax_init; /* stack of syntax state initialisers */
List *syntax_finish; /* stack of syntax state finalizers */
SicState *state; /* state data from syntax extensions */
} Sic;
#endif /* !SIC_SIC_H */
|
9.2.2.2 `builtin.c' & `builtin.h'
Here are the functions for managing tables of builtin commands in each
Sic structure:
|
typedef int (*builtin_handler) (Sic *sic,
int argc, char *const argv[]);
typedef struct {
const char *name;
builtin_handler func;
int min, max;
} Builtin;
typedef struct builtintab BuiltinTab;
extern Builtin *builtin_find (Sic *sic, const char *name);
extern int builtin_install (Sic *sic, Builtin *table);
extern int builtin_remove (Sic *sic, Builtin *table);
|
9.2.2.3 `eval.c' & `eval.h'
Having created a Sic parser, and populated it with some
Builtin handlers, a user of this library must tokenize and
evaluate its input stream. These files define a structure for storing
tokenized strings (Tokens ), and functions for converting
char * strings both to and from this structure type:
|
#ifndef SIC_EVAL_H
#define SIC_EVAL_H 1
#include <sic/common.h>
#include <sic/sic.h>
BEGIN_C_DECLS
typedef struct {
int argc; /* number of elements in ARGV */
char **argv; /* array of pointers to elements */
size_t lim; /* number of bytes allocated */
} Tokens;
extern int eval (Sic *sic, Tokens *tokens);
extern int untokenize (Sic *sic, char **pcommand, Tokens *tokens);
extern int tokenize (Sic *sic, Tokens **ptokens, char **pcommand);
END_C_DECLS
#endif /* !SIC_EVAL_H */
|
These files also define the eval function, which examines a
Tokens structure in the context of the given Sic parser,
dispatching the argv array to a relevant Builtin handler,
also written by the library user.
9.2.2.4 `syntax.c' & `syntax.h'
When tokenize splits a char * string into parts, by
default it breaks the string into words delimited by whitespace. These
files define the interface for changing this default behaviour, by
registering callback functions which the parser will run when it meets
an `interesting' symbol in the input stream. Here are the
declarations from `syntax.h':
|
BEGIN_C_DECLS
typedef int SyntaxHandler (struct sic *sic, BufferIn *in,
BufferOut *out);
typedef struct syntax {
SyntaxHandler *handler;
char *ch;
} Syntax;
extern int syntax_install (struct sic *sic, Syntax *table);
extern SyntaxHandler *syntax_handler (struct sic *sic, int ch);
END_C_DECLS
|
A SyntaxHandler is a function called by tokenize as it
consumes its input to create a Tokens structure; the two
functions associate a table of such handlers with a given Sic
parser, and find the particular handler for a given character in that
Sic parser, respectively.
9.2.3 Beginnings of a `configure.in'
Now that I have some code, I can run autoscan to generate a
preliminary `configure.in'. autoscan will examine all of
the sources in the current directory tree looking for common points of
non-portability, adding macros suitable for detecting the discovered
problems. autoscan generates the following in
`configure.scan':
|
# Process this file with autoconf to produce a configure script.
AC_INIT(sic/eval.h)
# Checks for programs.
# Checks for libraries.
# Checks for header files.
AC_HEADER_STDC
AC_CHECK_HEADERS(strings.h unistd.h)
# Checks for typedefs, structures, and compiler characteristics.
AC_C_CONST
AC_TYPE_SIZE_T
# Checks for library functions.
AC_FUNC_VPRINTF
AC_CHECK_FUNCS(strerror)
AC_OUTPUT()
|
Since the generated `configure.scan' does not overwrite your
project's `configure.in', it is a good idea to run
autoscan periodically even in established project source
trees, and compare the two files. Sometimes autoscan will
find some portability issue you have overlooked, or weren't aware of.
Looking through the documentation for the macros in this
`configure.scan', AC_C_CONST and AC_TYPE_SIZE_T will
take care of themselves (provided I ensure that `config.h' is
included into every source file), and AC_HEADER_STDC and
AC_CHECK_HEADERS(unistd.h) are already taken care of in
`common.h'.
autoscan is no silver bullet! Even here in this
simple example, I need to manually add macros to check for the presence
of `errno.h':
|
AC_CHECK_HEADERS(errno.h strings.h unistd.h)
|
I also need to manually add the Autoconf macro for generating
`config.h'; a macro to initialise automake support; and a
macro to check for the presence of ranlib . These should go
close to the start of `configure.in':
|
...
AC_CONFIG_HEADER(config.h)
AM_INIT_AUTOMAKE(sic, 0.5)
AC_PROG_CC
AC_PROG_RANLIB
...
|
An interesting macro suggested by autoscan is
AC_CHECK_FUNCS(strerror) . This tells me that I need to provide a
replacement implementation of strerror for the benefit of
architectures which don't have it in their system libraries. This is
resolved by providing a file with a fallback implementation for the
named function, and creating a library from it and any others that
`configure' discovers to be lacking from the system library on the
target host.
You will recall that `configure' is the shell script the end user
of this package will run on their machine to test that it has all the
features the package wants to use. The library that is created will
allow the rest of the project to be written in the knowledge that any
functions required by the project but missing from the installers system
libraries will be available nonetheless. GNU `libiberty'
comes to the rescue again -- it already has an implementation of
`strerror.c' that I was able to use with a little modification.
Being able to supply a simple implementation of strerror , as the
`strerror.c' file from `libiberty' does, relies on there being
a well defined sys_errlist variable. It is a fair bet that if
the target host has no strerror implementation, however, that the
system sys_errlist will be broken or missing. I need to write a
configure macro to check whether the system defines sys_errlist ,
and tailor the code in `strerror.c' to use this knowledge.
To avoid clutter in the top-level directory, I am a great believer in
keeping as many of the configuration files as possible in their own
sub-directory. First of all, I will create a new directory called
`config' inside the top-level directory, and put
`sys_errlist.m4' inside it:
|
AC_DEFUN(SIC_VAR_SYS_ERRLIST,
[AC_CACHE_CHECK([for sys_errlist],
sic_cv_var_sys_errlist,
[AC_TRY_LINK([int *p;], [extern int sys_errlist; p = &sys_errlist;],
sic_cv_var_sys_errlist=yes, sic_cv_var_sys_errlist=no)])
if test x"$sic_cv_var_sys_errlist" = xyes; then
AC_DEFINE(HAVE_SYS_ERRLIST, 1,
[Define if your system libraries have a sys_errlist variable.])
fi])
|
I must then add a call to this new macro in the `configure.in' file
being careful to put it in the right place --
somwhere between typedefs and structures and library
functions according to the comments in `configure.scan':
GNU Autotools can also be set to store most of their files in a
subdirectory, by calling the AC_CONFIG_AUX_DIR macro near the top
of `configure.in', preferably right after AC_INIT :
|
AC_INIT(sic/eval.c)
AC_CONFIG_AUX_DIR(config)
AM_CONFIG_HEADER(config.h)
...
|
Having made this change, many of the files added by running
autoconf and automake --add-missing will be put in
the aux_dir.
The source tree now looks like this:
|
sic/
+-- configure.scan
+-- config/
| +-- sys_errlist.m4
+-- replace/
| +-- strerror.c
+-- sic/
+-- builtin.c
+-- builtin.h
+-- common.h
+-- error.c
+-- error.h
+-- eval.c
+-- eval.h
+-- list.c
+-- list.h
+-- sic.c
+-- sic.h
+-- syntax.c
+-- syntax.h
+-- xmalloc.c
+-- xstrdup.c
+-- xstrerror.c
|
In order to correctly utilise the fallback implementation,
AC_CHECK_FUNCS(strerror) needs to be removed and strerror
added to AC_REPLACE_FUNCS :
|
# Checks for library functions.
AC_REPLACE_FUNCS(strerror)
|
This will be clearer if you look at the `Makefile.am' for the
`replace' subdirectory:
|
## Makefile.am -- Process this file with automake to produce Makefile.in
INCLUDES = -I$(top_builddir) -I$(top_srcdir)
noinst_LIBRARIES = libreplace.a
libreplace_a_SOURCES =
libreplace_a_LIBADD = @LIBOBJS@
|
The code tells automake that I want to build a library for use
within the build tree (i.e. not installed -- `noinst'), and that
has no source files by default. The clever part here is that when
someone comes to install Sic, they will run configure which
will test for strerror , and add `strerror.o' to
LIBOBJS if the target host environment is missing its own
implementation. Now, when `configure' creates
`replace/Makefile' (as I asked it to with AC_OUTPUT ),
`@LIBOBJS@' is replaced by the list of objects required on the
installer's machine.
Having done all this at configure time, when my user runs
make , the files required to replace functions missing
from their target machine will be added to `libreplace.a'.
Unfortunately this is not quite enough to start building the project.
First I need to add a top-level `Makefile.am' from which to
ultimately create a top-level `Makefile' that will descend into
the various subdirectories of the project:
|
## Makefile.am -- Process this file with automake to produce Makefile.in
SUBDIRS = replace sic
|
And `configure.in' must be told where it can find instances of
Makefile.in :
|
AC_OUTPUT(Makefile replace/Makefile sic/Makefile)
|
I have written a bootstrap script for Sic, for details see
8. Bootstrapping:
|
#! /bin/sh
set -x
aclocal -I config
autoheader
automake --foreign --add-missing --copy
autoconf
|
The `--foreign' option to automake tells it to relax
the GNU standards for various files that should be present in a
GNU distribution. Using this option saves me from havng to create
empty files as we did in 5. A Minimal GNU Autotools Project.
Right. Let's build the library! First, I'll run bootstrap :
|
$ ./bootstrap
+ aclocal -I config
+ autoheader
+ automake --foreign --add-missing --copy
automake: configure.in: installing config/install-sh
automake: configure.in: installing config/mkinstalldirs
automake: configure.in: installing config/missing
+ autoconf
|
The project is now in the same state that an end-user would see, having
unpacked a distribution tarball. What follows is what an end user might
expect to see when building from that tarball:
|
$ ./configure
creating cache ./config.cache
checking for a BSD compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking whether make sets ${MAKE}... yes
checking for working aclocal... found
checking for working autoconf... found
checking for working automake... found
checking for working autoheader... found
checking for working makeinfo... found
checking for gcc... gcc
checking whether the C compiler (gcc ) works... yes
checking whether the C compiler (gcc ) is a cross-compiler... no
checking whether we are using GNU C... yes
checking whether gcc accepts -g... yes
checking for ranlib... ranlib
checking how to run the C preprocessor... gcc -E
checking for ANSI C header files... yes
checking for unistd.h... yes
checking for errno.h... yes
checking for string.h... yes
checking for working const... yes
checking for size_t... yes
checking for strerror... yes
updating cache ./config.cache
creating ./config.status
creating Makefile
creating replace/Makefile
creating sic/Makefile
creating config.h
|
Compare this output with the contents of `configure.in', and notice
how each macro is ultimately responsible for one or more consecutive
tests (via the Bourne shell code generated in `configure'). Now
that the `Makefile's have been successfully created, it is safe to
call make to perform the actual compilation:
|
$ make
make all-recursive
make[1]: Entering directory `/tmp/sic'
Making all in replace
make[2]: Entering directory `/tmp/sic/replace'
rm -f libreplace.a
ar cru libreplace.a
ranlib libreplace.a
make[2]: Leaving directory `/tmp/sic/replace'
Making all in sic
make[2]: Entering directory `/tmp/sic/sic'
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c builtin.c
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c error.c
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c eval.c
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c list.c
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c sic.c
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c syntax.c
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c xmalloc.c
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c xstrdup.c
gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -c xstrerror.c
rm -f libsic.a
ar cru libsic.a builtin.o error.o eval.o list.o sic.o syntax.o xmalloc.o
xstrdup.o xstrerror.o
ranlib libsic.a
make[2]: Leaving directory `/tmp/sic/sic'
make[1]: Leaving directory `/tmp/sic'
|
On this machine, as you can see from the output of configure
above, I have no need of the fallback implementation of strerror ,
so `libreplace.a' is empty. On another machine this might not be
the case. In any event, I now have a compiled `libsic.a' -- so
far, so good.
9.3 A Sample Shell Application
What I need now, is a program that uses `libsic.a', if only to give
me confidence that it is working. In this section, I will write a
simple shell which uses the library. But first, I'll create a directory
to put it in:
|
$ mkdir src
$ ls -F
COPYING Makefile.am aclocal.m4 configure* config/ sic/
INSTALL Makefile.in bootstrap* configure.in replace/ src/
$ cd src
|
In order to put this shell together, we need to provide just a few
things for integration with `libsic.a'...
9.3.1 `sic_repl.c'
In `sic_repl.c'(13) there is a loop for
reading strings typed by the user, evaluating them and printing the
results. GNU readline is ideally suited to this, but it is not
always available -- or sometimes people simply may not wish to use it.
With the help of GNU Autotools, it is very easy to cater for building with
and without GNU readline. `sic_repl.c' uses this function to
read lines of input from the user:
|
static char *
getline (FILE *in, const char *prompt)
{
static char *buf = NULL; /* Always allocated and freed
from inside this function. */
XFREE (buf);
buf = (char *) readline ((char *) prompt);
#ifdef HAVE_ADD_HISTORY
if (buf && *buf)
add_history (buf);
#endif
return buf;
}
|
To make this work, I must write an Autoconf macro which adds an option
to `configure', so that when the package is installed, it will use
the readline library if `--with-readline' is used:
|
AC_DEFUN(SIC_WITH_READLINE,
[AC_ARG_WITH(readline,
[ --with-readline compile with the system readline library],
[if test x"${withval-no}" != xno; then
sic_save_LIBS=$LIBS
AC_CHECK_LIB(readline, readline)
if test x"${ac_cv_lib_readline_readline}" = xno; then
AC_MSG_ERROR(libreadline not found)
fi
LIBS=$sic_save_LIBS
fi])
AM_CONDITIONAL(WITH_READLINE, test x"${with_readline-no}" != xno)
])
|
Having put this macro in the file `config/readline.m4', I must also
call the new macro (SIC_WITH_READLINE ) from `configure.in'.
9.3.2 `sic_syntax.c'
The syntax of the commands in the shell I am writing is defined by a set
of syntax handlers which are loaded into `libsic' at startup. I
can get the C preprocessor to do most of the repetitive code for me, and
just fill in the function bodies:
|
#if HAVE_CONFIG_H
# include <sic/config.h>
#endif
#include "sic.h"
/* List of builtin syntax. */
#define syntax_functions \
SYNTAX(escape, "\\") \
SYNTAX(space, " \f\n\r\t\v") \
SYNTAX(comment, "#") \
SYNTAX(string, "\"") \
SYNTAX(endcmd, ";") \
SYNTAX(endstr, "")
/* Prototype Generator. */
#define SIC_SYNTAX(name) \
int name (Sic *sic, BufferIn *in, BufferOut *out)
#define SYNTAX(name, string) \
extern SIC_SYNTAX (CONC (syntax_, name));
syntax_functions
#undef SYNTAX
/* Syntax handler mappings. */
Syntax syntax_table[] = {
#define SYNTAX(name, string) \
{ CONC (syntax_, name), string },
syntax_functions
#undef SYNTAX
{ NULL, NULL }
};
|
This code writes the prototypes for the syntax handler functions, and
creates a table which associates each with one or more characters that
might occur in the input stream. The advantage of writing the code this
way is that when I want to add a new syntax handler later, it is a simple
matter of adding a new row to the syntax_functions macro, and
writing the function itself.
9.3.3 `sic_builtin.c'
In addition to the syntax handlers I have just added to the Sic shell,
the language of this shell is also defined by the builtin commands it
provides. The infrastructure for this file is built from a table of
functions which is fed into various C preprocessor macros, just as I did
for the syntax handlers.
One builtin handler function has special status, builtin_unknown .
This is the builtin that is called, if the Sic library cannot find a
suitable builtin function to handle the current input command. At first
this doesn't sound especially important -- but it is the key to any
shell implementation. When there is no builtin handler for the command,
the shell will search the users command path, `$PATH', to find a
suitable executable. And this is the job of builtin_unknown :
|
int
builtin_unknown (Sic *sic, int argc, char *const argv[])
{
char *path = path_find (argv[0]);
int status = SIC_ERROR;
if (!path)
{
sic_result_append (sic, "command \"");
sic_result_append (sic, argv[0]);
sic_result_append (sic, "\" not found");
}
else if (path_execute (sic, path, argv) != SIC_OKAY)
{
sic_result_append (sic, "command \"");
sic_result_append (sic, argv[0]);
sic_result_append (sic, "\" failed: ");
sic_result_append (sic, strerror (errno));
}
else
status = SIC_OKAY;
return status;
}
static char *
path_find (const char *command)
{
char *path = xstrdup (command);
if (*command == '/')
{
if (access (command, X_OK) < 0)
goto notfound;
}
else
{
char *PATH = getenv ("PATH");
char *pbeg, *pend;
size_t len;
for (pbeg = PATH; *pbeg != '\0'; pbeg = pend)
{
pbeg += strspn (pbeg, ":");
len = strcspn (pbeg, ":");
pend = pbeg + len;
path = XREALLOC (char, path, 2 + len + strlen(command));
*path = '\0';
strncat (path, pbeg, len);
if (path[len -1] != '/') strcat (path, "/");
strcat (path, command);
if (access (path, X_OK) == 0)
break;
}
if (*pbeg == '\0')
goto notfound;
}
return path;
notfound:
XFREE (path);
return NULL;
}
|
Running `autoscan' again at this point adds
AC_CHECK_FUNCS(strcspn strspn) to `configure.scan'. This
tells me that these functions are not truly portable. As before I
provide fallback implementations for these functions incase they are
missing from the target host -- and as it turns out, they are easy to
write:
|
/* strcspn.c -- implement strcspn() for architectures without it */
#if HAVE_CONFIG_H
# include <sic/config.h>
#endif
#include <sys/types.h>
#if STDC_HEADERS
# include <string.h>
#elif HAVE_STRINGS_H
# include <strings.h>
#endif
#if !HAVE_STRCHR
# ifndef strchr
# define strchr index
# endif
#endif
size_t
strcspn (const char *string, const char *reject)
{
size_t count = 0;
while (strchr (reject, *string) == 0)
++count, ++string;
return count;
}
|
There is no need to add any code to `Makefile.am', because the
configure script will automatically add the names of the
missing function sources to `@LIBOBJS@'.
This implementation uses the autoconf generated
`config.h' to get information about the availability of headers and
type definitions. It is interesting that autoscan reports
that strchr and strrchr , which are used in the fallback
implementations of strcspn and strspn respectively, are
themselves not portable! Luckily, the Autoconf manual tells me exactly
how to deal with this: by adding some code to my `common.h'
(paraphrased from the literal code in the manual):
|
#if !STDC_HEADERS
# if !HAVE_STRCHR
# define strchr index
# define strrchr rindex
# endif
#endif
|
And another macro in `configure.in':
9.3.4 `sic.c' & `sic.h'
Since the application binary has no installed header files, there is
little point in maintaining a corresponding header file for every
source, all of the structures shared by these files, and non-static
functions in these files are declared in `sic.h':
|
#ifndef SIC_H
#define SIC_H 1
#include <sic/common.h>
#include <sic/sic.h>
#include <sic/builtin.h>
BEGIN_C_DECLS
extern Syntax syntax_table[];
extern Builtin builtin_table[];
extern Syntax syntax_table[];
extern int evalstream (Sic *sic, FILE *stream);
extern int evalline (Sic *sic, char **pline);
extern int source (Sic *sic, const char *path);
extern int syntax_init (Sic *sic);
extern int syntax_finish (Sic *sic, BufferIn *in, BufferOut *out);
END_C_DECLS
#endif /* !SIC_H */
|
To hold together everything you have seen so far, the main
function creates a Sic parser and initialises it by adding syntax
handler functions and builtin functions from the two tables defined
earlier, before handing control to evalstream which will
eventually exit when the input stream is exhausted.
|
int
main (int argc, char * const argv[])
{
int result = EXIT_SUCCESS;
Sic *sic = sic_new ();
/* initialise the system */
if (sic_init (sic) != SIC_OKAY)
sic_fatal ("sic initialisation failed");
signal (SIGINT, SIG_IGN);
setbuf (stdout, NULL);
/* initial symbols */
sicstate_set (sic, "PS1", "] ", NULL);
sicstate_set (sic, "PS2", "- ", NULL);
/* evaluate the input stream */
evalstream (sic, stdin);
exit (result);
}
|
Now, the shell can be built and used:
|
$ bootstrap
...
$ ./configure --with-readline
...
$ make
...
make[2]: Entering directory `/tmp/sic/src'
gcc -DHAVE_CONFIG_H -I. -I.. -I../sic -I.. -I../sic -g -c sic.c
gcc -DHAVE_CONFIG_H -I. -I.. -I../sic -I.. -I../sic -g -c sic_builtin.c
gcc -DHAVE_CONFIG_H -I. -I.. -I../sic -I.. -I../sic -g -c sic_repl.c
gcc -DHAVE_CONFIG_H -I. -I.. -I../sic -I.. -I../sic -g -c sic_syntax.c
gcc -g -O2 -o sic sic.o sic_builtin.o sic_repl.o sic_syntax.o \
../sic/libsic.a ../replace/libreplace.a -lreadline
make[2]: Leaving directory `/tmp/sic/src'
...
$ ./src/sic
] pwd
/tmp/sic
] ls -F
Makefile aclocal.m4 config.cache configure* sic/
Makefile.am bootstrap* config.log configure.in src/
Makefile.in config/ config.status* replace/
] exit
$
|
This chapter has developed a solid foundation of code, which I will
return to in 12. A Large GNU Autotools Project, when Libtool will join
the fray. The chapters leading up to that explain what Libtool is for,
how to use it and integrate it into your own projects, and the
advantages it offers over building shared libraries with Automake (or
even just Make) alone.
|