Search     or:     and:
 LINUX 
 Language 
 Kernel 
 Package 
 Book 
 Test 
 OS 
 Forum 
iakovlev.org

Autoconf, Automake, and Libtool

Авторы :

17. Dynamic Loading

An increasingly popular way of adding functionality to a project is to give a program the ability to dynamically load plugins, or modules. By doing this your users can extend your project in new ways, which even you perhaps hadn't envisioned. Dynamic Loading, then, is the process of loading compiled objects into a running program and executing some or all of the code from the loaded objects in the same context as the main executable.

This chapter begins with a discussion of the mechanics of dynamic modules and how they are used, and ends with example code for very simple module loading on GNU/Linux, along with the example code for a complementary dynamically loadable module. Once you have read this chapter and understand the principles of dynamic loading, the next chapter will explain how to use GNU Autotools to write portable dynamic module loading code and address some of the shortcomings of native dynamic loading APIs.

17.1 Dynamic Modules

In order to dynamically load some code into your executable, that code must be compiled in some special but architecture dependent fashion. Depending on the compiler you use and the platform you are compiling for, there are different conventions you must observe in the code for the module, and for the particular combination of compiler options you need to select if the resulting objects are to be suitable for use in a dynamic module. For the rest of this chapter I will concentrate on the conventions used when compiling dynamic modules with GCC on GNU/Linux, which although peculiar to this particular combination of compiler and host architecture, are typical of the sorts of conventions you would need to observe on other architectures or with a different compiler.

With GCC on GNU/Linux, you must compile each of the source files with `-fPIC'(38), the resulting objects must be linked into a loadable module with gcc's `-shared' option:

 
$ gcc -fPIC -c foo.c
 $ gcc -fPIC -c bar.c
 $ gcc -shared -o baz.so foo.o bar.o
 

This is pretty similar to how you might go about linking a shared library, except that the `baz.so' module will never be linked with a `-lbaz' option, so the `lib' prefix isn't necessary. In fact, it would probably be confusing if you used the prefix. Similarly, there is no constraint to use any particular filename suffix, but it is sensible to use the target's native shared library suffix (GNU/Linux uses `.so') to make it obvious that the compiled file is some sort of shared object, and not a normal executable.

Apart from that, the only difference between a shared library built for linking at compile-time and a dynamic module built for loading at run-time is that the module must provide known entry points for the main executable to call. That is, when writing code destined for a dynamic module, you must provide functions or variables with known names and semantics that the main executable can use to access the functionality of the module. This is different to the function and variable names in a regular library, which are already known when you write the client code, since the libraries are always written before the code that uses them; a runtime module loading system must, by definition, be able to cope with modules that are written after the code that uses those modules.

17.2 Module Access Functions

In order to access the functionality of dynamic modules, different architectures provide various APIs to bring the code from the module into the address space of the loading program, and to access the symbols exported by that module.

GNU/Linux uses the dynamic module API introduced by Sun's Solaris operating system, and widely adopted (and adapted!) by the majority of modern Unices (39). The interface consists of four functions. In practice, you really ought not to use these functions, since you would be locking your project into this single API, and the class of machines that supports it. This description is over-simplified to serve as a comparison with the fully portable libltdl API described in 18. Using GNU libltdl. The minutiae are not discussed, because therein lie the implementation peculiarities that spoil the portability of this API. As they stand, these descriptions give a good overview of how the functions work at a high level, and are broadly applicable to the various implementations in use. If you are curious, the details of your machines particular dynamic loading API will be available in its system manual pages.

Function: void * dlopen (const char *filename, int flag)
This function brings the code from a named module into the address space of the running program that calls it, and returns a handle which is used by the other API functions. If filename is not an absolute path, GNU/Linux will search for it in directories named in the `LD_LIBRARY_PATH' environment variable, and then in the standard library directories before giving up.

The flag argument is made by `OR'ing together various flag bits defined in the system headers. On GNU/Linux, these flags are defined in `dlfcn.h':

`RTLD_LAZY'
Resolve undefined symbols when they are first used.

`RTLD_NOW'
If all symbols cannot be resolved when the module is loaded, dlopen will fail and return `NULL'.

`RTLD_GLOBAL'
All of the global symbols in the loaded module will be available to resolve undefined symbols in subsequently loaded modules.

Function: void * dlsym (void *handle, char *name)
Returns the address of the named symbol in the module which returned handle when it was dlopened. You must cast the returned address to a known type before using it.

Function: int dlclose (void *handle)
When you are finished with a particular module, it can be removed from memory using this function.

Function: const char * dlerror (void)
If any of the other three API calls fails, this function returns a string which describes the last error that occured.

In order to use these functions on GNU/Linux, you must #include <dlfcn.h> for the function prototypes, and link with `-ldl' to provide the API implementation. Other Unices use `-ldld' or provide the implementation of the API inside the standard C library.

17.3 Finding a Module

When you are writing a program that will load dynamic modules, a major stumbling block is writing the code to find the modules you wish to load. If you are worried about portability (which you must be, or you wouldn't be reading this book!), you can't rely on the default search algorithm of the vendor dlopen function, since it varies from implementation to implementation. You can't even rely on the name of the module, since the module suffix will vary according to the conventions of the target host (though you could insist on a particular suffix for modules you are willing to load).

Unfortunately, this means that you will need to implement your own searching algorithm and always use an absolute pathname when you call dlopen. A widely adopted mechanism is to look for each module in directories listed in an environment variable specific to your application, allowing your users to inform the application of the location of any modules they have written. If a suitable module is not yet found, the application would then default to looking in a list of standard locations -- say, in a subdirectory of the user's home directory, and finally a subdirectory of the application installation tree. For application `foo', you might use `/usr/lib/foo/module.so' -- that is, `$(pkglibdir)/module.so' if you are using Automake.

This algorithm can be further improved:

  • If you try different module suffixes to the named module for every directory in the search path, which will avoid locking your code into a subset of machines that use the otherwise hardcoded module suffix. With this in place you could ask the module loader for module `foomodule', and if it was not found in the first search directory, the module loader could try `foomodule.so', `foomodule.sl' and `foomodule.dll' before moving on to the next directory.

  • You might also provide command line options to your application which will preload modules before starting the program proper or to modify the module search path. For example, GNU M4, version 1.5, will have the following dynamic loading options:

 
$ m4 --help
 Usage: m4 [OPTION]... [FILE]...
 ...
 Dynamic loading features:
   -M, --module-directory=DIRECTORY  add DIRECTORY to the search path
   -m, --load-module=MODULE          load dynamic MODULE from M4MODPATH
 ...
 Report bugs to <bug-m4@gnu.org>.
 

17.4 A Simple GNU/Linux Module Loader

Something to be aware of, is that when your users write dynamic modules for your application, they are subject to the interface you design. It is very important to design a dynamic module interface that is clean and functional before other people start to write modules for your code. If you ever need to change the interface, your users will need to rewrite their modules. Of course you can carefully change the interface to retain backwards compatibility to save your users the trouble of rewriting their modules, but that is no substitute for designing a good interface from the outset. If you do get it wrong, and subsequently discover that the design you implemented is misconceived (this is the voice of experience speaking!), you will be left with a difficult choice: try to tweak the broken API so that it does work while retaining backwards compatibility, and the maintenance and performace penalty that brings? Or start again with a fresh design born of the experience gained last time, and rewrite all of the modules you have so far?

If there are other applications which have similar module requirements to you, it is worth writing a loader that uses the same interface and semantics. That way, you will (hopefully) be building from a known good API design, and you will have access to all the modules for that other application too, and vice versa.

For the sake of clarity, I have sidestepped any issues of API design for the following example, by choosing this minimal interface:

Function: int run (const char *argument)
When the module is successfully loaded a function with the following prototype is called with the argument given on the command line. If this entry point is found and called, but returns `-1', an error message is displayed by the calling program.

Here's a simplistic but complete dynamic module loading application you can build for this interface with the GNU/Linux dynamic loading API:

 
#include <stdio.h>
 #include <stdlib.h>
 #ifndef EXIT_FAILURE
 #  define EXIT_FAILURE        1
 #  define EXIT_SUCCESS        0
 #endif
 
 #include <limits.h>
 #ifndef PATH_MAX
 #  define PATH_MAX 255
 #endif
 
 #include <dlfcn.h>
 /* This is missing from very old Linux libc. */
 #ifndef RTLD_NOW
 #  define RTLD_NOW 2
 #endif
 
 typedef int entrypoint (const char *argument);
 
 /* Save and return a copy of the dlerror() error  message,
    since the next API call may overwrite the original. */
 static char *dlerrordup (char *errormsg);
 
 int
 main (int argc, const char *argv)
 {
   const char modulepath[1+ PATH_MAX];
   const char *errormsg = NULL;
   void *module = NULL;
   entrypoint *run = NULL;
   int errors = 0;
 
   if (argc != 3)
     {
       fprintf (stderr, "USAGE: main MODULENAME ARGUMENT\n");
       exit (EXIT_FAILURE);
     }
 
   /* Set the module search path. */
   getcwd (modulepath, PATH_MAX);
   strcat (modulepath, "/");
   strcat (modulepath, argv[1]);
   
   /* Load the module. */
   module = dlopen (modulepath, RTLD_NOW);
   if (!module)
     {
       strcat (modulepath, ".so");
       module = dlopen (modulepath, RTLD_NOW);
     }
   if (!module)
     errors = 1;
 
   /* Find the entry point. */
   if (!errors)
     {
       run = dlsym (module, "run");
       /* In principle, run might legitimately be NULL, so
          I don't use run == NULL as an error indicator. */
       errormsg = dlerrordup (errormsg);
 
       if (errormsg != NULL)
         errors = dlclose (module);
     }
 
   /* Call the entry point function. */
   if (!errors)
     {
       int result = (*run) (argv[2]);
       if (result < 0)
         errormsg = strdup ("module entry point execution failed");
       else
         printf ("\t=> %d\n", result);
     }
 
   /* Unload the module, now that we are done with it. */
   if (!errors)
     errors = dlclose (module);
 
   if (errors)
     {
       /* Diagnose the encountered error. */
       errormsg = dlerrordup (errormsg);
 
       if (!errormsg)
         {
           fprintf (stderr, "%s: dlerror() failed.\n", argv[0]);
           return EXIT_FAILURE;
         }
     }
   
   if (errormsg)
     {
       fprintf (stderr, "%s: %s.\n", argv[0], errormsg);
       free (errormsg);
       return EXIT_FAILURE;
     }
   
   return EXIT_SUCCESS;
 }
 
 /* Be careful to save a copy of the error message,
    since the next API call may overwrite the original. */
 static char *
 dlerrordup (char *errormsg)
 {
   char *error = (char *) dlerror ();
   if (error && !errormsg)
     errormsg = strdup (error);
   return errormsg;
 }
 
 

You would compile this on a GNU/Linux machine like so:

 
$ gcc -o simple-loader simple-loader.c -ldl
 

However, despite making reasonable effort with this loader, and ignoring features which could easily be added, it still has some seemingly insoluble problems:

  1. It will fail if the user's platform doesn't have the dlopen API. This also includes platforms which have no shared libraries.

  2. It relies on the implementation to provide a working self-opening mechanism. `dlopen (NULL, RTLD_NOW)' is very often unimplemented, or buggy, and without that, it is impossible to access the symbols of the main program through the `dlsym' mechanism.

  3. It is quite difficult to figure out at compile time whether the target host needs `libdl.so' to be linked.

17.5 A Simple GNU/Linux Dynamic Module

As an appetiser for working with dynamic loadable modules, here is a minimal module written for the interface used by the loader in the previous section:

 
#include <stdio.h>
 
 int
 run (const char *argument)
 {
   printf ("Hello, %s!\n", argument);
   return 0;
 }
 
 

Again, to compile on a GNU/Linux machine:

 
$ gcc -fPIC -c simple-module.c
 $ gcc -shared -o simple-module.so
 

Having compiled both loader and module, a test run looks like this:

 
$ ./simple-loader simple-module World
 Hello, World!
         => 0
 

If you have a GNU/Linux system, you should experiment with the simple examples from this chapter to get a feel for the relationship between a dynamic module loader and its modules -- tweak the interface a little; try writing another simple module. If you have a machine with a different dynamic loading API, try porting these examples to that machine to get a feel for the kinds of problems you would encounter if you wanted a module system that would work with both APIs.

The next chapter will do just that, and develop these examples into a fully portable module loading system with the aid of GNU Autotools. In 20.1 A Module Loading Subsystem, I will add a more realistic mdoule loader into the Sic project last discussed in 12. A Large GNU Autotools Project.

Оставьте свой комментарий !

Ваше имя:
Комментарий:
Оба поля являются обязательными

 Автор  Комментарий к данной статье