2.2 Installing mod_perl on Unix Platforms
Now let's go over the installation again, this time
with each step explained in detail and with some troubleshooting
advice. If the build worked and you are in a hurry to boot your new
httpd, you may skip to Section 2.4.
Before installing Apache and mod_perl, you usually have to become
root so that the files can be installed in a
protected area. However, users without root
access can still install all files under their home directories by
building Apache in an unprivileged location; you need
root access only to install it. We will talk
about the nuances of this approach in Chapter 3.
2.2.1 Obtaining and Unpacking the Source Code
The first step is to obtain the source code
distributions of Apache and mod_perl. These distributions can
be retrieved from http://www.apache.org/dist/httpd/ and
http://perl.apache.org/dist/ and
are also available from mirror sites. Even if you have the Apache
server running on your machine, you'll need its
source distribution to rebuild it from scratch with mod_perl.
The source distributions of Apache and mod_perl should be
downloaded into a directory of your
choice. For the sake of consistency, we assume throughout the book
that all builds are being done in the
/home/stas/src directory. Just remember to
substitute /home/stas/src in the examples with
the actual path being used.
The next step is to move to the directory containing the source
archives:
panic% cd /home/stas/src
Uncompress and untar both sources. GNU
tar allows this using a single command
per file:
panic% tar -zvxf apache_1.3.xx.tar.gz
panic% tar -zvxf mod_perl-1.xx.tar.gz
For non-GNU tars, you may need to do this with
two steps (which you can combine via a pipe):
panic% gzip -dc apache_1.3.xx.tar.gz | tar -xvf -
panic% gzip -dc mod_perl-1.xx.tar.gz | tar -xvf -
Linux distributions supply tar and
gzip and install them
by default. If your machine doesn't have these
utilities already installed, you can get tar and
gzip from
http://www.gnu.org/, among other
sources. The GNU versions are available for every platform that
Apache supports.
2.2.2 Building mod_perl
Move into the
/home/stas/src/mod_perl-1.xx/ source
distribution directory:
panic% cd mod_perl-1.xx
The next step is to create the Makefile. This is
no different in principle from the creation of the
Makefile for any other Perl module.
panic% perl Makefile.PL APACHE_SRC=../apache_1.3.xx/src \
DO_HTTPD=1 USE_APACI=1 EVERYTHING=1
mod_perl accepts a variety of parameters. The options specified above
will enable almost every feature that mod_perl offers. There are many
other options for fine-tuning mod_perl to suit particular
circumstances; these are explained in detail in Chapter 3.
Running Makefile.PL will cause Perl to check for
prerequisites and identify any required software packages that are
missing. If it reports missing Perl packages, they will have to be
installed before proceeding. Perl modules are available from CPAN
(http://cpan.org/) and can easily be downloaded and installed.
An advantage of installing mod_perl with the help of the
CPAN.pm module is that all the missing modules
will be installed with the Bundle::Apache bundle:
panic% perl -MCPAN -e 'install("Bundle::Apache")'
We will talk in depth about using CPAN.pm in Chapter 3.
Running Makefile.PL also transparently executes
the ./configure script from
Apache's source distribution directory, which
prepares the Apache build configuration files. If parameters must be
passed to Apache's ./configure
script, they can be passed as options to
Makefile.PL. Chapter 3 covers
all this in detail.
The
httpd
executable can now be built by using the
make
utility (note that the current working directory is still
/home/stas/src/mod_perl-1.xx/):
panic% make
This command prepares the mod_perl extension files, installs them in
the Apache source tree, and builds the httpd
executable (the web server itself) by compiling all the required
files. Upon completion of the make process, the
working directory is restored to
/home/stas/src/mod_perl-1.xx/.
Running
make test will execute various mod_perl tests on
the newly built httpd executable:
panic% make test
This command starts the server on a nonstandard port (8529) and tests
whether all parts of the built server function correctly. The process
will report anything that does not work properly.
2.2.3 Installing mod_perl
Running make
install completes the installation process by installing
all the Perl files required for mod_perl to run. It also installs the
mod_perl documentation (manpages). Typically, you need to be
root to have permission to do this, but another
user account can be used if the appropriate options are set on the
perl Makefile.PL command line (see Chapter 3). To become root, use the
su command.
panic% su
panic# make install
If you have the proper permissions, you can also chain all three
make commands into a single command line:
panic# make && make test && make install
The single-line version simplifies the installation, since there is
no need to wait for each command to complete before starting the next
one. Of course, if you need to become root in
order to run make install,
you'll either need to run make
install as a separate command or become
root before running the single-line version.
If you choose the all-in-one approach and any of the
make commands fail, execution will stop at that
point. For example, if make alone fails, then
make test and make install
will not be attempted. Similarly, if make test
fails, then make install will not be attempted.
Finally, change to the Apache source distribution directory and run
make install to create the Apache directory tree
and install Apache's header files
(*.h), default configuration files
(*.conf), the httpd
executable, and a few other programs:
panic# cd ../apache_1.3.xx
panic# make install
Note that, as with a plain Apache installation, any configuration
files left from a previous installation will not be overwritten by
this process. Although backing up is never unwise,
it's not actually necessary to back up the
previously working configuration files before the installation.
At the end of the make install process, the
installation program will list the path to the
apachectl utility, which you can use to start and
stop the server, and the path to the installed configuration files.
It is important to write down these pathnames, as they will be needed
frequently when maintaining and configuring Apache. On our machines,
these two important paths are:
/usr/local/apache/bin/apachectl
/usr/local/apache/conf/httpd.conf
The mod_perl Apache server is now built and installed. All that needs
to be done before it can be run is to edit the configuration file
httpd.conf and write a test script.
2.3 Configuring and Starting the mod_perl Server
Once you have mod_perl installed, you need to
configure the server and test it.
The first thing to do is ensure that Apache was built correctly and
that it can serve plain HTML files. This helps to minimize the number
of possible problem areas: once you have confirmed that Apache can
serve plain HTML files, you know that any problems with mod_perl are
related to mod_perl itself.
Apache should be configured just as you would configure it without
mod_perl. Use the defaults as suggested, customizing only when
necessary. Values that will probably need to be customized are
ServerName, Port,
User, Group,
ServerAdmin, DocumentRoot, and
a few others. There are helpful hints preceding each directive in the
configuration files themselves, with further information in
Apache's documentation. Follow the advice in the
files and documentation if in doubt.
When the configuration file has been edited, start the server. One of
the ways to start and stop the server is to use the
apachectl utility. To start the
server with apachectl, type:
panic# /usr/local/apache/bin/apachectl start
To stop the server, type:
panic# /usr/local/apache/bin/apachectl stop
Note that if the server will listen on port 80 or
another privileged port, the user executing
apachectl must be root.
After the server has started, check in the
error_log file
(/usr/local/apache/logs/error_log, by default)
to see if the server has indeed started. Do not rely on the
apachectl status reports. The
error_log should contain something like the
following:
[Thu Jun 22 17:14:07 2000] [notice] Apache/1.3.12 (Unix)
mod_perl/1.24 configured -- resuming normal operations
Now point your browser to
http://localhost/ or
http://example.com/, as configured with the
ServerName directive. If the
Port directive has been set with a value other
than 80, add this port number to the end of the
server name. For example, if the port is 8080, test the server with
http://localhost:8080/ or
http://example.com:8080/. The
"It Worked!" page, which is an
index.html file that is installed automatically
when running make install in the Apache source
tree, should appear in the browser. If this page does not appear,
something went wrong and the contents of the
logs/error_log file should be checked. The path
to the error log file is specified by the ErrorLog
directive in httpd.conf. (It is usually
specified relative to the ServerRoot, so a value
of logs/error_log usually means
/usr/local/apache/logs/error_log if Apache is
installed into /usr/local/apache.)
If everything works as expected, shut down the server, open
httpd.conf with a text editor, and scroll to the
end of the file. The mod_perl configuration directives are
conventionally added to the end of httpd.conf.
It is possible to place mod_perl's configuration
directives anywhere in httpd.conf, but adding
them at the end seems to work best in practice.
Assuming that all the scripts that should be executed by the
mod_perl-enabled server are located in the
/home/stas/modperl directory, add the following
configuration directives:
Alias /perl/ /home/stas/modperl/
PerlModule Apache::Registry
<Location /perl/>
SetHandler perl-script
PerlHandler Apache::Registry
Options +ExecCGI
PerlSendHeader On
Allow from all
</Location>
Save the modified file.
This configuration causes every URI starting with
/perl to be handled by the Apache mod_perl
module with the handler from the Perl module
Apache::Registry.
2.5 Preparing the Scripts Directory
Now you
have to select a directory where all the
mod_perl scripts and modules will be placed. We usually create a
directory called modperl under our home
directory for this purpose (e.g.,
/home/stas/modperl), but it is also common to
create a directory called perl under your Apache
server root, such as /usr/local/apache/perl.
First create this
directory if it doesn't
yet exist:
panic% mkdir /home/stas/modperl
Next, set the file permissions. Remember
that when scripts are executed from a shell, they are being executed
with the permissions of the user's account. Usually,
you want to have read, write, and execute access for yourself, but
only read and execute permissions for the server. When the scripts
are run by Apache, however, the server needs to be able to read and
execute them. Apache runs under an account specified by the
User directive, typically
nobody. You can modify the
User directive to run the server under your
username, for example:
User stas
Since the permissions on all files and directories should usually be
rwx------, set the directory permissions to:
panic% chmod 0700 /home/stas/modperl
Now no one but you and the server can access the files in this
directory. You should set the same permissions for all the files you
place under this directory.
If the server is running under the nobody
account, you have to set the permissions to
rwxr-xr-x or 0755 for your
files and directories. This is insecure, because other users on the
same machine can read your files.
panic# chmod 0755 /home/stas/modperl
If you aren't running the server with your username,
you have to set these permissions for all the files created under
this directory so Apache can read and execute them.
In the following examples, we assume that you run the server under
your username, and hence we set the scripts'
permissions to 0700.
2.6 A Sample Apache::Registry Script
One of mod_perl's benefits
is that it can run existing CGI scripts written in Perl that were
previously used under mod_cgi (the standard Apache CGI handler).
Indeed, mod_perl can be used for running CGI scripts without taking
advantage of any of mod_perl's special features,
while getting the benefit of the potentially huge performance boost.
Example 2-1 gives an example of a very simple
CGI-style mod_perl script.
Example 2-1. mod_perl_rules1.pl
print "Content-type: text/plain\n\n";
print "mod_perl rules!\n";
Save this script in the
/home/stas/modperl/mod_perl_rules1.pl file.
Notice that the #! line (colloquially known as the
shebang line) is not needed with mod_perl,
although having one causes no problems, as can be seen in Example 2-2.
Example 2-2. mod_perl_rules1.pl with shebang line
#!/usr/bin/perl
print "Content-type: text/plain\n\n";
print "mod_perl rules!\n";
Now make the script executable and readable by the server, as
explained in the previous section:
panic% chmod 0700 /home/stas/modperl/mod_perl_rules1.pl
The mod_perl_rules1.pl script can be tested from
the command line, since it is essentially a regular Perl script:
panic% perl /home/stas/modperl/mod_perl_rules1.pl
This should produce the following output:
Content-type: text/plain
mod_perl rules!
Make sure the server is running and issue these requests using a
browser:
http://localhost/perl/mod_perl_rules1.pl
If the port being used is not 80 (e.g., 8080), the port number should
be included in the URL:
http://localhost:8080/perl/mod_perl_rules1.pl
Also, the localhost approach will work only if the
browser is running on the same machine as the server. If not, use the
real server name for this test. For example:
http://example.com/perl/mod_perl_rules1.pl
The page rendered should be similar to the one in Figure 2-1.
If you see it, congratulations! You have a working mod_perl server.
If something went wrong, go through the installation process again,
making sure that none of the steps are missed and that each is
completed successfully. You might also look at the
error_log file for error messages. If this does
not solve the problem, Chapter 3 will attempt to
salvage the situation.
Jumping a little bit ahead, Example 2-3
shows the same CGI script written with the
mod_perl API.
Example 2-3. mod_perl_rules2.pl
my $r = Apache->request;
$r->send_http_header('text/plain');
$r->print("mod_perl rules!\n");
The mod_perl API needs a request object, $r, to
communicate with Apache. The script retrieves this object and uses it
to send the HTTP header and print the irrefutable fact about
mod_perl's coolness.
This script generates the same output as the previous one.
As you can see, it's not much harder to write your
code using the mod_perl API. You need to learn the API, but the
concepts are the same. As we will show in the following chapters,
usually you will want to use the mod_perl API for better performance
or when you need functionality that CGI
doesn't provide.
2.6.1 Porting Existing CGI Scripts to mod_perl
Now
it's time to move
any existing CGI scripts from the
/somewhere/cgi-bin directory to
/home/stas/modperl. Once moved, they should run
much faster when requested from the newly configured base URL
(/perl/). For example, a CGI script called
test.pl that was previously accessed as
/cgi-bin/test.pl can now be accessed as
/perl/test.pl under mod_perl and the
Apache::Registry module.
Some of the scripts might not work immediately and may require some
minor tweaking or even a partial rewrite to work properly with
mod_perl. We will talk in depth about these issues in Chapter 6. Most scripts that have been written with care
and developed with warnings enabled and the strict
pragma
will probably work without any modifications at all.
A quick solution that
avoids
most rewriting or editing of existing scripts that do not run
properly under Apache::Registry is to run them
under Apache::PerlRun. This can be achieved by
simply replacing Apache::Registry with
Apache::PerlRun in
httpd.conf. Put the following configuration
directives instead in httpd.conf and restart the
server:
Alias /perl/ /home/stas/modperl/
PerlModule Apache::PerlRun
<Location /perl/>
SetHandler perl-script
PerlHandler Apache::PerlRun
Options ExecCGI
PerlSendHeader On
Allow from all
</Location>
Almost every script should now run without problems; the few
exceptions will almost certainly be due to the few minor limitations
that mod_perl or its handlers have, but these are all solvable and
covered in Chapter 6.
As we saw in Chapter 1,
Apache::PerlRun is usually useful while
transitioning scripts to run properly under
Apache::Registry. However, we
don't recommend using
Apache::PerlRun in the long term; although it is
significantly faster than mod_cgi, it's still not as
fast as Apache::Registry and mod_perl handlers.
2.7 A Simple mod_perl Content Handler
As we mentioned
in the beginning of this chapter, mod_perl lets you run both scripts
and handlers. The previous example showed a script, which is probably
the most familiar approach to web programming, but the more advanced
use of mod_perl involves writing handlers. Have no fear; writing
handlers is almost as easy as writing scripts and offers a level of
access to Apache's internals that is simply not
possible with conventional CGI scripts.
To create a mod_perl handler module, all that is necessary is to wrap
the code that would have been the body of a script into a
handler subroutine, add a statement to return the
status to the server when the subroutine has successfully completed,
and add a package declaration at the top of the code.
Just as with scripts, the familiar CGI API may be used. Example 2-4 shows an example.
Example 2-4. ModPerl/Rules1.pm
package ModPerl::Rules1;
use Apache::Constants qw(:common);
sub handler {
print "Content-type: text/plain\n\n";
print "mod_perl rules!\n";
return OK; # We must return a status to mod_perl
}
1; # This is a perl module so we must return true to perl
Alternatively, the mod_perl API can be used. This API provides almost
complete access to the Apache core. In the simple example used here,
either approach is fine, but when lower-level access to Apache is
required, the mod_perl API shown in Example 2-5 must
be used.
Example 2-5. ModPerl/Rules2.pm
package ModPerl::Rules2;
use Apache::Constants qw(:common);
sub handler {
my $r = shift;
$r->send_http_header('text/plain');
$r->print("mod_perl rules!\n");
return OK; # We must return a status to mod_perl
}
1; # This is a perl module so we must return true to perl
Create a directory called ModPerl under one of
the directories in @INC (e.g., under
/usr/lib/perl5/site_perl/5.6.1), and put
Rules1.pm and Rules2.pm
into it. (Note that you will need root access in
order to do this.) The files should include the code from the above
examples. To find out what the @INC directories
are, execute:
panic% perl -le 'print join "\n", @INC'
On our machine it reports:
/usr/lib/perl5/5.6.1/i386-linux
/usr/lib/perl5/5.6.1
/usr/lib/perl5/site_perl/5.6.1/i386-linux
/usr/lib/perl5/site_perl/5.6.1
/usr/lib/perl5/site_perl
.
Therefore, on our machine, we might place the files in the directory
/usr/lib/perl5/site_perl/5.6.1/ModPerl. By
default, when you work as root, the files are
created with permissions allowing everybody to read them, so here we
don't have to adjust the file permissions (the
server only needs to be able to read those).
Now add the following snippet to
/usr/local/apache/conf/httpd.conf, to configure
mod_perl to execute the ModPerl::Rules1::handler
subroutine whenever a request to mod_perl_rules1
is made:
PerlModule ModPerl::Rules1
<Location /mod_perl_rules1>
SetHandler perl-script
PerlHandler ModPerl::Rules1
PerlSendHeader On
</Location>
Now issue a request to:
http://localhost/mod_perl_rules1
and, just as with the mod_perl_rules.pl scripts,
the following should be rendered as a response:
mod_perl rules!
Don't forget to include the port number if not using
port 80 (e.g.,
http://localhost:8080/mod_perl_rules1); from now
on, we will assume you know this.
To test
the
second module, ModPerl::Rules2, add a similar
configuration, while replacing all 1s with
2s:
PerlModule ModPerl::Rules2
<Location /mod_perl_rules2>
SetHandler perl-script
PerlHandler ModPerl::Rules2
</Location>
In Chapter 4 we will explain why the
PerlSendHeader directive is not needed for this
particular module.
To test, use the URI:
http://localhost/mod_perl_rules2
You should see the same response from the server that we saw when
issuing a request for the former mod_perl handler.
Chapter 3. Installing mod_perl
In Chapter 2, we presented a basic mod_perl
installation. In this chapter, we will talk about various ways in
which mod_perl can be installed (using a variety of installation
parameters), as well as prepackaged binary installations, and more.
Chapter 2 showed you the following
commands to
build and install a basic mod_perl-enabled Apache server on almost
any standard flavor of Unix.
First, download
http://www.apache.org/dist/httpd/apache_1.3.xx.tar.gz
and http://perl.apache.org/dist/mod_perl-1.xx.tar.gz.
Then, issue the following commands:
panic% cd /home/stas/src
panic% tar xzvf apache_1.3.xx.tar.gz
panic% tar xzvf mod_perl-1.xx.tar.gz
panic% cd mod_perl-1.xx
panic% perl Makefile.PL APACHE_SRC=../apache_1.3.xx/src \
DO_HTTPD=1 USE_APACI=1 EVERYTHING=1
panic% make && make test
panic# make install
panic# cd ../apache_1.3.xx
panic# make install
As usual, replace 1.xx and
1.3.xx with the real version numbers of mod_perl
and Apache, respectively.
You can then add a few configuration lines to
httpd.conf (the Apache configuration file),
start the server, and enjoy mod_perl. This should work just fine.
Why, then, are you now reading a 50-page chapter on installing
mod_perl?
You're reading this chapter for the same reason you
bought this book. Sure, the instructions above will get you a working
version of mod_perl. But the average reader of this book
won't want to stop there. If you're
using mod_perl, it's because you want to improve the
performance of your web server. And when you're
concerned with performance, you're always looking
for ways to eke a little bit more out of your server. In essence,
that's what this book is about: getting the most out
of your mod_perl-enabled Apache server. And it all starts at the
beginning, with the installation of the software.
In the basic mod_perl installation, the
parameter
EVERYTHING=1 enables a lot of options for you,
whether you actually need them or not. You may want to enable only
the required options, to squeeze even more juice out of mod_perl. You
may want to build mod_perl as a loadable object instead of compiling
it into Apache, so that it can be upgraded without rebuilding Apache
itself. You may also want to install other Apache components, such as
PHP or mod_ssl, alongside mod_perl.
To accomplish any of these tasks, you will need to understand various
techniques for mod_perl configuration and building. You need to know
what configuration parameters are available to you and when and how
to use them.
As with Perl, in mod_perl simple things are simple. But when you need
to accomplish more complicated tasks, you may have to invest some
time to gain a deeper understanding of the process. In this chapter,
we will take the following route. We'll start with a
detailed explanation of the four stages of the mod_perl installation
process, then continue on with the different paths each installation
might take according to your goal, followed by a few copy-and-paste
real-world installation scenarios. Toward the end of the chapter we
will show you various approaches that might make the installation
easier, by automating most of the steps. Finally,
we'll cover some of the general issues that new
users might stumble on while installing mod_perl.
3.1 Configuring the Source
Before building and installing mod_perl you
will have to configure it, as you would configure any other Perl
module:
panic% perl Makefile.PL [parameters].
Make sure you have Perl installed! Use the latest stable
version, if possible. To determine
your version of Perl, run the following command on the command line:
panic% perl -v
You will need at least Perl Version 5.004. If you
don't have it, install it. Follow the instructions
in the distribution's INSTALL
file. The only thing to watch for is that during the configuration
stage (while running ./Configure) you make sure
you can dynamically load Perl module extensions. That is, answer
YES to the following question:
Do you wish to use dynamic loading? [y]
|
In this section, we will explain each of the parameters accepted by
the Makefile.PL file for mod_perl First,
however, lets talk about how the mod_perl configuration dovetails
with Apache's configuration. The source
configuration mechanism in Apache 1.3 provides four major features
(which of course are available to mod_perl):
Apache
modules can use per-module
configuration scripts to link themselves into
the Apache configuration process. This feature lets you automatically
adjust the configuration and build parameters from the Apache module
sources. It is triggered by
ConfigStart/ConfigEnd sections inside
modulename.module files (e.g., see the file
libperl.module in the mod_perl distribution).
The APache AutoConf-style
Interface (APACI)
is the top-level
configure script from Apache 1.3; it provides a
GNU Autoconf-style interface to the
Apache configuration process. APACI is useful for configuring the
source tree without manually editing any
src/Configuration files. Any parameterization
can be done via command-line options to the
configure script. Internally, this is just a
nifty wrapper over the old src/Configure script. Since Apache 1.3, APACI is the best way to install mod_perl as
cleanly as possible. However, the complete Apache 1.3 source
configuration mechanism is available only under Unix at this
writingit doesn't work on Win32.
Dynamic shared object (DSO)
support is one of the most
interesting features in Apache 1.3. It allows Apache
modules to be built as so-called DSOs
(usually named modulename.so), which can be
loaded via the
LoadModule
directive in Apache's
httpd.conf file. The benefit is that the modules
become part of the httpd executable only on
demand; they aren't loaded into the address space of
the httpd executable until the user asks for
them to be. The benefits of DSO support are most evident in relation
to memory consumption and added flexibility (in that you
won't have to recompile your
httpd each time you want to add, remove, or
upgrade a module). The DSO mechanism is provided by Apache's
mod_so
module, which needs to be compiled into the
httpd binary with:
panic% ./configure --enable-module=so The usage of any enable-shared option
automatically implies an enable-module=so
option, because the bootstrapping module mod_so is
always needed for DSO support. So if, for example, you want the
module mod_dir to be built as a DSO, you can
write:
panic% ./configure --enable-shared=dir and the DSO support will be added automatically.
The APache eXtension
Support tool (APXS)
is a tool from Apache 1.3 that can be used to build an Apache module
as a DSO even outside the Apache source tree. APXS
is to Apache what
MakeMaker
and XS are to Perl. It knows the
platform-dependent build parameters for making DSO files and provides
an easy way to run the build commands with them.
As of Apache 1.3, the configuration system
supports two optional features for taking advantage of the modular
DSO approach: compilation of the Apache core program into a DSO
library for shared usage, and compilation of the Apache modules into
DSO files for explicit loading at runtime.
Should you build mod_perl as a DSO? Let's study the
pros and cons of this installation method, so you can decide for
yourself.
Pros:
The server package is
more flexible because the actual
server executable can be assembled at runtime via
LoadModule configuration commands in
httpd.conf instead of via
AddModule commands in the
Configuration file at build time. This allows
you to run different server instances (e.g., standard and SSL
servers, or servers with and without mod_perl) with only one Apache
installation; the only thing you need is different configuration
files (or, by judicious use of IfDefine, different
startup scripts).
The server package can
easily be extended with
third-party
modules even
after installation. This is especially helpful for vendor package
maintainers who can create an Apache core package and additional
packages containing extensions such as PHP, mod_perl, mod_fastcgi,
etc.
DSO support allows easier Apache
module prototyping, because with the
DSO/APXS pair you can work outside the Apache
source tree and need only an apxs -i command
followed by an apachectl restart to bring a new
version of your currently developed module into the running Apache
server.
Cons:
The DSO mechanism cannot be used on every platform,
because not all operating systems support shared libraries.
The server starts up
approximately 20% slower because of the
overhead of the symbol-resolving the Unix loader now has to do.
The server runs approximately 5% slower on some platforms, because
position-independent code (PIC) sometimes needs complicated assembler
tricks for relative addressing, which are not necessarily as fast as
those for absolute addressing.
Because DSO modules cannot be linked against other DSO-based
libraries (ld -lfoo) on all platforms (for
instance, a.out-based platforms usually
don't provide this functionality, while ELF-based
platforms do), you cannot use the DSO mechanism for all types of
modules. In other words, modules compiled as DSO files are restricted
to use symbols only from the Apache core, from the C library
(libc) and from any other dynamic or static
libraries used by the Apache core, or from static library archives
(libfoo.a) containing position-independent code.
The only way you can use other code is to either make sure the Apache
core itself already contains a reference to it, load the code
yourself via dlopen( ), or enable the
SHARED_CHAIN rule while building Apache (if your
platform supports linking DSO files against DSO libraries). This,
however, won't be of much significance to you if
you're writing modules only in Perl.
Under some platforms (e.g., many SVR4 systems), there is no way to
force the linker to export all global symbols for use in DSOs when
linking the Apache httpd executable program. But
without the visibility of the Apache core symbols, no standard Apache
module could be used as a DSO. The only workaround here is to use the
SHARED_CORE feature, because in this way the
global symbols are forced to be exported. As a consequence, the
Apache src/Configure script automatically
enforces SHARED_CORE on these platforms when DSO
features are used in the Configuration file or
on the configure command line.
|
Together, these four features provide a way to integrate mod_perl
into Apache in a very clean and smooth way. No patching of the Apache
source tree is usually required, and for APXS
support, not even the Apache source tree is needed.
To benefit from the above features, a hybrid build environment was
created for the Apache side of mod_perl. See Section 3.5, later in this chapter, for details.
Once the overview of the four building steps is complete, we will
return to each of the above configuration mechanisms when describing
different installation passes.
3.1.1 Controlling the Build Process
The configuration stage of the build is performed by the
command perl Makefile.PL, which accepts various
parameters. This section covers all of the configuration parameters,
grouped by their functionality.
Of course, you should keep in mind that these options are cumulative.
We display only one or two options being used at once, but you should
use the ones you want to enable all at once, in one call
to perl Makefile.PL.
- APACHE_SRC, DO_HTTPD, NO_HTTPD, PREP_HTTPD
-
These four parameters are tightly interconnected, as they control the
way in which the Apache source is handled.
Typically, when you want mod_perl to be compiled statically with
Apache without adding any extra components, you specify the location
of the Apache source tree using the APACHE_SRC
parameter and use the DO_HTTPD=1 parameter to tell
the installation script to build the httpd
executable:
panic% perl Makefile.PL APACHE_SRC=../apache_1.3.xx/src DO_HTTPD=1
If no APACHE_SRC is specified,
Makefile.PL makes an intelligent guess by
looking at the directories at the same level as the mod_perl sources
and suggesting a directory with the highest version of Apache found
there.
By default, the configuration process will ask you to confirm whether
the location of the source tree is correct before continuing. If you
use DO_HTTPD=1 or NO_HTTPD=1,
the first Apache source tree found or the one you specified will be
used for the rest of the build process.
If you don't use DO_HTTPD=1, you
will be prompted by the following question:
Shall I build httpd in ../apache_1.3.xx/src for you?
Note that if you set DO_HTTPD=1 but do not use
APACHE_SRC=../apache_1.3.xx/src, the first Apache
source tree found will be used to configure and build against.
Therefore, you should always use an explicit
APACHE_SRC parameter, to avoid confusion.
If you don't want to build the
httpd in the Apache source tree because you
might need to add extra third-party modules, you should use
NO_HTTPD=1 instead of
DO_HTTPD=1. This option will install all the files
that are needed to build mod_perl in the Apache source tree, but it
will not build httpd itself.
PREP_HTTPD=1 is similar to
NO_HTTPD=1, but if you set this parameter you will
be asked to confirm the location of the Apache source directory even
if you have specified the APACHE_SRC parameter.
If you choose not to build the binary, you will have to do that
manually. Building an httpd binary is covered in
an upcoming section. In any case, you will need to run make
install in the mod_perl source tree so the Perl side of
mod_perl will be installed. Note that mod_perl's
make test won't work until you
have built the server.
- APACHE_HEADER_INSTALL
-
When Apache and mod_perl are installed, you may need
to build other Perl modules that use Apache C functions, such as
HTML::Embperl or Apache::Peek.
These modules usually will fail to build if Apache header files
aren't installed in the Perl tree. By default, the
Apache source header files are installed into the
$Config{sitearchexp}/auto/Apache/include
directory. If you don't want
or need these headers to be installed, you can change this behavior
by using the APACHE_HEADER_INSTALL=0 parameter.
- USE_APACI
-
The USE_APACI parameter tells mod_perl to
configure Apache using the flexible APACI. The alternative is the older
system, which required a file named
src/Configuration to be edited manually. To
enable APACI, use:
panic% perl Makefile.PL USE_APACI=1
- APACI_ARGS
-
When you use the USE_APACI=1 parameter, you can
tell Makefile.PL to pass any arguments you want
to the Apache ./configure utility. For example:
panic% perl Makefile.PL USE_APACI=1 \
APACI_ARGS='--sbindir=/home/httpd/httpd_perl/sbin, \
--sysconfdir=/home/httpd/httpd_perl/etc'
Note that the APACI_ARGS argument must be passed
as a single long line if you work with a C-style shell (such as
csh or tcsh), as those
shells seem to corrupt multi-lined values enclosed inside single
quotes.
Of course, if you want the default Apache directory layout but a
different root directory
(/home/httpd/httpd_perl/, in our case), the
following is the simplest way to do so:
panic% perl Makefile.PL USE_APACI=1 \
APACI_ARGS='--prefix=/home/httpd/httpd_perl'
- ADD_MODULE
-
This parameter enables building of built-in Apache
modules. For example,
to enable the mod_rewrite and mod_proxy modules, you can do the
following:
panic% perl Makefile.PL ADD_MODULE=proxy,rewrite
If you are already using APACI_ARGS, you can add
the usual Apache ./configure directives as
follows:
panic% perl Makefile.PL USE_APACI=1 \
APACI_ARGS='--enable-module=proxy --enable-module=rewrite'
- APACHE_PREFIX
-
As an alternative to:
APACI_ARGS='--prefix=/home/httpd/httpd_perl'
you can use the APACHE_PREFIX parameter. When
USE_APACI is enabled, this attribute specifies the
same prefix option.
Additionally, the APACHE_PREFIX option
automatically executes make install in the
Apache source directory, which makes the following commands:
panic% perl Makefile.PL APACHE_SRC=../apache_1.3.xx/src \
DO_HTTPD=1 USE_APACI=1 EVERYTHING=1 \
APACI_ARGS='--prefix=/home/httpd/httpd_perl'
panic% make && make test
panic# make install
panic# cd ../apache_1.3.xx
panic# make install
equivalent to these commands:
panic% perl Makefile.PL APACHE_SRC=../apache_1.3.xx/src \
DO_HTTPD=1 USE_APACI=1 EVERYTHING=1 \
APACHE_PREFIX=/home/httpd/httpd_perl
panic% make && make test
panic# make install
- PERL_STATIC_EXTS
-
Normally, if a C code extension is statically linked with Perl, it is
listed in Config.pm's
$Config{static_exts}, in which case mod_perl
will also statically link this extension with
httpd. However, if an extension is statically
linked with Perl after it is installed, it will not be listed in
Config.pm. You can either edit
Config.pm and add these extensions, or configure
mod_perl like this:
panic% perl Makefile.PL "PERL_STATIC_EXTS=DBI DBD::Oracle"
- DYNAMIC
-
This option tells mod_perl to build the Apache::*
API extensions as shared libraries. The default is to link these
modules statically with the httpd executable.
This can save some memory if you use these API features only
occasionally. To enable this option, use:
panic% perl Makefile.PL DYNAMIC=1
- USE_APXS
-
If this option is enabled, mod_perl will be built using
the APXS tool. This
tool is used to build C API modules in a way that is independent of
the Apache source tree. mod_perl will look for the
apxs executable in the location specified by
WITH_APXS; otherwise, it will check the
bin and sbin directories
relative to APACHE_PREFIX. To enable this option,
use:
panic% perl Makefile.PL USE_APXS=1
- WITH_APXS
-
This attribute tells mod_perl the location of the
apxs executable. This is necessary if the binary
cannot be found in the command path or in the location specified by
APACHE_PREFIX. For example:
panic% perl Makefile.PL USE_APXS=1 WITH_APXS=/home/httpd/bin/apxs
- USE_DSO
-
This option tells mod_perl to build
itself as a DSO. Although this reduces the apparent size of the
httpd executable on disk, it
doesn't actually reduce the memory consumed by each
httpd process. This is recommended only if you
are going to be using the mod_perl API only occasionally, or if you
wish to experiment with its features before you start using it in a
production environment. To enable this option, use:
panic% perl Makefile.PL USE_DSO=1
- SSL_BASE
-
When building against a mod_ssl-enabled server, this option will tell
Apache where to look for the SSL include and
lib subdirectories. For example:
panic% perl Makefile.PL SSL_BASE=/usr/share/ssl
- PERL_DESTRUCT_LEVEL={1,2}
-
When the Perl interpreter shuts down, this level enables additional
checks during server shutdown to make sure the
interpreter
has done proper bookkeeping. The default is 0. A
value of 1 enables full destruction, and
2 enables full destruction with checks. This value
can also be changed at runtime by setting the environment variable
PERL_DESTRUCT_LEVEL. We will revisit this
parameter in Chapter 5.
- PERL_TRACE
-
To enable mod_perl debug tracing, configure mod_perl with
the PERL_TRACE option:
panic% perl Makefile.PL PERL_TRACE=1
To see the diagnostics, you will also need to set the
MOD_PERL_TRACE environment variable at runtime.
We will use mod_perl configured with this parameter enabled to show a
few debugging techniques in Chapter 21.
- PERL_DEBUG
-
This option builds mod_perl and the Apache server with C source code
debugging enabled (the -g switch). It also
enables PERL_TRACE, sets
PERL_DESTRUCT_LEVEL to 2, and
links against the debuggable libperld Perl
interpreter if one has been installed. You will be able to debug the
Apache executable and each of its modules with a source-level
debugger, such as the GNU debugger gdb. To
enable this option, use:
panic% perl Makefile.PL PERL_DEBUG=1
We will discuss this option in Chapter 21, as it is
extremely useful to track down bugs or report problems.
3.1.2 Activating Callback Hooks
A callback hook (also known simply as a
callback) is a reference to a subroutine. In
Perl, we create subroutine references with
the following syntax:
$callback = \&subroutine;
In this example, $callback contains a reference to
the subroutine called subroutine. Another way to
create a callback is to use an anonymous subroutine:
$callback = sub { 'some code' };
Here, $callback contains a reference to the
anonymous subroutine. Callbacks are used when we want some action
(subroutine call) to occur when some event takes place. Since we
don't know exactly when the event will take place,
we give the event
handler
a reference to the subroutine we want to be executed. The handler
will call our subroutine at the right time, effectively
calling back that subroutine.
By default, most of the callback hooks except for
PerlHandler,
PerlChildInitHandler,
PerlChildExitHandler,
PerlConnectionApi, and
PerlServerApi are turned off. You may enable them
via options to Makefile.PL.
Here is the list of available hooks and the
parameters that enable them. The
Apache request prcessing phases were explained in Chapter 1.
Directive/Hook Configuration Option
--------------------------------------------------------
PerlPostReadRequestHandler PERL_POST_READ_REQUEST
PerlTransHandler PERL_TRANS
PerlInitHandler PERL_INIT
PerlHeaderParserHandler PERL_HEADER_PARSER
PerlAuthenHandler PERL_AUTHEN
PerlAuthzHandler PERL_AUTHZ
PerlAccessHandler PERL_ACCESS
PerlTypeHandler PERL_TYPE
PerlFixupHandler PERL_FIXUP
PerlHandler PERL_HANDLER
PerlLogHandler PERL_LOG
PerlCleanupHandler PERL_CLEANUP
PerlChildInitHandler PERL_CHILD_INIT
PerlChildExitHandler PERL_CHILD_EXIT
PerlDispatchHandler PERL_DISPATCH
As with any parameters that are either defined or not, use
OPTION_FOO=1 to enable them (e.g.,
PERL_AUTHEN=1).
To enable all callback hooks, use:
ALL_HOOKS=1
There are a few more hooks that won't be enabled by
default, because they are experimental.
If you are using:
panic% perl Makefile.PL EVERYTHING=1 ...
it already includes the ALL_HOOKS=1 option.
3.1.3 Activating Standard API Features
The following options enable
various standard features
of the mod_perl API. While not absolutely needed,
they're very handy and there's
little penalty in including them. Unless specified otherwise, these
options are all disabled by default. The
EVERYTHING=1 or DYNAMIC=1
options will enable them en masse. If in doubt, include these.
- PERL_FILE_API=1
-
Enables the Apache::File class, which helps with
the handling of files under mod_perl.
- PERL_TABLE_API=1
-
Enables the Apache::Table class, which provides
tied access to the Apache Table structure (used for HTTP headers,
among others).
- PERL_LOG_API=1
-
Enables the Apache::Log class. This class allows
you to access Apache's more advanced logging
features.
- PERL_URI_API=1
-
Enables the Apache::URI class, which deals with
the parsing of URIs in a similar way to the Perl
URI::URL module, but much faster.
- PERL_UTIL_API=1
-
Enables the Apache::Util class, allowing you to
use various functions such as HTML escaping or date parsing, but
implemented in C.
- PERL_CONNECTION_API=1
-
Enables the Apache::Connection class. This class
is enabled by default. Set the option to 0 to
disable it.
- PERL_SERVER_API=1
-
Enables the Apache::Server class. This class is
enabled by default. Set the option to 0 to disable
it.
Please refer to Lincoln Stein and Doug MacEachern's
Writing Apache Modules with Perl and C
(O'Reilly) for more information about the Apache
API.
3.1.4 Enabling Extra Features
mod_perl
comes with a number of other
features. Most of them are disabled by default. This is the list of
features and options to enable them:
<Perl> sections give you a way to configure
Apache using Perl code in the httpd.conf file
itself. See Chapter 4 for more information. panic% perl Makefile.PL PERL_SECTIONS=1 ...
With the PERL_SSI option, the mod_include module can be
extended to include a #perl directive. panic% perl Makefile.PL PERL_SSI=1 By enabling PERL_SSI, a new
#perl element is added to the standard mod_include
functionality. This element allows server-side includes to call Perl
subroutines directly. This feature works only when mod_perl is not
built as a DSO (i.e., when it's built statically).
If you develop an Apache module in Perl and you want to create custom
configuration directives to be recognized in
httpd.conf, you need to use
Apache::ModuleConfig and
Apache::CmdParms. For these modules to work, you
will need to enable this option: panic% perl Makefile.PL PERL_DIRECTIVE_HANDLERS=1
The stacked handlers feature explained in Chapter 4 requires this parameter to be enabled: panic% perl Makefile.PL PERL_STACKED_HANDLERS=1
The method handlers feature discussed in Chapter 4
requires this parameter to be enabled: panic% perl Makefile.PL PERL_METHOD_HANDLERS=1
To enable all phase callback handlers, all API modules, and all
miscellaneous features, use the
"catch-all" option we used when we
first compiled mod_perl: panic% perl Makefile.PL EVERYTHING=1
3.1.5 Reusing Configuration Parameters
When you have to
upgrade the server,
it's sometimes hard to remember what parameters you
used in the previous mod_perl build. So it's a good
idea to save them in a file.
One way to save parameters is to create
a file (e.g.,
~/.mod_perl_build_options) with the following
contents:
APACHE_SRC=../apache_1.3.xx/src DO_HTTPD=1 USE_APACI=1 \
EVERYTHING=1
Then build the server with the following command:
panic% perl Makefile.PL `cat ~/.mod_perl_build_options`
panic% make && make test
panic# make install
But mod_perl has a standard method to perform this trick. If a file
named
makepl_args.mod_perl is found in the same
directory as the mod_perl build location, it will
be read in by Makefile.PL. Parameters supplied
at the command line will override the parameters given in this file.
The makepl_args.mod_perl file can also be
located in your home directory or in the ../
directory relative to the mod_perl distribution directory. The
filename can also start with a dot
(.makepl_args.mod_perl), so you can keep
it
nicely hidden along with the rest of the dot files in your home
directory. So, Makefile.PL will look for the
following files (in this order), using the first one it comes across:
./makepl_args.mod_perl
../makepl_args.mod_perl
./.makepl_args.mod_perl
../.makepl_args.mod_perl
$ENV{HOME}/.makepl_args.mod_perl
For example:
panic% ls -1 /home/stas/src
apache_1.3.xx/
makepl_args.mod_perl
mod_perl-1.xx/
panic% cat makepl_args.mod_perl
APACHE_SRC=../apache_1.3.xx/src
DO_HTTPD=1
USE_APACI=1
EVERYTHING=1
panic% cd mod_perl-1.xx
panic% perl Makefile.PL
panic% make && make test
panic# make install
Now the parameters from the makepl_args.mod_perl
file will be used automatically, as if they were entered directly.
In the sample makepl_args.mod_perl file in the
eg/ directory of the mod_perl distribution
package, you might find a few options enabling some experimental
features for you to play with, too!
If you are faced with a compiled Apache and no trace of the
parameters used to build it, you can usually still find them if
make clean was not run on the sources. You will
find the Apache-specific parameters in
apache_1.3.xx/config.status and the mod_perl
parameters in
mod_perl-1.xx/apaci/mod_perl.config.
3.1.6 Discovering Whether a Feature Was Enabled
mod_perl Version 1.25
introduced
Apache::MyConfig, which provides access to the various
hooks and features set when mod_perl was built. This circumvents the
need to set up a live server just to find out if a certain callback
hook is available.
To see whether some feature was built in or not, check the
%Apache::MyConfig::Setup hash. For example,
suppose we install mod_perl with the following options:
panic% perl Makefile.PL EVERYTHING=1
but the next day we can't remember which callback
hooks were enabled. We want to know whether the
PERL_LOG callback hook is available. One of the
ways to find an answer is to run the following code:
panic% perl -MApache::MyConfig -e 'print $Apache::MyConfig::Setup{PERL_LOG}'
If it prints 1, that means the
PERL_LOG callback hook is enabled (which it should
be, as EVERYTHING=1 enables them all).
Another approach is to configure
Apache::Status (see Chapter 9) and
run http://localhost/perl-status?hooks to check
for enabled hooks.
If you want to check for the existence of
various hooks within your handlers, you
can use the script shown in Example 3-1.
Example 3-1. test_hooks.pl
use mod_perl_hooks;
for my $hook (mod_perl::hooks( )) {
if (mod_perl::hook($hook)) {
print "$hook is enabled\n";
}
else {
print "$hook is not enabled\n";
}
}
You can also try to look at the symbols inside the
httpd executable with the help of
nm(1) or a similar utility. For example, if you
want to see whether you enabled PERL_LOG=1 while
building mod_perl, you can search for a symbol with the same name but
in lowercase:
panic% nm httpd | grep perl_log
08071724 T perl_logger
This shows that PERL_LOG=1 was enabled. But this
approach will work only if you have an unstripped
httpd binary. By default, make
install strips the binary before installing it, thus
removing the symbol names to save space. Use the
without-execstrip ./configure option to
prevent stripping during the make install phase.
Yet another approach that will work in most cases is to try to use
the feature in question. If it wasn't configured,
Apache will give an error message.
3.1.7 Using an Alternative Configuration File
By default,
mod_perl provides its own
copy of the Configuration file to
Apache's configure utility. If
you want to pass it your own version, do this:
panic% perl Makefile.PL CONFIG=Configuration.custom
where
Configuration.custom is the pathname of the file
relative to the Apache source tree you build
against.
3.1.8 perl Makefile.PL Troubleshooting
During the configuration (perl
Makefile.PL) stage, you may encounter some of
these problems. To help you avoid them, let's study
them, find out why they happened, and discuss how to fix them.
3.1.8.1 A test compilation with your Makefile configuration failed...
When you see the
following error during the
perl Makefile.PL stage:
** A test compilation with your Makefile configuration
** failed. This is most likely because your C compiler
** is not ANSI. Apache requires an ANSI C Compiler, such
** as gcc. The above error message from your compiler
** will also provide a clue.
Aborting!
it's possible that you have a problem with a
compiler. It may be improperly installed or not installed at all.
Sometimes the reason is that your Perl executable was built on a
different machine, and the software installed on your machine is not
the same. Generally this happens when you install prebuilt packages,
such as rpm or deb. You may
find that the dependencies weren't properly defined
in the Perl binary package and you were allowed to install it even
though some essential packages were not installed.
The most frequent pitfall is a missing gdbm
library (see the next section).
But why guess, when we can actually see the real error message and
understand what the real problem is? To get a real error message,
edit the Apache src/Configure script. Around
line 2140, you should see a line like this:
if ./helpers/TestCompile sanity; then
Add the -v option, as follows:
if ./helpers/TestCompile -v sanity; then
and try again. Now you should get a useful error message.
3.1.8.2 Missing or misconfigured libgdbm.so
On some Red Hat Linux systems, you might
encounter a problem during the perl Makefile.PL
stage, when Perl was installed from an rpm
package built with the gdbm
library, but libgdbm
isn't actually installed. If this happens to you,
make sure you install it before proceeding with the build process.
You can check how Perl was built by running the perl
-V command:
panic% perl -V | grep libs
You should see output similar to this:
libs=-lnsl -lndbm -lgdbm -ldb -ldl -lm -lc -lposix -lcrypt
Sometimes the problem is even more obscure: you do have
libgdbm installed, but it's not
installed properly. Do this:
panic% ls /usr/lib/libgdbm.so*
If you get at least three lines, like we do:
lrwxrwxrwx /usr/lib/libgdbm.so -> libgdbm.so.2.0.0
lrwxrwxrwx /usr/lib/libgdbm.so.2 -> libgdbm.so.2.0.0
-rw-r--r-- /usr/lib/libgdbm.so.2.0.0
you are all set. On some installations, the
libgdbm.so symbolic link is missing, so you get
only:
lrwxrwxrwx /usr/lib/libgdbm.so.2 -> libgdbm.so.2.0.0
-rw-r--r-- /usr/lib/libgdbm.so.2.0.0
To fix this problem, add the missing symbolic link:
panic% cd /usr/lib
panic% ln -s libgdbm.so.2.0.0 libgdbm.so
Now you should be able to build mod_perl without any problems.
Note that you might need to prepare this symbolic link as well:
lrwxrwxrwx /usr/lib/libgdbm.so.2 -> libgdbm.so.2.0.0
with the command:
panic% ln -s libgdbm.so.2.0.0 libgdbm.so.2
Of course, if a new version of the libgdbm library
was released between the moment we wrote this sentence and the moment
you're reading it, you will have to adjust the
version numbers. We didn't use the usual
xx.xx version replacement here, to make it
easier to understand how the symbolic links should be set.
If you need to have the dbm library linked in, you should know
that both the gdbm and db
libraries offer ndbm emulation, which is the
interface that Apache actually uses. So when you build mod_perl, you
end up using whichever library was linked first by the Perl
compilation. If you build Apache without mod_perl, you end up with
whatever appears to be be your ndbm library, which
will vary between systems, and especially Linux distributions. So you
may have to work a bit to get both Apache and Perl to use the same
library, and you are likely to have trouble copying the
dbm file from one system to another or even
using it after an upgrade.
|
3.1.8.3 Undefined reference to `PL_perl_destruct_level'
When manually building mod_perl using the shared library:
panic% cd mod_perl-1.xx
panic% perl Makefile.PL PREP_HTTPD=1
panic% make && make test
panic# make install
panic% cd ../apache_1.3.xx
panic% ./configure --with-layout=RedHat --target=perlhttpd
--activate-module=src/modules/perl/libperl.a
you might see the following output:
gcc -c -I./os/unix -I./include -DLINUX=2 -DTARGET=\"perlhttpd\"
-DUSE_HSREGEX -DUSE_EXPAT -I./lib/expat-lite `./apaci` buildmark.c
gcc -DLINUX=2 -DTARGET=\"perlhttpd\" -DUSE_HSREGEX -DUSE_EXPAT
-I./lib/expat-lite `./apaci` \
-o perlhttpd buildmark.o modules.o modules/perl/libperl.a
modules/standard/libstandard.a main/libmain.a ./os/unix/libos.a ap/libap.a
regex/libregex.a lib/expat-lite/libexpat.a -lm -lcrypt
modules/perl/libperl.a(mod_perl.o): In function `perl_shutdown':
mod_perl.o(.text+0xf8): undefined reference to `PL_perl_destruct_level'
mod_perl.o(.text+0x102): undefined reference to `PL_perl_destruct_level'
mod_perl.o(.text+0x10c): undefined reference to `PL_perl_destruct_level'
mod_perl.o(.text+0x13b): undefined reference to `Perl_av_undef'
[more errors snipped]
This happens when Perl was built statically linked, with no shared
libperl.a. Build a dynamically linked Perl (with
libperl.a) and the problem will
disappear.
3.2 Building mod_perl (make)
After completing
the configuration, it's time to build the server by
simply calling:
panic% make
The make program first compiles the source files
and creates a mod_perl library file. Then, depending on your
configuration, this library is either linked with
httpd (statically) or not linked at all,
allowing you to dynamically load it at runtime.
You should avoid putting the mod_perl source directory inside the
Apache source directory, as this confuses the build process. The best
choice is to put both source directories under the same parent
directory.
3.2.1 What Compiler Should Be Used to Build mod_perl?
All Perl
modules that use C extensions must be compiled
using the compiler with which your copy of Perl was built.
When you run perl Makefile.PL, a
Makefile is created. This
Makefile includes the same compilation options
that were used to build Perl itself. They are stored in the
Config.pm module and can be displayed with the
Perl -V command. All these options are reapplied
when compiling Perl modules.
If you use a different compiler to build Perl extensions, chances are
that the options this compiler uses won't be the
same, or they might be interpreted in a completely different way. So
the code may not compile, may dump core, or may behave in unexpected
ways.
Since Perl, Apache, and third-party modules all work together under
mod_perl, it's essential to use the same compiler
while building each of the components.
If you compile a non-Perl component separately, you should make sure
to use both the same compiler and the same options used to build
Perl. You can find much of this information by running perl
-V.
3.2.2 make Troubleshooting
The following errors are the ones that frequently
occur during the make process when building
mod_perl.
3.2.2.1 Undefined reference to `Perl_newAV'
This and similar error messages may show up during the
make process. Generally it happens when you have
a broken Perl installation. If it's installed from a
broken rpm or another precompiled binary
package, build Perl from source or use another properly built binary
package. Run perl -V to learn what version of
Perl you are using and other important details.
3.2.2.2 Unrecognized format specifier for...
This error is usually reported due to the problems with
some versions of the SFIO library. Try to use the
latest version to get around this problem or, if you
don't really need SFIO, rebuild
Perl without this library.
3.3 Testing the Server (make test)
After building the
server, it's a
good idea to test it throughly by calling:
panic% make test
Fortunately, mod_perl comes with a big collection of tests, which
attempt to exercise all the features you asked for at the
configuration stage. If any of the tests fails, the make
test step will fail.
Running make test will start the freshly built
httpd on port 8529 (an unprivileged port),
running under the UID (user ID) and GID (group ID) of the
perl Makefile.PL process. The
httpd will be terminated when the tests are
finished.
To change the default port (8529) used for the tests, do this:
panic% perl Makefile.PL PORT=xxxx
Each file in the testing suite generally includes more than one test,
but when you do the testing, the program will report only how many
tests were passed and the total number of tests defined in the test
file. To learn which ones failed, run the tests in verbose mode by
using the
TEST_VERBOSE parameter:
panic% make test TEST_VERBOSE=1
As of mod_perl v1.23, you can use the environment variables
APACHE_USER and APACHE_GROUP to
override the default User and
Group settings in the
httpd.conf file used for make
test. These two variables should be set before the
Makefile is created to take effect during the
testing stage. For example, if you want to set them to
httpd, you can do the following in the
Bourne-style shell:
panic% export APACHE_USER=httpd
panic% export APACHE_GROUP=httpd
panic% perl Makefile.PL ...
3.3.1 Manual Testing
Tests
are
invoked by running the ./TEST script located in
the ./t directory. Use the
-v option for verbose tests. You might run an
individual test like this:
panic% perl t/TEST -v modules/file.t
or all tests in a test subdirectory:
panic% perl t/TEST modules
The TEST script starts the server before the
test is executed. If for some reason it fails to start, use
make start_httpd to start it manually:
panic% make start_httpd
To shut down Apache when the testing is complete, use make
kill_httpd:
panic% make kill_httpd
3.3.2 make test Troubleshooting
The following sections cover problems that you
may encounter during the testing stage.
3.3.2.1 make test fails
make test requires Apache to be running already,
so if you specified NO_HTTPD=1 during the
perl Makefile.PL stage, you'll
have to build httpd independently before running
make test. Go to the Apache source tree and run
make, then return to the mod_perl source tree
and continue with the server testing.
If you get an error like this:
still waiting for server to warm up...............not ok
you may want to examine the t/logs/error_log
file, where all the make test-stage errors are
logged. If you still cannot find the problem or this file is
completely empty, you may want to run the test with
strace (or truss) in the
following way (assumming that you are located in the root directory
of the mod_perl source tree):
panic% make start_httpd
panic% strace -f -s1024 -o strace.out -p `cat t/logs/httpd.pid` &
panic% make run_tests
panic% make kill_httpd
where the strace -f option
tells strace to trace child processes as they
are created, -s1024 allows trace strings of a
maximum of 1024 characters to be printed (it's 32 by
default), -o gives the name of the file to which
the output should be written, -p supplies the
PID of the parent process, and & puts the
job in the background.
When the tests are complete, you can examine the generated
strace.out file and hopefully find the problem.
We talk about creating and analyzing trace outputs in Chapter 21.
3.3.2.2 mod_perl.c is incompatible with this version of Apache
If you had a
stale Apache header layout in one of
the include paths during the build process, you
will see the message "mod_perl.c is incompatible
with this version of Apache" when you try to execute
httpd. Find the file
ap_mmn.h using find,
locate, or another utility. Delete this file and
rebuild Apache. The Red Hat Linux distribution usually installs it in
/usr/local/include.
Before installing mod_perl-enabled Apache from scratch,
it's a good idea to remove all the pre-installed
Apache modules, and thus save the trouble of looking for files that
mess up the build process. For example, to remove the precompiled
Apache installed as a Red Hat Package Manager (RPM) package, as
root you should do:
panic# rpm -e apache
There may be other RPM packages that depend on the Apache RPM
package. You will be notified about any other dependent packages, and
you can decide whether to delete them, too. You can always supply the
nodeps option to tell the RPM manager to
ignore the dependencies.
apt users would do this instead:
panic# apt-get remove apache
3.3.2.3 make test......skipping test on this platform
make test may report
some tests as
skipped. They are skipped because you are
missing the modules that are needed for these tests to pass. You
might want to peek at the contents of each test; you will find them
all in the ./t directory. It's
possible that you don't need any of the missing
modules to get your work done, in which case you
shouldn't worry that the tests are skipped.
If you want to make sure that all tests pass, you will need to figure
out what modules are missing from your installation. For example, if
you see:
modules/cookie......skipping test on this platform
you may want to install the Apache::Cookie module.
If you see:
modules/request.....skipping test on this platform
Apache::Request is missing. If you
see:
modules/psections...skipping test on this platform
Devel::Symdump and Data::Dumper
are needed.
Chances are that all of these will be installed if you use
CPAN.pm to install
Bundle::Apache. We talk about CPAN installations
later in this chapter.
3.3.2.4 make test fails due to misconfigured localhost entry
The make test suite uses
localhost to run the tests that require a
network. Make sure you have this entry in
/etc/hosts:
127.0.0.1 localhost.localdomain localhost
Also make sure you have the loopback device lo
configured. If you aren't sure, run:
panic% /sbin/ifconfig lo
This will tell you whether the loopback device is configured.
3.4 Installation (make install)
After
testing the server, the last step is
to install it. First install all the Perl files (usually as
root):
panic# make install
Then go to the Apache source tree and complete the Apache
installation (installing the configuration files,
httpd, and utilities):
panic# cd ../apache_1.3.xx
panic# make install
Of course, if you have used the APACHE_PREFIX
option as explained earlier in this chapter, you can skip this step.
Now the installation should be considered complete. You may now
configure your server and start using it.
3.4.1 Manually Building a mod_perl-Enabled Apache
If you want to build
httpd separately from mod_perl, you should use
the NO_HTTPD=1 option during the perl
Makefile.PL (mod_perl build) stage. Then you will have to
configure various things by hand and proceed to build Apache. You
shouldn't run perl Makefile.PL
before following the steps described in this section.
If you choose to manually build mod_perl, there are three things you
may need to set up before the build stage:
- mod_perl's Makefile
-
When perl
Makefile.PL is executed,
$APACHE_SRC/modules/perl/Makefile may need to be
modified to enable various options (e.g.,
ALL_HOOKS=1).
Optionally, instead of tweaking the options during the perl
Makefile.PL stage, you can edit
mod_perl-1.xx/src/modules/perl/Makefile before
running perl Makefile.PL.
- Configuration
-
Add the following to
apache_1.3.xx/src/Configuration:
AddModule modules/perl/libperl.a
We suggest you add this entry at the end of the
Configuration file if you want your callback
hooks to have precedence over core handlers.
Add the following to EXTRA_LIBS:
EXTRA_LIBS=`perl -MExtUtils::Embed -e ldopts`
Add the following to EXTRA_CFLAGS:
EXTRA_CFLAGS=`perl -MExtUtils::Embed -e ccopts`
- mod_perl source files
-
Return to the mod_perl directory and
copy the mod_perl source files into the Apache build directory:
panic% cp -r src/modules/perl apache_1.3.xx/src/modules/
When you are done with the configuration parts, run:
panic% perl Makefile.PL NO_HTTPD=1 DYNAMIC=1 EVERYTHING=1 \
APACHE_SRC=../apache_1.3.xx/src
DYNAMIC=1 enables a build of the shared mod_perl
library. Add other options if required.
panic# make install
Now you may proceed with the plain Apache build process. Note that in
order for your changes to the
apache_1.3.xx/src/Configuration file to take
effect, you must run apache_1.3.xx/src/Configure
instead of the default apache_1.3.xx/configure
script:
panic% cd ../apache_1.3.xx/src
panic% ./Configure
panic% make
panic# make install
3.5 Installation Scenarios for Standalone mod_perl
When building mod_perl, the mod_perl C source files
that have to be compiled into the httpd
executable usually are copied to the subdirectory
src/modules/perl/ in the Apache source tree. In
the past, to integrate this subtree into the Apache build process, a
lot of adjustments were done by mod_perl's
Makefile.PL. Makefile.PL
was also responsible for the Apache build process.
This approach is problematic in several ways. It is very restrictive
and not very clean, because it assumes that mod_perl is the only
third-party module that has to be integrated into Apache.
A new hybrid build environment was therefore created for the Apache
side of mod_perl, to avoid these problems. It prepares only the
src/modules/perl/ subtree inside the Apache
source tree, without adjusting or editing anything else. This way, no
conflicts can occur. Instead, mod_perl is activated later (via APACI
calls when the Apache source tree is configured), and then it
configures itself.
There are various ways to build Apache with the new hybrid build
environment (using USE_APACI=1):
Build Apache and mod_perl together, using the default configuration.
Build Apache and mod_perl separately, allowing you to plug in other
third-party Apache modules as needed.
Build mod_perl as a DSO inside the Apache source tree using APACI.
Build mod_perl as a DSO outside the Apache source tree with APXS.
3.5.1 The All-in-One Way
If your goal is just to build and install Apache with mod_perl
out of their source trees, and you have no interest in further
adjusting or enhancing Apache, proceed as we described in Chapter 2:
panic% tar xzvf apache_1.3.xx.tar.gz
panic% tar xzvf mod_perl-1.xx.tar.gz
panic% cd mod_perl-1.xx
panic% perl Makefile.PL APACHE_SRC=../apache_1.3.xx/src \
DO_HTTPD=1 USE_APACI=1 EVERYTHING=1
panic% make && make test
panic# make install
panic# cd ../apache_1.3.xx
panic# make install
This builds Apache statically with mod_perl, installs Apache under
the default /usr/local/apache tree, and installs
mod_perl into the site_perl hierarchy of your
existing Perl installation.
3.5.2 Building mod_perl and Apache Separately
However, sometimes you might need
more flexibility while building mod_perl. If you build mod_perl into
the Apache binary (httpd) in separate steps,
you'll also have the freedom to include other
third-party Apache modules. Here are the steps:
Prepare the Apache source tree. As before, first extract the distributions:
panic% tar xvzf apache_1.3.xx.tar.gz
panic% tar xzvf mod_perl-1.xx.tar.gz
Install mod_perl's
Perl side and prepare the Apache side. Next, install the Perl side of mod_perl into the Perl hierarchy and
prepare the src/modules/perl/ subdirectory
inside the Apache source tree:
panic% cd mod_perl-1.xx
panic% perl Makefile.PL \
APACHE_SRC=../apache_1.3.xx/src \
NO_HTTPD=1 \
USE_APACI=1 \
PREP_HTTPD=1 \
EVERYTHING=1 \
[...]
panic% make
panic# make install The APACHE_SRC option sets the path to your Apache
source tree, the NO_HTTPD option forces this path
and only this path to be used, the USE_APACI
option triggers the new hybrid build environment, and the
PREP_HTTPD option forces preparation of the
$APACHE_SRC/modules/perl/ tree but no automatic
build.
This tells the configuration process to prepare the Apache side of
mod_perl in the Apache source tree, but doesn't
touch anything else in it. It then just builds the Perl side of
mod_perl and installs it into the Perl installation hierarchy.
Note that if you use PREP_HTTPD as described
above, to complete the build you must go into the Apache source
directory and run make and make
install.
Prepare other third-party modules. Now you have a chance to prepare any other third-party modules you
might want to include in Apache. For instance, you can build PHP
separately, as you did with mod_perl.
Build the Apache package. Now it's time to build Apache, including the Apache
side of mod_perl and any other third-party modules
you've prepared:
panic% cd apache_1.3.xx
panic% ./configure \
--prefix=/path/to/install/of/apache \
--activate-module=src/modules/perl/libperl.a \
[...]
panic% make
panic# make install You must use the
prefix option if you want to change the default
target directory of the Apache
installation. The
activate-module option activates mod_perl for the
configuration process and thus also for the build process. If you
choose prefix=/usr/share/apache, the
Apache directory tree will be installed in
/usr/share/apache.
If you add other third-party components, such as PHP, include a
separate activate-module option for each
of them. (See the module's documentation for the
actual path to which activate-module
should point.) For example, for mod_php4:
--activate-module=src/modules/php4/libphp4.a Note that the files activated by
activate-module do not exist at this
time. They will be generated during compilation.
You may also want to go back to the mod_perl source tree and run
make test (to make sure that mod_perl is
working) before running make install inside the
Apache source tree.
For more detailed examples on building mod_perl with other
components, see Section 3.6.
3.5.3 When DSOs Can Be Used
If you want
to build mod_perl as a DSO, you must make sure that Perl was built
with the system's native malloc(
). If Perl was built with its own malloc(
) and -Dbincompat5005, it pollutes the
main httpd program with
free and malloc symbols.
When httpd starts or restarts, any references in
the main program to free and
malloc become invalid, causing memory leaks and
segfaults.
Notice that mod_perl's build system warns about this
problem.
With Perl 5.6.0+ this pollution can be prevented by using
-Ubincompat5005 or
-Uusemymalloc for any version of Perl. However,
there's a chance that
-Uusemymalloc might hurt performance on your
platform, so -Ubincompat5005 is likely a better
choice.
If you get the following reports with Perl version 5.6.0+:
% perl -V:usemymalloc
usemymalloc='y';
% perl -V:bincompat5005
bincompat5005='define';
rebuild Perl with -Ubincompat5005.
For pre-5.6.x Perl versions, if you get:
% perl -V:usemymalloc
usemymalloc='y';
rebuild Perl with -Uusemymalloc.
Now rebuild mod_perl.
3.5.4 Building mod_perl as a DSO via APACI
We have already mentioned that the new mod_perl
build environment (with USE_APACI) is a hybrid.
What does that mean? It means, for instance, that you can use the
same src/modules/perl/ configuration to build
mod_perl as a DSO or not, without having to edit any files. To build
libperl.so, just add a single option, depending on
which method you used to build mod_perl.
The static mod_perl library is called
libperl.a, and the shared mod_perl library is
called libperl.so.
Of course, libmodperl would have been a better
prefix, but libperl was used because of
prehistoric Apache issues. Be careful that you don't
confuse mod_perl's libperl.a
and libperl.so files with the ones that are
built with the standard Perl installation.
|
If you choose the "standard"
all-in-one way of building mod_perl, add:
USE_DSO=1
to the perl Makefile.PL options.
If you choose to build mod_perl and Apache separately, add:
--enable-shared=perl
to Apache's configure options
when you build Apache.
As you can see, whichever way you build mod_perl and Apache, only one
additional option is needed to build mod_perl as a DSO. Everything
else is done automatically: mod_so is automatically enabled, the
Makefiles are adjusted, and the
install target from APACI installs
libperl.so into the Apache installation tree.
Additionally, the LoadModule and
AddModule directives (which dynamically load and
insert mod_perl into httpd) are automatically
added to httpd.conf.
3.5.5 Building mod_perl as a DSO via APXS
We've seen how to build mod_perl as a DSO
inside the Apache source tree, but there is a
nifty alternative: building mod_perl as a DSO
outside the Apache source tree via the new
Apache 1.3 support tool called APXS. The advantage
is obvious: you can extend an already installed Apache with mod_perl
even if you don't have the sources (for instance,
you may have installed an Apache binary package from your vendor or
favorite distribution).
Here are the build steps:
panic% tar xzvf mod_perl-1.xx.tar.gz
panic% cd mod_perl-1.xx
panic% perl Makefile.PL \
USE_APXS=1 \
WITH_APXS=/path/to/bin/apxs \
EVERYTHING=1 \
[...]
panic% make && make test
panic# make install
This will build the DSO libperl.so outside the
Apache source tree and install it into the existing Apache hierarchy.
4.1 Apache Configuration
Apache configuration can be confusing. To minimize the number of
things that can go wrong, it's a good idea to first
configure Apache itself without mod_perl. So before we go into
mod_perl configuration, let's look at the basics of
Apache itself.
4.1.1 Configuration Files
Prior to Version
1.3.4, the default Apache installation used three configuration
files: httpd.conf,
srm.conf, and access.conf.
Although there were historical reasons for having three separate
files (dating back to the NCSA server), it stopped mattering which
file you used for what a long time ago, and the Apache team finally
decided to combine them. Apache Versions 1.3.4 and later are
distributed with the configuration directives in a single file,
httpd.conf.
Therefore, whenever we mention a configuration file, we are referring
to httpd.conf.
By default, httpd.conf is
installed in the
conf directory under the server root directory.
The default server root is /usr/local/apache/ on
many Unix platforms, but it can be any directory of your choice
(within reason). Users new to Apache and mod_perl will probably find
it helpful to keep to the directory layouts we use in this book.
There is also a special file called
.htaccess, used for per-directory
configuration. When Apache tries to access a file on the filesystem,
it will first search for .htaccess files in the
requested file's parent directories. If found,
Apache scans .htaccess for further configuration
directives, which it then applies only to that directory in which the
file was found and its subdirectories. The name
.htaccess is confusing, because it can contain
almost any configuration directives, not just those related to
resource access control. Note that if the following directive is in
httpd.conf:
<Directory />
AllowOverride None
</Directory>
Apache will not look for .htaccess at all unless
AllowOverride is set to a value other than
None in a more specific
<Directory> section.
.htaccess can be renamed by using the
AccessFileName directive. The following example
configures Apache to look in the target directory for a file called
.acl instead of .htaccess:
AccessFileName .acl
However, you must also make sure that this file
can't be accessed directly from the Web, or else you
risk exposing your configuration. This is done automatically for
.ht* files by Apache, but for other files you
need to use:
<Files .acl>
Order Allow,Deny
Deny from all
</Files>
Another often-mentioned file is the startup file, usually named
startup.pl. This file contains Perl code that will be
executed at server startup. We'll discuss the
startup.pl file in greater detail later in this
chapter, in Section 4.3.
Beware of editing httpd.conf
without understanding all the implications.
Modifying the configuration file and adding new directives can
introduce security problems and have performance implications. If you
are going to modify anything, read through the documentation
beforehand. The Apache distribution comes with an extensive
configuration manual. In addition, each section of the distributed
configuration file includes helpful comments explaining how each
directive should be configured and what the default values are.
If you haven't moved Apache's
directories around, the installation program will configure
everything for you. You can just start the server and test it. To
start the server, use the apachectl utility
bundled with the Apache distribution. It resides in the same
directory as httpd, the Apache
server itself. Execute:
panic% /usr/local/apache/bin/apachectl start
Now you can test the server, for example by accessing
http://localhost/ from a browser running on the
same host.
4.1.2 Configuration Directives
A basic setup
requires little configuration.
If you moved any directories after Apache was installed, they should
be updated in httpd.conf. Here are just a couple
of examples:
ServerRoot "/usr/local/apache"
DocumentRoot "/usr/local/apache/docs"
You can change the port to which the server is bound by editing the
Port directive. This example sets the port to 8080
(the default for the HTTP protocol is 80):
Port 8080
You might want to change the user and group names under which the
server will run. If Apache is started by the user
root (which is generally the case), the parent
process will continue to run as root, but its
children will run as the user and group specified in the
configuration, thereby avoiding many potential security problems.
This example uses the httpd user and group:
User httpd
Group httpd
Make sure that the user and group httpd already
exist. They can be created using useradd(1) and
groupadd(1) or equivalent utilities.
Many other directives may need to be configured as well. In addition
to directives that take a single value, there are whole sections of
the configuration (such as the <Directory>
and <Location> sections) that apply to only
certain areas of the web space. The httpd.conf
file supplies a few examples, and these will be discussed shortly.
4.1.3 <Directory>, <Location>, and <Files> Sections
Let's discuss the basics of the
<Directory>,
<Location>, and
<Files> sections. Remember that there is
more to know about them than what we list here, and the rest of the
information is available in the Apache documentation. The information
we'll present here is just what is important for
understanding mod_perl configuration.
Apache considers directories and files on the machine it runs on as
resources. A particular behavior can be
specified for each resource; that behavior will apply to every
request for information from that particular resource.
Directives in <Directory>
sections apply to specific directories on the host machine, and those
in <Files>
sections apply only to specific files (actually, groups of files with
names that have something in common).
<Location> sections
apply to specific URIs. Locations are given relative to the document
root, whereas directories are given as absolute paths starting from
the filesystem root (/). For example, in the
default server directory layout where the server root is
/usr/local/apache and the document root is
/usr/local/apache/htdocs, files under the
/usr/local/apache/htdocs/pub directory can be
referred to as:
<Directory /usr/local/apache/htdocs/pub>
</Directory>
or alternatively (and preferably) as:
<Location /pub>
</Location>
Exercise caution when using <Location> under
Win32. The Windows family of operating systems are case-insensitive.
In the above example, configuration directives specified for the
location /pub on a case-sensitive Unix machine
will not be applied when the request URI is
/Pub. When URIs map to existing files, such as
Apache::Registry scripts, it is safer to use the
<Directory> or
<Files> directives, which correctly
canonicalize filenames according to local filesystem semantics.
It is up to you to decide which directories on your host machine are
mapped to which locations. This should be done with care, because the
security of the server may be at stake. In particular, essential
system directories such as /etc/
shouldn't be mapped to locations accessible through
the web server. As a general rule, it might be best to organize
everything accessed from the Web under your
ServerRoot, so that it stays organized and you
can keep track of which directories are actually accessible.
Locations do not necessarily have to refer to existing physical
directories, but may refer to virtual resources that the server
creates upon a browser request. As you will see, this is often the
case for a mod_perl server.
When a client (browser)
requests a resource (URI plus optional
arguments) from the server, Apache determines from its configuration
whether or not to serve the request, whether to pass the request on
to another server, what (if any) authentication and authorization is
required for access to the resource, and which module(s) should be
invoked to generate the response.
For any given resource, the various sections in the configuration may
provide conflicting information. Consider, for example, a
<Directory> section that specifies that
authorization is required for access to the resource, and a
<Files> section that says that it is not. It
is not always obvious which directive takes precedence in such cases.
This can be a trap for the unwary.
4.1.3.1 <Directory directoryPath> ... </Directory>
Scope: Can appear in server and virtual host
configurations.
<Directory> and
</Directory> are used to enclose a group
of
directives that will apply to only the named directory and its
contents, including any subdirectories. Any directive that is allowed
in a directory context (see the Apache documentation) may be used.
The path given in the <Directory> directive
is either the full path to a directory, or a string containing
wildcard characters (also called globs). In the
latter case, ? matches any single character,
* matches any sequence of characters, and
[ ] matches character ranges. These are similar to
the wildcards used by sh and similar shells. For
example:
<Directory /home/httpd/docs/foo[1-2]>
Options Indexes
</Directory>
will match /home/httpd/docs/foo1 and
/home/httpd/docs/foo2. None of the wildcards
will match a / character. For example:
<Directory /home/httpd/docs>
Options Indexes
</Directory>
matches /home/httpd/docs and applies to all its
subdirectories.
Matching a regular expression is done by using the
<DirectoryMatch regex> ...
</DirectoryMatch> or <Directory
~ regex> ... </Directory> syntax. For example:
<DirectoryMatch /home/www/.*/public>
Options Indexes
</DirectoryMatch>
will match /home/www/foo/public but not
/home/www/foo/private. In a regular expression,
.* matches any character (represented by
.) zero or more times (represented by
*). This is entirely different from the
shell-style wildcards used by the
<Directory> directive. They make it easy to
apply a common configuration to a set of public directories. As
regular expressions are more flexible than globs, this method
provides more options to the experienced user.
If multiple (non-regular expression)
<Directory> sections match the directory (or
its parents) containing a document, the directives are applied in the
order of the shortest match first, interspersed with the directives
from any .htaccess files. Consider the following
configuration:
<Directory />
AllowOverride None
</Directory>
<Directory /home/httpd/docs/>
AllowOverride FileInfo
</Directory>
Let us detail the steps Apache goes through when it receives a
request for the file
/home/httpd/docs/index.html:
Apply the directive AllowOverride None (disabling
.htaccess files).
Apply the directive AllowOverride FileInfo for the
directory /home/httpd/docs/ (which now enables
.htaccess in
/home/httpd/docs/ and its subdirectories).
Apply any directives in the group FileInfo, which
control document types (AddEncoding,
AddLanguage, AddType,
etc.see the Apache documentation for more information) found
in /home/httpd/docs/.htaccess.
4.1.3.2 <Files filename > ... </Files>
Scope: Can appear in server and virtual host
configurations, as well as in .htaccess files.
The <Files> directive provides access control by
filename and is comparable to the
<Directory> and
<Location> directives.
<Files> should be closed with the
corresponding </Files>. The directives
specified
within this section will be applied to any object with a basename
matching the specified filename. (A basename is the last component of
a path, generally the name of the file.)
<Files> sections are processed in the order
in which they appear in the configuration file, after the
<Directory> sections and
.htaccess files are read, but before
<Location> sections. Note that
<Files> can be nested inside
<Directory> sections to restrict the portion
of the filesystem to which they apply. However,
<Files> cannot be nested inside
<Location> sections.
The filename argument should include a filename or a wildcard string,
where ? matches any single character and
* matches any sequence of characters, just as with
<Directory> sections. Extended regular
expressions can also be used, placing a tilde character
(~) between the directive and the regular
expression. The regular expression should be in quotes. The dollar
symbol ($) refers to the end of the string. The
pipe character (|) indicates alternatives, and
parentheses (()) can be used for grouping. Special
characters in extended regular expressions must be escaped with
backslashes (\). For example:
<Files ~ "\.(pl|cgi)$">
SetHandler perl-script
PerlHandler Apache::Registry
Options +ExecCGI
</Files>
would match all the files ending with the .pl or
.cgi extension (most likely Perl scripts).
Alternatively, the <FilesMatch regex> ...
</FilesMatch> syntax can be used.
There is much more to regular expressions than what we have shown you
here. As a Perl programmer, learning to use regular expressions is
very important, and what you can learn there will be applicable to
your Apache configuration too.
See the perlretut manpage and the book
Mastering Regular Expressions by Jeffrey E. F.
Friedl (O'Reilly) for more information.
|
4.1.3.3 <Location URI> ... </Location>
Scope: Can appear in server and virtual host
configurations.
The <Location>
directive
provides for directive scope limitation by URI. It is similar to the
<Directory> directive and starts a section
that is terminated with the </Location>
directive.
<Location> sections are processed in the
order in which they appear in the configuration file, after the
<Directory> sections,
.htaccess files, and
<Files> sections have been interpreted.
The <Location> section is the directive that
is used most often with mod_perl.
Note that URIs do not have to refer to real directories or files
within the filesystem at all; <Location>
operates completely outside the filesystem. Indeed, it may sometimes
be wise to ensure that <Location>s do not
match real paths, to avoid confusion.
The URI may use wildcards. In a wildcard string, ?
matches any single character, * matches any
sequences of characters, and [ ] groups characters
to match. For regular expression matches, use the
<LocationMatch regex> ...
</LocationMatch> syntax.
The <Location> functionality is especially
useful when combined with the SetHandler
directive. For example, to enable server status requests (via
mod_status) but allow them only from browsers at
*.example.com, you might use:
<Location /status>
SetHandler server-status
Order Deny,Allow
Deny from all
Allow from .example.com
</Location>
As you can see, the /status path does not exist
on the filesystem, but that doesn't matter because
the filesystem isn't consulted for this
requestit's passed on directly to mod_status.
4.1.4 Merging <Directory>, <Location>, and <Files> Sections
When configuring the server,
it's important to understand the order in which the
rules of each section are applied to requests. The order of merging
is:
<Directory> (except for regular expressions)
and .htaccess are processed simultaneously, with
the directives in .htaccess overriding
<Directory>.
<DirectoryMatch> and <Directory
~ > with regular expressions are processed next.
<Files> and
<FilesMatch> are processed simultaneously.
<Location> and
<LocationMatch> are processed
simultaneously.
Apart from <Directory>, each group is
processed in the order in which it appears in the configuration
files. <Directory>s (group 1 above) are
processed in order from the shortest directory component to the
longest (e.g., first / and only then
/home/www). If multiple
<Directory> sections apply to the same
directory, they are processed in the configuration file order.
Sections inside <VirtualHost> sections are
applied as if you were running several independent servers. The
directives inside one <VirtualHost> section
do not interact with directives in other
<VirtualHost> sections. They are applied
only after processing any sections outside the virtual host
definition. This allows virtual host configurations to override the
main server configuration.
If there is a conflict, sections found later in the configuration
file override those that come earlier.
4.1.5 Subgrouping of <Directory>, <Location>, and <Files> Sections
Let's say that you want all files to be
handled the same way, except for a few of the files in a specific
directory and its subdirectories. For example, say you want all the
files in /home/httpd/docs to be processed as
plain files, but any files ending with .html and
.txt to be processed by the content handler of
the Apache::Compress module (assuming that you are
already running a mod_perl server):
<Directory /home/httpd/docs>
<FilesMatch "\.(html|txt)$">
PerlHandler +Apache::Compress
</FilesMatch>
</Directory>
The + before Apache::Compress
tells mod_perl to load the Apache::Compress module
before using it, as we will see later.
Using <FilesMatch>,
it is possible to embed sections inside other sections to create
subgroups that have their own distinct behavior. Alternatively, you
could also use a <Files> section inside an
.htaccess file.
Note that you can't put
<Files> or
<FilesMatch> sections inside a
<Location> section, but you can put them
inside a <Directory> section.
4.1.6 Options Directive Merging
Normally, if multiple Options
directives apply to a directory, the most specific one is taken
completely; the options are not merged.
However, if all the options on the Options
directive are preceded by either a + or
- symbol, the options are merged. Any options
preceded by + are added to the options currently
active, and any options preceded by - are removed.
For example, without any + or -
symbols:
<Directory /home/httpd/docs>
Options Indexes FollowSymLinks
</Directory>
<Directory /home/httpd/docs/shtml>
Options Includes
</Directory>
Indexes and FollowSymLinks will
be set for /home/httpd/docs/, but only
Includes will be set for the
/home/httpd/docs/shtml/ directory. However, if
the second Options directive uses the
+ and - symbols:
<Directory /home/httpd/docs>
Options Indexes FollowSymLinks
</Directory>
<Directory /home/httpd/docs/shtml>
Options +Includes -Indexes
</Directory>
then the options FollowSymLinks and
Includes will be set for the
/home/httpd/docs/shtml/ directory.
4.1.7 MinSpareServers, MaxSpareServers, StartServers, MaxClients, and MaxRequestsPerChild
MinSpareServers,
MaxSpareServers,
StartServers, and
MaxClients are
standard Apache configuration
directives that control the number of servers being launched at
server startup and kept alive during the server's
operation. When Apache starts, it spawns
StartServers child processes. Apache makes sure
that at any given time there will be at least
MinSpareServers but no more than
MaxSpareServers idle servers. However, the
MinSpareServers rule is completely satisfied only
if the total number of live servers is no bigger than
MaxClients.
MaxRequestsPerChild lets you specify the maximum
number of requests to be served by each child. When a process has
served MaxRequestsPerChild requests, the parent
kills it and replaces it with a new one. There may also be other
reasons why a child is killed, so each child will not necessarily
serve this many requests; however, each child will not be allowed to
serve more than this number of requests. This feature is handy to
gain more control of the server, and especially to avoid child
processes growing too big (RAM-wise) under mod_perl.
These five directives are very important for getting the best
performance out of your server. The process of tuning these variables
is described in great detail in Chapter 11.
4.2 mod_perl Configuration
When you
have
tested that the Apache server works on your machine,
it's time to configure the mod_perl part. Although
some of the configuration directives are already familiar to you,
mod_perl introduces a few new ones.
It's a good idea to keep all mod_perl-related
configuration at the end of the configuration file, after the native
Apache configuration directives, thus avoiding any confusion.
To ease maintenance and to simplify multiple-server installations,
the mod_perl-enabled Apache server configuration system provides
several alternative ways to keep your configuration directives in
separate places. The
Include directive in
httpd.conf lets you include the contents of
other files, just as if the information were all contained in
httpd.conf. This is a
feature of Apache itself. For example,
placing all mod_perl-related configuration in a separate file named
conf/mod_perl.conf can be done by adding the
following directive to httpd.conf:
Include conf/mod_perl.conf
If you want to include this configuration conditionally, depending on
whether your Apache has been compiled with mod_perl, you can use the
IfModule directive :
<IfModule mod_perl.c>
Include conf/mod_perl.conf
</IfModule>
mod_perl adds two more directives. <Perl>
sections allow you to execute Perl code from within any configuration
file at server startup time. Additionally, any file containing a Perl
program can be executed at server startup simply by using the
PerlRequire or PerlModule
directives, as we will show shortly.
4.2.1 Alias Configurations
For many reasons, a server can never
allow access to its entire directory hierarchy. Although there is
really no indication of this given to the web browser, every path
given in a requested URI is therefore a virtual path; early in the
processing of a request, the virtual path given in the request must
be translated to a path relative to the filesystem root, so that
Apache can determine what resource is really being requested. This
path can be considered to be a physical path, although it may not
physically exist.
For instance, in mod_perl systems, you may
intend that the translated path does not
physically exist, because your module responds when it sees a request
for this non-existent path by sending a virtual document. It creates
the document on the fly, specifically for that request, and the
document then vanishes. Many of the documents you see on the Web (for
example, most documents that change their appearance depending on
what the browser asks for) do not physically exist. This is one of
the most important features of the Web, and one of the great powers
of mod_perl is that it allows you complete flexibility to create
virtual documents.
The ScriptAlias and Alias
directives provide a mapping of a URI to a filesystem directory. The
directive:
Alias /foo /home/httpd/foo
will map all requests starting with /foo to the
files starting with /home/httpd/foo/. So when
Apache receives a request to
http://www.example.com/foo/test.pl, the server
will map it to the file test.pl in the directory
/home/httpd/foo/.
Additionally, ScriptAlias assigns all the requests
that match the specified URI (i.e., /cgi-bin) to
be executed by mod_cgi.
ScriptAlias /cgi-bin /home/httpd/cgi-bin
is actually the same as:
Alias /cgi-bin /home/httpd/cgi-bin
<Location /cgi-bin>
SetHandler cgi-script
Options +ExecCGI
</Location>
where the SetHandler directive invokes mod_cgi. You
shouldn't use the ScriptAlias
directive unless you want the request to be processed under mod_cgi.
Therefore, when configuring mod_perl sections, use
Alias instead.
Under mod_perl, the Alias directive will be
followed by a section with at least two directives. The first is the
SetHandler/perl-script directive, which tells Apache to
invoke mod_perl to run the script. The second directive (for example,
PerlHandler) tells mod_perl which handler (Perl
module) the script should be run under, and hence for which phase of
the request. Later in this chapter, we discuss the available
Perl*Handlers for
the various request phases. A typical mod_perl configuration that
will execute the Perl scripts under the
Apache::Registry handler looks like this:
Alias /perl/ /home/httpd/perl/
<Location /perl>
SetHandler perl-script
PerlHandler Apache::Registry
Options +ExecCGI
</Location>
The last directive tells Apache to execute the file as a program,
rather than return it as plain text.
When you have decided which methods to use to run your scripts and
where you will keep them, you can add the configuration directive(s)
to httpd.conf. They will look like those below,
but they will of course reflect the locations of your scripts in your
filesystem and the decisions you have made about how to run the
scripts:
ScriptAlias /cgi-bin/ /home/httpd/cgi-bin/
Alias /perl/ /home/httpd/perl/
<Location /perl>
SetHandler perl-script
PerlHandler Apache::Registry
Options +ExecCGI
</Location>
In the examples above, all requests issued for URIs starting with
/cgi-bin will be served from the directory
/home/httpd/cgi-bin/, and those starting with
/perl will be served from the directory
/home/httpd/perl/.
4.2.1.1 Running scripts located in the same directory under different handlers
Sometimes you will want to map the same
directory to a few different locations and execute each file
according to the way it was requested. For example, in the following
configuration:
# Typical for plain cgi scripts:
ScriptAlias /cgi-bin/ /home/httpd/perl/
# Typical for Apache::Registry scripts:
Alias /perl/ /home/httpd/perl/
# Typical for Apache::PerlRun scripts:
Alias /cgi-perl/ /home/httpd/perl/
<Location /perl/>
SetHandler perl-script
PerlHandler Apache::Registry
Options +ExecCGI
</Location>
<Location /cgi-perl/>
SetHandler perl-script
PerlHandler Apache::PerlRun
Options +ExecCGI
</Location>
the following three URIs:
http://www.example.com/perl/test.pl
http://www.example.com/cgi-bin/test.pl
http://www.example.com/cgi-perl/test.pl
are all mapped to the same file,
/home/httpd/perl/test.pl. If
test.pl is invoked with the URI prefix
/perl, it will be executed under the
Apache::Registry handler. If the prefix is
/cgi-bin, it will be executed under mod_cgi, and
if the prefix is /cgi-perl, it will be executed
under the Apache::PerlRun handler.
This means that we can have all our CGI scripts located at the same
place in the filesystem and call the script in any of three ways
simply by changing one component of the URI
(cgi-bin|perl|cgi-perl).
This technique makes it easy to migrate your scripts to mod_perl. If
your script does not seem to work while running under mod_perl, in
most cases you can easily call the script in straight mod_cgi mode or
under Apache::PerlRun without making any script
changes. Simply change the URL you use to invoke it.
Although in the configuration above we have configured all three
Aliases to point to the same directory within our
filesystem, you can of course have them point to different
directories if you prefer.
This should just be a migration strategy, though. In general,
it's a bad idea to run scripts in plain mod_cgi mode
from a mod_perl-enabled serverthe extra resource consumption
is wasteful. It is better to run these on a plain Apache server.
4.2.2 <Location /perl> Sections
The <Location>
section assigns a number of rules that the server follows when the
request's URI matches the location. Just as it is a
widely accepted convention to use /cgi-bin for
mod_cgi scripts, it is habitual to use /perl as
the base URI of the Perl scripts running under mod_perl.
Let's review the following very widely used
<Location> section:
Alias /perl/ /home/httpd/perl/
PerlModule Apache::Registry
<Location /perl>
SetHandler perl-script
PerlHandler Apache::Registry
Options +ExecCGI
Allow from all
PerlSendHeader On
</Location>
This configuration causes all requests for URIs starting with
/perl to be handled by the mod_perl Apache
module with the handler from the Apache::Registry
Perl module.
Remember the Alias from the previous section? We
use the same Alias here. If you use a
<Location> that does not have the same
Alias, the server will fail to locate the script
in the filesystem. You need the Alias setting only
if the code that should be executed is located in a file.
Alias just provides the URI-to-filepath
translation rule.
Sometimes there is no script to be executed. Instead, a method in a
module is being executed, as with /perl-status,
the code for which is stored in an Apache module. In such cases, you
don't need Alias settings for
these <Location>s.
PerlModule is equivalent to
Perl's native use( ) function
call. We use it to load the Apache::Registry
module, later used as a handler in the
<Location> section.
Now let's go through the directives inside the
<Location> section:
- SetHandler perl-script
-
The SetHandler directive assigns the mod_perl
Apache module to handle the content generation phase.
- PerlHandler Apache::Registry
-
The PerlHandler directive tells mod_perl to use
the Apache::Registry Perl module for the actual
content generation.
- Options +ExecCGI
-
Options +ExecCGI ordinarily tells Apache that
it's OK for the directory to contain CGI scripts. In
this case, the flag is required by
Apache::Registry to confirm that you really know
what you're doing. Additionally, all scripts located
in directories handled by Apache::Registry must be
executable, another check against wayward non-script files getting
left in the directory accidentally. If you omit this option, the
script either will be rendered as plain text or will trigger a Save
As dialog, depending on the client.
- Allow from all
-
The Allow directive is used to set access control
based on the client's domain or IP adress. The
from all setting allows any client to run the
script.
- PerlSendHeader On
-
The PerlSendHeader On line tells mod_perl to
intercept anything that looks like a header line (such as
Content-Type: text/html) and automatically turn it
into a correctly formatted HTTP header the way mod_cgi does. This
lets you write scripts without bothering to call the request
object's send_http_header( )
method, but it adds a small overhead because of the special handling.
If you use CGI.pm's
header( ) function to generate HTTP headers, you
do not need to activate this directive, because
CGI.pm detects that it's running
under mod_perl and calls send_http_header( ) for
you.
You will want to set PerlSendHeader Off for
non-parsed headers (nph) scripts and generate
all the HTTP headers yourself. This is also true for mod_perl
handlers that send headers with the send_http_header(
) method, because having PerlSendHeader
On as a server-wide configuration option might be a
performance hit.
- </Location>
-
</Location> closes the
<Location> section definition.
Suppose you have:
<Location /foo>
SetHandler perl-script
PerlHandler Book::Module
</Location>
To remove a mod_perl handler setting from a location beneath a
location where a handler is set (e.g.,
/foo/bar), just reset the handler like this:
<Location /foo/bar>
SetHandler default-handler
</Location>
Now all requests starting with /foo/bar will be
served by Apache's default handler, which serves the
content directly.
|
4.2.3 PerlModule and PerlRequire
As we
saw earlier, a module should be loaded
before its handler can be used.
PerlModule and
PerlRequire are the two mod_perl directives that
are used to load modules and code. They are almost equivalent to
Perl's use( ) and
require( ) functions (respectively) and are called
from the Apache configuration file. You can pass one or more module
names as arguments to PerlModule:
PerlModule Apache::DBI CGI DBD::Mysql
Generally, modules are preloaded from the startup script, which is
usually called startup.pl. This is a file
containing Perl code that is executed through the
PerlRequire directive. For example:
PerlRequire /home/httpd/perl/lib/startup.pl
A PerlRequire filename can be absolute or relative
to the ServerRoot or to a path in
@INC.
As with any file with Perl code that gets use( )d
or require( )d, it must return a true value. To
ensure that this happens, don't forget to add
1; at the end of startup.pl.
4.2.4 Perl*Handlers
As mentioned in Chapter 1, Apache specifies 11 phases of
the request loop. In order of processing,
they are: Post-read-request, URI
translation, header parsing,
access control,
authentication,
authorization, MIME type
checking, fixup,
response (also known as the content handling
phase), logging, and finally
cleanup. These are the stages of a request where
the Apache API allows a module to step in and do something. mod_perl
provides dedicated configuration directives for each of these stages:
PerlPostReadRequestHandler
PerlInitHandler
PerlTransHandler
PerlHeaderParserHandler
PerlAccessHandler
PerlAuthenHandler
PerlAuthzHandler
PerlTypeHandler
PerlFixupHandler
PerlHandler
PerlLogHandler
PerlCleanupHandler
These configuration directives usually are referred to as
Perl*Handler directives. The *
in Perl*Handler is a placeholder to be replaced by
something that identifies the phase to be handled. For example,
PerlLogHandler is the Perl handler that (fairly
obviously) handles the logging phase.
In addition, mod_perl adds a few more stages that happen outside the
request loop:
- PerlChildInitHandler
-
Allows your modules to initialize data structures during the startup
of the child process.
- PerlChildExitHandler
-
Allows your modules to clean up during the child process shutdown.
PerlChildInitHandler and
PerlChildExitHandler might be used, for example,
to allocate and deallocate system resources, pre-open and close
database connections, etc. They do not refer to parts of the request
loop.
- PerlRestartHandler
-
Allows you to specify a routine that is called when the server is
restarted. Since Apache always restarts itself immediately after it
starts, this is a good phase for doing various initializations just
before the child processes are spawned.
- PerlDispatchHandler
-
Can be used to take over the process of loading and executing handler
code. Instead of processing the Perl*Handler
directives directly, mod_perl will invoke the routine pointed to by
PerlDispatchHandler and pass it the Apache request
object and a second argument indicating the handler that would
ordinarily be invoked to process this phase. So for example, you can
write a PerlDispatchHandler handler with a logic
that will allow only specific code to be executed.
Since most mod_perl applications need to handle only the response
phase, in the default compilation, most of the
Perl*Handlers are disabled. During the
perl Makefile.PL mod_perl build stage, you must
specify whether or not you will want to handle parts of the request
loop other than the usual content generation phase. If this is the
case, you need to specify which phases, or build mod_perl with the
option EVERYTHING=1, which enables them all. All
the build options are covered in detail in Chapter 3.
Note that it is mod_perl that recognizes these directives, not
Apache. They are mod_perl directives, and an ordinary Apache server
will not recognize them. If you get error messages about these
directives being "perhaps
mis-spelled," it is a sure sign that the appropriate
part of mod_perl (or the entire mod_perl module!) is missing from
your server.
All <Location>,
<Directory>, and
<Files> sections contain a physical path
specification. Like PerlChildInitHandler and
PerlChildExitHandler, the directives
PerlPostReadRequestHandler and
PerlTransHandler cannot be used in these sections,
nor in .htaccess files, because the path
translation isn't completed and a physical path
isn't known until the end of the translation
(PerlTransHandler) phase.
PerlInitHandler is more of an alias; its behavior
changes depending on where it is used. In any case, it is the first
handler to be invoked when serving a request. If found outside any
<Location>,
<Directory>, or
<Files> section, it is an alias for
PerlPostReadRequestHandler. When inside any such
section, it is an alias for
PerlHeaderParserHandler.
Starting with the header parsing phase, the
requested URI has been mapped to a physical server pathname, and thus
PerlHeaderParserHandler can be used to match a
<Location>,
<Directory>, or
<Files> configuration section, or to process
an .htaccess file if such a file exists in the
specified directory in the translated path.
PerlDispatchHandler,
PerlCleanupHandler, and
PerlRestartHandler do not correspond to parts of
the Apache API, but allow you to fine-tune the mod_perl API. They are
specified outside configuration sections.
The Apache documentation and the book Writing Apache
Modules with Perl and C (O'Reilly)
provide in-depth information on the request phases.
4.2.5 The handler( ) Subroutine
By default, the mod_perl API expects a subroutine named
handler( ) to handle the request in the registered
Perl*Handler module. Thus, if your module
implements this subroutine, you can register the handler with
mod_perl by just specifying the module name. For example, to set the
PerlHandler to
Apache::Foo::handler, the following setting would
be sufficient:
PerlHandler Apache::Foo
mod_perl will load the specified module for you when it is first
used. Please note that this approach will not preload the module at
startup. To make sure it gets preloaded, you have three options:
You can explicitly preload it with the PerlModule
directive: PerlModule Apache::Foo
You can preload it in the startup file: use Apache::Foo ( );
You can use a nice shortcut provided by the
Perl*Handler syntax: PerlHandler +Apache::Foo Note the leading + character. This directive is
equivalent to:
PerlModule Apache::Foo
<Location ..>
...
PerlHandler Apache::Foo
</Location>
If you decide to give the handler routine a name other than
handler( ) (for example, my_handler(
)), you must preload the module and explicitly give the
name of the handler subroutine:
PerlModule Apache::Foo
<Location ..>
...
PerlHandler Apache::Foo::my_handler
</Location>
This configuration will preload the module at server startup.
If a module needs to know which handler is currently being run, it
can find out with the current_callback( ) method.
This method is most useful to PerlDispatchHandlers
that take action for certain phases only.
if ($r->current_callback eq "PerlLogHandler") {
$r->warn("Logging request");
}
4.2.6 Investigating the Request Phases
Imagine a complex server setup in which many
different Perl and non-Perl handlers participate in the request
processing, and one or more of these handlers misbehaves. A simple
example is one where one of the handlers alters the request record,
which breaks the functionality of other handlers. Or maybe a handler
invoked first for any given phase of the process returns an
unexpected OK status, thus preventing other
handlers from doing their job. You can't just add
debug statements to trace the offenderthere are too many
handlers involved.
The simplest solution is to get a trace of all registered handlers
for each phase, stating whether they were invoked and what their
return statuses were. Once such a trace is available,
it's much easier to look only at the players that
actually participated, thus narrowing the search path down a
potentially misbehaving module.
The
Apache::ShowRequest
module shows the phases the request goes through, displaying module
participation and response codes for each phase. The content response
phase is not run, but possible modules are listed as defined. To
configure it, just add this snippet to
httpd.conf:
<Location /showrequest>
SetHandler perl-script
PerlHandler +Apache::ShowRequest
</Location>
To see what happens when you access some URI, add the URI to
/showrequest.
Apache::ShowRequest uses
PATH_INFO to obtain the URI that should be
executed. So, to run /index.html with
Apache::ShowRequest, issue a request for
/showrequest/index.html. For
/perl/test.pl, issue a request for
/showrequest/perl/test.pl.
This module produces rather lengthy output, so we will show only one
section from the report generated while requesting
/showrequest/index.html:
Running request for /index.html
Request phase: post_read_request
[snip]
Request phase: translate_handler
mod_perl ....................DECLINED
mod_setenvif ................undef
mod_auth ....................undef
mod_access ..................undef
mod_alias ...................DECLINED
mod_userdir .................DECLINED
mod_actions .................undef
mod_imap ....................undef
mod_asis ....................undef
mod_cgi .....................undef
mod_dir .....................undef
mod_autoindex ...............undef
mod_include .................undef
mod_info ....................undef
mod_status ..................undef
mod_negotiation .............undef
mod_mime ....................undef
mod_log_config ..............undef
mod_env .....................undef
http_core ...................OK
Request phase: header_parser
[snip]
Request phase: access_checker
[snip]
Request phase: check_user_id
[snip]
Request phase: auth_checker
[snip]
Request phase: type_checker
[snip]
Request phase: fixer_upper
[snip]
Request phase: response handler (type: text/html)
mod_actions .................defined
mod_include .................defined
http_core ...................defined
Request phase: logger
[snip]
For each stage, we get a report of what modules could participate in
the processing and whether they took any action. As you can see, the
content response phase is not run, but possible modules are listed as
defined. If we run a mod_perl script, the response phase looks like:
Request phase: response handler (type: perl-script)
mod_perl ....................defined
4.2.7 Stacked Handlers
With the mod_perl stacked
handlers mechanism, it is possible for more than one
Perl*Handler to be defined and executed during any
stage of a request.
Perl*Handler directives can define any number of
subroutines. For example:
PerlTransHandler Foo::foo Bar::bar
Foo::foo( ) will be executed first and
Bar::bar( ) second. As always, if the
subroutine's name is handler( ),
you can omit it.
With the Apache->push_handlers( ) method,
callbacks (handlers) can be added to a stack at
runtime by mod_perl modules.
Apache->push_handlers( ) takes the callback
handler name as its first argument and a subroutine name or reference
as its second. For example, let's add two handlers
called my_logger1( ) and my_logger2(
) to be executed during the logging phase:
use Apache::Constants qw(:common);
sub my_logger1 {
#some code here
return OK;
}
sub my_logger2 {
#some other code here
return OK;
}
Apache->push_handlers("PerlLogHandler", \&my_logger1);
Apache->push_handlers("PerlLogHandler", \&my_logger2);
You can also pass a reference to an anonymous subroutine. For example:
use Apache::Constants qw(:common);
Apache->push_handlers("PerlLogHandler", sub {
print STDERR "_ _ANON_ _ called\n";
return OK;
});
After each request, this stack is erased.
All handlers will be called in turn, unless a handler returns a
status other than OK or
DECLINED.
To enable this feature, build mod_perl with:
panic% perl Makefile.PL PERL_STACKED_HANDLERS=1 [ ... ]
or:
panic% perl Makefile.PL EVERYTHING=1 [ ... ]
To test whether the version of mod_perl you're
running can stack handlers, use the
Apache->can_stack_handlers method. This method will return a true
value if mod_perl was configured with
PERL_STACKED_HANDLERS=1, and a false value
otherwise.
Let's look at a few real-world examples where this
method is used:
The widely used CGI.pm module maintains a global
object for its plain function interface. Since the object is global,
under mod_perl it does not go out of scope when the request is
completed, and the DESTROY method is never called.
Therefore, CGI->new arranges to call the
following code if it detects that the module is used in the mod_perl
environment: Apache->push_handlers("PerlCleanupHandler", \&CGI::_reset_globals); This function is called during the final stage of a request,
resetting CGI.pm's globals before
the next request arrives.
Apache::DCELogin establishes a DCE login context
that must exist for the lifetime of a request, so the
DCE::Login object is stored in a global variable.
Without stacked handlers, users must set the following directive in
the configuration file to destroy the context: PerlCleanupHandler Apache::DCELogin::purge This is ugly. With stacked handlers,
Apache::DCELogin::handler can call from within the
code:
Apache->push_handlers("PerlCleanupHandler", \&purge);
Apache::DBI, the persistent database connection
module, can pre-open the connection when the child process starts via
its connect_on_init( ) function. This function
uses push_handlers( ) to add a
PerlChildInitHandler: Apache->push_handlers(PerlChildInitHandler => \&childinit); Now when the new process gets the first request, it already has the
database connection open.
Apache::DBI also uses push_handlers(
) to have PerlCleanupHandler handle
rollbacks if its AutoCommit attribute is turned
off.
PerlTransHandlers (e.g.,
Apache::MsqlProxy) may decide, based on the URI or
some arbitrary condition, whether or not to handle a request. Without
stacked handlers, users must configure it themselves. PerlTransHandler Apache::MsqlProxy::translate
PerlHandler Apache::MsqlProxy PerlHandler is never actually invoked unless
translate( ) sees that the request is a proxy
request ($r->proxyreq). If it is a proxy
request, translate( ) sets
$r->handler("perl-script"), and only then will
PerlHandler handle the request. Now users do not
have to specify PerlHandler Apache::MsqlProxy,
because the translate( ) function can set it with
push_handlers( ).
Now let's write our own example using stacked
handlers. Imagine that you want to piece together a document that
includes footers, headers, etc. without using SSI. The following
example shows how to implement it. First we prepare the code as shown
in Example 4-1.
Example 4-1. Book/Compose.pm
package Book::Compose;
use Apache::Constants qw(OK);
sub header {
my $r = shift;
$r->send_http_header("text/plain");
$r->print("header text\n");
return OK;
}
sub body {
shift->print("body text\n");
return OK;
}
sub footer {
shift->print("footer text\n");
return OK;
}
1;
The code defines the package Book::Compose,
imports the OK constant, and defines three
subroutines: header( ) to send the header,
body( ) to create and send the actual content, and
finally footer( ) to add a standard footer to the
page. At the end of each handler we return OK, so
the next handler, if any, will be executed.
To enable the construction of the page, we now supply the following
configuration:
PerlModule Book::Compose
<Location /compose>
SetHandler perl-script
PerlHandler Book::Compose::header Book::Compose::body Book::Compose::footer
</Location>
We preload the Book::Compose module and construct
the PerlHandler directive by listing the handlers
in the order in which they should be invoked.
Finally, let's look at the technique that allows
parsing the output of another PerlHandler. For
example, suppose your module generates HTML responses, but you want
the same content to be delivered in plain text at a different
location. This is a little trickier, but consider the following:
<Location /perl>
SetHandler perl-script
PerlHandler Book::HTMLContentGenerator
</Location>
<Location /text>
SetHandler perl-script
PerlHandler Book::HTML2TextConvertor Book::HTMLContentGenerator
</Location>
Notice that Book::HTML2TextConvertor is listed
first. While its handler( ) will be called first,
the actual code that does the conversion will run last, as we will
explain in a moment. Now let's look at the sample
code in Example 4-2.
Example 4-2. Book/HTML2TextConvertor.pm
package Book::HTML2TextConvertor;
sub handler {
my $r = shift;
untie *STDOUT;
tie *STDOUT => _ _PACKAGE_ _, $r;
}
sub TIEHANDLE {
my($class, $r) = @_;
bless { r => $r}, $class;
}
sub PRINT {
my $self = shift;
for (@_) {
# copy it so no 'read-only value modification' will happen
my $line = $_;
$line =~ s/<[^>]*>//g; # strip the html <tags>
$self->{r}->print($line);
}
}
1;
It untie( )s STDOUT and
re-tie( )s it to its own package, so that content
printed to STDOUT by the previous content
generator in the pipe goes through this module. In the
PRINT( ) method, we attempt to strip the HTML
tags. Of course, this is only an example; correct HTML stripping
actually requires more than one line of code and a quite complex
regular expression, but you get the idea.
4.2.8 Perl Method Handlers
If mod_perl was built with:
panic% perl Makefile.PL PERL_METHOD_HANDLERS=1 [ ... ]
or:
panic% perl Makefile.PL EVERYTHING=1 [ ... ]
it's possible to write method handlers in addition
to function handlers. This is useful when you want to write code that
takes advantage of inheritance. To make the handler act as a method
under mod_perl, use the $$ function prototype in
the handler definition. When mod_perl sees that the handler function
is prototyped with $$, it'll pass
two arguments to it: the calling object or a class, depending on how
it was called, and the Apache request object. So you can write the
handler as:
sub handler ($$) {
my($self, $r) = @_;
# ...
}
The configuration
is almost as usual. Just use the
class name if the default method name handler( )
is used:
PerlHandler Book::SubClass
However, if you choose to use a different method name, the
object-oriented notation should be used:
PerlHandler Book::SubClass->my_handler
The my_handler( ) method will then be called as a
class (static) method.
Also, you can use objects created at startup to call methods. For
example:
<Perl>
use Book::SubClass;
$Book::Global::object = Book::SubClass->new( );
</Perl>
...
PerlHandler $Book::Global::object->my_handler
In this example, the my_handler( ) method will be
called as an instance method on the global object
$Book::Global.
4.2.9 PerlFreshRestart
To reload
PerlRequire,
PerlModule, and other use( )d
modules, and to flush the Apache::Registry cache
on server restart, add this directive to
httpd.conf:
PerlFreshRestart On
You should be careful using this setting. It used to cause trouble in
older versions of mod_perl, and some people still report problems
using it. If you are not sure if it's working
properly, a full stop and restart of the server will suffice.
Starting with mod_perl Version 1.22,
PerlFreshRestart is ignored when mod_perl is
compiled as a DSO. But it almost doesn't matter, as
mod_perl as a DSO will do a full tear-down (calling
perl_destruct( )).
4.2.10 PerlSetEnv and PerlPassEnv
In addition to Apache's
SetEnv and PassEnv directives,
respectively setting and passing shell environment variables,
mod_perl provides its own directives:
PerlSetEnv and
PerlPassEnv.
If you want to globally set an environment variable for the server,
you can use the PerlSetEnv directive. For example,
to configure the mod_perl tracing mechanism (as discussed in Chapter 21), add this to httpd.conf:
PerlSetEnv MOD_PERL_TRACE all
This will enable full mod_perl tracing.
Normally, PATH is the only shell environment
variable available under mod_perl. If you need to rely on other
environment variables, you can have mod_perl make those available for
your code with PerlPassEnv.
For example, to forward the environment variable
HOME (which is usually set to the home of the user
who has invoked the server in httpd.conf), add:
PerlPassEnv HOME
Once you set the environment variable, it can be accessed via the
%ENV hash in Perl (e.g.,
$ENV{HOME}).
PerlSetEnv and PerlPassEnv work
just like the Apache equivalents, except that they take effect in the
first phase of the Apache request cycle. The standard Apache
directives SetEnv and PassEnv
don't affect the environment until the fixup phase,
which happens much later, just before content generation. This works
for CGI scripts, which aren't run before then, but
if you need to set some environment variables and access them in a
handler invoked before the response stage, you should use the
mod_perl directives. For example, handlers that want to use an Oracle
relational database during the authentication phase might need to set
the following environment variable (among others) in
httpd.conf:
PerlSetEnv ORACLE_HOME /share/lib/oracle/
Note that PerlSetEnv will override the environment
variables that were available earlier. For example, we have mentioned
that PATH is always supplied by Apache itself. But
if you explicitly set:
PerlSetEnv PATH /tmp
this setting will be used instead of the one set in the shell program.
As with other configuration scoping rules, if you place
PerlSetEnv or PerlPassEnv in
the scope of the configuration file, it will apply everywhere (unless
overridden). If placed into a <Location>
section, or another section in the same group, these directives will
influence only the handlers in that section.
4.2.11 PerlSetVar and PerlAddVar
PerlSetVar is another directive introduced by
mod_perl. It is very similar to PerlSetEnv, but
the key/value pairs are stored in an Apache::Table
object and retrieved using the dir_config( )
method.
There are two ways to use PerlSetVar. The first is
the usual way, as a configuration directive. For example:
PerlSetVar foo bar
The other way is via Perl code in <Perl>
sections:
<Perl>
push @{ $Location{"/"}->{PerlSetVar} }, [ foo => 'bar' ];
</Perl>
Now we can retrieve the value of foo using the
dir_config( ) method:
$foo = $r->dir_config('foo');
Note that you cannot use the following code in
<Perl> sections, which we discuss later in
this chapter:
<Perl>
my %foo = (a => 0, b => 1);
push @{ $Location{"/"}->{PerlSetVar} }, [ foo => \%foo ];
</Perl>
All values are passed to Apache::Table as strings,
so you will get a stringified reference to a hash as a value (such as
"HASH(0x87a5108)"). This cannot be turned back
into the original hash upon retrieval.
However, you can use the
PerlAddVar directive to push more values into
the variable, emulating arrays. For example:
PerlSetVar foo bar
PerlAddVar foo bar1
PerlAddVar foo bar2
or the equivalent:
PerlAddVar foo bar
PerlAddVar foo bar1
PerlAddVar foo bar2
To retrieve the values, use the $r->dir_config->get(
) method:
my @foo = $r->dir_config->get('foo');
Obviously, you can always turn an array into a hash with Perl, so you
can use this directive to pass hashes as well. Consider this example:
PerlAddVar foo key1
PerlAddVar foo value1
PerlAddVar foo key2
PerlAddVar foo value2
You can then retrieve the hash in this way:
my %foo = $r->dir_config->get('foo');
Make sure that you use an even number of elements if you store the
retrieved values in a hash.
Passing a list or a hash via the PerlAddVar
directive in a <Perl> section should be
coded in this way:
<Perl>
my %foo = (a => 0, b => 1);
for (%foo) {
push @{ $Location{"/"}->{PerlAddVar} }, [ foo => $_ ];
}
</Perl>
Now you get back the hash as before:
my %foo = $r->dir_config->get('foo');
This might not seem very practical; if you have more complex needs,
think about having dedicated configuration files.
Customized configuration directives can also be created for the
specific needs of a Perl module. To learn how to create these, please
refer to Chapter 8 of Writing Apache Modules with Perl and
C (O'Reilly), which covers this topic in
great detail.
4.2.12 PerlSetupEnv
Certain Perl modules used in CGI code (such as
CGI.pm) rely on a number of environment variables
that are normally set by mod_cgi. For example, many modules depend on
QUERY_STRING, SCRIPT_FILENAME,
and REQUEST_URI. When the
PerlSetupEnv directive is turned on,
mod_perl provides these environment variables in the same fashion
that mod_cgi does. This directive is On by
default, which means that all the environment variables you are
accustomed to being available under mod_cgi are also available under
mod_perl.
The process of setting these environment variables adds overhead for
each request, whether the variables are needed or not. If you
don't use modules that rely on this behavior, you
can turn it off in the general configuration and then turn it on in
sections that need it (such as legacy CGI scripts):
PerlSetupEnv Off
<Location /perl-run>
SetHandler perl-script
PerlHandler Apache::PerlRun
Options +ExecCGI
PerlSetupEnv On
</Location>
You can use mod_perl methods to access the information provided by
these environment variables (e.g.,
$r->path_info instead of
$ENV{PATH_INFO}). For more details, see the
explanation in Chapter 11.
4.2.13 PerlWarn and PerlTaintCheck
PerlWarn and
PerlTaintCheck have two possible values,
On and Off.
PerlWarn turns warnings on and off globally to the
whole server, and PerlTaintCheck controls whether
the server is running with taint checking or not. These two variables
are also explained in Chapter 6.
4.3 The Startup File
At server startup, before child processes are
spawned, you can do much more than just preload modules. You might
want to register code that will initialize a database connection for
each child when it is forked, tie read-only DBM files, fill in shared
caches, etc.
The startup.pl file is an ideal place to put
code that should be executed when the server starts. Once you have
prepared the code, load it in httpd.conf before
other mod_perl configuration directives with the
PerlRequire directive:
PerlRequire /home/httpd/perl/lib/startup.pl
Be careful with the startup file. Everything run at server
initialization is run with root privileges if
you start the server as root (which you have to
do unless you choose to run the server on an unprivileged port,
numbered 1024 or higher). This means that anyone who has write access
to a script or module that is loaded by
PerlModule, PerlRequire, or
<Perl> sections effectively has
root access to the system.
4.3.1 A Sample Startup File
Let's look at a
real-world startup file. The elements of the file are shown here,
followed by their descriptions.
use strict;
This pragma is worth using in every script longer than half a dozen
lines. It will save a lot of time and debugging later.
use lib qw(/home/httpd/lib /home/httpd/extra-lib);
This permanently adds extra directories to @INC,
something that's possible only during server
startup. At the end of each request's processing,
mod_perl resets @INC to the value it had after the
server startup. Alternatively, you can use the
PERL5LIB environment variable to add extra
directories to @INC.
$ENV{MOD_PERL} or die "not running under mod_perl!";
This is a sanity check. If mod_perl wasn't properly
built, the server startup is aborted.
use Apache::Registry ( );
use LWP::UserAgent ( );
use Apache::DBI ( );
use DBI ( );
Preload the
modules that get used by Perl code serving
requests. Unless you need the symbols (variables and subroutines)
exported by preloaded modules to accomplish something within the
startup file, don't import
themit's just a waste of startup time and
memory. Instead, use the empty import list ( ) to
tell the import( ) function not to import anything.
use Carp ( );
$SIG{_ _WARN_ _} = \&Carp::cluck;
This is a useful snippet to enable extended warnings logged in the
error_log file. In addition to basic warnings, a
trace of calls is added. This makes tracking potential problems a
much easier task, since you know who called what.
The only drawback of this method is that it globally overrides the
default warning handler behaviorthus, in some places it might
be desirable to change the settings locally (for example, with
local $^W=0, or no warnings
under Perl 5.6.0 and higher). Usually warnings are turned off on
production machines to prevent unnecessary clogging of the
error_log file if your code is not very clean.
Hence, this method is mostly useful in a development environment.
use CGI ( );
CGI->compile(':all');
Some modules, such as CGI.pm, create their
subroutines at runtime via AUTOLOAD to improve
their loading time. This helps when the module includes many
subroutines but only a few are actually used. (Also refer to the
AutoSplit manpage.) Since the module is loaded
only once with mod_perl, it might be a good idea to precompile all or
some of its methods at server startup. This avoids the overhead of
compilation at runtime. It also helps share more compiled code
between child processes.
CGI.pm's compile(
) method performs this task. Note that
compile( ) is specific to
CGI.pm; other
modules that implement this feature may use another name for the
compilation method.
As with all modules we preload in the startup file, we
don't import symbols from them because they will be
lost when they go out of the file's scope.
The following code snippet makes sure that when the child process is
spawned, a connection to the database is opened automatically,
avoiding this performance hit on the first request:
Apache::DBI->connect_on_init
("DBI:mysql:database=test;host=localhost",
"user", "password", {
PrintError => 1, # warn( ) on errors
RaiseError => 0, # don't die on error
AutoCommit => 1, # commit executes immediately
}
);
We discuss this method in detail in Chapter 20.
The file ends with 1; so it can be successfully
loaded by Perl.
The entire
startup.pl file is shown
in Example 4-3.
Example 4-3. startup.pl
use strict;
use lib qw(/home/httpd/lib /home/httpd/extra-lib);
$ENV{MOD_PERL} or die "not running under mod_perl!";
use Apache::Registry ( );
use LWP::UserAgent ( );
use Apache::DBI ( );
use DBI ( );
use Carp ( );
$SIG{_ _WARN_ _} = \&Carp::cluck;
use CGI ( );
CGI->compile(':all');
Apache::DBI->connect_on_init
("DBI:mysql:database=test;host=localhost",
"user", "password", {
PrintError => 1, # warn( ) on errors
RaiseError => 0, # don't die on error
AutoCommit => 1, # commit executes immediately
}
);
1;
4.3.2 Syntax Validation
If the
startup
file doesn't include any modules that require the
mod_perl runtime environment during their loading, you can validate
its syntax with:
panic% perl -cw /home/httpd/perl/lib/startup.pl
The -c switch tells Perl to validate only the
file's syntax, and the -w
switch enables warnings.
Apache::DBI is an example of a module that cannot
be loaded outside of the mod_perl environment. If you try to load it,
you will get the following error message:
panic% perl -MApache::DBI -c -e 1
Can't locate object method "module" via package "Apache"
(perhaps you forgot to load "Apache"?) at
/usr/lib/perl5/site_perl/5.6.1/Apache/DBI.pm line 202.
Compilation failed in require.
BEGIN failed--compilation aborted.
However, Apache::DBI will work perfectly once
loaded from within mod_perl.
4.3.3 What Modules Should Be Added to the Startup File
Every module loaded at server startup will be shared
among the server children, saving a lot of RAM on your machine.
Usually, we put most of the code we develop into modules and preload
them.
You can even preload CGI scripts with
Apache::RegistryLoader, as explained in Chapter 10.
4.3.4 The Confusion with use( ) in the Server Startup File
Some people
wonder why they need to duplicate use Modulename
in the startup file and in the script itself. The confusion arises
due to misunderstanding use( ).
Let's take the POSIX module as an
example. When you write:
use POSIX qw(setsid);
use( ) internally performs two operations:
BEGIN {
require POSIX;
POSIX->import(qw(setsid));
}
The first operation loads and compiles the module. The second calls
the module's import( ) method and
specifies to import the symbol setsid into the
caller's namespace. The BEGIN
block makes sure that the code is executed as soon as possible,
before the rest of the code is even parsed. POSIX,
like many other modules, specifies a default export list. This is an
especially extensive list, so when you call:
use POSIX;
about 500 KB worth of symbols gets imported.
Usually, we don't need POSIX or
its symbols in the startup file; all we want is to preload it.
Therefore, we use an empty list as an argument for use(
):
use POSIX ( );
so the POSIX::import( ) method
won't be even called.
When we want to use the POSIX module in the code,
we use( ) it again, but this time no loading
overhead occurs because the module has been loaded already. If we
want to import something from the module, we supply the list of
symbols to load:
use POSIX qw(:flock_h);
This example loads constants used with the flock(
) function.
Technically, you aren't required to supply the
use( ) statement in your handler code if the
module has already been loaded during server startup or elsewhere.
When writing your code, however, don't assume that
the module code has been preloaded. Someday in the future, you or
someone else will revisit this code and will not understand how it is
possible to use a module's methods without first
loading the module itself.
Please refer to the Exporter and
perlmod manpages, and to the section on
use( ) in the perlfunc
manpage for more information about import( ).
Remember that you can always use require( ) to
preload the files at server startup if you don't add
( ), because:
require Data::Dumper;
is the same as:
use Data::Dumper ( );
except that it's not executed at compile-time.
4.4 Apache Configuration in Perl
With <Perl> ...
</Perl> sections,
you can
configure your server entirely in Perl. It's
probably not worth it if you have simple configuration files, but if
you run many virtual hosts or have complicated setups for any other
reason, <Perl> sections become very handy.
With <Perl> sections you can easily create
the configuration on the fly, thus reducing duplication and easing
maintenance.
To enable <Perl> sections, build mod_perl
with:
panic% perl Makefile.PL PERL_SECTIONS=1 [ ... ]
or with EVERYTHING=1.
4.4.1 Constructing <Perl> Sections
<Perl> sections can contain
any and as much Perl code as you wish.
<Perl> sections are compiled into a special
package called Apache::ReadConfig. mod_perl looks
through the symbol table for Apache::ReadConfig
for Perl variables and structures to grind through the Apache core
configuration gears. Most of the configuration directives can be
represented as scalars ($scalar) or arrays
(@array). A few directives become hashes.
How do you know which Perl global variables to use? Just take the
Apache directive name and prepend either $,
@, or % (as shown in the
following examples), depending on what the directive accepts. If you
misspell the directive, it is silently ignored, so
it's a good idea to check your settings.
Since Apache directives are case-insensitive, their Perl equivalents
are case-insensitive as well. The following statements are
equivalent:
$User = 'stas';
$user = 'stas'; # the same
Let's look at all possible cases we might encounter
while configuring Apache in Perl:
Directives that accept zero or one argument are represented as
scalars. For example, CacheNegotiatedDocs is a
directive with no arguments. In Perl, we just assign it an empty
string: <Perl>
$CacheNegotiatedDocs = '';
</Perl> Directives that accept a single value are simple to handle. For
example, to configure Apache so that child processes run as user
httpd and group httpd, use:
User = httpd
Group = httpd What if we don't want user and group definitions to
be hardcoded? Instead, what if we want to define them on the fly
using the user and group with which the server is started? This is
easily done with <Perl> sections:
<Perl>
$User = getpwuid($>) || $>;
$Group = getgrgid($)) || $);
</Perl> We use the power of the Perl API to retrieve the data on the fly.
$User is set to the name of the effective user ID
with which the server was started or, if the name is not defined, the
numeric user ID. Similarly, $Group is set to
either the symbolic value of the effective group ID or the numeric
group ID.
Notice that we've just taken the Apache directives
and prepended a $, as they represent scalars.
Directives that accept more than one argument are represented as
arrays or as a space-delimited string. For example, this directive: PerlModule Mail::Send Devel::Peek becomes:
<Perl>
@PerlModule = qw(Mail::Send Devel::Peek);
</Perl> @PerlModule is an array variable, and we assign it
a list of modules. Alternatively, we can use the scalar notation and
pass all the arguments as a space-delimited string:
<Perl>
$PerlModule = "Mail::Send Devel::Peek";
</Perl>
Directives that can be repeated more than once with different values
are represented as arrays of arrays. For example, this configuration: AddEncoding x-compress Z
AddEncoding x-gzip gz tgz becomes:
<Perl>
@AddEncoding = (
['x-compress' => qw(Z)],
['x-gzip' => qw(gz tgz)],
);
</Perl>
Directives that implement a container block, with beginning and
ending delimiters such as <Location> ...
</Location>, are represented as Perl hashes.
In these hashes, the keys are the arguments of the opening directive,
and the values are the contents of the block. For example: Alias /private /home/httpd/docs/private
<Location /private>
DirectoryIndex index.html index.htm
AuthType Basic
AuthName "Private Area"
AuthUserFile /home/httpd/docs/private/.htpasswd
Require valid-user
</Location> These settings tell Apache that URIs starting with
/private are mapped to the physical directory
/home/httpd/docs/private/ and will be processed
according to the following rules:
The users are to be authenticated using basic authentication.
PrivateArea will be used as the title of the
pop-up box displaying the login and password entry form.
Only valid users listed in the password file
/home/httpd/docs/private/.htpasswd and who
provide a valid password may access the resources under
/private/.
If the filename is not provided, Apache will attempt to respond with
the index.html or index.htm
directory index file, if found.
Now let's see the equivalent
<Perl> section:
<Perl>
push @Alias, qw(/private /home/httpd/docs/private);
$Location{"/private"} = {
DirectoryIndex => [qw(index.html index.htm)],
AuthType => 'Basic',
AuthName => '"Private Area"',
AuthUserFile => '/home/httpd/docs/private/.htpasswd',
Require => 'valid-user',
};
</Perl> First, we convert the Alias directive into an
array @Alias. Instead of assigning, however, we
push the values at the end. We do this because it's
possible that we have assigned values earlier, and we
don't want to overwrite them. Alternatively, you may
want to push references to lists, like this:
push @Alias, [qw(/private /home/httpd/docs/private)]; Second, we convert the Location block, using
/private as a key to the hash
%Location and the rest of the block as its value.
When the structures are nested, the normal Perl rules
applythat is, arrays and hashes turn into references.
Therefore, DirectoryIndex points to an array
reference. As shown earlier, we can always replace this array with a
space-delimited string:
$Location{"/private"} = {
DirectoryIndex => 'index.html index.htm',
...
}; Also notice how we specify the value of the
AuthName attribute:
AuthName => '"Private Area"', The value is quoted twice because Apache expects a single value for
this argument, and if we write:
AuthName => 'Private Area', <Perl> will pass two values to Apache,
"Private" and
"Area", and Apache will refuse to
start, with the following complaint:
[Thu May 16 17:01:20 2002] [error] <Perl>: AuthName takes one
argument, The authentication realm (e.g. "Members Only")
If a block section accepts two or more identical keys (as the
<VirtualHost> ...
</VirtualHost> section does), the same rules
as in the previous case apply, but a reference to an array of hashes
is used instead. In one company, we had to run an Intranet machine behind a
NAT/firewall (using the 10.0.0.10 IP address). We decided up front to
have two virtual hosts to make both the management and the
programmers happy. We had the following simplistic setup:
NameVirtualHost 10.0.0.10
<VirtualHost 10.0.0.10>
ServerName tech.intranet
DocumentRoot /home/httpd/docs/tech
ServerAdmin webmaster@tech.intranet
</VirtualHost>
<VirtualHost 10.0.0.10>
ServerName suit.intranet
DocumentRoot /home/httpd/docs/suit
ServerAdmin webmaster@suit.intranet
</VirtualHost> In Perl, we wrote it as follows:
<Perl>
$NameVirtualHost => '10.0.0.10';
my $doc_root = "/home/httpd/docs";
$VirtualHost{'10.0.0.10'} = [
{
ServerName => 'tech.intranet',
DocumentRoot => "$doc_root/tech",
ServerAdmin => 'webmaster@tech.intranet',
},
{
ServerName => 'suit.intranet',
DocumentRoot => "$doc_root/suit",
ServerAdmin => 'webmaster@suit.intranet',
},
];
</Perl> Because normal Perl rules apply, more entries can be added as needed
using push( ). Let's say we want to create a special
virtual host for the company's president to show off
to his golf partners, but his fancy vision doesn't
really fit the purpose of the Intranet site. We just let him handle
his own site:
push @{ $VirtualHost{'10.0.0.10'} },
{
ServerName => 'president.intranet',
DocumentRoot => "$doc_root/president",
ServerAdmin => 'webmaster@president.intranet',
};
Nested block directives naturally become Perl nested data structures.
Let's extend an example from the previous section: <Perl>
my $doc_root = "/home/httpd/docs";
push @{ $VirtualHost{'10.0.0.10'} },
{
ServerName => 'president.intranet',
DocumentRoot => "$doc_root/president",
ServerAdmin => 'webmaster@president.intranet',
Location => {
"/private" => {
Options => 'Indexes',
AllowOverride => 'None',
AuthType => 'Basic',
AuthName => '"Do Not Enter"',
AuthUserFile => 'private/.htpasswd',
Require => 'valid-user',
},
"/perlrun" => {
SetHandler => 'perl-script',
PerlHandler => 'Apache::PerlRun',
PerlSendHeader => 'On',
Options => '+ExecCGI',
},
},
};
</Perl> We have added two Location blocks. The first,
/private, is for the juicy stuff and accessible
only to users listed in the president's password
file. The second, /perlrun, is for running dirty
Perl CGI scripts, to be handled by the
Apache::PerlRun handler.
<Perl> sections don't
provide equivalents for <IfModule> and
<IfDefine> containers. Instead, you can use
the module( ) and define( )
methods from the Apache package. For example: <IfModule mod_ssl.c>
Include ssl.conf
</IfModule> can be written as:
if (Apache->module("mod_ssl.c")) {
push @Include, "ssl.conf";
} And this configuration example:
<IfDefine SSL>
Include ssl.conf
</IfDefine> can be written as:
if (Apache->define("SSL")) {
push @Include, "ssl.conf";
} Now that you know how to convert the usual configuration directives
to Perl code, there's no limit to what you can do
with it. For example, you can put environment variables in an array
and then pass them all to the children with a single configuration
directive, rather than listing each one via
PassEnv or PerlPassEnv:
<Perl>
my @env = qw(MYSQL_HOME CVS_RSH);
push @PerlPassEnv, \@env;
</Perl> Or suppose you have a cluster of machines with similar configurations
and only small distinctions between them. Ideally, you would want to
maintain a single configuration file, but because the configurations
aren't exactly the same (for
example, the ServerName directive will have to
differ), it's not quite that simple.
<Perl> sections come to the rescue. Now you
can have a single configuration file and use the full power of Perl
to tweak the local configuration. For example, to solve the problem
of the ServerName directive, you might have this
<Perl> section:
<Perl>
use Sys::Hostname;
$ServerName = hostname( );
</Perl> and the right machine name will be assigned automatically.
Or, if you want to allow personal directories on all machines except
the ones whose names start with secure, you
can use:
<Perl>
use Sys::Hostname;
$ServerName = hostname( );
if ($ServerName !~ /^secure/) {
$UserDir = "public.html";
}
</Perl>
4.4.2 Breaking Out of <Perl> Sections
Behind
the
scenes, mod_perl defines a package called
Apache::ReadConfig in which it keeps all the
variables that you define inside the <Perl>
sections. So <Perl> sections
aren't the only way to use mod_perl to configure the
server: you can also place the Perl code in a separate file that will
be called during the configuration parsing with either
PerlModule or PerlRequire
directives, or from within the startup file. All you have to do is to
declare the package Apache::ReadConfig before
writing any code in this file.
Using the last example from the previous section, we place the code
into a file named apache_config.pl, shown in
Example 4-4.
Example 4-4. apache_config.pl
package Apache::ReadConfig;
use Sys::Hostname;
$ServerName = hostname( );
if ($ServerName !~ /^secure/) {
$UserDir = "public.html";
}
1;
Then we execute it either from httpd.conf:
PerlRequire /home/httpd/perl/lib/apache_config.pl
or from the startup.pl file:
require "/home/httpd/perl/lib/apache_config.pl";
4.4.3 Cheating with Apache->httpd_conf
In fact, you can create a complete
configuration file in Perl. For example, instead of putting the
following lines in httpd.conf:
NameVirtualHost 10.0.0.10
<VirtualHost 10.0.0.10>
ServerName tech.intranet
DocumentRoot /home/httpd/httpd_perl/docs/tech
ServerAdmin webmaster@tech.intranet
</VirtualHost>
<VirtualHost 10.0.0.10>
ServerName suit.intranet
DocumentRoot /home/httpd/httpd_perl/docs/suit
ServerAdmin webmaster@suit.intranet
</VirtualHost>
You can write it in Perl:
use Socket;
use Sys::Hostname;
my $hostname = hostname( );
(my $domain = $hostname) =~ s/[^.]+\.//;
my $ip = inet_ntoa(scalar gethostbyname($hostname || 'localhost'));
my $doc_root = '/home/httpd/docs';
Apache->httpd_conf(qq{
NameVirtualHost $ip
<VirtualHost $ip>
ServerName tech.$domain
DocumentRoot $doc_root/tech
ServerAdmin webmaster\@tech.$domain
</VirtualHost>
<VirtualHost $ip>
ServerName suit.$domain
DocumentRoot $doc_root/suit
ServerAdmin webmaster\@suit.$domain
</VirtualHost>
});
First, we prepare the data, such as deriving the domain name and IP
address from the hostname. Next, we construct the configuration file
in the "usual" way, but using the
variables that were created on the fly. We can reuse this
configuration file on many machines, and it will work anywhere
without any need for adjustment.
Now consider that you have many more virtual hosts with a similar
configuration. You have probably already guessed what we are going to
do next:
use Socket;
use Sys::Hostname;
my $hostname = hostname( );
(my $domain = $hostname) =~ s/[^.]+\.//;
my $ip = inet_ntoa(scalar gethostbyname($hostname || 'localhost'));
my $doc_root = '/home/httpd/docs';
my @vhosts = qw(suit tech president);
Apache->httpd_conf("NameVirtualHost $ip");
for my $vh (@vhosts) {
Apache->httpd_conf(qq{
<VirtualHost $ip>
ServerName $vh.$domain
DocumentRoot $doc_root/$vh
ServerAdmin webmaster\@$vh.$domain
</VirtualHost>
});
}
In the loop, we create new virtual hosts. If we need to create 100
hosts, it doesn't take a long timejust adjust
the @vhosts array.
4.4.4 Declaring Package Names in Perl Sections
Be careful when you
declare package names inside <Perl>
sections. For example, this code has a problem:
<Perl>
package Book::Trans;
use Apache::Constants qw(:common);
sub handler { OK }
$PerlTransHandler = "Book::Trans";
</Perl>
When you put code inside a <Perl> section,
by default it goes into the Apache::ReadConfig
package, which is already declared for you. This means that the
PerlTransHandler we tried to define will be
ignored, since it's not a global variable in the
Apache::ReadConfig package.
If you define a different package name within a
<Perl> section, make sure to close the scope
of that package and return to the
Apache::ReadConfig package when you want to define
the configuration directives. You can do this by either explicitly
declaring the Apache::ReadConfig package:
<Perl>
package Book::Trans;
use Apache::Constants qw(:common);
sub handler { OK }
package Apache::ReadConfig;
$PerlTransHandler = "Book::Trans";
</Perl>
or putting the code that resides in a different package into a block:
<Perl>
{
package Book::Trans;
use Apache::Constants qw(:common);
sub handler { OK }
}
$PerlTransHandler = "Book::Trans";
</Perl>
so that when the block is over, the Book::Trans
package's scope is over, and you can use the
configuration variables again.
However, it's probably a good idea to use
<Perl> sections only to create or adjust
configuration directives. If you need to run some other code not
related to configuration, it might be better to place it in the
startup file or in its own module. Your mileage may vary, of course.
4.4.5 Verifying <Perl> Sections
How do we know whether the
configuration made inside <Perl> sections
was correct?
First we need to check the validity of the Perl syntax. To do that,
we should turn it into a Perl script, by adding
#!perl at the top of the section:
<Perl>
#!perl
# ... code here ...
_ _END_ _
</Perl>
Notice that #!perl and _ _END_
_ must start from the column zero. Also, the same rules as
we saw earlier with validation of the startup file apply: if the
<Perl> section includes some modules that
can be loaded only when mod_perl is running, this validation is not
applicable.
Now we may run:
perl -cx httpd.conf
If the Perl code doesn't compile, the server
won't start. If the Perl code is syntactically
correct, but the generated Apache configuration is invalid,
<Perl> sections will just log a warning and
carry on, since there might be globals in the section that are not
intended for the configuration at all.
If you have more than one <Perl> section,
you will have to repeat this procedure for each section, to make sure
they all work.
To check the Apache configuration syntax, you can use the variable
$Apache::Server::StrictPerlSections,
added in mod_perl Version 1.22. If you set this variable to a true
value:
$Apache::Server::StrictPerlSections = 1;
then mod_perl will not tolerate invalid Apache configuration syntax
and will croak (die) if it encounters invalid
syntax. The default value is 0. If you
don't set
$Apache::Server::StrictPerlSections to
1, you should localize variables unrelated to
configuration with my( ) to avoid errors.
If the syntax is correct, the next thing we need to look at is the
parsed configuration as seen by Perl. There are two ways to see it.
First, we can dump it at the end of the section:
<Perl>
use Apache::PerlSections ( );
# code goes here
print STDERR Apache::PerlSections->dump( );
</Perl>
Here, we load the Apache::PerlSections module at
the beginning of the section, and at the end we can use its
dump( ) method to print out the configuration as
seen by Perl. Notice that only the configuration created in the
section will be seen in the dump. No plain Apache configuration can
be found there.
For example, if we adjust this section (parts of which we have seen
before) to dump the parsed contents:
<Perl>
use Apache::PerlSections ( );
$User = getpwuid($>) || $>;
$Group = getgrgid($)) || $);
push @Alias, [qw(/private /home/httpd/docs/private)];
my $doc_root = "/home/httpd/docs";
push @{ $VirtualHost{'10.0.0.10'} },
{
ServerName => 'president.intranet',
DocumentRoot => "$doc_root/president",
ServerAdmin => 'webmaster@president.intranet',
Location => {
"/private" => {
Options => 'Indexes',
AllowOverride => 'None',
AuthType => 'Basic',
AuthName => '"Do Not Enter"',
AuthUserFile => 'private/.htpasswd',
Require => 'valid-user',
},
"/perlrun" => {
SetHandler => 'perl-script',
PerlHandler => 'Apache::PerlRun',
PerlSendHeader => 'On',
Options => '+ExecCGI',
},
},
};
print STDERR Apache::PerlSections->dump( );
</Perl>
This is what we get as a dump:
package Apache::ReadConfig;
#hashes:
%VirtualHost = (
'10.0.0.10' => [
{
'Location' => {
'/private' => {
'AllowOverride' => 'None',
'AuthType' => 'Basic',
'Options' => 'Indexes',
'AuthUserFile' => 'private/.htpasswd',
'AuthName' => '"Do Not Enter"',
'Require' => 'valid-user'
},
'/perlrun' => {
'PerlHandler' => 'Apache::PerlRun',
'Options' => '+ExecCGI',
'PerlSendHeader' => 'On',
'SetHandler' => 'perl-script'
}
},
'DocumentRoot' => '/home/httpd/docs/president',
'ServerAdmin' => 'webmaster@president.intranet',
'ServerName' => 'president.intranet'
}
]
);
#arrays:
@Alias = (
[
'/private',
'/home/httpd/docs/private'
]
);
#scalars:
$Group = 'stas';
$User = 'stas';
1;
_ _END_ _
You can see that the configuration was created properly. The dump
places the output into three groups: arrays, hashes, and scalars. The
server was started as user stas, so the
$User and $Group settings were
dynamically assigned to the user stas.
A different approach to seeing the dump at any time (not only during
startup) is to use the Apache::Status module (see
Chapter 9). First we store the Perl configuration:
<Perl>
$Apache::Server::SaveConfig = 1;
# the actual configuration code
</Perl>
Now the Apache::ReadConfig namespace (in which the
configuration data is stored) will not be flushed, making
configuration data available to Perl modules at request time. If the
Apache::Status module is configured, you can view
it by going to the /perl-status URI (or another
URI that you have chosen) in your browser and selecting
"Perl Section Configuration" from
the menu. The configuration data should look something like that
shown in Figure 4-1.
Since the Apache::ReadConfig namespace is not
flushed when the server is started, you can access the configuration
values from your codethe data resides in the
Apache::ReadConfig package. So if you had the
following Perl configuration:
<Perl>
$Apache::Server::SaveConfig = 1;
$DocumentRoot = "/home/httpd/docs/mine";
</Perl>
at request time, you could access the value of
$DocumentRoot with the fully qualified name
$Apache::ReadConfig::DocumentRoot. But usually you
don't need to do this, because mod_perl provides you
with an API to access to the most interesting and useful server
configuration bits.
4.4.6 Saving the Perl Configuration
Instead of
dumping the generated Perl
configuration, you may decide to store it in a file. For example, if
you want to store it in httpd_config.pl, you can
do the following:
<Perl>
use Apache::PerlSections ( );
# code goes here
Apache::PerlSections->store("httpd_config.pl");
</Perl>
You can then require( ) that file in some other
<Perl> section. If you have the whole server
configuration in Perl, you can start the server using the following
trick:
panic% httpd -C "PerlRequire httpd_config.pl"
Apache will fetch all the configuration directives from
httpd_config.pl, so you don't
need httpd.conf at all.
4.4.7 Debugging
If your
configuration
doesn't seem to do what it's
supposed to do, you should debug it. First, build mod_perl with:
panic% perl Makefile.PL PERL_TRACE=1 [...]
Next, set the environment variable MOD_PERL_TRACE
to s (as explained in Chapter 21). Now you should be able to see how the
<Perl> section globals are converted into
directive string values. For example, suppose you have the following
Perl section:
<Perl>
$DocumentRoot = "/home/httpd/docs/mine";
</Perl>
If you start the server in single-server mode (e.g., under
bash):
panic% MOD_PERL_TRACE=s httpd -X
you will see these lines among the printed trace:
...
SVt_PV: $DocumentRoot = `/home/httpd/docs/mine'
handle_command (DocumentRoot /home/httpd/docs/mine): OK
...
But what if you mistype the directory name and pass two values
instead of a single value? When you start the server,
you'll see the following error:
...
SVt_PV: $DocumentRoot = `/home/httpd/docs/ mine'
handle_command (DocumentRoot /home/httpd/docs/ mine):
DocumentRoot takes one argument,
Root directory of the document tree
...
and of course the error will be logged in the
error_log file:
[Wed Dec 20 23:47:31 2000] [error]
(2)No such file or directory: <Perl>:
DocumentRoot takes one argument,
Root directory of the document tree
4.5 Validating the Configuration Syntax
Before
you restart a server on a live production
machine after the configuration has been changed,
it's essential to validate that the configuration
file is not broken. If the configuration is broken, the server
won't restart and users will find your server
offline for the time it'll take you to fix the
configuration and start the server again.
You can use apachectl configtest or
httpd -t to validate the configuration file
without starting the server. You can safely validate the
configuration file on a running production server, as long as you run
this test before you restart the server with apachectl
restart. Of course, it is not 100% perfect, but it will
reveal any syntax errors you might have made while editing the file.
The validation procedure doesn't just parse the code
in startup.pl, it executes it too.
<Perl> sections invoke the Perl interpreter
when reading the configuration files, and
PerlRequire and PerlModule do
so as well.
Of course, we assume that the code that gets called during this test
cannot cause any harm to your running production environment. If
you're worried about that, you can prevent the code
in the startup script and in <Perl> sections
from being executed during the syntax check. If the server
configuration is tested with -Dsyntax_check:
panic% httpd -t -Dsyntax_check
you can check in your code whether syntax_check
was set with:
Apache->define('syntax_check')
If, for example, you want to prevent the code in
startup.pl from being executed, add the
following at the top of the code:
return if Apache->define('syntax_check');
Of course, there is nothing magical about using the string
'syntax_check' as a flagyou can use any
other string as well.
4.6 The Scope of mod_perl Configuration Directives
Table 4-1 depicts
where the various mod_perl configuration
directives can be used.
Table 4-1. The Scope of mod_perl configuration directives
PerlTaintCheck
|
V
|
|
|
PerlWarn
|
V
|
|
|
PerlFreshRestart
|
V
|
|
|
PerlPassEnv
|
V
|
V
|
|
PerlRequire
|
V
|
V
|
V
|
PerlModule
|
V
|
V
|
V
|
PerlAddVar
|
V
|
V
|
V
|
PerlSetEnv
|
V
|
V
|
V
|
PerlSetVar
|
V
|
V
|
V
|
PerlSetupEnv
|
V
|
V
|
V
|
PerlSendHeader
|
V
|
V
|
V
|
<Perl> Sections
|
V
|
V
|
V
|
The first column represents directives that can appear in the global
configuration; that is, outside all sections. Note that
PerlTaintCheck, PerlWarn, and
PerlFreshRestart can be placed inside
<VirtualHost> sections. However, because
there's only one Perl interpreter for all virtual
hosts and the main server, setting any of these values in one virtual
host affects all other servers. Therefore, it's
probably a good idea to think of these variables as being allowed
only in the global configuration.
The second column represents directives that can appear inside the
<VirtualHost> sections.
The third column represents directives that can appear in the
<Directory>,
<Location>, and
<Files> sections and all their regex
variants. These mod_perl directives can also appear in
.htaccess files.
For example, PerlWarn cannot be used in
<Directory> and
<VirtualHost> sections. However,
PerlSetEnv can be used anywhere, which allows you
to provide different behavior in different sections:
PerlSetEnv ADMIN_EMAIL webmaster@example.com
<Location /bar/manage/>
PerlSetEnv ADMIN_EMAIL bar@example.com
</Location>
In this example, a handler invoked from
/bar/manage/ will see the
ADMIN_EMAIL environment variable as
bar@example.com, while other handlers configured
elsewhere will see ADMIN_EMAIL as the default
value, webmaster@example.com.
4.7 Apache Restarts Twice
When the server is restarted, the
configuration and module initialization phases are called twice
before the children are forked. The second restart is done to test
that all modules can survive a restart (SIGHUP),
in order to ensure that future graceful restarts will work correctly.
This is very important if you are going to restart a production
server.
You can control what Perl code will be executed on the start or
restart by checking the values of
$Apache::Server::Starting and
$Apache::Server::ReStarting. The former variable
is true when the server is starting, and the latter is true when
it's restarting.
For example, if you want to be notified when the server starts or
restarts, you can do:
<Perl>
email_notify("start") if $Apache::Server::Starting;
email_notify("restart") if $Apache::Server::ReStarting;
</Perl>
where the function email_notify( ) (that you have
to write) performs the notification. Since Apache restarts itself on
start, you will get both notifications when Apache is started, and
only one when it's restarted.
The startup.pl file and similar files loaded via
PerlModule or PerlRequire are
compiled only once, because once the module is compiled, it enters
the special %INC hash. When Apache restarts, Perl
checks whether the module or script in question is already registered
in %INC and won't try to compile
it again.
Thus, the only code that you might need to protect from running on
restart is the code in <Perl> sections. But
since <Perl> sections are primarily used for
creating on-the-fly configurations, it shouldn't be
a problem to run the code more than once.
4.8 Enabling Remote Server Configuration Reports
The nifty mod_info
Apache
module displays the complete server configuration in your browser. In
order to use it, you have to compile it in or, if the server was
compiled with DSO mode enabled, load it as an object. Then just
uncomment the already prepared section in the
httpd.conf file:
<Location /server-info>
SetHandler server-info
Order deny,allow
Deny from all
Allow from localhost
</Location>
Now restart the server and issue the request:
http://localhost/server-info
We won't show a snapshot of the output here, as
it's very lengthy. However, you should know that
mod_info is unaware of the configuration created or modified by
<Perl> sections or equivalent methods
discussed earlier in this chapter.
4.9 Tips and Tricks
The following are miscellaneous tips and tricks that might save you
lots of time when configuring mod_perl and Apache.
4.9.1 Publishing Port Numbers Other Than 80
If you are using a dual-server setup, with a
mod_perl server listening on a high port (e.g., 8080),
don't publish the high port number in URLs. Rather,
use a proxying rewrite rule in the non-mod_perl server:
RewriteEngine On
RewriteLogLevel 0
RewriteRule ^/perl/(.*) http://localhost:8080/perl/$1 [P]
ProxyPassReverse / http://localhost/
In the above example, all the URLs starting with
/perl are rewritten to the backend server,
listening on port 8080. The backend server is not directly
accessible; it can be reached only through the frontend server.
One of the problems with publishing high port numbers is that
Microsoft Internet Explorer (IE) 4.x has a bug when re-posting data
to a URL with a nonstandard port (i.e., anything but 80). It drops
the port designator and uses port 80 anyway. Hence, your service will
be unusable for IE 4.x users.
Another problem is that firewalls will probably have most of the high
ports closed, and users behind them will be unable to reach your
service if it is running on a blocked port.
4.9.2 Running the Same Script from Different Virtual Hosts
When
running under a virtual host, Apache::Registry and
other registry family handlers will compile each script into a
separate package. The package name includes the name of the virtual
host if the variable
$Apache::Registry::NameWithVirtualHost is set to
1. This is the default behavior.
Under this setting, two virtual hosts can have two different scripts
accessed via the same URI (e.g.,
/perl/guestbook.pl) without colliding with each
other. Each virtual host will run its own version of the script.
However, if you run a big service and provide a set of identical
scripts to many virtual hosts, you will want to have only one copy of
each script compiled in memory. By default, each virtual host will
create its own copy, so if you have 100 virtual hosts, you may end up
with 100 copies of the same script compiled in memory, which is very
wasteful. If this is the case, you can override the default behavior
by setting the following directive in a startup file or in a
<Perl> section:
$Apache::Registry::NameWithVirtualHost = 0;
But be careful: this makes sense only if you are sure that there are
no other scripts with identical URIs but different content on
different virtual hosts.
Users of mod_perl v1.15 are encouraged to upgrade to the latest
stable version if this problem is encounteredit was solved
starting with mod_perl v1.16.
4.10 Configuration Security Concerns
Any service open to the Internet at large must take security into
account. Large, complex software tends to expose subtle
vulnerabilities that attackers can exploit to gain unauthorized
access to the server host. Third-party modules or libraries can also
contain similarly exploitable bugs. Perl scripts
aren't immune either: incorrect untainting and
sanitizing of user input can lead to disaster when this input is fed
to the open( ) or system( )
functions.
Also, if the same mod_perl server is shared by more than one user,
you may need to protect users of the server from each other (see
Appendix C).
4.10.1 Using Only Absolutely Necessary Components
The more modules you have enabled in your web
server, the more complex the code and interaction between these
modules will be. The more complex the code in your web server, the
more chances for bugs there are. The more chances for bugs, the more
chance there is that some of those bugs may involve security holes.
Before you put the server into production, review the server setup
and disable any unused modules. As time goes by, the server
enviroment may change and some modules may not be used anymore. Do
periodical revisions of your setups and disable modules that
aren't in use.
4.10.2 Taint Checking
Make sure to run the server with the following
setting in the httpd.conf file:
PerlTaintCheck On
As discussed in Chapter 6, taint checking
doesn't ensure that your code is completely safe
from external hacks, but it does force you to improve your code to
prevent many potential security problems.
4.10.3 Hiding Server Information
We
aren't completely sure why the default value
of the ServerTokens directive in Apache is
Full rather than Minimal. It
seems like Full is really useful only for
debugging purposes. A probable reason for using ServerTokens
Full is publicity: it means that Netcraft
(http://netcraft.com/) and other similar survey
services will count more Apache servers, which is good for all of us.
In general, though, you really want to reveal as little information
as possible to potential crackers.
Another approach is to modify the httpd sources
to not reveal any unwanted information, so that all responses return
an empty or phony Server: field.
Be aware, however, that there's no security by
obscurity (as the old saying goes). Any determined cracker will
eventually figure out what version of Apache is running and what
third-party modules are built in.
You can see what information is revealed by your server by telneting
to it and issuing some request. For example:
panic% telnet localhost 8080
Trying 127.0.0.1
Connected to localhost
Escape character is '^]'.
HEAD / HTTP/1.0
HTTP/1.1 200 OK
Date: Sun, 16 Apr 2000 11:06:25 GMT
Server: Apache/1.3.24 (Unix) mod_perl/1.26 mod_ssl/2.8.8 OpenSSL/0.9.6
[more lines snipped]
As you can see, a lot of information is revealed when
ServerTokens Full has been specified.
4.10.4 Making the mod_perl Server Inaccessible from the Outside
It is best not to expose mod_perl to the
outside world, as it creates a potential security risk by revealing
which modules you use and which operating system you are running your
web server on. In Chapter 12, we show how to make
mod_perl inaccessible directly from the outside by listening only to
the request coming from mod_proxy at the local host (127.0.0.1).
4.10.5 Protecting Private Status Locations
It's a good idea
to protect your various monitors,
such as /perl-status, by password. The less
information you provide for intruders, the harder it will be for them
to break in. (One of the biggest helps you can provide for these bad
guys is to show them all the scripts you use. If any of these are in
the public domain, they can grab the source of the script from the
Web, study it, and probably find a few or even many security holes in
it.)
Security by obscurity may help to wave away some of the
less-determined malicious fellas, but it doesn't
really work against a determined intruder. For example, consider the
old <Limit> container:
<Location /sys-monitor>
SetHandler perl-script
PerlHandler Apache::VMonitor
AuthUserFile /home/httpd/perl/.htpasswd
AuthGroupFile /dev/null
AuthName "Server Admin"
AuthType Basic
<Limit GET POST>
require user foo bar
</Limit>
</Location>
Use of the <Limit> container is a leftover
from NCSA server days that is still visible in many configuration
examples today. In Apache, it will limit the scope of the
require directive to the GET
and POST request methods. Use of another method
will bypass authentication. Since most scripts don't
bother checking the request method, content will be served to the
unauthenticated users.
For this reason, the
Limit directive generally should not be
used. Instead, use this secure configuration:
<Location /sys-monitor>
SetHandler perl-script
PerlHandler Apache::VMonitor
AuthUserFile /home/httpd/perl/.htpasswd
AuthGroupFile /dev/null
AuthName "Server Admin"
AuthType Basic
require user foo bar
</Location>
The contents of the password file
(/home/httpd/perl/.htpasswd) are populated by
the htpasswd utility, which comes bundled with
Apache:
foo:1SA3h/d27mCp
bar:WbWQhZM3m4kl
|
|
|
|
|
| | | | | | | | | | | | | | | |