Apache Modules
09/27/2001Over the last few articles, I have covered some of the new features in Apache 2.0, and how you can take advantage of them in your web servers. This time, I am going to cover one of the least-discussed features in Apache 2.0.
One of the biggest advantages of Apache over other web servers is how easy it is to write powerful modules. Apache has used modules since version 1.0 to implement everything but serving basic, static files. Because Apache itself used modules for everything, modules need to have access to every stage of serving a request. Modules have, however, always suffered from one major flaw.
Modules have always been solitary entities. If two modules
both have to do the same operation, they both need to implement
the feature. This makes maintaining modules very difficult
because if you modify the behavior in one module, you must also
modify it in every other location. Perhaps the best examples
of this are the mod_include and mod_cgi modules. mod_include
implements server-side include (SSI) processing. mod_cgi spawns
CGI scripts. However, mod_include also needs to be able to
spawn CGI scripts whenever it encounters the following code
in an SSI:
<!--#exec cgi=/cgi-bin/printenv -->
In Apache 1.3, both modules had logic to create a CGI process
and execute the correct program in it. This logic was very
complex on some platforms -- especially platforms that did not
look at the #! line of a script to find the interpreter.
In Apache 2.0, we can remove this maintainance problem by
allowing mod_include to call into mod_cgi to create a CGI
script. This does have some drawbacks. The first major
drawback is that mod_include cannot include the output
of CGI scripts unless mod_cgi is loaded into the server.
The
Apache developers decided that this was a valid tradeoff
because by not including mod_cgi, the administrator has stated that he or she does not want to allow CGI scripts to be run. We
could not find a valid reason to allow CGI scripts within
SSI files if they weren't allowed to direct connections.
This addition to mod_include allows other module authors to
easily extend the tags that are allowed in SSI files, without
having to modify the base mod_include code.
|
Also in Apache 2.0 Basics: Writing Input Filters for Apache 2.0 |
To demonstrate how easy it is to extend mod_include, let's
take a look at how mod_cgi does it. The first step is to
retrieve a couple functions from mod_include. This is
done using optional functions, a new feature in Apache 2.0, that lets
a module can register functions with the core
as optional. When another module wants to use
those functions, it queries the core server, asking if
modules those functions have been registered. The core
uses the name of the function as the key to find the
pointer that was stored when the function was registered.
There are three functions that are important for mod_include
extensions: ap_register_include_handler, ap_ssi_get_tag_and_value,
and ap_ssi_parse_string. The ap_register_include_handler function is used to specify the tag that this module will
handle. ap_ssi_get_tag_and_value gets the next attribute
and value from the SSI tag that is being parsed. SSI extension
functions will loop calling this function, retreiving each
of the attributes until the function returns a "null"
attribute. This signifies that all attributes have been
found. The final function is ap_ssi_parse_string; this
function is used to do variable substitution on a string.
Once all these function pointers are retrieved, your
module must register a tag with mod_include. This can only
be done if all the functions were retrieved
successfully. To register a tag, just call the register
function with the string that mod_include should look for
when processing SSI output, and a function to call when the
string is found. The following function is the function
that mod_cgi uses to do this.
static void cgi_post_config(apr_pool_t *p, apr_pool_t *plog, apr_pool_t *ptemp, server_rec *s)
{
cgi_pfn_reg_with_ssi = APR_RETRIEVE_OPTIONAL_FN(ap_register_include_handler)
;
cgi_pfn_gtv = APR_RETRIEVE_OPTIONAL_FN(ap_ssi_get_tag_and_value);
cgi_pfn_ps = APR_RETRIEVE_OPTIONAL_FN(ap_ssi_parse_string);
if ((cgi_pfn_reg_with_ssi) && (cgi_pfn_gtv) && (cgi_pfn_ps)) {
/* Required by mod_include filter. This is how mod_cgi registers
* with mod_include to provide processing of the exec directive.
*/
cgi_pfn_reg_with_ssi("exec", handle_exec);
}
}
post_config phase. The problem is
that mod_include uses a hash table to store the string/function
pairs, and it allocates that hash table during its post_config
function. If the mod_cgi function is called first, then the server
will "seg fault" during startup. This is easily solved using the
new hook mechanism in Apache 2.0. Below is an example this. This is the mod_cgi module's register_hooks function.
When mod_cgi registers its post_config function, it must
specify that mod_include should run first.
static void register_hooks(apr_pool_t *p)
{
static const char * const aszPre[] = { "mod_include.c", NULL };
ap_hook_handler(cgi_handler, NULL, NULL, APR_HOOK_MIDDLE);
ap_hook_post_config(cgi_post_config, aszPre, NULL, APR_HOOK_REALLY_FIRST);
}
|
|
Of course, this is just one example of where modules extending modules
are useful. mod_log_config is another place where this ability
has been added to the server. In the case of mod_log_config, we now
allow modules to extend the type of information that can be logged.
As more people implement Apache 2.0 modules, more situations like
this will be discovered, and the core team will continue to create
opportunities for modules to help each other solve problems.
Apache 2.0 has many features that were not possible in Apache 1.3. These features will help to keep Apache the most popular web server on the Internet. As this is my last article for OnLAMP.com, I hope that in the last six months I have taught you something about Apache 2.0. And I hope that you will download it and experiment to see where it can provide a better solution than Apache 1.3.
Ryan Bloom is a member of the Apache Software Foundation, and the Vice President of the Apache Portable Run-time project.
Read more Apache 2.0 Basics columns.
Return to the Apache DevCenter.

