Writing Apache 2.0 Output Filters09/13/2001
In the last article, we discussed the basics of Apache 2.0 filters and there was enough information to get started, but not enough to write a functioning filter. In this article, we will finish discussing output filters. After reading this article, you should be able to write your own Apache filters.
When the filter interface was first designed, it was written for the Apache developers and very little attention was paid to making it easy for other people to write their own filters. Since that time, the developers have looked at the API again and have added a simple layer on top of the original interface. The original API required that the filter writer take into account how much data they were passing to the next filter to take full advantage of the filters. For example, if you were writing a filter that swapped every other word, you had two choices -- either convert the file to a series of buckets and move the pointers in the bucket list around, or copy each word into a large block of memory.
Each of these approaches has problems that must be solved, but with the new API this problem becomes simple. All the developer must do is split the file into individual words and write the words to the next filter. Apache itself will take care of copying the data when it should, and it will take care of passing the data to the next filter when appropriate. This module implements this swap filter.
The new API very closely resembles the buffered file I/O API from POSIX. In fact, the developers used that API as the model when redesigning filters. There are five functions used to send data from the current filter to the next. All five functions share the same first two parameters. The filter passed to each function is the next filter in the filter chain, and the bucket brigade used to store the data, if necessary. All of the functions also share the same characteristics. Data is copied into the last bucket in the brigade until the brigade has more than 8K of data. Once 8K is reached, the entire brigade is sent to the next filter in the chain.
ap_fwrite(ap_filter_t *f, apr_bucket_brigade *bb, const char *data, apr_size_t nbyte)
Write a specified number of characters from the data string to the next filter. The variable
nbytesspecifies how many characters should be written.
ap_fputs(ap_filter_t *f, apr_bucket_brigade *bb, const char *s)
Write the string
sto the next filter.
ap_fputc(ap_filter_t *f, apr_bucket_brigade *bb, const char c)
Write the character
cto the next filter.
ap_fputstrs(ap_filter_t *f, apr_bucket_brigade *bb, ...)
Write all the strings passed to this function to the next filter.
ap_fprintf(ap_filter_t *f, apr_bucket_brigade *bb, const char *fmt, ...)
printf()formatted data to the next filter.
In addition to those five standard functions, there is one more that is
important. Because these functions buffer data until there is enough to send,
it is vital that filter writers be able to dictate that the data must be sent
immediately. This is done using the final function,
ap_fflush(ap_filter_t *f, apr_bucket_brigade *bb). This function just takes the current brigade and sends it to the next filter.
Now that you know how to send data, there is only one more thing that you must know before you can write your own filter. Apache filters are called as many times as necessary to process all of the data produced by the handler. This means that it is possible and even likely that at some point your filter will begin to process data and find that it doesn't have enough to finish processing. In some cases, you can just save a state and return to the previous filter to wait for more information. However, more often you will need to save some of the data that you have already parsed for the next time your filter is called. This is done using:
ap_save_brigade(ap_filter_t *f, apr_bucket_brigade **save_to, apr_bucket_brigade **b, apr_pool_t *p)
This function accepts the current filter pointer as the first argument. The
second function is the bucket brigade used to save the data. If the
brigade is "null" it will be created inside the
save_brigade function. The third
parameter is the current brigade. This brigade should contain the data that
you want to save for the next time the filter is called. Finally, this
function accepts a pool which is used to allocate any required data. When
this function returns, the
save_to brigade contains a complete copy of all
the data to be sent to the next filter. This brigade can then be saved to
ctx pointer for use the next time the filter is called.
When writing filters, it is important to realize that filters are called as often as necessary to process all the data. This is a good thing because it allows Apache to stream information to the clients as soon as it is available. However, this also has some drawbacks for filter writers.
It is important to realize that there are some things that can be done the first time a filter is called that can never be done again. For example, the first time a filter is called, it is possible to modify the headers associated with a response. It is also possible to add a special error bucket to the brigade. If an error bucket is added to the brigade, then one of Apache's core filters will find the bucket and issue an error response instead of sending the data that has been generated.
Writing Apache Modules with Perl and C
Between the description of the filter functions in this article, and the example output filter linked to above, you should have enough information to be able to write your own Apache filters. Many things can be done with Apache filters and I encourage everybody to experiment with the new filtering abilities in Apache 2.0. In the next article in this series, we will explore Apache 2.0 input filters. Input and output filters share some characteristics, but they are different enough that spending one article specifically on input filters is a valuable exercise.
Finally, I want to catch everybody up on the current status of Apache 2.0. We are still tracking down a show-stopper problem with the threaded MPM that could take down the web server in edge cases. A beginning patch has been applied, and as soon as it is completed the Apache developers will tag and roll another distribution of Apache 2.0. This distribution should become the next beta release of Apache.
Ryan Bloom is a member of the Apache Software Foundation, and the Vice President of the Apache Portable Run-time project.
Read more Apache 2.0 Basics columns.
Return to the Apache DevCenter.