ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Building a PHP Front Controller

by Q Ethan McCallum
07/08/2004

I recently had the opportunity to implement some small (noncommercial) web sites in PHP. One of many decisions I faced concerned the templating strategy: duplicated, include()-based template pages weren't quite future-proof, while a formal, external template library would have been overkill.

Luckily, the web host in question permitted certain key Apache directives in .htaccess that let me customize request handling. Add to that PHP's OO support and my templating decision reached a comfortable middle ground in the form of a custom Front Controller.

The Front Controller design pattern describes a way to centralize processing in a web-based application. Routing all requests through a single entry point provides a place to apply common application logic, and, at the same time, reduce dependencies between other components. All of this adds up to a site that is easier to maintain and extend.

In this article, I will share a stripped-down version of the controller I implemented to mediate the relationship between content and layout.

What I explain here is not limited to PHP; the Front Controller design pattern is available in many server-based dynamic web design toolkits, such as mod_perl and Java servlets/JSP. Unlike those technologies, this article's implementation of the Front Controller requires only a modest Apache setup. Your PHP web host most likely supports the required configuration.

Related Reading

PHP Cookbook
By David Sklar, Adam Trachtenberg

The sample code was tested under Apache 1.3.27 and PHP 4.3.4.

First, Some Theory

A web server is typically a glorified file service. A client asks for a file and the server retrieves it. The Uniform Resource Identifier (URI) specifies a file under the document root, so the request and response are closely related.

Web servers may also invoke executables, such as a CGI or PHP scripts, instead of fetching raw files. Executables insert logic between the request and response, changing portions of the latter (to add a customized greeting or pull different records from a database, for example) with each call.

It's possible to route all requests through a single executable, generating an entirely different page for each one. In this case the executable is the site's Front Controller and the generated response is the view. The URI and request parameters comprise a command issued to the controller, which performs some logic to decide which view to return to the client.

The controller's central location in the request-response cycle makes it a suitable place to apply common logic for checking credentials or setting response headers.

A controller-managed web site operates independently of the underlying web server to a certain extent; file access, authentication, and content generation take place within the code. This is why the controller can access files from directories that otherwise have .htaccess restrictions.

Purists will note that the description here is abbreviated, with some participants, such as the dispatcher, folded into the main controller. The code is a starting point which you may extend as needed.

Direct and Indirect

Clients must call the controller either directly or indirectly. The direct method adds request parameters to the controller URI:

http://host/controller.php?display=main
http://host/controller.php?display=contact_form
http://host/controller.php?display=privacy_policy

All intra-site links point to /controller.php, but the varied request parameters change which page to show. This works and has wide support among web servers, but the string of parameters can grow unwieldy.

The indirect way is cleaner and looks more natural, but the specific setup depends on the web server used. Apache's AddHandler configuration directive associates an executable or module with a path (such as /special) or filename extension (.site):

http://host/intro/main.site
http://host/intro/contact_form.site
http://host/about/privacy_policy.site

Notice that there is no explicit mention of the controller. Associating of the controller with the file extension .site takes place behind the scenes in the httpd.conf or .htaccess file.

The (hidden) executable here is an Apache content handler. When a user requests a matching URI, Apache short-circuits its normal request/response flow and passes control to the registered handler. It is then up to the handler to generate the content to return to the requesting browser.

This article will demonstrate the indirect method and use filename extensions.

A Simple Example

A concrete example will clarify the previous sections' points. Consider the following code, simple.php:

<p>
Controller called for URI
<code>
    <?= $_SERVER[ 'REQUEST_URI' ] ?>
</code>
</p>

<ul>
<?php foreach( $_SERVER as $key => $val ){ ?>
    <li>
    <b><?= ${key} ?>:</b>
    <tt><?= ${val} ?></tt>
    </li>
<?php } ?>
</ul>

Assume simple.php exists in the base directory of the document root of a web server listening on localhost:8000.

In .htaccess, the lines:

Action     controller-test /simple.php
AddHandler controller-test .tst

define /simple.php as the action controller-test and associate that action with the file extension .tst. When Apache receives a request for a filename ending in .tst, it will execute /simple.php instead of trying to serve a file from disk. (The executable named by Action is relative to the document root.)

Try it out: make web requests for simple.php then various .tst files. For example:

http://localhost:8000/simple.php
http://localhost:8000/foo.tst
http://localhost:8000/params.tst?p1=one&p2=two
http://localhost:8000/not_here/this_will_fail.tst

The first URL prints out all request parameters and server variables. So does the second URL, although the "request URI" line changes. Here, foo.tst was requested, and because of the AddHandler call in .htaccess, Apache let simple.php take over. Though it doesn't demonstrate much in the way of aesthetics, the output from simple.php shows the wealth of variables available to a PHP script that is called as a controller. (Sharp eyes may notice that it's the same set of variables available during normal PHP script execution.)

Fittingly, the third URL doesn't work. A requested resource may be purely virtual, but directories leading up to that resource must exist. Requests for /not_here/this_will_fail.tst fail because the directory not_here does not exist in the document root.

Nor can a controller-managed resource act as an index page. An explicit request for index.site will work, but a request of http://localhost:8000/ will not attempt to load /index.site, regardless of DirectoryIndex directives.

These two limitations are based on how Apache maps requests, described in detail in Writing Apache Modules with Perl and C.

What About Apache 2.x?

Users of Apache 2 will notice that simple.php does not work as a content handler. For Apache 2.0.x, AddHander works only for existing files. According to reports (specifically, Bugzilla bug ID 8431), Apache 1.x misused the term "handler" so this changed in 2.x.

There is hope. Per that bug report, Apache 2.1 may add a configuration directive to relax the restriction, such that AddHandler can refer to nonexistent files just as it did in Apache 1.x.

Putting It Together: An Overview of the Sample Site

This article's sample web site uses a Front Controller to separate content from layout. The controller maps request URIs to content files, then merges the content with a layout template at request time.

php_include/classes.php contains the helper classes that simplify the process:

SitePage, PageMap, and AliasMap could be replaced by an XML document; but they are straight code here for the sake of simplicity.

php_include/mappings.php configures the page mappings and populates AppConfig with some values. End-users can adjust site settings here without touching the support classes or controller code.

Controller.php, the controller, weighs in around 30 lines of code because the helper classes do the heavy lifting:

<?php

require_once( 'php_include/classes.php'  );
require_once( 'php_include/mappings.php' );

$daysToCache  = 1.5;
$cacheMaxAge  = ${daysToCache}*24*(60*60);

$layoutFile   = 'php_include/layout_main.php';

$requestedURI = ereg_replace(
    $config->getValue( 'URISuffix' ) .  '$' ,
    "" ,
    $_SERVER["REQUEST_URI"]
);

$page         = $pageMappings->getPage( $requestedURI );

if( is_null( $page ) )
{
    header( 'HTTP/1.0 404 Not Found' );
    $page = $pageMappings->getNotFoundPage();
}

if( ! headers_sent() )
{
    header(
        'Cache-Control: max-age=' . ${cacheMaxAge} . ',
        must-revalidate'
    );
    header(
        'Last-Modified: ' . gmdate('D, d M Y H:i:s' ,
        $page->getLastModified() ) . ' GMT'
    );
}

include( $layoutFile );

?>

The actual content files in the content directory are HTML fragments with no layout. The include() function merges them with the layout template. Adding a new page to the site requires placing the file under content, updating the PageMap, and (optionally) updating AliasMap.

The content and php_include directories should never allow direct access, so .htaccess files restrict normal web requests.

.htaccess in the base of the document root contains various Apache settings. Since virtual resources cannot be index documents, RedirectMatch forwards requests for the root URI to /main/index.site. (Change the host and port statements to make the sample code to work on your computer.) AddHandler directives associate Controller.php with requests ending in .site. ErrorDocument directives map HTTP status codes to controller-managed error pages.

The stub directories main and errors permit us to call controller-managed resources with directory paths, for logical grouping. (Recall Apache's handling of nonexistent directories, described above.)

Request Flow

When the site receives a request for a controller-managed file:

  1. The controller loads its helper classes and the mappings file which creates $config, the app-wide configuration object.
  2. The controller strips the custom file extension (as declared in mappings.php) from the request URI and pulls the matching SitePage object from the PageMap.
  3. If the content file exists and is readable, the $page variable will contain the SitePage object. Otherwise, $page refers to the designated 404-error page.
  4. The controller sets headers — in this case, a Last-Modified date — and passes control to the layout template.
  5. The layout template (php_include/layout_main.php) uses $config to set the header bar's background color and pulls the name of the content file from $page.
  6. The include() call inserts the content page in the middle of the template to complete the view. The server will interpret content files as PHP documents, so they may contain any valid PHP logic.

At this point, the server can return a complete HTML document to the client browser.

Is It Appropriate?

Few techniques are suitable for everyone. If your web site updates involve adding new pages in-flight, then the overhead of the map maintenance may work against you. However, if the dynamic nature of your site depends on something outside of the raw pages (such as a database) or if your changes involve carefully-tested migrations, then the Front Controller technique may benefit you long-term.

Conclusion

The Front Controller demonstrated in this article separates the request from the response, the layout from the content, and the content pages from each other. Such decoupling offers significant flexibility over both static HTML pages and duplicated, include()-based template pages.

The sample code is just a starting point. Certainly, the controller could do more in terms of logging or authentication checks. The SitePage class could also provide more information, such as per-page <META> tags. Finally, it would be trivial to implement automatic generation of a site map through a helper class that logically groups internal aliases.

For more detail on the Front Controller design pattern, refer to the texts listed in the Resources section.

Resources

Q Ethan McCallum grew from curious child to curious adult, turning his passion for technology into a career.


Return to the PHP DevCenter.

Copyright © 2009 O'Reilly Media, Inc.