Apache DevCenter
oreilly.comSafari Books Online.Conferences.

advertisement


HTTP Wrangler

AxKit: An XML-Delivery Toolkit for Apache

by Rael Dornfest
11/29/200011/29/2000

AxKit is a feature-rich XML application server for Apache that brings together flexible XML transformation, a dynamic component architecture, and the power of an embedded Perl interpreter.

I can't say enough about Matt Sergeant's AxKit -- thankfully he has, with instructions, tutorials, and articles galore. Termed "an XML Delivery Toolkit," AxKit is just that -- a method of delivering all that static or dynamic XML content you've spent so much effort building. And AxKit's pipeline concept makes it a snap to present the same content in a format targetted to the Web browser, handheld device, WAP phone, printer, or other platform at hand.

AxKit is built on top of mod_perl, the integration of the Perl interpreter and Apache Web server. Not simply an XSLT module for Apache, AxKit allows you to tap into the power of Perl and the Apache API to extend its transformative abilities modularly.

Installation and configuration

AxKit installation assumes:

  • A mod_perl-enabled Apache Web server. If you don't already have mod_perl installed and configured, visit the HTTP Wrangler column on Installing mod_perl from RPM and come on back when finished. For the intricate details on mod_perl installation, visit the mod_perl Installation Guide

  • A few requisite Perl modules. While some are not strictly needed for basic AxKit functionality, all are useful in their own right and well worth installing.

    • XML::XPath
    • XML::Parser
    • MIME::Base64
    • Digest::MD5
    • mod_perl
    • Storable
    • XML::Sablotron
    • Apache::Request
    • DBI
    • Unicode::Map8
    • Unicode::String
    • Time::HiRes
    • Compress::Zlib
    • Error

    I used the handy CPAN module to install the modules I was missing. Here I am installing the Error module.

    # perl -MCPAN -e shell;
    
    cpan shell -- CPAN exploration and modules installation (v1.48)
    ReadLine support available (try ``install Bundle::CPAN'')
    
    cpan> install Error
    Running make for G/GB/GBARR/Error-0.13.tar.gz
    Fetching with LWP:
      ftp://ftp.perl.org/pub/perl/CPAN/authors/id/G/GB/GBARR/Error-0.13.tar.gz
    Fetching with LWP:
      ftp://ftp.perl.org/pub/perl/CPAN/authors/id/G/GB/GBARR/CHECKSUMS
    Checksum for /root/.cpan/sources/authors/id/G/GB/GBARR/Error-0.13.tar.gz ok
    Error-0.13/
    Error-0.13/t/
    Error-0.13/t/02order.t
    Error-0.13/t/01throw.t
    Error-0.13/Error.pm
    Error-0.13/MANIFEST
    Error-0.13/ChangeLog
    Error-0.13/Makefile.PL
    Error-0.13/Error.ppd
    Error-0.13/example
    Error-0.13/README
    
      CPAN.pm: Going to build G/GB/GBARR/Error-0.13.tar.gz
    
    Checking if your kit is complete...
    Looks good
    Writing Makefile for Error
    mkdir blib
    mkdir blib/lib
    mkdir blib/arch
    mkdir blib/arch/auto
    mkdir blib/arch/auto/Error
    mkdir blib/lib/auto
    mkdir blib/lib/auto/Error
    mkdir blib/man3
    cp Error.pm blib/lib/Error.pm
    Manifying blib/man3/Error.3
      /usr/bin/make  -- OK
    Running make test
    PERL_DL_NONLAZY=1 /usr/bin/perl -Iblib/arch -Iblib/lib 
      -I/usr/lib/perl5/5.00503/i386-linux -I/usr/lib/perl5/5.00503 
      -e 'use Test::Harness qw(&runtests $verbose); $verbose=0; runtests @ARGV;' 
      t/*.t
    t/01throw...........ok
    t/02order...........ok
    All tests successful.
    Files=2,  Tests=10,  0 wallclock secs ( 0.14 cusr +  0.00 csys =  0.14 CPU)
      /usr/bin/make test -- OK
    Running make install
    Installing /usr/lib/perl5/site_perl/5.005/Error.pm
    Installing /usr/lib/perl5/man/man3/Error.3
    Writing /usr/lib/perl5/site_perl/5.005/i386-linux/auto/Error/.packlist
    Appending installation info to /usr/lib/perl5/5.00503/i386-linux/perllocal.pod
      /usr/bin/make install  -- OK
    
    cpan>
  • The Sablotron XSLT Processor. I grabbed the Linux binary. Unfortunately, installation is left to you, but this is just a matter of copying the appropriate files into place, as shown below.

    [~]% tar xvzf Sablot-linux-ix86-0.44.tar.gz
    Sablot-0.44/
    Sablot-0.44/bin/
    Sablot-0.44/bin/sabcmd
    Sablot-0.44/include/
    Sablot-0.44/include/xmlparse.h
    Sablot-0.44/include/sablot.h
    Sablot-0.44/include/shandler.h
    Sablot-0.44/lib/
    Sablot-0.44/lib/libxmltok.so.1.1.1
    Sablot-0.44/lib/libxmltok.so.1
    Sablot-0.44/lib/libxmltok.so
    Sablot-0.44/lib/libxmlparse.so.1
    Sablot-0.44/lib/libxmlparse.so.1.1.1
    Sablot-0.44/lib/libsablot.so.0
    Sablot-0.44/lib/libxmlparse.so
    Sablot-0.44/lib/libsablot.so
    Sablot-0.44/lib/libsablot.so.0.44.0
    Sablot-0.44/README
    Sablot-0.44/INSTALL
    Sablot-0.44/RELEASE
    [~]% cd Sablot-0.44
    [Sablot-0.44]% su
    # cp -v bin/* /usr/bin/
    bin/sabcmd -> /usr/bin/sabcmd
    # cp -v lib/* /usr/lib/
    lib/libsablot.so -> /usr/lib/libsablot.so
    lib/libsablot.so.0 -> /usr/lib/libsablot.so.0
    lib/libsablot.so.0.44.0 -> /usr/lib/libsablot.so.0.44.0
    lib/libxmlparse.so -> /usr/lib/libxmlparse.so
    lib/libxmlparse.so.1 -> /usr/lib/libxmlparse.so.1
    lib/libxmlparse.so.1.1.1 -> /usr/lib/libxmlparse.so.1.1.1
    lib/libxmltok.so -> /usr/lib/libxmltok.so
    lib/libxmltok.so.1 -> /usr/lib/libxmltok.so.1
    lib/libxmltok.so.1.1.1 -> /usr/lib/libxmltok.so.1.1.1
    # cp -v include/* /usr/include/
    include/sablot.h -> /usr/include/sablot.h
    include/shandler.h -> /usr/include/shandler.h
    include/xmlparse.h -> /usr/include/xmlparse.h
    # rehash ; sabcmd --version
    
    sabcmd 0.44 (September 13, 2000)
    copyright (C) 2000 Ginger Alliance (www.gingerall.com)
    
    The Sablotron XSLT Processor comes with NO WARRANTY.
    It is subject to the Mozilla Public License Version 1.1.
    Alternatively, you may use Sablotron under the GNU General Public License.
    
    # exit
    [Sablot-0.44]%

AxKit itself is simple to install and configure; I leave you to the instructions at Axkit.org.

XSLT

XSLT or eXtensible Stylesheet Language Transformations, is "a language for transforming XML documents into other XML documents." XSLT stylesheets are, themselves, well-formed XML documents. The XSLT language is declarative or "transform by example" rather than the procedural processing of most scripting languages you might be used to. Whether you consider XSLT ugly or elegant, simple or confusing, it is powerful magic when you're used to it.

AxKit sports not one, but two XSLT modules -- the XML::XSLT Perl module and Ginger Alliance's speedy C++-based Sablotron Open Source XSLT Processor.

Example 1: XSLT

As XML input, I used the O'Reilly Network's RSS 1.0 reference feed. An RSS file describes a channel consisting of one or more <item>s describing a Web resource -- highlighted in red. The only change made to the raw feed is the addition of the <?xml-stylesheet...> element, telling AxKit to use one.xsl, an XSLT stylesheet, to transform the XML file.

one.xml

<?xml version="1.0" encoding="iso-8859-1"?>
<?xml-stylesheet href="one.xsl" type="text/xsl"?>

<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns="http://purl.org/rss/1.0/"
>

...

<item rdf:about="http://www.oreillynet.com/pub/a/p2p/2000/11/24/shirky1-whatisp2p.html">
 <title>What Is P2P ... And What Isn't</title>
 <link>http://www.oreillynet.com/pub/a/p2p/2000/11/24/shirky1-whatisp2p.html</link>
 <dc:description>
  Shirky sheds some light on what's distinctive, new, and useful about P2P.
 </dc:description>
 <dc:creator>Clay Shirky</dc:creator>
 <dc:subject>P2P, Distributed</dc:subject>
 <dc:type>Article</dc:type>
 <dc:language>en-us</dc:language>
 <dc:date>2000-11-24</dc:date>
 <dc:format>text/html</dc:format>
 <dc:rights>Copyright 2000, O'Reilly and Associates</dc:rights>
 <dc:publisher>O'Reilly and Associates</dc:publisher>
</item>

...

</rdf:RDF>

The XSLT stylesheet starts by wrapping the page in the usual <html> and <body> tags. Then, for each RSS <item>, it displays a hyperlinked title and associated description.

one.xsl
<?xml version="1.0"?>

<xsl:stylesheet 
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:rss="http://purl.org/rss/1.0/"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
version="1.0">

 <xsl:template match="/rdf:RDF">
  <html>
  <body>
   <xsl:apply-templates select="rss:item" />
  </body>
  </html>
 </xsl:template>

 <xsl:template match="rss:item">
  <p>
  <a>
  <xsl:attribute name="href"><xsl:value-of select="@rdf:about" /></xsl:attribute>
  <xsl:value-of select="rss:title" />
  </a>
  <br />
  <xsl:value-of select="dc:description" />
  </p>
 </xsl:template>

</xsl:stylesheet>

I point my browser at http://.../one.xml and retrieve a nicely rendered HTML page.

What Is P2P ... And What Isn't
Shirky sheds some light on what's distinctive, new, and useful about P2P.

What's New in Qt 2.2.2
A review of Qt 2.2.2, a cross-platform C++ toolkit, and a first look at the Qt Palmtop Environment.

XML-RPC in Python
Python's xmlrpclib module makes using the XML-RPC protocol easy.

...

XPathScript

XPathScript brings together the dynamic XSLT and Perl duo for unrivaled transformative functionality. XPathScript borrows its syntax from ASP (Active Server Pages), embedding code right into the output template between <% %> delimiters. "The result is a language for server-side transformation that provides the power and flexibility of XSLT, combined with the full capabilities of the Perl language, and the ability to produce style sheets in any ASP-capable or ordinary text editor" (XPathScript: An Alternative To XSLT).

Example 2: XPathScript

This example highlights XPathScript's ability to output content in formats other than strictly valid XML. Here the result will be a plain text document. I am again using O'Reilly Network's RSS 1.0 feed as XML input. You'll notice the only change is the <?xml-stylesheet...> element, indicating that we'll be using two.xps, an XPathScript stylesheet.

two.xml

<?xml version="1.0" encoding="iso-8859-1"?>
<?xml-stylesheet href="two.xps" type="application/x-xpathscript"?>

<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns="http://purl.org/rss/1.0/"
>

...

<item rdf:about="http://www.oreillynet.com/pub/a/p2p/2000/11/24/shirky1-whatisp2p.html">
 <title>What Is P2P ... And What Isn't</title>
 <link>http://www.oreillynet.com/pub/a/p2p/2000/11/24/shirky1-whatisp2p.html</link>
 <dc:description>
  Shirky sheds some light on what's distinctive, new, and useful about P2P.
 </dc:description>
 <dc:creator>Clay Shirky</dc:creator>
 <dc:subject>P2P, Distributed</dc:subject>
 <dc:type>Article</dc:type>
 <dc:language>en-us</dc:language>
 <dc:date>2000-11-24</dc:date>
 <dc:format>text/html</dc:format>
 <dc:rights>Copyright 2000, O'Reilly and Associates</dc:rights>
 <dc:publisher>O'Reilly and Associates</dc:publisher>
</item>

...

</rdf:RDF>

A stunningly simple XPathScript stylesheet feeds each <item> found by the findnodes() function to a Perl foreach loop.

two.xps


<%

# Set Content-Type to plain text 
$r->content_type('text/plain');

# Loop through the items
foreach my $item (findnodes('/rdf:RDF/item')) {

	# Print a nicely formatted block of text using the values
  # returned by XPathScript's findvalue() function
  printf("Title:  %s\nAuthor: %s\nDate:   %s\nSubject: %s\n\n",
      findvalue('./title/text()', $item),
      findvalue('./dc:creator/text()', $item),
      findvalue('./dc:date/text()', $item),
      findvalue('./dc:subject/text()', $item)
  );
}
%>

This produces a nicely formatted plain-text document listing the latest articles on the O'Reilly Network...

Title:  What Is P2P ... And What Isn't
Author: Clay Shirky
Date:   2000-11-24
Subject: P2P, Distributed

Title:  What's New in Qt 2.2.2
Author: Boudewijn Rempt
Date:   2000-11-24
Subject: Network

Title:  XML-RPC in Python
Author: Dave Warner
Date:   2000-11-22
Subject: Python, Internet, Programming

...

This look at AxKit was a just a toe in the water, barely scratching the surface of the powerful features and incredible flexibility this XML application server has to offer. For more information, visit Axkit.org, read the "Introduction to AxKit," and visit some of the links in the Resources section at the end of this article.

Alternatives

If Java's your bag, then the Apache Software Foundation's 100% Java publishing framework Cocoon may be a good fit. Cocoon provides XSLT transformation, targetted formatting, and caching functionality similar to AxKit's, but it lacks an XPathScript equivalent. It does, however, provide XSL:FO for on-the-fly PDF documents, which is rather nice.

If you simply want to perform dynamic XSLT transformations on your Web content and all the extras AxKit and Cocoon provide are overkill, you may want to try mod_xslt, an Apache XSLT module.

And, of course, you can always do static transformation using the Perl XML::XSLT module, Sablotron and its ties into various programming languages (e.g. PHP, Python, Ruby), or the like. This obviously doesn't work for dynamically generated content. It will also become quite a chore as your content quantity grows -- as it always does!

Resources

The following is a list of starting points from which to explore further some of the topics covered (or not) in this article.

Rael Dornfest is Founder and CEO of Portland, Oregon-based Values of n. Rael leads the Values of n charge with passion, unearthly creativity, and a repertoire of puns and jokes — some of which are actually good. Prior to founding Values of n, he was O'Reilly's Chief Technical Officer, program chair for the O'Reilly Emerging Technology Conference (which he continues to chair), series editor of the bestselling Hacks book series, and instigator of O'Reilly's Rough Cuts early access program. He built Meerkat, the first web-based feed aggregator, was champion and co-author of the RSS 1.0 specification, and has written and contributed to six O'Reilly books. Rael's programmatic pride and joy is the nimble, open source blogging application Blosxom, the principles of which you'll find in the Values of n philosophy and embodied in Stikkit: Little yellow notes that think.


Read more HTTP Wrangler columns.

Discuss this article in the O'Reilly Network Apache Forum.

Return to the Apache DevCenter.

 





Sponsored by: