ONJava.com    
 Published on ONJava.com (http://www.onjava.com/)
 See this if you're having trouble printing code examples


O'Reilly Book Excerpts: Learning Java, 2nd Edition

XML Basics for Java Developers, Part 5

by Patrick Niemeyer and Jonathan Knudsen

In this final in a series of XML basics for Java developers book excerpts from Learning Java, 2nd Edition, get an introduction to XSL/XSLT and Web services.

Related Reading

Learning Java
By Patrick Niemeyer, Jonathan Knudsen

XSL/XSLT

Earlier in this chapter, we used a Transformer object to copy a DOM representation of an example back to XML text. We mentioned then that we were not really tapping the potential of the Transformer. Now we'll give you the full story.

The javax.xml.transform package is the API for using the XSL/XSLT transformation language. XSL stands for Extensible Stylesheet Language. Like Cascading Stylesheets for HTML, XSL allows us to "mark up" XML documents by adding tags that provide presentation information. XSL Transformation (XSLT) takes this further by adding the ability to completely restructure the XML and produce arbitrary output. XSL and XSLT together comprise their own programming language for processing an XML document as input and producing another (usually XML) document as output. (From here on in we'll refer to them collectively as XSL.)

XSL is extremely powerful, and new applications for its use arise every day. For example, consider a web portal that is frequently updated and which must provide access to a variety of mobile devices, from PDAs to cell phones to traditional browsers. Rather than recreating the site for these and additional platforms, XSL can transform the content to an appropriate format for each platform. Multilingual sites also benefit from XSL.

You can probably guess the caveat that we're going to issue next: XSL is a big topic worthy of its own books (see, for example, O'Reilly's Java and XSLT by Eric Burke, a fellow St. Louis author), and we can only give you a taste of it here. Furthermore, some people find XSL difficult to understand at first glance because it requires thinking in terms of recursively processing document tags. Don't be put off if you have trouble following this example; just file it away and return to it when you need it. At some point, you will be interested in the power transformation can offer you.

XSL Basics

XSL is an XML-based standard, so it should come as no surprise that the language is based on XML. An XSL stylesheet is an XML document using special tags defined by the XSL namespace to describe the transformation. The most basic XSL operations include matching parts of the input XML document and generating output based on their contents. One or more XSL templates live within the stylesheet and are called in response to tags appearing in the input. XSL is often used in a purely input-driven way, where input XML tags trigger output in the order that they appear, using only the information they contain. But more generally, the output can be constructed from arbitrary parts of the input, drawing from it like a database, composing elements and attributes. The XSLT transformation part of XSL adds things like conditionals and for loops to this mix, enabling arbitrary output to be generated based on the input.

An XSL stylesheet contains as its root element a stylesheet tag. By convention, the stylesheet defines a namespace prefix xsl for the XSL namespace. Within the stylesheet are one or more template tags containing a match attribute describing the element upon which they operate.

<xsl:stylesheet
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

   <xsl:template match="/">
     I found the root of the document!
   </xsl:template>

</xsl:stylesheet>

When a template matches an element, it has an opportunity to handle all the children of the element. The simple stylesheet above has one template that matches the root of the input document and simply outputs some plain text. By default, input not matched is simply copied to the output with its tags stripped (HTML convention). But here we match the root so we consume the entire input.

In This Series

XML Basics for Java Developers, Part 4
In part four in a series of XML basics for Java developers book excerpts from Learning Java, 2nd Edition, learn about validating documents.

XML Basics for Java Developers, Part 3
In part three in this series of book excerpts on XML basics for Java developers from Learning Java, 2nd Edition, learn about the Document Object Model (DOM).

XML Basics for Java Developers, Part 2
In this second part in a several part series on XML for Java developers from Learning Java, 2nd Edition, learn about SAX and the SAX API.

XML Basics for Java Developers, Part 1
This is the first in a series of book excerpts on XML for Java developers from Learning Java, 2nd Edition. This excerpt covers XML fundamentals.

The match attribute can refer to elements in a hierarchical path fashion starting with the root. For example, match="/Inventory/Animal" would match only the Animal elements from our zooinventory.xml file. The path may be absolute (starting with "/") or relative, in which case the template detects whenever that element appears in any context. The match attribute actually uses an expression format called XPath that allows you to describe element names using a syntax somewhat similar to a regular expression. XPath is a powerful syntax for describing sets of nodes in XML, and it includes notation for describing sets of child nodes based on path and even attributes.

Within the template, we can put whatever we want, as long as it is well-formed XML (if not, we can use a CDATA section). But the real power comes when we use parts of the input to generate output. The XSL value-of tag is used to output the content of an element or a child of the element. For example, the following template would match an Animal element and output the value of its Name child:

<xsl:template match="Animal">
   Name: <xsl:value-of select="Name"/>
</xsl:template>

The select attribute uses a similar expression format to match. Here we tell it to print the value of the Name element within Animal. We could have used a relative path to a more deeply nested element within Animal or even an absolute path to another part of the document. To refer to its own element, we can simply use "." as the path. The select expression can also retrieve attributes from the elements it refers to.

Now if we try to add the Animal template to our simple example, it won't generate any output. What's the problem? Well, if you recall, we said that a template matching an element has the opportunity to process all its children. We already have a template matching the root ("/"), so it is consuming all the input. The answer to our dilemma--and this is where things get a little tricky--is to delegate the matching to other templates using the apply-templates tag. The following example correctly prints the names of all the animals in our document:

<xsl:stylesheet   
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

   <xsl:template match="/">
      Found the root!
      <xsl:apply-templates/>
   </xsl:template>

   <xsl:template match="Animal">
      Name: <xsl:value-of select="Name"/>
   </xsl:template>

</xsl:stylesheet>

Note that we still have the opportunity to add output before and after the apply-templates tag. But upon invoking it, the template matching continues from the current node. Next we'll use what we have so far and add a few bells and whistles.

Transforming the Zoo Inventory

Your boss just called, and it's now imperative that your zoo clients have access to the zoo inventory through the Web, today! Well, after reading Chapter 14, you should be thoroughly prepared to build a nice "zoo portal." Let's get you started by creating an XSL stylesheet to turn our zooinventory.xml into HTML:

<xsl:stylesheet   
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

   <xsl:template match="/Inventory">
      <html><head><title>Zoo Inventory</title></head>
      <body><h1>Zoo Inventory</h2>
      <table border="1">
      <tr><td><b>Name</b></td><td><b>Species</b></td>
      <td><b>Habitat</b></td><td><b>Temperament</b></td>
      <td><b>Diet</b></td></tr>
         <xsl:apply-templates/>
           <!-- Process Inventory -->
      </table>
      </body>
      </html>
   </xsl:template>
   <xsl:template match="Inventory/Animal">
      <tr><td><xsl:value-of select="Name"/></td>
          <td><xsl:value-of select="Species"/></td>
         <td><xsl:value-of select="Habitat"/></td>
         <td><xsl:value-of select="Temperament"/></td> 
         <td><xsl:apply-templates select="Food|FoodRecipe"/>
            <!-- Process Food,FoodRecipe--></td></tr>
   </xsl:template>

   <xsl:template match="FoodRecipe">
      <table>
      <tr><td><em><xsl:value-of select="Name"/></em></td></tr>
      <xsl:for-each select="Ingredient">
         <tr><td><xsl:value-of select="."/></td></tr>
      </xsl:for-each>
      </table>
   </xsl:template>

</xsl:stylesheet>

The stylesheet contains three templates. The first matches /Inventory and outputs the beginning of our HTML document (the header) along with the start of a table for the animals. It then delegates using apply-templates before closing the table and adding the HTML footer. The next template matches Inventory/Animal, printing one row of an HTML table for each animal. Although there are no other Animal elements in the document, it still doesn't hurt to specify that we will match an Animal only in the context of an Inventory, because in this case we are relying on Animal to start and end our table. (This template makes sense only in the context of an Inventory.) Finally, we provide a template that matches FoodRecipe and prints a small (nested) table for that information. FoodRecipe makes use of the for-each operation to loop over child nodes with a select specifying that we are only interested in Ingredient children. For each Ingredient, we output its value in a row.

There is one more thing to note in the Animal template. Our apply-templates element has a select attribute that limits the elements affected. In this case, we are using the "|" regular expression-like syntax to say that we want to apply templates for only the Food or FoodRecipe child elements. Why do we do this? Because we didn't match the root of the document (only Inventory), we still have the default stylesheet behavior of outputting the plain text of nodes that aren't matched. We want this behavior for the Food element in the event that a FoodRecipe isn't there. But we don't want it for all the other elements of Animal that we've handled explicitly. Alternatively, we could have been more verbose, adding a template matching the root and another template just for the Food element. That would also mean that new tags added to our XML would be ignored and not change the output. This may or may not be the behavior you want, and there are other options as well. As with all powerful tools, there is usually more than one way to do something.

XSLTransform

Now that we have a stylesheet, let's apply it! The following simple program, XSLTransform, uses the javax.xml.transform package to apply the stylesheet to an XML document and print the result. You can use it to experiment with XSL and our example code.

import javax.xml.transform.*;
import javax.xml.transform.stream.*;
  
public class XSLTransform 
{
   public static void main( String [] args ) throws Exception
   {
      if ( args.length < 2 || !args[0].endsWith(".xsl") ) {
         System.err.println("usage: XSLTransform file.xsl file.xml");
         System.exit(1);
      }
      TransformerFactory factory = TransformerFactory.newInstance(  );
      Transformer transformer = 
         factory.newTransformer( new StreamSource( args[0] ) );
      StreamSource xmlsource = new StreamSource( args[1] );
      StreamResult output = new StreamResult( System.out );
      transformer.transform( xmlsource, output );
   }
}

Run XSLTransform, passing the XSL stylesheet and XML input, as in the following command:

% java XSLTransform zooinventory.xsl zooinventory.xml > zooinventory.html

The output should look like Figure 23-2.


Figure 23-2. Image of the zoo inventory table

Constructing the transform is a similar process to that of getting a SAX or DOM parser. The difference from our earlier use of the TransformerFactory is that this time we construct the transformer, passing it the XSL stylesheet source. The resulting Transformer object is then a dedicated machine that knows how to take input XML and generate output according to its rules.

One important thing to note about XSLTransform is that it is not guaranteed thread-safe. If you must make concurrent transformations in many threads, they must either coordinate their use of the transformer or have their own instances.

XSL in the Browser

With our XSLTransform example, you can see how you'd go about rendering XML to an HTML document on the server side. But as mentioned in the introduction, modern web browsers support XSL on the client side as well. Internet Explorer 5.x and above, Netscape 6.x, and Mozilla can automatically download an XSL stylesheet and use it to transform an XML document. To make this happen, just add a standard XSL stylesheet reference in your XML. You can put the stylesheet directive next to your DOCTYPE declaration in the zooinventory.xml file:

<?xml-stylesheet type="text/xsl" href="zooinventory.xsl"?>

Now, as long as the zooinventory.xsl file is available at the same location (base URL) as the zooinventory.xml file, the browser will use it to render HTML on the client side.

Web Services

One of the most interesting directions for XML is web services. A web service is simply an application service supplied over the network, making use of XML to describe the request and response. Normally, web services run over HTTP and use an XML-based protocol called SOAP. SOAP stands for Simple Object Access Protocol and is an evolving W3C standard. The combination of XML and HTTP provides a universally accessible interface for services.

SOAP and other XML-based remote procedure call mechanisms can be used in place of Java RMI for cross-platform communications and as an alternative to CORBA. There is a lot of excitement surrounding web services, and it is likely that they will grow in importance in coming years. To learn more about SOAP, see http://www.w3.org/TR/SOAP/. To learn more about Java APIs related to web services, keep an eye on http://java.sun.com/webservices/.

Well, that's it for our brief introduction to XML. There is a lot more to learn about this exciting new area, and many of the APIs are evolving rapidly. We hope we've given you a good start.

With this chapter we also wrap up the main part of our book. We hope that you've enjoyed Learning Java. We welcome your feedback to help us keep making this book better in the future.


Learning Java

Related Reading

Learning Java
By Patrick Niemeyer, Jonathan Knudsen

Return to ONJava.com.

Copyright © 2009 O'Reilly Media, Inc.