Python DevCenter
oreilly.comSafari Books Online.Conferences.


Advanced OOP: Declarative Programming and Mini-Languages

by David Mertz

This article extends my discussion of advanced programming, but strays into an area that is not exclusively object oriented. What we are interested in for this installment is ways of writing programs that are declarative rather than imperative. In many cases, simply notating facts is more concise and less error prone than providing instructions. A number of less common programming languages make declarative styles predominant, but it is also possible to use a declarative style within generally imperative languages. In this article, as with the others in this series, I will focus on techniques as exemplified in Python.

Background on Declarative Styles

When most programmers think about programming, they imagine imperative styles and techniques for writing applications. The most popular general purpose programming languages are predominantly imperative in style. On the other hand, there are also many programming languages that are declarative in style, including both functional and logic languages.

Let me list a few languages that fall in various categories. Many readers have used many of these tools, without necessarily thinking about the categorical differences among them. Python, C, C++, Java, Perl, Ruby, Smalltalk, Fortran, Basic, and xBase are all straightforwardly imperative programming languages. Some of these are object oriented, but that is simply a matter of the organization of code and data, not of the fundamental programming style. In these languages, you command the program to carry out a sequence of instructions: put some data in a variable; fetch the data back out of the variable; loop through a block of instructions until some condition is satisfied; do something if something else is true. One nice thing about all these languages is that it is easy to think about them within familiar temporal metaphors. Ordinary life consists of doing one thing, making a choice then doing another thing, and maybe using some tools along the way. It is easy to imagine the computer that runs a program as a cook, or a bricklayer, or an automobile driver.

Languages like Prolog, Mercury, SQL, XSLT, EBNF grammars, and indeed configuration files of various formats, all declare that something is the case or that certain constraints apply. Mathematics is generally the same. The functional languages—such as Haskell, ML, Dylan, Ocaml, and Scheme—are similar, but with more of an emphasis on stating internal (functional) relationships between programming objects (recursion, lists, etc.). Our ordinary life, at least in its narrative quality, provides no direct analog for the programming constructs of these languages. For those problems you can naturally describe in these languages, however, declarative descriptions are far more concise and far less error prone than are imperative solutions.

Prolog is a language that comes close to logic or mathematics. In it you simply write statements you know to be true, then ask the application to derive consequences for you. Statements are composed in no particular order, and you the programmer or user need have no real idea what steps are taken to derive results. For example:

Example 1. Prolog sample

/* Adapted from sample at:
This app can answer questions about sisterhood & love, e.g.:
# Is alice a sister of harry?
?-sisterof( alice, harry )
# Which of alice's sisters love wine?
?-sisterof( X, alice ), love( X, wine)
sisterof( X, Y ) :- parents( X, M, F ),
                    female( X ),
                    parents( Y, M, F ).
parents( edward, victoria, albert ).
parents( harry, victoria, albert ).
parents( alice, victoria, albert ).
female( alice ).
loves( harry, wine ).
loves( alice, wine ).

Another declarative example is a document type definition that describes a dialect of valid XML documents.

Example 2. An XML document type declaration

<!ELEMENT dissertation (chapter+)>
<!ELEMENT chapter (title, paragraph+)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT paragraph (#PCDATA | figure)+>

The DTD language does not contain any instructions about what to do to recognize or create a valid XML document. It merely describes what one would be like if it were to exist.

Python in a Nutshell

Related Reading

Python in a Nutshell
By Alex Martelli

Python as Interpreter Versus Python as Environment

Python libraries can use declarative languages in two fairly distinct ways. Perhaps the more common technique is to parse and process non-Python declarative languages as data. An application or a library can read in an external source (or a string defined internally, but just as a "blob"), then figure out a set of imperative steps to carry out that conform in some way with those external declaration. In essence, these types of libraries are "data-driven" systems; there is a conceptual and category gap between the declarative language and what a Python application does to carry out or apply its declarations.

There is a Python library to implement Prolog, called PyLog. There are several Python libraries that work with Extended Backus-Naur Form (EBNF) grammars. The Python module xmlproc determines if an XML document conforms with a DTD. Moreover, libraries to process those identical declarations have also been implemented in other programming languages. For example, in the Python library SimpleParse, you can declare grammar rules in a data file, then use those grammar rules to parse a conforming document into a tree structure. Such a rule file might look like:

Example 3. EBNF sample

word        := alphanums, (wordpunct, alphanums)*, contraction?
alphanums   := [azAZ0-9]+
wordpunct   := [-_]
contraction := "'", ("clock"/"d"/"ll"/"m"/"re"/"s"/"t"/"ve")

The data file read by SimpleParse is a pretty standard looking EBNF grammar, which is to say that it doesn't look much like Python. The same is true for the other mentioned tools that happen to be written in Python, but use declarations that are quite unlike Python.

While there is nothing wrong with this approach and these tools (I use them all the time), it might be more elegant, and in some ways more expressive, if Python itself could be the declarative language. If nothing else, libraries that facilitated this would not require programmers to think about two (or more) languages when writing one application.

The Magic of Introspection

The EBNF parsers Spark and PLY let users declare Python values in Python, then use some magic to let the Python runtime environment act as the configuration of parsing. Using PLY, you can write a grammar that is equivalent to the prior SimpleParse EBNF example:

Example 4. PLY sample

t_ALPHANUMS = r"[azAZ0-0]+"
t_WORDPUNCT = r"[-_]"
t_CONTRACTION = r"'(clock|d|ll|m|re|s|t|ve)"
def t_WHITESPACE(t):
    t.value = " "
    return t
import lex
while 1:
    t = lex.token()
    if not t: break

Without going into details of the PLY library, what you should notice here is that it is the Python bindings themselves that configure the parsing (actually lexing/tokening in this example). The PLY module just happens to know enough about the Python environment it is running in to act on these pattern declarations. Python's introspective capabilities are quite powerful, but the details of how PLY works internally are omitted here.

Inheritance as Declaration

The module gnosis.xml.validity is a framework for creating classes that map directly to DTD productions. It is available as part of Gnosis Utils. Any gnosis.xml.validity class can only be instantiated with arguments obeying XML dialect validity constraints. Actually that is not quite true; the module will also infer the proper types from simpler arguments when there is only one, unambiguous, way of "lifting" the arguments to the correct types.

For this article, I only want to look at the declarative style in which validity classes are created, not explore the uses of the module. A set of rules/classes matching the DTD listed above consists of the following:

Example 5. gnosis.xml.validity rule declarations

from gnosis.xml.validity import *
class figure(EMPTY):      pass
class _mixedpara(Or):     _disjoins = (PCDATA, figure)
class paragraph(Some):    _type = _mixedpara
class title(PCDATA):      pass
class _paras(Some):       _type = paragraph
class chapter(Seq):       _order = (title, _paras)
class dissertation(Some): _type = chapter

You might create instances out of these declarations using:

ch1  = LiftSeq(chapter, ("1st Title","Validity is important"))
ch2  = LiftSeq(chapter, ("2nd Title","Declaration is fun"))
diss = dissertation([ch1, ch2])
print diss

The classes defined closely match the prior DTD. The mapping is basically one-to-one; except it is necessary to use intermediaries for quantification and alternation of nested tags (intermediary names are marked by a leading underscore).

Notice that these classes, while created using standard Python syntax, are unusual (and more concise) in having no methods or instance data. Classes are defined solely to inherit from some framework, where that framework is narrowed by a single class attribute. For example, a <chapter> is a sequence of other tags, namely a <title> followed by one or more <paragraph> tags. All we need to do to assure the constrain is obeyed in the instances is declare the chapter class in this straightforward manner. No methods are programmed by the library user, and there is no imperative flow within the classes.

The main "trick" involved in programming parent classes like gnosis.xml.validity.Seq is to look at the .__class__ attribute of an instance during initialization. The class chapter does not have its own initialization, so its parent's __init__() method is called, but the self passed to the parent __init__() is an instance of chapter, and it knows it. To illustrate, this is part of the implementation of gnosis.xml.validity.Seq:

Example 6. Class gnosis.xml.validity.Seq

class Seq(tuple):
    def __init__(self, inittup):
        if not hasattr(self.__class__, '_order'):
            raise NotImplementedError, \
                "Child of Abstract Class Seq must specify order"
        if not isinstance(self._order, tuple):
            raise ValidityError, "Seq must have tuple as order"
        self._tag = self.__class__.__name__

Once an application programmer tries to create a chapter instance, the instantiation code checks that chapter was declared with the required _order class attribute, and that this attribute is the needed tuple object. The method validate() performs some further checks to make sure that the objects the instance was initialized with belong to the corresponding classes specified in _order.

When to Declare

A declarative programming style is almost always a more direct way of stating constraints than is an imperative or procedural one. Of course, not all programming problems are about constraints; or at least, that is not always a natural formulation. Problems of rule-based systems, though, such as grammars and inference systems, are much easier to manage if they can be described declaratively. Imperative verification of grammaticality quickly turns into spaghetti code and is difficult to debug. Statements of patterns and rules can remain much simpler.

David Mertz , being a sort of Foucauldian Berkeley, believes, esse est denunte.

Return to the Python DevCenter.

Sponsored by: