This article extends my discussion of advanced programming, but strays into an area that is not exclusively object oriented. What we are interested in for this installment is ways of writing programs that are declarative rather than imperative. In many cases, simply notating facts is more concise and less error prone than providing instructions. A number of less common programming languages make declarative styles predominant, but it is also possible to use a declarative style within generally imperative languages. In this article, as with the others in this series, I will focus on techniques as exemplified in Python.
When most programmers think about programming, they imagine imperative styles and techniques for writing applications. The most popular general purpose programming languages are predominantly imperative in style. On the other hand, there are also many programming languages that are declarative in style, including both functional and logic languages.
Let me list a few languages that fall in various categories. Many readers have used many of these tools, without necessarily thinking about the categorical differences among them. Python, C, C++, Java, Perl, Ruby, Smalltalk, Fortran, Basic, and xBase are all straightforwardly imperative programming languages. Some of these are object oriented, but that is simply a matter of the organization of code and data, not of the fundamental programming style. In these languages, you command the program to carry out a sequence of instructions: put some data in a variable; fetch the data back out of the variable; loop through a block of instructions until some condition is satisfied; do something if something else is true. One nice thing about all these languages is that it is easy to think about them within familiar temporal metaphors. Ordinary life consists of doing one thing, making a choice then doing another thing, and maybe using some tools along the way. It is easy to imagine the computer that runs a program as a cook, or a bricklayer, or an automobile driver.
Languages like Prolog, Mercury, SQL, XSLT, EBNF grammars, and indeed configuration files of various formats, all declare that something is the case or that certain constraints apply. Mathematics is generally the same. The functional languages—such as Haskell, ML, Dylan, Ocaml, and Scheme—are similar, but with more of an emphasis on stating internal (functional) relationships between programming objects (recursion, lists, etc.). Our ordinary life, at least in its narrative quality, provides no direct analog for the programming constructs of these languages. For those problems you can naturally describe in these languages, however, declarative descriptions are far more concise and far less error prone than are imperative solutions.
Prolog is a language that comes close to logic or mathematics. In it you simply write statements you know to be true, then ask the application to derive consequences for you. Statements are composed in no particular order, and you the programmer or user need have no real idea what steps are taken to derive results. For example:
family.pro Prolog sample
/* Adapted from sample at: http://www.engin.umd.umich.edu/CIS/course.des/cis479/prolog/ This app can answer questions about sisterhood & love, e.g.: # Is alice a sister of harry? ?-sisterof( alice, harry ) # Which of alice's sisters love wine? ?-sisterof( X, alice ), love( X, wine) */ sisterof( X, Y ) :- parents( X, M, F ), female( X ), parents( Y, M, F ). parents( edward, victoria, albert ). parents( harry, victoria, albert ). parents( alice, victoria, albert ). female( alice ). loves( harry, wine ). loves( alice, wine ).
Another declarative example is a document type definition that describes a dialect of valid XML documents.
Example 2. An XML document type declaration
<!ELEMENT dissertation (chapter+)> <!ELEMENT chapter (title, paragraph+)> <!ELEMENT title (#PCDATA)> <!ELEMENT paragraph (#PCDATA | figure)+> <!ELEMENT figure EMPTY>
The DTD language does not contain any instructions about what to do to recognize or create a valid XML document. It merely describes what one would be like if it were to exist.
Python libraries can use declarative languages in two fairly distinct ways. Perhaps the more common technique is to parse and process non-Python declarative languages as data. An application or a library can read in an external source (or a string defined internally, but just as a "blob"), then figure out a set of imperative steps to carry out that conform in some way with those external declaration. In essence, these types of libraries are "data-driven" systems; there is a conceptual and category gap between the declarative language and what a Python application does to carry out or apply its declarations.
There is a Python library to implement Prolog, called
PyLog. There are several Python libraries that work with
Extended Backus-Naur Form (EBNF) grammars. The Python module
xmlproc determines if an XML document conforms with a
DTD. Moreover, libraries to process those identical declarations have also
been implemented in other programming languages. For example, in the
SimpleParse, you can declare grammar rules in
a data file, then use those grammar rules to parse a conforming document
into a tree structure. Such a rule file might look like:
Example 3. EBNF sample
word := alphanums, (wordpunct, alphanums)*, contraction? alphanums := [azAZ0-9]+ wordpunct := [-_] contraction := "'", ("clock"/"d"/"ll"/"m"/"re"/"s"/"t"/"ve")
The data file read by
SimpleParse is a pretty standard
looking EBNF grammar, which is to say that it doesn't look much like
Python. The same is true for the other mentioned tools that happen to be
written in Python, but use declarations that are quite unlike Python.
While there is nothing wrong with this approach and these tools (I use them all the time), it might be more elegant, and in some ways more expressive, if Python itself could be the declarative language. If nothing else, libraries that facilitated this would not require programmers to think about two (or more) languages when writing one application.
The EBNF parsers
PLY let users
declare Python values in Python, then use some magic to let the Python
runtime environment act as the configuration of parsing. Using
PLY, you can write a grammar that is equivalent to the prior
SimpleParse EBNF example:
Example 4. PLY sample
tokens = ('ALPHANUMS','WORDPUNCT','CONTRACTION','WHITESPACE') t_ALPHANUMS = r"[azAZ0-0]+" t_WORDPUNCT = r"[-_]" t_CONTRACTION = r"'(clock|d|ll|m|re|s|t|ve)" def t_WHITESPACE(t): r"\s+" t.value = " " return t import lex lex.lex() lex.input(sometext) while 1: t = lex.token() if not t: break
Without going into details of the
PLY library, what you
should notice here is that it is the Python bindings themselves that
configure the parsing (actually lexing/tokening in this example). The
PLY module just happens to know enough about the Python
environment it is running in to act on these pattern declarations.
Python's introspective capabilities are quite powerful, but the details of
PLY works internally are omitted here.
gnosis.xml.validity is a framework for
creating classes that map directly to DTD productions. It is available as
part of Gnosis
gnosis.xml.validity class can only
be instantiated with arguments obeying XML dialect validity
constraints. Actually that is not quite true; the module will also infer
the proper types from simpler arguments when there is only one,
unambiguous, way of "lifting" the arguments to the correct types.
For this article, I only want to look at the declarative style in which validity classes are created, not explore the uses of the module. A set of rules/classes matching the DTD listed above consists of the following:
gnosis.xml.validity rule declarations
from gnosis.xml.validity import * class figure(EMPTY): pass class _mixedpara(Or): _disjoins = (PCDATA, figure) class paragraph(Some): _type = _mixedpara class title(PCDATA): pass class _paras(Some): _type = paragraph class chapter(Seq): _order = (title, _paras) class dissertation(Some): _type = chapter
You might create instances out of these declarations using:
ch1 = LiftSeq(chapter, ("1st Title","Validity is important")) ch2 = LiftSeq(chapter, ("2nd Title","Declaration is fun")) diss = dissertation([ch1, ch2]) print diss
The classes defined closely match the prior DTD. The mapping is basically one-to-one; except it is necessary to use intermediaries for quantification and alternation of nested tags (intermediary names are marked by a leading underscore).
Notice that these classes, while created using standard Python syntax,
are unusual (and more concise) in having no methods or instance
data. Classes are defined solely to inherit from some framework, where
that framework is narrowed by a single class attribute. For example, a
<chapter> is a sequence of other tags, namely a
<title> followed by one or more
<paragraph> tags. All we need to do to assure the
constrain is obeyed in the instances is declare the
chapter class in this straightforward manner. No methods are
programmed by the library user, and there is no imperative flow
within the classes.
The main "trick" involved in programming parent classes like
gnosis.xml.validity.Seq is to look at the
.__class__ attribute of an instance during
initialization. The class
chapter does not have its own
initialization, so its parent's
__init__() method is called,
self passed to the parent
an instance of
chapter, and it knows it. To illustrate, this
is part of the implementation of
Example 6. Class
class Seq(tuple): def __init__(self, inittup): if not hasattr(self.__class__, '_order'): raise NotImplementedError, \ "Child of Abstract Class Seq must specify order" if not isinstance(self._order, tuple): raise ValidityError, "Seq must have tuple as order" self.validate() self._tag = self.__class__.__name__
Once an application programmer tries to create a
instance, the instantiation code checks that
declared with the required
_order class attribute, and that
this attribute is the needed tuple object. The method
validate() performs some further checks to make sure that the
objects the instance was initialized with belong to the corresponding
classes specified in
A declarative programming style is almost always a more direct way of stating constraints than is an imperative or procedural one. Of course, not all programming problems are about constraints; or at least, that is not always a natural formulation. Problems of rule-based systems, though, such as grammars and inference systems, are much easier to manage if they can be described declaratively. Imperative verification of grammaticality quickly turns into spaghetti code and is difficult to debug. Statements of patterns and rules can remain much simpler.
David Mertz , being a sort of Foucauldian Berkeley, believes, esse est denunte.
Return to the Python DevCenter.
Copyright © 2009 O'Reilly Media, Inc.