ONLamp.com
oreilly.comSafari Books Online.Conferences.

advertisement


Rexx: Power Through Simplicity
Pages: 1, 2

Content-Addressable Arrays

You've read one Rexx script, so given that the language is easy to learn, you're ready for something interesting. The next script illustrates a Rexx feature common to some of today's more powerful languages: the ability to index arrays or tables by non-numeric subscripts.



This example program includes an array that defines three Chicago-area telephone area codes. It prompts the user to enter the name of a town in the Chicago metro area. Then it retrieves and displays the telephone area code associated with that town. The interesting feature of this script is that the town name is the index into the array the script uses to retrieve the telephone area code. Here's the script:

        /*******************************************************************/
        /* Code Lookup:                                                    */
        /*     Looks up the areacode for the town the user enters.         */
        /*******************************************************************/

1       area. = ''                     /* Initialize array entries to null */

2       area.CHICAGO  = 312            /* Define a table of area codes     */    
3       area.HOMEWOOD = 708   
4       area.EVANSTON = 847   

5       do while town <> ''            /* Loop until user enters null line */

6          say 'For which town do you want the area code?'
7          pull town 

8          if town <> '' then do
9             if area.town = '' 
10               then  say 'Town' town 'is not in my database'
11               else  say 'The area code for' town 'is' area.town
12         end

13      end

The first line in the program defines an array or table named area. You can tell it's an array because it ends with a period. You don't have to declare or define an array prior to using it, but I found it useful because I wanted to initialize all array elements to the null string. The script does this without specifying how many elements the array might contain or the "data types" of those elements. (Rexx arrays can typically grow to the size of available memory.)

Lines 2 through 4 initialize by associating three towns in the array with their respective telephone area code values. Variables that contain internal periods are array variables or compound elements. The interesting thing here is that the array elements have string subscripts (the names of towns) rather than numeric ones. This means that the array is content-addressable. It is a form of associative memory, in that it associates or relates values represented by arbitrary strings.

Line 5 continues the script while the value of variable town is anything other than the null string. The town variable has had no previous use or declaration, so its "uninitialized value" is its own name in upper case (TOWN). After prompting the user in line 6, the script reads a town from the user in line 7. The pull instruction reads one or more variables and automatically translates them into upper case.

If the user declines to enter a town by not entering anything and pressing the Enter key, line 8 identifies the situation, skips the if statement in lines 9 through 11, and exits the do while loop and the program. If the user does enter a town, line 9 looks it up in the area table. Either line 10 displays a message that it is not in the script's "database" (the area array) or line 11 displays the proper area code for the town. In line 11, the compound variable reference area.town is what displays the relevant area code to the user.

There's one other feature to mention in this script. The table lookup works properly because the pull instruction automatically translates the user's input to upper case. Because Rexx considers variable names internally as upper case, the comparison would also have worked if I had coded the array names as area.Chicago, area.chicago, AREA.CHICAGO, or ... you get the idea. Rexx routinely provides this kind of convenient automation, but always offers easy ways to avoid it. For example, to avoid automatic upper-case translation in strings, quote them. To avoid it in reading variable values, use the parse pull instruction instead of pull.

Data Structures

As variable-sized, content-addressable entities, Rexx arrays have a wide range of uses. Here's another example. This script implements the weighted-retrieval algorithm that forms the basis for bibliographic search services in libraries. The idea is that resources like books, magazine articles, and videos have assigned lists of descriptors. The user inputs his own search descriptors along with a weight: the number of his descriptors that must match the resource's descriptors in order for that resource to match his query. The search system retrieves the most relevant resources, as determined by the weight of their matches, and typically displays them in ranked (weighted) order.

For clarity of illustration, I've simplified the algorithm. I hardcoded the search descriptors (or keywords) and only allow the user to input the weight or threshold for retrieval. I also coded the resources (book titles) and their descriptors right into the program, rather than reading them from a database.

Here's the script. It uses two arrays. The first is a list of keywords that describe the retrieval topics. The second is a list of three books, categorized by three descriptors apiece:

        /*********************************************************************/
        /*  Find Books:                                                      */
        /*     This program illustrates how arrays may be of any dimension   */
        /*     in retrieving book titles based on their keyword weightings.  */
        /*********************************************************************/

1       keyword. = ''         /* Initialize both arrays to all null strings  */
2       title.   = ''
     
        /* The array of keywords to search for among the book descriptors    */

3       keyword.1 = 'earth'   ;   keyword.2 = 'computers'
4       keyword.3 = 'life'    ;   keyword.4 = 'environment'

        /* The array of book titles, each having several descriptors         */

5       title.1 = 'Saving Planet Earth'
6          title.1.1 = 'earth' 
7          title.1.2 = 'environment' 
8          title.1.3 = 'life'
9       title.2 = 'Computer Lifeforms'   
10         title.2.1 = 'life'
11         title.2.2 = 'computers'
12         title.2.3 = 'intelligence'
13      title.3 = 'Algorithmic Insanity'
14         title.3.1 = 'computers'
15         title.3.2 = 'algorithms'
16         title.3.3 = 'programming' 

17      arg weight      /* Get number keyword matches required for retrieval */

18      say 'For weight of' weight 'retrieved titles are:'  /* Output header */

19      do j = 1  while title.j <> ''                /* Look at each book    */
20         count = 0
  
21         do k = 1  while keyword.k <> ''           /* Inspect its keywords */
   
22            do l = 1  while title.j.l <> ''        /* Compute its weight   */
23               if  keyword.k = title.j.l  then count = count + 1
24            end
   
25         end

26         if count >= weight then   /* Display titles matching the criteria */
27            say title.j
28      end

The first two lines of the program initialize all positions in both arrays to the null string. Lines 3 and 4 define the elements in the keyword array, while lines 5 through 16 define the three titles and the list of descriptors for each. I've defined all array elements as quotation-delimited character strings. This preserves their case sensitivity. I've placed more than Rexx statement per line in defining the keywords array by using the semicolon to separate the statements. I created a hierarchical or tree structure in the title array merely by using multiple subscripts to describe array elements. Rexx enables subscripting to an arbitrary depth, with any number of subscripts or dimensions, limited only by available memory.

Line 17 reads the weight the user enters from command-line argument provided to the script. Line 18 writes a header for the program's output list of retrieved titles.

The do while loop of lines 19 through 28 process a single book or title. Line 20 initializes the weight or count of descriptor matches for the book title to 0. The loop of lines 21 through 25 processes all of the retrieval keywords against a single book title, while the innermost loop of lines 22 to 24 accumulates the matching weight for an individual book title. Lines 26 and 27 display the books with a number of matches at least equal to the hit count or weight dictated by the user.

Unlike most programming languages, Rexx does not have many built-in data structures. This script shows why. Associative arrays are easy to use to implement data structures of arbitrary complexity. The processing logic in lines 19 through 28 would work without alteration even if I changed the number of keywords or book descriptors, assigned different numbers of descriptors per book, or read the contents of either array from the user or from a file or a database.

The second scripting example demonstrated a lookup table, a data structure embodying the key-value pairs popular in Perl and in the open source Berkeley DB database. This example script demonstrates a list in its keywords table and a tree in its array of titles. This tree happens to be balanced tree, but skewed or unbalanced trees are also possible.

Arrays can also group heterogeneous data items to implement the equivalent of C/C++ structures or Pascal or COBOL record definitions. They can even implement structures requiring symbolic pointers, such as linked lists and doubly linked lists. Rexx arrays may be dense or sparse, contain homogenous or heterogeneous elements, and expand and contract as necessary.

Here is Rexx, a language that "lacks data structures"--and yet permits you to create them without any special syntax. Power comes from ease of use, not from adding features or complexity.

What Next

This article offers just a taste of what Rexx is about. It excludes object-oriented Rexx, which runs standard procedural Rexx programs without alteration, yet fully supports object-oriented scripting.

Open Object Rexx includes classes, methods, messaging, encapsulation, abstraction, multiple inheritance, polymorphism, and a huge hammer of a class library. It retains Rexx's ease of use while providing the full power of object-oriented programming.

I also omitted NetRexx, a "Rexx-like" language that extends Rexx's ease of use into the Java environment. NetRexx runs on both clients and servers. Use it to develop classes, applets, applications, servlets, and beans. NetRexx functions as an interpreter or compiler, so you can run it with or without a Java Virtual Machine and even use it to generate formatted Java code.

No single programming language is best for every task--which is one reason why there are so many of them--but Rexx is certainly a useful one to have around. It makes a nice complement to syntax-based power languages like Perl, Bash, and Korn. You can become fluent in Rexx in a matter of days, yet you won't run out of power as your knowledge grows.

Here is a list of free software and other resources.

Interpreters with Tools

Object-Oriented Rexx

For Handhelds

For Java Integration

Example Scripts

Books

The Rexx Language Forums

International Users Group

Howard Fosdick is an independent consultant who has worked with most major scripting languages.

Classic Shell Scripting

Related Reading

Classic Shell Scripting
Hidden Commands that Unlock the Power of Unix
By Arnold Robbins, Nelson H.F. Beebe

Return to ONLamp.com.



Sponsored by: