ONJava.com    
 Published on ONJava.com (http://www.onjava.com/)
 See this if you're having trouble printing code examples


Web Services and the Search for Really Big Prime Numbers

by Eoin Lane
08/29/2002

What do searching for extraterrestrials, curing cancer, and finding big prime numbers all have in common? These problems are all being attacked with grid computing, a a technique of breaking a large problem into small tasks that can be computed independently. While projects like Seti@home and The Greatest Internet Mersenne Prime Search have received plenty of press for using the Internet to distribute tasks to end users around the globe, grid computing also takes place in more controlled environments, such as research and financial settings. But it is by using the power of the Internet and the ability to discover and access idle processes on users' machines that grid computing (once called distributed computing), can access massive numbers of machines and processes.

What does all this have to do with Web services? The grid's principal tasks -- discovery and utilization of idle processes -- are best done over a ubiquitous protocol like HTTP. Although not tied to HTTP, Web services are effectively synonymous with HTTP and allow middleware components to be invoked using a verbose ASCII-formatted message. Web services provide a framework that is ideal for grid computing architectures. The Web services framework consists of UDDI for lookup and discovery, WSDL for service definition, and SOAP/HTTP for service invocation.

In this article, this Web services framework will be applied to the problem of factorizing large numbers to check if they are prime. The problem must first be understood in terms of the mathematics behind it. This problem will then be conceptualized and remodeled in UML, and finally the problem implemented in a service-oriented manner that can easily be translated to a grid computing architecture for scalability and rapid solution of the complex problem.

Source Code

Download the source for this program. For future updates, see primes.sourceforge.net.

To this end, we'll explore the factorization of Mersenne numbers. Mersenne numbers are numbers of the type 2k-1 where p=1,2,3 ... but the real challenge here is to find Mersenne prime numbers -- that is, Mersenne numbers that are prime. These are currently the biggest known prime numbers. Let's begin with a bit of math.

The Problem

In this article, four types of numbers will be considered: prime numbers, Mersenne numbers, Mersenne prime numbers, and perfect numbers.

A prime number is any number that is only divisible by itself and one. Examples of prime numbers are 1, 2, 3, 5, 7, and 11.

A Mersenne number is a number that can be written in the form of 2k-1. The first few Mersenne numbers are 1, 3 , 7, 15, 31, 63, 127, 255, and 511.

A Mersenne Prime Number is a special type of Mersenne number, in that it is also a prime number. The first few Mersenne primes (where k = 2, 3, 5, and 7) are 3, 7, 31, and 127.

A perfect number, on the other hand is one that is equal to the sum of it parts. Put another way, a whole number is perfect if it is equal to the sum of its proper divisors. So, for example, the number 6 is perfect, because its proper divisors are 1, 2, and 3, and 1 + 2 + 3 = 6. Also perfect are 28 (1 + 2 + 4 + 7 + 14), 496, and 8128. These numbers constitute the first four perfect numbers; as is evident, there are not a lot of them.

So what is the relationship between these numbers? By definition, a Mersenne prime is also a prime number. A consequence of this is that k in the 2k-1 equation must also be prime. Also, a relationship between prime numbers and perfect numbers has been known since antiquity and was formulated by Euclid in the following theorem: If 2k-1 is prime, and if n = 2k-1(2k-1), then n is perfect.

But it wasn't until almost 2000 years later that the exact relationship between these types of numbers was resolved by Euler, in the following theorem: If n is an even perfect number, then n = 2k-1(2k-1), where 2k-1 is prime.

In plain language, an even perfect number has a Mersenne prime associated with it.

A Bit of UML

A Mersenne prime, then, is a special type of prime number. In UML, this is represented as an is a relationship: a Mersenne prime is a special type of prime number. And since k in 2k-1 is also prime, then there is also a has a relationship: a Mersenne number has a prime number.

Mersenne Prime

From the Euler theorem on perfect numbers above, it is clear that there are two types of perfect numbers -- odd and even. Again, this can be represented in UML as follows:

Perfect Number

It is interesting to note that an odd perfect number has never been found, even though some extraordinary properties about them are known, for instance:

Check out item three; imagine trying to do some long division sums in your head with that number.

Finally, again from the Euler theorem, perfect numbers are related to prime numbers in that an even perfect number can be expressed in the form n = 2k-1(2k-1); i.e., n = 2k-1 (a Mersenne number). Therefore, every even perfect number has a Mersenne number. Once more, in UML this can be diagrammatically depicted as follows:

Perfect Prime Number

Understanding the Server-Side Architecture

The server-side architecture can be taught as two use cases. The first is to find the factor of a Mersenne number to check if it is a Mersenne prime or not. The second it to take a Mersenne prime number and calculate the corresponding perfect number.

Use Case 1: Factorizing a Mersenne Number.

This is by far the more complicated of the two use cases, and can be thought of in terms of a number of sub-use cases. It involves taking a Mersenne number and finding its factors; since this is computationally a very intensive process, the client is informed of the result asynchronously by email when the calculation is finished.

Furthermore, this now involves the notion of state, since with multiple client invocations, each Mersenne number, and its corresponding email address and factors (once it has finished factorizing), must be remembered. Since Web services are largely stateless services (a consequence of using stateless HTTP as the protocol), they must impersonate some type of stateful service. The client must also be informed of the results, so there must be the ability to email the results to the calling client. Finally, some sort of state-lifecycle management is also desirable. On a server restart, a Mersenne number that is not finished factorizing should, at a minimum, be restarted. The sub-use cases can therefore be listed as follows:

  1. The basic service
  2. Making a stateless service stateful
  3. Informing the client
  4. Making the calculation
  5. Adding a persistence mechanism

These use cases are discussed in more detail below; however, before discussing them individually, it is worth looking at an overview of the sequence diagrams and class diagrams for this use case.

The Basic Service

The basic service is the PerfectCalc interface (here is the WSDL). It defines the two methods listed below; however, the second method belongs to the second use case and will be addressed as part of that discussion.

public boolean checkMersenneNumber(MersenneNumber k, String email)

takes a Mersenne number (which is not necessarily a prime number) and an email address of the invoker, factors the Mersenne number, and emails the result to the invoker. If the Mersenne number has no factor other than itself and one, then it is a Mersenne prime.

public PerfectNumber calcPerfectNumber(MersennePrime MersennePrime)

takes a Mersenne prime and calculates the corresponding perfect number.

Making a Stateless Service Stateful

The FactorFactory object is a factory pattern responsible for making the stateless PerfectCalc service stateful. The FactorFactory object is also a singleton with a protected constructor and a public static initializer. Example 1 gives the code for the FactorFactory object.

Example 1. FactorFactory

/**
 * A Factory class that is also a singleton object for creating multiple 
 * instances of the stateful Factor class
 * @testcase test.TestFactorFactory
 * @version 1.0
 * @author Eoin Lane
 */
public class FactorFactory implements IFactorFactory, Observer {
    /** This contructor is only called once and when it is 
     * the Caretaker is created and initalised 
     */
    protected FactorFactory() {
        // Initalise the Map
        this.map = new HashMap();
        //factors = new Vector();
        caretaker = new SimpleCaretaker();
        System.out.println("Caretaker created and initialized");
    }

    /**
     * Factory method to create instances of IFactor object
     * @param MersenneNumber number of the form (2^k-1)
     * @param email of the invoking client
     */
    public IFactor getFactor(MersenneNumber MersenneNumber, String email) {
        // Store the Mersenne number k value into the Map
        Double d = new Double(MersenneNumber.getK());
        this.map.put(d, email);
        Factor factor = new Factor(MersenneNumber);
        factor.attach(this);
        factor.factorize();
        // Create a Memento of the newly created factor object
        IMemento memento = new SimpleMemento(MersenneNumber.getK(), email);
        //Tell the caretaker about it
        this.caretaker.addElement(memento);
        return (IFactor)factor;
    }

    /** Returns a one and only one instance of this class */
    public static FactorFactory getInstance() {
        if (instance == null) {
            synchronized(FactorFactory.class) {
                if (instance == null) {
                    instance = new FactorFactory();
                }
            }
        }
        return instance;
    }

    /**
     * The callback method that the subject will call on this listener
     * @param factor the finished Factor class
     */
    public void update(Factor factor) {
        // Get the k value from the Mersenne number in the factor class
        double k_value = factor.getMersenneNumber().getK();
        System.out.println("Observer informed of finished 
                  calculation of Mersenne number: " + k_value);
        // Get the email corrosponding th the Factor's Mersenne k value
        String email = (String)(this.map.get(new Double(k_value)));
        Collection bag = factor.getFactors();
        mail(bag, email, k_value);
        // since this Factor class has now finished process remove 
        // it's corrosponding memento object from the caretaker
        // Create a Memento of the newly created factor object
        IMemento memento = new SimpleMemento(k_value, email);
        this.caretaker.removeElement(factor);
    }
  
    private static FactorFactory instance = null;
    // Looks after and tracks all the created Factor objects
    private ICaretaker caretaker;
    // A map containing the Mersenne number, email value pair
    Map map;
}

The important thing to note is the protected constructor, where a Vector of Factor class is created and initialized. This Vector keeps a record of all of the created Factor classes. The Factor classes are then created with the getFactor(MersenneNumber MersenneNumber) method. This pattern -- using a singleton factory object to create and keep track of the generated Factor object -- allows a stateless facade to be exposed to the outside while internally stateful classes can be created and their lifecycle managed.

Informing the Client

The observer pattern is used here as a callback to the FactorFactory object to tell it that the Factor object created to factorize a particular Mersenne number is now finished. To achieve this, FactorFactory implements the observer interface while the Factor object implements the subject interface.

Example 2. Implementing the observer interface.

/**
 * This object is used to factorize a Mersenne number. There are 
 * potentially many of these created in memory at run time.
 * @testcase test.TestFactor
 * @author Eoin Lane
 * @version 1.0
 */
public class Factor implements IFactor, Subject {
    /**
     * Constructor
     * @param MersenneNumber a number of the form (2^k-1)
     */
    public Factor(MersenneNumber MersenneNumber) {
        strategy = new LLFactorer();
        this.MersenneNumber = MersenneNumber;
    }

    /** factorize this object */
    public void factorize() {
        factors = this.strategy.factorize(this.MersenneNumber);
        this.inform();
    }

    public MersenneNumber getMersenneNumber() {
        return this.MersenneNumber;
    }

    /**
     * Register an observer with this object.
     * @param observer an observer
     */
    public void attach(Observer observer) {
        observersVector.addElement(observer);
    }

    /**
     * detach an observer from this object.
     * @param observer an observer
     */
    public void detach(Observer observer) {
        observersVector.removeElement(observer);
    }

    /** Inform all the observers of an event */
    public void inform() {
        java.util.Enumeration enumeration = observers();
        Collection bag = this.getFactors();
        while (enumeration.hasMoreElements()) {
            ((Observer)enumeration.nextElement()).update(this);
        }
    }

    public Enumeration observers() {
        return ((java.util.Vector) observersVector.clone()).elements();
    }

    public Collection getFactors() {
        return factors;
    }

    /*# private FactorFactory _factorFactory; */

    // The (2^k-1) object
    private MersenneNumber MersenneNumber;
    // The factors returned by the factorisation method
    private Collection factors;
    // The strategy to be used to factorize this object
    private IFactorer strategy = null;
   
    /** @associates <{Observer}> */
    private Vector observersVector = new java.util.Vector();
}

Making the Calculation

A strategy pattern is used to implement the factorization. The strategy pattern allows an algorithmic change of the implementation at runtime, depending on the choice of strategies (i.e., a different algorithm may be used, depending on the size of the number to be factored). Example 3 gives the code for LLFactorer, a simple algometric implementation of the IFactor interface.

Example 3. LLFactorer

import java.util.Collection;
import primes.PrimeNumber;
import java.lang.Math;
import perfectNumbers.MersenneNumber;

/**
 * A simple concrete implementation of the strategy IFactorer interface
 * @testcase test.TestLLFactorer
 * @author Eoin Lane
 * @version 1.0
 */
public class LLFactorer implements IFactorer {
    public Collection factorize(Number number) {
        return new java.util.ArrayList(null);
    }

    // We can check a potential factor n of Mp by seeing if 2^p (mod n) =1
    public Collection factorize(MersenneNumber MersenneNumber) {
        System.out.println("Got Mersenne number for processsing");
        // A Collection bag to collect the factors
        Collection bag = new java.util.ArrayList();
        double maxNumberOfIteration = 
                Math.pow((double)2, (double)(MersenneNumber.getK()));
        double siveNumberOfIteration = sive(maxNumberOfIteration);
        for (long i = 1; i < siveNumberOfIteration; i++) {
            if ((factorize(MersenneNumber.getK(), (long)i) == 0)) {
                bag.add(new Long(i));
                System.out.println(i + "\t" + factorize(MersenneNumber.getK(), i));
            }
        }
        return bag;
    }

    //a simple algorithm for checking for find a^b (mod c)
    public static double factorize(double b, double c) {
        double a = 2;
        double res = 1;
        double d;
        while (b > 0) {
            d = b % 2;
            if (d == 1) {
                res = a * res;
                if (res > c) res = res % c;
            }
            a = a * a;
            if (a > c)
                a = a % c;
            b = (b - d) / 2;
        }
        return res;
    }

    // A simple sive to cut down the number of potential factors
    public static double sive(double a) {
        return (a + 1) / 2;
    }
}

The Factor object calls the public Collection factorize(MersenneNumber MersenneNumber) method defined in the Factorer interface to do the factorization. The LLFactorer object (shown above) implements this method by first calling the public static long sive(long a) method to cut down on the potential factor. This is a very simple method that just cuts the number of factors in half. For instance, for checking 24-1 = 15, only factors up to half of that number need to be checked.

We then iterate through the potential factors and individually check them using the factorized algorithm found in the public static double factorize(double b, double c) method. Most of the material for the above strategy code can be found at the Mersenne FAQ.

Adding a Persistence Mechanism

In case of a server restart, the current Mersenne numbers being factored need to be persisted. To achieve this, a memento pattern is used. This is in conjunction with a Caretaker object, which manages and persists the Memento objects. The Memento objects represent the state of the Factor objects currently being computed, and the Caretaker object manages and persists these objects. On startup, the Caretaker checks if all of the previous Factor objects have finished. Unfinished Factor objects are reinitialized and their calculations restarted. In a later iteration, the stage of factorization will also be persisted.

Example 4. Caretaker interface

/**
 * A simple implementation of the Caretaker interface. This Caretaker 
 * implementation is used to persist Factor objects by keeping a list of 
 * Memento representing the state of the current Factor instance. 
 * It writes that list to disk using a a JDO DAO object
 * @author Eoin Lane
 * @version 1.0
 */
public class SimpleCaretaker implements ICaretaker {
    /** The current list of Memento objects */
    private Vector currentList;

    /** The initalisign list of factor objects */
    private Vector factors;

    /**
     * @link dependency
     * @label mementos
     */

    /*# IMemento lnkIMemento; */

    /**
     * Initialise the caretaker load Memento object using 
     * JDO from disk if necessary
     * @param a vector of factory objects
     */
    public SimpleCaretaker(Vector factors) {
        this.currentList = new Vector();
        this.factors = factors;
        System.out.println("Loading Memento 
                  objects using JDO from if necessary disk");
    }

    /** Clears the current Vector list */
    public void clear(Vector factor) {
        currentList = new Vector();
        this.factors = factors;
    }

    /** Adds an element */
    public void addElement(Object obj) {
        currentList.addElement(obj);
    }

    /** Removes an element */
    public void removeElement(Object obj) {
        currentList.removeElement(obj);
    }

    //------------------------------------
    private void remove(IMemento obj) {
        IMemento m = (IMemento)obj;
        m.restore(); //and restore the old position
    }
}

The persistence mechanism used here is the new Java Data Object (JDO) mechanism. A JDO data access object (DAO) is used to encapsulate the persistence mechanism and the Caretaker object uses the JDODOA object to persist the object graph of Memento objects (these Memento objects contain the state of the Factor objects currently in memory).

Use Case 2: Calculate a Perfect Number From a Mersenne Prime

This use is also defined in the IPerfectCalc interface and implemented in PerfectCalc. As a reminder, there are two methods declared in IPerfectCalc. See the above section for more information about these methods.

As is evident from the service description, this service takes a Mersenne prime and returns the corresponding perfect number. Here are the sequence diagrams and class diagram for this use case.

The Patterns

This server-side architecture uses a number of GOF patterns, which are detailed below.

A stateless facade service class is exposed to the client via a Web service. This facade calls the FactorFactory (which is a factory class, a mediator, and a singleton) to create Factor stateful objects. Potentially, there will be many of these objects, depending on the number of client requests for factorization. The Factor stateful object uses a strategy pattern to find the factors of the Mersenne number and informs the FactorFactory of the state change using an observer pattern. The FactorFactory will then email the result. Finally, the FactorFactoryalso uses a memento pattern to persist the state of the Mersenne number factorization calculation, so it may be reinstated in case of a server restart.

Here is a list of and references to the GOF patterns used:

  1. The facade pattern (the stateless service exposes a simple interface of checkMersenne number to the client).
  2. The factory method (a FactorFactory object for creating Factor (or any other type of object implementing the IFactor interface) at runtime),
  3. The singleton pattern is used for the FactorFactory object to ensure that there is only one object creating and managing Factor objects.
  4. The strategy pattern (for factorizing the MersenneNumber, decoupling the Factor object from how it factors).
  5. The mediator pattern is used in the FactorFactory again to control and manage the lifecycle of the Factor objects that it is responsible for creating.
  6. The observer pattern (a callback to the FactorFactory object that the stateful Factor object has finished factoring and the results should be emailed to the calling client).
  7. The memento pattern to persist the state of the Factor object in case of a server restart.

Grid Computing

Where does the notion of grid computing come into all of this? In the above example, an instance of the strategy pattern -- the LLFactorer -- has been looked at. Another potential factorization algorithm uses the Elliptic Curve Method (ECM) to check for primality. This can be another implementation of the IFactorer interface called ECMFactorer. Numerous other algorithms exist that could be used for this; a comprehensive list can be found here.

For grid computing, the algorithm must be capable of breaking this problem down into nonsequential units of work that can be run in a simultaneous, or parallel, manner. The LL algorithm cannot be parallelized very well, but the ECM algorithm can. Thus, a master Web service could accept a potential Mersenne number for primality testing, only to de-convolute the problem into multiple, paralleled problems that can be farmed out to other Web services from the ECMFactorer strategy implementation.

Idle processes, within an Intranet exposing an algorithm service as WSDL, for example, could be looked up using UDDI and invoked using SOAP/HTTP. Care should be taken here in long-running blocking transactions, and in extreme cases, a more flexible non-blocking protocol like SOAP/JMS or SOAP/SMTP could be used.

Therefore, our choice of algorithmic implementation dictates the usefulness of applying the problem to a grid solution. The easy infrastructure that Web services provide for solving this sort of computationally-intense problem is unattainable in any other middleware solution currently available.

Conclusion

The combination of Web services and grid computing provides a powerful duo for tackling computationally difficult problems. The three amigos of Web services -- WSDL, UDDI, and SOAP -- provide a framework on which a grid can flourish.

It remains to be seen how effective this will be in terms of providing solutions in realistic timeframes. Other standards within the Web services domain will need to mature in order for this to be a true robust enterprise solution. However, this type of grid computing is already being actively developed by Sun and HP, and applications may not be far off.

Eoin Lane is a Principal Technologist with Cape Clear Software. He offers advice and consultancy to Cape Clear clients in the design and implementation of Web services projects.


Return to ONJava.com.

Copyright © 2009 O'Reilly Media, Inc.