ONLamp.com
oreilly.comSafari Books Online.Conferences.

advertisement


Extending Ruby with C
Pages: 1, 2, 3

Adding Methods

Now it's time to add some actual functionality to the new class, which means deciding exactly how it should work. The basic usage pattern for GenX itself is simple: create a document, then create a variety of elements, possibly mixed with character data, and finally close the document. In the simplest terms, this boils down to five methods: GenX::Writer#begin_document, GenX::Writer#end_document, GenX::Writer#begin_element, GenX::Writer#end_element, and GenX::Writer#text. Because you can't do anything without a document, I'll start with begin_document and end_document.

GenX::Writer#begin_document

The underlying GenX library has two functions for starting a new document. The first, genxStartDocFile, takes a FILE * argument to which to write the contents of the document. I want something a bit more generic than that, because there's no certainty that the user wants to send the output to a file. It might be nice to be able to write the XML to a network socket, or a string, or any number of other things. Fortunately GenX also provides the genxStartDocSender function, which requires a genxSender structure, containing pointers to send, sendBounded, and flush functions. The extension needs to implement those functions. Here is writer_send, the send function:

static genxStatus
writer_send (void *baton, constUtf8 s)
{
  VALUE file = (VALUE) baton;
  VALUE ary;

  if (! rb_respond_to (file, rb_intern ("<<")))
    rb_raise (rb_eRuntimeError, "target must respond to '<<'");

  ary = rb_ary_new2 (2);

  rb_ary_store (ary, 0, file);
  rb_ary_store (ary, 1, rb_str_new2 (s));

  if (rb_rescue (call_write, ary, handle_exception, Qnil) == Qfalse)
    return GENX_IO_ERROR;
  else
    return GENX_SUCCESS;
}

writer_send takes two arguments. First is a void * that points to the "user data" associated with the genxWriter. In that case, the pointer holds a VALUE that refers to the object into which to send data. This could be a File object, a String, or anything else—what's important is that it responds to the << method, which the code uses to write the data from GenX. The second argument is a constUtf8, which is GenX's typedef for a pointer to the start of a null-terminated Utf8 string containing the data to write out.

The code first checks whether the object it received actually responds to the << method. If it doesn't, then it can't do much else, so it raises an exception via rb_raise. If it passes that check, it creates a Ruby array to hold the object into which to write the data and a Ruby string to hold the data to write. The array exists to make it easier to pass them to call_write, a helper function that does the writing by way of the rb_rescue function. Here's the implementation of call_write:

static VALUE
call_write (VALUE ary)
{
  rb_funcall (rb_ary_entry (ary, 0),
              rb_intern ("<<"),
              1,
              rb_ary_entry (ary, 1));
}

All this does is use rb_funcall to call the << method on the file object, extracted from the array using the rb_ary_entry. The << method takes a single argument (thus the 1): the string built up in writer_send, also extracted from the array.

If all this code needs to do is call a function, why not do it directly in writer_send, thus avoiding the rb_rescue gymnastics? In case the << method throws an exception, rb_rescue calls the handle_exception function passed to it. It passes the return value of handle_exception back to the caller through rb_rescue. In this case, handle_exception returns Qfalse so that writer_send knows to return GENX_IO_ERROR to its caller. Here's the (trivial) implementation of handle_exception:

static VALUE
handle_exception (VALUE unused)
{
  return Qfalse;
}

All right, that's the implementation of send. The next function is sendBounded. This is pretty much the same thing; it just needs to create the string it passes in to call_write based on a start pointer and an end pointer instead of a null-terminated string. The implementation looks like this:

static genxStatus
writer_send_bounded (void *baton, constUtf8 start, constUtf8 end)
{
  VALUE file = (VALUE) baton;
  VALUE ary;

  if (! rb_respond_to (file, rb_intern ("<<")))
    rb_raise (rb_eRuntimeError, "target must respond to '<<'");

  ary = rb_ary_new2 (2);

  rb_ary_store (ary, 0, file);
  rb_ary_store (ary, 1, rb_str_new (start, end - start));

  if (rb_rescue (call_write, ary, handle_exception, Qnil) == Qfalse)
    return GENX_IO_ERROR;
  else
    return GENX_SUCCESS;
}

As you can see, the only difference here is in calling rb_str_new instead of rb_str_new2. This takes a pointer and a length, calculated based on the given start and end pointers.

Finally, this code needs a flush function. Here's writer_flush and its helper function call_flush:

static VALUE
call_flush (VALUE file)
{
  rb_funcall (file, rb_intern ("flush"), 0);

  return Qtrue;
}

static genxStatus
writer_flush (void *baton)
{
  VALUE file = (VALUE) baton;

  /* if we can't flush, just let it go... */
  if (! rb_respond_to (file, rb_intern ("flush")))
    return GENX_SUCCESS;

  if (rb_rescue (call_flush, file, handle_exception, Qnil) == Qfalse)
    return GENX_IO_ERROR;
  else
    return GENX_SUCCESS;
}

This is rather similar to the send and sendBounded functions, so let's consider only the differences. First of all, if the object to which to write the data doesn't respond to the flush method, the code returns success, assuming that it's holding the data in memory or something else where flush is not applicable. The only other difference is that instead of the << method, it calls the flush method on the object.

With all three helper functions, it's time to write the writer_begin_document function itself. That'll finally make it possible to start a new document with the GenX::Writer object.

static genxSender writer_sender = { writer_send,
                                    writer_send_bounded,
                                    writer_flush };

static VALUE
writer_begin_document (VALUE self, VALUE file)
{
  genxWriter w;

  Data_Get_Struct (self, struct genxWriter_rec, w);

  if (! rb_respond_to (file, rb_intern ("<<")))
    rb_raise (rb_eRuntimeError, "target must respond to '<<'");

  genxSetUserData(w, (void *) file);

  GENX4R_ERR (genxStartDocSender (w, &writer_sender), w);

  return Qnil;
}

The first thing to do is to unwrap the contents of the self object. Recall the earlier use of Data_Wrap_Struct to turn a struct genxWriter_rec into our object. This code uses its counterpart, Data_Get_Struct, to pull it back out again. For paranoia's sake, it confirms that the file it received responds to the << method, because it will call that method later on. Next, calling genxSetUserData makes file useful as the "user data," so that when the time comes to call the sender functions, they can call the appropriate methods on it. Finally, the code calls genxStartDocSender, passing it the writer and a pointer to the genxSender so it knows which functions it should call later on. The object is now ready to do some real work.

There is something a little strange about the call to genxStartDocSender function though, being wrapped up in the GENX_ERR macro. This is a helper to let you avoid writing all sorts of boilerplate error-handling code in a lot of different places. Here's the definition:

#define GENX4R_ERR(expr, w)                                         \
  do {                                                              \
    genxStatus genx4r__status = (expr);                             \
    if (genx4r__status)                                             \
      rb_raise (rb_cGenXException, "%s", genxLastErrorMessage (w)); \
  } while (0)

All it does is declare a genxStatus to hold the return value of the expression it's calling. If that status is nonzero (meaning anything other than GENX_SUCCESS), it raises an exception via rb_raise. The exception is of type rb_cGenXException and holds the result of genxLastErrorMessage, making it possible to present a reasonable error to the caller. A do { } while loop (with no trailing semicolon) wraps the entire expression, making it possible to treat it just like a regular C statement.

This is the first appearance of rb_cGenXException. Where did it come from? Just like the rb_cGenXWriter VALUE used to hold the reference to the GenX::Writer class, rb_cGenXException is a VALUE holding a reference to the GenX::Exception class. Its definition is in Init_genx4r:

rb_cGenXException = rb_define_class_under (rb_mGenX,
                                           "Exception",
                                           rb_eStandardError);

This creates a new class within the rb_mGenX module named Exception; the class inherits from the rb_eStandardError class, better known in Rubyland as StandardError.

The final step here is to add a call in rb_define_method to the init function in order to hook up writer_begin_document to the GenX::Writer class. It looks like this:

rb_define_method (rb_cGenXWriter,
                  "begin_document",
                  writer_begin_document,
                  1);

Calling the begin_document method on an instance of the GenX::Writer class causes Ruby to call writer_begin_document. The 1 indicates that begin_document takes a single VALUE as its argument.

GenX::Writer#end_document

After starting a document with Genx::Writer#begin_document, there must be some way to end it, by way of the corresponding Genx::Writer#end_document method. This is, fortunately, much simpler than begin_document. All it needs to do is call genxEndDocument. The actual implementation looks like this:

static VALUE
writer_end_document (VALUE self)
{
  genxWriter w;

  Data_Get_Struct (self, struct genxWriter_rec, w);

  GENX4R_ERR (genxEndDocument (w), w);

  return Qnil;
}

As you can see, this is basically boilerplate stuff. It pulls the writer out of our object using Data_Get_Struct, calls genxEndDocument on it (wrapped up in a GENX_ERR macro to handle the error checking), and returns Qnil, which in Ruby terms returns nil to the caller.

Then it hooks up the method via the appropriate rb_define_method call in Init_genx4r:

rb_define_method (rb_cGenXWriter,
                  "end_document",
                  writer_end_document,
                  0);

This is exactly like hooking up the begin_document method, except that it takes no arguments instead of one.

Pages: 1, 2, 3

Next Pagearrow





Sponsored by: