Extending Ruby with C
Pages: 1, 2, 3
GenX::Writer#begin_element
It has taken an awful lot of trouble to put the begin_document
and end_document
methods in place. Now users can start and end
documents--but in order for this to be of much use, they'll want to put some
content in the document. Because this is XML, and XML documents all start with a
root element, that means the next methods to implement are
GenX::Writer#begin_element
and
GenX::Writer#end_element
. The obvious place to start is
GenX::Writer#begin_element
.
Genx::Writer#begin_element
is a thin wrapper around the
genxStartElementLiteral
function. It's really similar to the
methods already shown. Here's the implementation:
static VALUE
writer_begin_element (int argc, VALUE *argv, VALUE self)
{
genxWriter w;
VALUE xmlns, name;
switch (argc)
{
case 1:
xmlns = 0;
name = argv[0];
break;
case 2:
xmlns = argv[0];
Check_Type (xmlns, T_STRING);
name = argv[1];
break;
default:
rb_raise (rb_eRuntimeError, "invalid arguments");
}
Check_Type (name, T_STRING);
Data_Get_Struct (self, struct genxWriter_rec, w);
GENX4R_ERR (genxStartElementLiteral
(w,
xmlns ? (constUtf8) RSTRING (xmlns)->ptr : NULL,
(constUtf8) RSTRING (name)->ptr), w);
return Qnil;
}
A few things here haven't appeared before. First of all, this
method takes a variable number of arguments. Most of the code in this function
goes to figuring out how many arguments it received and setting things up as
appropriate. The way Ruby lets you do this at the C level is that the
underlying C function takes as arguments an integer that holds the number of
arguments passed, a pointer to an array of VALUE
s that contains
each of the arguments, and a VALUE
that holds the invoking
object.
If it receives one argument, it uses that as the name of the element.
If it receives two arguments, then the first is the namespace and the second
is the name. Given an xmlns
argument, the code verifies that it
is a String
using the Check_Type
macro with the
T_STRING
constant. The same check occurs for the element's
name
. Then, as usual, it pulls the genxWriter
out of
self
and finally calls the underlying
genxStartElementLiteral
function, passing in the namespace if
provided and valid, and a NULL
otherwise. When passing the
namespace and element name, note that the code uses the RSTRING
macro to cast the VALUE
to the underlying string data structure
before accessing the C-string pointer via the ptr
field in that
structure.
Once again, Init_genx4r
needs more code to hook up this
method:
rb_define_method (rb_cGenXWriter,
"begin_element",
writer_begin_element,
-1);
Notice the -1
that tells Ruby to call this method via the
count/array/object style of argument passing.
Now that there's a way to start an element, there must be a way to end it.
That's the purpose of the GenX::Writer#end_element
method.
GenX::Writer#end_element
As you might have guessed, GenX::Writer#end_element
is very
similar to GenX::Writer#end_document
. Here's the
implementation:
static VALUE
writer_end_element (VALUE self)
{
genxWriter w;
Data_Get_Struct (self, struct genxWriter_rec, w);
GENX4R_ERR (genxEndElement (w), w);
return Qnil;
}
All it does is pull out the writer and call genxEndElement
on
it. GenX does the rest. As usual, it takes one call in Init_genx4r
to hook up the method:
rb_define_method (rb_cGenXWriter,
"end_element",
writer_end_element,
0);
Now GenX4r
can actually produce XML. Jump into
irb
and try it out.
$ irb
irb(main):001:0> require 'genx4r'
=> true
irb(main):002:0> w = GenX::Writer.new
=> #<GenX::Writer:0x321f0c>
irb(main):003:0> s = ''
=> ""
irb(main):004:0> w.begin_document(s)
=> nil
irb(main):005:0> w.begin_element("foo")
=> nil
irb(main):006:0> w.end_element
=> nil
irb(main):007:0> w.end_document
=> nil
irb(main):008:0> s
=> "<foo></foo>"
irb(main):009:0>
There you have it! The extension actually produced some XML output! Of
course, most XML needs some textual content within at least some of the tags.
Making that work means implementing GenX::Writer#text
, a wrapper
around genxAddText
.
GenX::Writer#text
After everything implemented so far, GenX::Writer#text
doesn't
have anything all that new to it. Take a look:
static VALUE
writer_text (VALUE self, VALUE text)
{
genxWriter w;
Check_Type (text, T_STRING);
Data_Get_Struct (self, struct genxWriter_rec, w);
GENX4R_ERR (genxAddText (w, (constUtf8) RSTRING (text)->ptr), w);
return Qnil;
}
There are the usual hoops to access the genxWriter
and then a
call to pass the text through to genxAddText
. Here's code to hook
up the method in Init_genx4r
.
rb_define_method (rb_cGenXWriter,
"text",
writer_text,
1);
There you have it, a functionally complete wrapper. Try it out in
irb
to prove it.
$ irb
irb(main):001:0> require 'genx4r'
=> true
irb(main):002:0> w = GenX::Writer.new
=> #<GenX::Writer:0x321f0c>
irb(main):003:0> s = ''
=> ""
irb(main):004:0> w.begin_document(s)
=> nil
irb(main):005:0> w.begin_element("foo")
=> nil
irb(main):006:0> w.text("bar")
=> nil
irb(main):007:0> w.end_element
=> nil
irb(main):008:0> w.end_document
=> nil
irb(main):009:0> s
=> "<foo>bar</foo>"
irb(main):010:0>
With the combination of elements and text, you can now start using
GenX4r
for some nontrivial tasks. Before that, I'd like to write
some tests to verify that everything works now and that it will continue to
work as I make changes in the future.
Unit Testing
In Ruby, the accepted way to write unit tests is to use the
Test::Unit
framework. This is a standard unit test framework,
written along the lines of the popular JUnit package. To use it, subclass the
Test::Unit::TestCase
class and implement your tests as methods
that are named test_something
(where the
something part changes for each test). Inside the tests, use the
assert
method to indicate what conditions need to be true for the
tests to pass. Here's a simple test case to start:
require 'test/unit'
require 'genx4r'
class BasicsTest < Test::Unit::TestCase
def test_element
w = GenX::Writer.new
s = ''
w.begin_document(s)
w.begin_element('foo')
w.text('bar')
w.end_element
w.end_document
assert s == '<foo>bar</foo>'
end
end
Run the tests by running that file. You should receive output similar to the following:
$ ruby test.rb
Loaded suite test
Started
.
Finished in 0.005774 seconds.
1 tests, 1 assertions, 0 failures, 0 errors
The line with the single dot on it is where you see the output for the
tests. Each passing test prints a .
whereas failing tests print an
F
. To add more tests, fill in more test methods. They will run
automatically when you run the file.
Making Things a Bit More Ruby-esque
All right, now there's a working module and a test suite to make sure it
keeps on working. I'm all set to release this new toy to the unsuspecting
masses out there on the Internet, right? Not quite. Although the API works, it's
not ideal. You have to remember to call the
GenX::Writer#end_element
and
GenX::Writer#end_document
methods at exactly the right times;
otherwise you'll either mess up the output (if elements nest
incorrectly) or even possibly throw an exception because you call underlying
GenX
functions out of order. Remember that GenX
is
big on enforcing correctness, so if you screw up, it will tell you about it.
It would be really nice to arrange for the module to call these end methods at the appropriate times. Fortunately, Ruby has a way to do that: blocks.
A block in Ruby is a chunk of code passed to a method as one of its
arguments. The method can then call the yield
method to invoke the
block whenever it wants. The syntax looks like this:
def takes_a_block(&block)
puts "before yield"
yield
puts "after yield"
end
takes_a_block do
puts "in the block"
end
Running this code produces the following output:
$ ruby blocks-example.rb
before yield
in the block
after yield
Note that Ruby allows braces as block delimiters instead of do
and end
, in which case the method call could have looked like
takes_a_block { puts "in the block" }
. Both ways are valid.
Which one you use is mostly just a question of style.
Using blocks to indicate the beginning and end of an element in the XML to
generate solves the API problem perfectly. Here's how to implement this with
a new GenX::Writer#element
method defined at the C level.
static VALUE
writer_element (int argc, VALUE *argv, VALUE self)
{
writer_begin_element (argc, argv, self);
if (rb_block_given_p ())
{
rb_yield (Qnil);
writer_end_element (self);
return Qnil;
}
else
rb_raise (rb_eRuntimeError, "must be called with a block");
}
All of this is merely a new method that calls the begin_element
method and
then invokes the block it received (or throws an exception if it didn't receive
one), then calls end_element
. In order to nest elements or put
text inside them, the passed-in block needs to contain the code to create that
content. There are two new C-level functions here,
rb_block_given_p
, the predicate that asks "Was I given a block?" and
rb_yield
, which invokes the block. Because there's nothing else
to pass to the block, the code passes Qnil
.
As usual, the code to hook up this new method in Init_genx4r
looks like:
rb_define_method (rb_cGenXWriter, "element", writer_element, -1);
With that in place, using the new method like this:
w = GenX::Writer.new
s = ''
w.begin_document(s)
w.element('foo') do
w.element('bar') do
w.text('baz')
end
end
w.end_document
puts s
produces output of this:
<foo><bar>baz</bar></foo>
Isn't that a much nicer API? Instead of having to remember to handle the
nesting of elements manually, users can encode it directly into their program,
making it much more difficult to do incorrectly. Note that the same technique
can easily apply to the GenX::Writer#begin_document
and
GenX::Writer#end_document
methods.
Some Conclusions
This whole idea started off as something of an experiment. Is it really as easy as I thought it would be to wrap a C library in Ruby? I don't know about you, but I think it was a success. In fewer than 300 lines of C, I've provided users with access to a useful subset of the GenX library's functionality. If I were going to implement this code directly in Ruby, it would be longer and most likely buggier, simply because the C version has been debugged already and our hypothetical Ruby version has not.
That said, GenX4r is still somewhat incomplete. The
begin_document
and end_document
methods still need
block-based cover method wrappers, and for efficiency I also want to provide
users the ability to predeclare namespaces and elements, to avoid having
to validate them each time they're used. Plus, I'm a reasonably new Ruby
hacker, so it's not out of the realm of possibility that there are bugs in the
wrapper. Even so, I think this is a reasonable proof of concept. The
additional work I've done on GenX4r that isn't documented here indicates to me
that it is a success. On the strength of this experience, I have no trouble
recommending Ruby as a convenient scripting language to wrap around
libraries written in C.
For the record, the current version of GenX4r, which includes all this hypothetical functionality in one form or another, constitutes only 584 lines of C. If you're interested in using it or helping me develop it further, please grab the latest version from the GenX4r home page.
Garrett Rooney is a software developer at FactSet Research Systems, where he works on real-time market data.
Return to ONLamp.com.

-
Use rb_scan_args()
2004-11-19 15:40:58 djberg96 [View]
-
Use rb_scan_args()
2004-11-19 15:44:38 rooneg [View]
-
Use rb_scan_args()
2004-11-19 15:44:27 djberg96 [View]
-
well said
2004-11-19 09:58:02 amuegge [View]
-
re: well said
2004-11-20 02:26:52 bitserf [View]
-
nice article
2004-11-19 03:26:22 riffraff [View]
-
nice article
2004-11-19 09:09:09 rooneg [View]