ONLamp.com
oreilly.comSafari Books Online.Conferences.

advertisement


OpenGuides: City Wikis in Perl
Pages: 1, 2

Getting the Data Back out Again

So, what else makes OpenGuides different from other wiki software? Why would someone choose OpenGuides to run her city guide rather than, say, MediaWiki, or Kwiki, or MoinMoin?



OpenGuides' geographical awareness is certainly an advantage—having distance searching and Google map support built-in is very handy. Perl programmers will also appreciate the fact that it's written in Perl, of course! But the big win, as far as I'm concerned, is its use of structured data. Not only does this allow us to build complex queries like, "find me all the real ale pubs within 500m of King's Cross station that serve food at lunchtime," or "find me all the restaurants that Kake wants to be taken to that are within 500m of any Jubilee Line station," it also makes it easier for people to contribute content to the guides. A number of people have told me that they really appreciate seeing some structure on the edit page rather than just a big white blank box—I suppose it's the wiki equivalent of writer's block!

In theory, the possibilities for structured data in OpenGuides are endless, since the underlying Wiki::Toolkit software, which handles all data storage and output for OpenGuides, puts no restrictions on what kind of data can be stored. The most useful structured data fields at the moment are categories, locales, latitude, and longitude; although address, postal code, etc., are also stored in structured data fields, this is mainly a presentation and usability issue.

As well as the search and update tools incorporated within OpenGuides itself, individual guide admins are also free to write custom search tools. The Randomness Guide has a number of custom search scripts, including various tools aimed at making life easier for the guide's contributors, such as a way of finding all stub pages (pages with very little content), without those pages needing to be specifically marked as such.

OpenGuides data is accessible to programmers in a number of ways. Firstly, the OpenGuides modules have a number of methods which can be called externally; for example, text search results can be returned as a Perl data structure (or indeed a hash of Template Toolkit variables) instead of as a nicely formatted HTML page. If you'd like to get closer to the wires, the underlying Wiki::Toolkit modules offer additional methods, allowing you, for example, to grab a list of all categories that a given page has been placed in, or a list of all pages within a certain distance of a certain latitude/longitude. If that's not good enough (or is too slow), the fact that all OpenGuides data is structured, and is stored in an SQL database (your choice of Postgres, MySQL, or SQLite) means that even the most baroque questions can be answered simply by writing some SQL.

For example, when writing the stub page finder mentioned above, I went back to the SQL:

my $sql = "
SELECT node.name,
       locale.metadata_value,
       category.metadata_value
FROM node
LEFT JOIN metadata as locale
  ON ( node.id = locale.node_id
       AND node.version = locale.version
       AND lower( locale.metadata_type ) = 'locale'
     )
LEFT JOIN metadata as category
  ON ( node.id = category.node_id
       AND node.version = category.version
       AND lower( category.metadata_type ) = 'category'
     )
WHERE char_length( node.text ) < ?
AND node.text NOT LIKE '%#REDIRECT%'
";

if ( $q->param( "exclude_locales" ) ) {
    $sql .= " AND node.name NOT LIKE 'Locale %'";
}
if ( $q->param( "exclude_categories" ) ) {
    $sql .= " AND node.name NOT LIKE 'Category %'";
}

my $sth = $dbh->prepare( $sql );
$sth->execute( $length ) or die $dbh->errstr;

Conversely, when writing a little widget to find the nearest Tube (subway) station to a given place, I made use of a couple of Wiki::Toolkit methods, first to find everything within a kilometre of the place, and second to get a list of all Tube stations; the simple intersection of these arrays gives me my answer:

my $config_file = $ENV{OPENGUIDES_CONFIG_FILE}
                  || "../wiki.conf";
my $config = OpenGuides::Config->new( file => $config_file );

my $guide = OpenGuides->new( config => $config );
my $wiki = $guide->wiki;

my $locator = Wiki::Toolkit::Plugin::Locator::Grid->new(
    x => "os_x", y => "os_y" );
$wiki->register_plugin( plugin => $locator );

[...]

my @nearby = $locator->find_within_distance(node => $origin,
                                            metres => 1000 );
my @stations = $wiki->list_nodes_by_metadata(
    metadata_type  => "category",
    metadata_value => "tube",
    ignore_case    => 1,
);

The Data Determines the Structure

While OpenGuides does impose some structure, we deliberately left the choice of categories, locales, and house style to contributors to each guide. Although this makes it slightly harder to automatically transfer data between guides, it's more important to make it easier for local people to create the kind of guide that they find most useful. For example, the Randomness Guide uses postal districts (W1, WC2, SE1, etc.) in addition to named locales such as Hammersmith or Marylebone, because Londoners are used to navigating by postcode. The two Oxford guides have also adapted to their specific locality, and created a number of locales which are restricted to a single street; Oxford is so small that this makes a lot of sense.

Freedom of category choice also makes it easier to write custom search scripts; one of the more popular searches on the Randomness Guide is the pub search, which takes advantage of a number of categories that the guide's users have come up with—Real Ale, Real Cider, Food Served Lunchtimes, Food Served Evenings, Free Wireless, and so on.

One serendipitous outcome of the category system stemmed from our use of the Google Maps API, which allows us to plot search results on a map. We created a category for each of the Transport For London Travelcard zones (I used the WikiMedia Commons Tube station data and a WWW::Mechanize script for the Tube stations, though the rail stations had to be done by hand), which gave us, with no additional work, some strangely fascinating maps of the extent of each zone: 1, 2, 3, 4, 5, and 6 (best viewed in tabs).

How to Get Involved

There are many ways to get involved in OpenGuides. First, take a look at openguides.org to see if there's a guide local to you. If there is, take a look and see if you can improve or add to any of the information on it! If there's no guide covering your area, you might even be interested in setting one up yourself—if this is the case, the best way to start is by joining the openguides-dev mailing list, and posting there about your interest. People on the list have years of experience of setting up and maintaining an Open Guide, and will be happy to guide you through the technical and social issues involved.

Finally, if you're interested in getting involved in the programming side of things, you can download the OpenGuides and Wiki::Toolkit releases from CPAN, or take a look at our Trac install, and browse or check out our subversion repository for the latest code. Then, why not come along and meet us at one of our hackfests? The next one will be held in London on Saturday and Sunday July 21 and 22, and we'll also be travelling to Vienna in August to hold a short three-hour hackathon as part of YAPC::Europe.

You don't need to be an expert to work on OpenGuides, whether as a content contributor or a programmer; we have people of all levels of expertise involved in the project, and there's certainly no shortage of things to do! Our friendly community is full of helpful people, and it's growing all the time—most appropriate for a project that started off with a conversation between a couple of friends in a pub.

Kake Pugh is a freelance academic copy editor who writes Perl in her spare time. She likes test-first development, databases, and documentation.


Return to ONLamp.com.



Sponsored by: