The code behind OpenAustralia


How to Install

This document explains how to install the openaustralia parser, it assumes you’ve already installed the web application. The parser generates XML files which get loaded into the database of the web application. If all you need is the XML files and aren’t interested particularly in installing and configuring the parser then you can just download the XML files from data.openaustralia.org.

Requirements

Install the dependencies

Max OS X Leopard

Install DarwinPorts and then install ImageMagick and ghostscript:

$ sudo port install ImageMagick
$ sudo port install ghostscript

Note: the previous step takes a long while to complete, make yourself a coffee (or two)

Install the required rubygems:

$ sudo gem install -y mechanize -v 0.9.2
$ sudo gem install -y builder rmagick rcov htmlentities rspec activesupport log4r hpricot

Note: Currently OpenAustralia requires an older version of mechanize (0.9.2), but this might change in the future.

Ubuntu 10.04

Use apt-get to install the requirements:

$ sudo apt-get install imagemagick libmagick9-dev ghostscript ruby rubygems ruby1.8-dev libxslt1-dev

Install the required rubygems:

$ sudo gem install mechanize -v 0.9.2
$ sudo gem install activesupport -v 2.3.7
$ sudo gem install builder rmagick rcov htmlentities rspec log4r hpricot json

Note: Currently OpenAustralia requires an older version of mechanize (0.9.2), but this might change in the future.

For Windows

Ruby has its own Windows versions that you need to get from RubyInstaller for Windows (choose the one-click installer option).

In addition to the Ruby gems required above you’ll need to install Ruby-MySQL, which can be downloaded from http://www.tmtm.org/en/ruby/mysql/.

There is an error connecting to MySQL 5.0+ with Ruby due to change in password hash length so you will have to recompile with the include path set to the new MySQL C library.

Configure the Parser

Copy the example configuration:

$ cd openaustralia-parser
$ cp configuration.yml.example configuration.yml

The only configuration necessary is to change the web-root if you have installed the web application in another location. That value is web_root in the file you just copied: configuration.yml

Run the Parser

Before you can run the parser, you will need to create the directories that will hold the images of the MPs.

$ mkdir -p pwdata/images/mps pwdata/images/mpsL

You are now ready to create the members information. You should just use:

$ ./parse-members.rb
# you should see messages on the console similar to the following
Reading members data...
Running consistency checks...
Writing XML...
Replacing existing member with new data for 5
This is for your information only, just check it looks OK.

$VAR1 = [
          '5',
          '10006',
          1,
          '',
          'Albert',
          'Adermann',
          'Fisher',
          'National Party',
          '1972-12-02',
          '1984-12-01',
          'general_election',
          'elected_elsewhere'
        ];
[...]

To download the members images (this will take a while):

$ ./member-images.rb 

If you want, though it is not particularly important initially, you can also download the links information (which goes on the Representatives’ and Senators’ pages) by running:

$ ./parse-member-links.rb

To download the Hansard data (the speeches) for one day, say Sept 20th, 2007 and load them into the database:

$ ./parse-speeches.rb 2007.09.20
parse-speeche: 100% |oooooooooooooooooooooooooooooooooooooooooo| Time: 00:01:27
db loading  2007-09-20
db loading  2007-09-20

You should now be able to view the results at your webserver URL, dev.openaustralia.org

You should now see a version of openaustralia.org populated with data.

Congratulations, you’ve got a mostly complete running version of OpenAustralia! Give yourself a big pat on the back.

Keep in touch

Google Groups
Subscribe to OpenAustralia Development
Email:
Visit this group

Twitter // Identi.ca // Facebook // Blog

www.openaustralia.org

How to edit these pages