Posted by
Daniel Haran on
Mar 11, 2007; 2:31pm
URL: http://civicaccess.416.s1.nabble.com/postal-codes-campaign-next-steps-tp932p941.html
Hi all,
Thanks to Michael for translating what I did :)
Here's the initial result:
http://lokobo.com:3000/mpsCould Russell and others let me know if there is anything obviously
missing (Party affiliations aren't being recognized and the addresses
are a mess)? In what format should I make the database available? Is
anyone going to use this?
The main objective I had was to get constituency phone numbers for
each MP, since that could have been useful for Make Poverty History.
I've no affiliation - it just bugs me to see an advocacy group not
being quite as effective as they could because information is
disorganized.
And boy, is it EVER disorganized. One MP, Deepak Obhrai, doesn't even
have his constituency phone listed on the government website:
http://webinfo.parl.gc.ca/MembersOfParliament/ProfileMP.aspx?Key=78365&Language=EI have a small number of technical issues I'd like to resolve. However
the software that did the scraping and runs the website and web
service is in the public domain on rubyforge:
http://rubyforge.org/projects/mp-ca-scraper/(stats aren't updated on the main page, but the source IS there:)
svn checkout svn://rubyforge.org/var/svn/mp-ca-scraper/trunk scraper
Web pages are available for human and computer consumption:
http://lokobo.com:3000/mpshttp://lokobo.com:3000/mps.xmlMP pages are indexed by EDID - Electoral District ID:
http://lokobo.com:3000/mps/35049http://lokobo.com:3000/mps/35049.xmlCheers,
Daniel.
On 3/8/07, Michael Lenczner <
[hidden email]> wrote:
> In case the non computer geeks are feeling left out - here's a mini translation.
>
> In the email below - "scraping":
> "Screen scraping is a technique in which a computer program extracts
> text data from the display output of another program.... The program
> doing the scraping is called a screen scraper... There are a number
> of synonyms for screen scraping, including: Data scraping, data
> extraction, web scraping, page scraping, web page wrapping and HTML
> scraping (the last four being specific to scraping web pages)."
>