postal codes campaign - next steps?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

postal codes campaign - next steps?

Daniel Haran
I just had a chance to re-read the Postal Codes page on the wiki*.
Michael, you've done some great work, the letter seems very compelling
now!

http://civicaccess.ca/wiki/PostalCodes

What tangible forms of support are we asking from political parties
(and the others listed)?

What organization, if any, should spearhead this?

-Daniel.

* That was prompted by makepovertyhistory's latest action alert, which
could have been made more effective by personalizing with MP names and
phone numbers. You can see a copy of it here:
http://www.makepovertyhistory.ca/e/take-action/e-alerts/2007-03-05.html


Reply | Threaded
Open this post in threaded view
|

Re: postal codes campaign - next steps?

Russell McOrmond-2


Off-topic, but might be relevant to some other later project...

Daniel Haran wrote:
> * That was prompted by makepovertyhistory's latest action alert, which
> could have been made more effective by personalizing with MP names and
> phone numbers. You can see a copy of it here:
> http://www.makepovertyhistory.ca/e/take-action/e-alerts/2007-03-05.html

   Sometimes it isn't a CivicAccess issue, but other timing/funding
issues that cause these things.  There are other reasons why that
specific e-alert pointed to parl.gc.ca rather than using the already
purchased postal-code --> EDID database.


   We have 308 MPs (plus the Minister for Public Works) with information
about each, and I don't know of any group that is maintaining a table of
information about these MPs.   While we at MPH keep the name and email
address updated in our database, we didn't keep the phone number or
constituency office updated which is one of the things we were wanting
people to look up.


   Does anyone know of a structured WIKI, something that would allow a
group to collaboratively maintain a table of information?

--
  Russell McOrmond, Internet Consultant: <http://www.flora.ca/>
  Please help us tell the Canadian Parliament to protect our property
  rights as owners of Information Technology. Sign the petition!
  http://www.digital-copyright.ca/petition/ict/

  "The government, lobbied by legacy copyright holders and hardware
   manufacturers, can pry my camcorder, computer, home theatre, or
   portable media player from my cold dead hands!"


Reply | Threaded
Open this post in threaded view
|

Re: postal codes campaign - next steps?

Daniel Haran
Hello,

I made some progress trying to get some information from parl.gc.ca.
The following may only be of interest to techies...

After extracting the list of MP codes from
http://webinfo.parl.gc.ca/MembersOfParliament/MainMPsCompleteList.aspx?TimePeriod=Current&Language=E

I tried getting more information from individual MP pages. Two
scraping kits later, after REXML choked on XPath queries and various
other tech horrors (take a look at the source... __VIEWSTATE weighs in
at 8k, even tidy can't parse it, etc), I decided to resort to Dapper.
E.g.:

http://webinfo.parl.gc.ca/MembersOfParliament/ProfileMP.aspx?Key=78902&Language=E
=>
http://www.dapper.net/RunDapp?dappName=CanadianMPdetails&v=1&variableArg_0=78902

I'll use some regexps to clean up what I couldn't get Dapper to
extract, and publish the whole as a db and a RESTful web service so no
one else ever need go through this.

Let me know if this is useful and/or if anything is missing.

-Daniel.

On 3/6/07, Russell McOrmond <[hidden email]> wrote:

>
>
> Off-topic, but might be relevant to some other later project...
>
> Daniel Haran wrote:
> > * That was prompted by makepovertyhistory's latest action alert, which
> > could have been made more effective by personalizing with MP names and
> > phone numbers. You can see a copy of it here:
> > http://www.makepovertyhistory.ca/e/take-action/e-alerts/2007-03-05.html
>
>    Sometimes it isn't a CivicAccess issue, but other timing/funding
> issues that cause these things.  There are other reasons why that
> specific e-alert pointed to parl.gc.ca rather than using the already
> purchased postal-code --> EDID database.
>
>
>    We have 308 MPs (plus the Minister for Public Works) with information
> about each, and I don't know of any group that is maintaining a table of
> information about these MPs.   While we at MPH keep the name and email
> address updated in our database, we didn't keep the phone number or
> constituency office updated which is one of the things we were wanting
> people to look up.
>
>
>    Does anyone know of a structured WIKI, something that would allow a
> group to collaboratively maintain a table of information?
>
> --
>   Russell McOrmond, Internet Consultant: <http://www.flora.ca/>
>   Please help us tell the Canadian Parliament to protect our property
>   rights as owners of Information Technology. Sign the petition!
>   http://www.digital-copyright.ca/petition/ict/
>
>   "The government, lobbied by legacy copyright holders and hardware
>    manufacturers, can pry my camcorder, computer, home theatre, or
>    portable media player from my cold dead hands!"
>
> _______________________________________________
> CivicAccess-discuss mailing list
> [hidden email]
> http://civicaccess.ca/mailman/listinfo/civicaccess-discuss_civicaccess.ca
>


--
Change the world one loan at a time - visit Kiva.org to find out how


Reply | Threaded
Open this post in threaded view
|

Re: postal codes campaign - next steps?

Michael Lenczner
In case the non computer geeks are feeling left out - here's a mini translation.

In the email below - "scraping":
"Screen scraping is a technique in which a computer program extracts
text data from the display output of another program.... The program
doing the scraping is called a screen scraper...   There are a number
of synonyms for screen scraping, including: Data scraping, data
extraction, web scraping, page scraping, web page wrapping and HTML
scraping (the last four being specific to scraping web pages)."

It basically means creating a program that automatically grabs
websites and extracting their information and putting it in a
database.  So that you can do more useful stuff with it.

HowdtheyVote does that with the parliamentary Hansards.

Daniel also referred to Dappit / Dapper.  It's a web service that
tries to make it easier to make screen scrapers.

Regexp means "regular expressions".  It's geek talk for sorting
through files of text to look fro certain things.

So Daniel is talking about downloading the webpages from Parliament,
and searching through them for specific information which he then
stores in a database.

This info is all on the Tech page on our wiki:
 http://civicaccess.ca/wiki/Tech

There's some other stuff that he's talking about - but it's a bit out
of my area.  and it's not absolutely necessary for everyone to
undertand all of it - as long as we get the gist of using tools like
scrappers to collect / liberate civic info.

Thanks to Daniel for sharing this.  Non-techies - please don't be
scared off.  We need your experience + expertise if we're ever going
to get anywhere with this stuff.



On 3/8/07, Daniel Haran <[hidden email]> wrote:

> Hello,
>
> I made some progress trying to get some information from parl.gc.ca.
> The following may only be of interest to techies...
>
> After extracting the list of MP codes from
> http://webinfo.parl.gc.ca/MembersOfParliament/MainMPsCompleteList.aspx?TimePeriod=Current&Language=E
>
> I tried getting more information from individual MP pages. Two
> scraping kits later, after REXML choked on XPath queries and various
> other tech horrors (take a look at the source... __VIEWSTATE weighs in
> at 8k, even tidy can't parse it, etc), I decided to resort to Dapper.
> E.g.:
>
> http://webinfo.parl.gc.ca/MembersOfParliament/ProfileMP.aspx?Key=78902&Language=E
> =>
> http://www.dapper.net/RunDapp?dappName=CanadianMPdetails&v=1&variableArg_0=78902
>
> I'll use some regexps to clean up what I couldn't get Dapper to
> extract, and publish the whole as a db and a RESTful web service so no
> one else ever need go through this.
>
> Let me know if this is useful and/or if anything is missing.
>
> -Daniel.
>
> On 3/6/07, Russell McOrmond <[hidden email]> wrote:
> >
> >
> > Off-topic, but might be relevant to some other later project...
> >
> > Daniel Haran wrote:
> > > * That was prompted by makepovertyhistory's latest action alert, which
> > > could have been made more effective by personalizing with MP names and
> > > phone numbers. You can see a copy of it here:
> > > http://www.makepovertyhistory.ca/e/take-action/e-alerts/2007-03-05.html
> >
> >    Sometimes it isn't a CivicAccess issue, but other timing/funding
> > issues that cause these things.  There are other reasons why that
> > specific e-alert pointed to parl.gc.ca rather than using the already
> > purchased postal-code --> EDID database.
> >
> >
> >    We have 308 MPs (plus the Minister for Public Works) with information
> > about each, and I don't know of any group that is maintaining a table of
> > information about these MPs.   While we at MPH keep the name and email
> > address updated in our database, we didn't keep the phone number or
> > constituency office updated which is one of the things we were wanting
> > people to look up.
> >
> >
> >    Does anyone know of a structured WIKI, something that would allow a
> > group to collaboratively maintain a table of information?
> >
> > --
> >   Russell McOrmond, Internet Consultant: <http://www.flora.ca/>
> >   Please help us tell the Canadian Parliament to protect our property
> >   rights as owners of Information Technology. Sign the petition!
> >   http://www.digital-copyright.ca/petition/ict/
> >
> >   "The government, lobbied by legacy copyright holders and hardware
> >    manufacturers, can pry my camcorder, computer, home theatre, or
> >    portable media player from my cold dead hands!"
> >
> > _______________________________________________
> > CivicAccess-discuss mailing list
> > [hidden email]
> > http://civicaccess.ca/mailman/listinfo/civicaccess-discuss_civicaccess.ca
> >
>
>
> --
> Change the world one loan at a time - visit Kiva.org to find out how
>
> _______________________________________________
> CivicAccess-discuss mailing list
> [hidden email]
> http://civicaccess.ca/mailman/listinfo/civicaccess-discuss_civicaccess.ca
>


Reply | Threaded
Open this post in threaded view
|

Re: postal codes campaign - next steps?

Daniel Haran
Hi all,

Thanks to Michael for translating what I did :)

Here's the initial result:
http://lokobo.com:3000/mps

Could Russell and others let me know if there is anything obviously
missing (Party affiliations aren't being recognized and the addresses
are a mess)? In what format should I make the database available? Is
anyone going to use this?

The main objective I had was to get constituency phone numbers for
each MP, since that could have been useful for Make Poverty History.
I've no affiliation - it just bugs me to see an advocacy group not
being quite as effective as they could because information is
disorganized.

And boy, is it EVER disorganized. One MP, Deepak Obhrai, doesn't even
have his constituency phone listed on the government website:
http://webinfo.parl.gc.ca/MembersOfParliament/ProfileMP.aspx?Key=78365&Language=E

I have a small number of technical issues I'd like to resolve. However
the software that did the scraping and runs the website and web
service is in the public domain on rubyforge:
http://rubyforge.org/projects/mp-ca-scraper/

(stats aren't updated on the main page, but the source IS there:)
svn checkout svn://rubyforge.org/var/svn/mp-ca-scraper/trunk scraper

Web pages are available for human and computer consumption:
http://lokobo.com:3000/mps
http://lokobo.com:3000/mps.xml

MP pages are indexed by EDID - Electoral District ID:
http://lokobo.com:3000/mps/35049
http://lokobo.com:3000/mps/35049.xml

Cheers,

Daniel.

On 3/8/07, Michael Lenczner <[hidden email]> wrote:

> In case the non computer geeks are feeling left out - here's a mini translation.
>
> In the email below - "scraping":
> "Screen scraping is a technique in which a computer program extracts
> text data from the display output of another program.... The program
> doing the scraping is called a screen scraper...   There are a number
> of synonyms for screen scraping, including: Data scraping, data
> extraction, web scraping, page scraping, web page wrapping and HTML
> scraping (the last four being specific to scraping web pages)."
>