MPs by postal code

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

MPs by postal code

Daniel Haran
All right, since a few people said they would actually use this, I
decided to try extracting the list of MPs by postal code.

Tools like the lobby module in Drupal try to extract the information
from the page - an arduous task since the web page is a *mess*.
Already having the list of MPs, I only needed to extract the email
address.

If I have time this week-end, it will go inside the mp-scraper, with a
REST interface for anyone to use. Here's the code for you geeks:

require 'rubygems'
require 'hpricot'
require 'open-uri'

#postal_code = 'A1A 1A1'.gsub(/ /, '')
postal_code = 'H1T4C6'.gsub(/ /, '')

doc = Hpricot(open('http://www.parl.gc.ca/information/about/people/house/PostalCode.asp?Language=E&txtPostalCode='+postal_code))

emails = (doc/"h4").select {|e| e.innerHTML =~ /Parliament/}.collect
{|e| e.next_sibling.innerHTML.match(/ (.*@parl\.gc\.ca)/)[1]}

mps = Mp.find_all_by_email(emails)

Sample output on H1T4C6:
>>emails
=> ["[hidden email]", "[hidden email]",
"[hidden email]", "[hidden email]",
"[hidden email]"]

Cheerio,

Daniel.

--
Change the world one loan at a time - visit Kiva.org to find out how


Reply | Threaded
Open this post in threaded view
|

Re: MPs by postal code

Daniel Haran
MP search by postal code is now live:

http://lokobo.com:3000/

e.g.:
http://lokobo.com:3000/mps;search?postal_code=A1A1A1
http://lokobo.com:3000/mps;search?postal_code=H1T4C6

Results are 'cached' - a copy of each search is copied into a database
to avoid having to retrieve the information again. Cached searches are
about 100 times faster, so the more people use this service, the
better it is for everyone :)

Please let me know if you use it and/or encounter any problems. Thanks!

-Daniel.


Reply | Threaded
Open this post in threaded view
|

Re: MPs by postal code

David Akin
Fabulous work, Daniel!

Not that there are a lot of political reporters who are handy with a
database, but for those of us that are, this should be a helpful
little dataset for all sorts of projects as we approach a possible
federal election.

Thanks!

On 3/24/07, Daniel Haran <[hidden email]> wrote:

> MP search by postal code is now live:
>
> http://lokobo.com:3000/
>
> e.g.:
> http://lokobo.com:3000/mps;search?postal_code=A1A1A1
> http://lokobo.com:3000/mps;search?postal_code=H1T4C6
>
> Results are 'cached' - a copy of each search is copied into a database
> to avoid having to retrieve the information again. Cached searches are
> about 100 times faster, so the more people use this service, the
> better it is for everyone :)



--
David Akin
-------------------
http://www.davidakin.com


Reply | Threaded
Open this post in threaded view
|

Re: MPs by postal code

Hugh McGuire
Daniel: superb. Is the database open (& legal!) for all to use? ...

David: the idea (I hope) is that other citizen groups etc can build  
on that database to make more user-friendly tools. 1st step is  
getting the data available in a open format; 2nd step is using it in  
creative & useful ways.




On Mar 24, 2007, at 4:15 PM, David Akin wrote:

> Fabulous work, Daniel!
>
> Not that there are a lot of political reporters who are handy with a
> database, but for those of us that are, this should be a helpful
> little dataset for all sorts of projects as we approach a possible
> federal election.
>
> Thanks!
>
> On 3/24/07, Daniel Haran <[hidden email]> wrote:
>> MP search by postal code is now live:
>>
>> http://lokobo.com:3000/
>>
>> e.g.:
>> http://lokobo.com:3000/mps;search?postal_code=A1A1A1
>> http://lokobo.com:3000/mps;search?postal_code=H1T4C6
>>
>> Results are 'cached' - a copy of each search is copied into a  
>> database
>> to avoid having to retrieve the information again. Cached searches  
>> are
>> about 100 times faster, so the more people use this service, the
>> better it is for everyone :)
>
>
>
> --
> David Akin
> -------------------
> http://www.davidakin.com
>
> _______________________________________________
> CivicAccess-discuss mailing list
> [hidden email]
> http://civicaccess.ca/mailman/listinfo/civicaccess- 
> discuss_civicaccess.ca



Reply | Threaded
Open this post in threaded view
|

Re: MPs by postal code

Daniel Haran
On 3/24/07, Hugh McGuire <[hidden email]> wrote:
> Daniel: superb. Is the database open (& legal!) for all to use? ...

Thanks! As far as I'm concerned, yes. As for the government, I hope so.

:)

The list of MPs clearly ought to be in the public domain. I'm only
caching results from the postal code lookups, and not attempting to
build a complete database, so I believe I'm in the clear for that too.

Daniel.


Reply | Threaded
Open this post in threaded view
|

Re: MPs by postal code

Hugh McGuire
> I'm only
> caching results from the postal code lookups, and not attempting to
> build a complete database, so I believe I'm in the clear for that too.
OK so the project to build an open, free database of ridings v postal  
codes is still desirable?






Reply | Threaded
Open this post in threaded view
|

Re: MPs by postal code

Cory Horner
In reply to this post by Daniel Haran

>> Daniel: superb. Is the database open (& legal!) for all to use? ...
>>    
> Thanks! As far as I'm concerned, yes. As for the government, I hope so.
>
> :)
>
> The list of MPs clearly ought to be in the public domain. I'm only
> caching results from the postal code lookups, and not attempting to
> build a complete database, so I believe I'm in the clear for that too.
>  

It would be nice if we could use this data, but I don't think that this
is legal.  Given that the data costs thousands to acquire, scraping it
from their website would not be considered fair play.  Perhaps we should
contact the site and ask for a clarification on their use restrictions.

In actuality, we should be lobbying for the release of this data under
more accessible licensing terms from the people that make it.  This was
our original intent, no?

Cory.


Reply | Threaded
Open this post in threaded view
|

Re: MPs by postal code

Robin Millette
On 3/24/07, Cory Horner <[hidden email]> wrote:

> It would be nice if we could use this data, but I don't think that this
> is legal.  Given that the data costs thousands to acquire, scraping it
> from their website would not be considered fair play.  Perhaps we should
> contact the site and ask for a clarification on their use restrictions.

I tend to think this is ok. The data isn't copyright itself, only it's
disposition and grouping and layout, and we're not using that. I would
be comfortable using this new database and sharing it with anyone, I
would suggest making it public domain for now even and take some more
time to come up with a licence to use.

> In actuality, we should be lobbying for the release of this data under
> more accessible licensing terms from the people that make it.  This was
> our original intent, no?

I think Canadians can do both. If we can get it officially, then
great. It's also why we need to explore licencing issues a bit more.

--
Robin 'oqp' Millette : http://rym.waglo.com/
Bande-Passante : http://bande-passante.info/
SQIL 2007 : http://2007.sqil.info/


Reply | Threaded
Open this post in threaded view
|

Re: MPs by postal code

Russell McOrmond-2
In reply to this post by Hugh McGuire
Hugh McGuire wrote:
>> I'm only
>> caching results from the postal code lookups, and not attempting to
>> build a complete database, so I believe I'm in the clear for that too.
> OK so the project to build an open, free database of ridings v postal  
> codes is still desirable?

   Yes, this is still needed.  Screen scraping is a very ugly kludge
that doesn't solve the underlying technical, legal or political problems.

   Elections Canada deliberately tries to break screen scraping, and has
randomly changes the method they use over recent years to kill screen
scraping tools (Drupal's Lobby module, the ECTOOLS tool that I used in
the past, etc).


http://sourceforge.net/projects/campaigntoolz/

   ECTools was a PHP system which used XML-RPC to split a caching server
that would screen scrape from a small client which would do a database
lookup.  The idea was to have many sites using the client, and one site
running the caching server.  This never really worked well as the screen
scraping kept breaking and it was very hard to keep it up-to-date.
http://campaigntoolz.cvs.sourceforge.net/campaigntoolz/ectools/



   The parl.gc.ca site won't work during the election, or at least it
was shut down in the past.  This would work if there was a secondary
database that converted the incumbent MP link into an Electoral District
that could then be used as the index against the current candidates
database.  Expect parl.gc.ca to follow elections.ca to shutting things
down if screen scraping becomes common.


   We need this information to be released directly with a clear open
license so that it can be shared and imported without the problems that
screen scraping has (IE: may work this moment, may be dead a minute from
now).

   While IANAL, I believe this screen scraping is a clear copyright
infringement, but one where the copyright holder is quite unlikely to
sue for infringement.  The bad politics of Elections Canada or the
Library of Parliament sueing someone for screen scraping this data could
even be a win for us as the data is then made public legally.


   One-time screen scraping like the collection of the contact and other
information for MPs is different in that we can scrape, verify the data,
and publish the results as has been done.  We don't need to rely on the
parl.gc.ca site letting us in tomorrow as we already have the relevant
information today.

Note: During elections Elections Canada releases a database of all
candidates, which is what makes sites that list candidates so easy.   It
is unfortunate that parl.gc.ca doesn't already do this for sitting MPs
(more reliable than screen scraping), and that the postal code database
needs to be released.


Some interesting stuff from Elections Canada to be aware of, especially
if we are heading into an election (Possibly over the Clean Air Act
since the budget won't be an issue).


  Final List of Confirmed Candidates – 39th General Election  (This is
live updated during the election, and can be imported directly into a
database)
http://www.elections.ca/content.asp?section=pas&document=index&dir=39ge/loc&lang=e&textonly=false

   Here is a tool I wrote to allow people to browse this database (with
additions of websites/email contact that was done by our community
during the election)
http://www.digital-copyright.ca/election2006/candidates


Official Voting Results of the 39th General Election – Poll-by-Poll
Results – Raw Data
http://www.elections.ca/scripts/resval/ovr_39ge.asp?prov=&lang=e

...

--
  Russell McOrmond, Internet Consultant: <http://www.flora.ca/>
  Please help us tell the Canadian Parliament to protect our property
  rights as owners of Information Technology. Sign the petition!
  http://www.digital-copyright.ca/petition/ict/

  "The government, lobbied by legacy copyright holders and hardware
   manufacturers, can pry my camcorder, computer, home theatre, or
   portable media player from my cold dead hands!"


Reply | Threaded
Open this post in threaded view
|

Re: MPs by postal code

Tracey P. Lauriault-2
In reply to this post by Daniel Haran
i just tried and it is wonderful!

The only issue i can think of, it the look, feel and findability.  
General or novice web users may not find this service and may not be
able to relate to the tool because of how it looks.  It is perfect for
most of us on this list but perhaps not for the general public.  I can
see this tool being very powerful particularly if there are elections
coming up.

I will send it around to some advocacy groups who may also find it useful.

Cheers
t

Daniel Haran wrote:

> MP search by postal code is now live:
>
> http://lokobo.com:3000/
>
> e.g.:
> http://lokobo.com:3000/mps;search?postal_code=A1A1A1
> http://lokobo.com:3000/mps;search?postal_code=H1T4C6
>
> Results are 'cached' - a copy of each search is copied into a database
> to avoid having to retrieve the information again. Cached searches are
> about 100 times faster, so the more people use this service, the
> better it is for everyone :)
>
> Please let me know if you use it and/or encounter any problems. Thanks!
>
> -Daniel.
>
> _______________________________________________
> CivicAccess-discuss mailing list
> [hidden email]
> http://civicaccess.ca/mailman/listinfo/civicaccess-discuss_civicaccess.ca
>
>
>  



Reply | Threaded
Open this post in threaded view
|

Re: MPs by postal code

Daniel Haran
Hi all,

Thanks all for the great feedback.

The site as is only demonstrates data access and redistribution under
various formats. My hope is that others will use it in their projects
and offer a foretaste of what civic benefits we could get from freeing
this data. Please get in touch with me if you need help integrating
this, or have any questions.

There are a few ways in which this data can be obtained legally.
->Government finally stops duplicating its work and releases the data
in the public domain.

By far the best alternative, although it's going to take an organized
(and louder) political effort to get them to notice us.

->Canada Post could allow derivative products.

Ages ago I inquired with Canada Post about using one of their data
products and was told they did not see a web service as
redistribution, which significantly changes the licensing cost. Would
someone check with them? A polygon file of all the postal codes would
be enough to generate the correspondence database since the MP
polygons are already public domain.

-> StatsCan allows a web service to operate under a single license

Just as Canada Post didn't count a web service as redistribution,
maybe they'll compromise.

---

My preference would be for government to wake up and smell the roses.
In the meantime, if enough groups were to rely on a web service
perhaps we could perhaps obtain a common license as a stop-gap
measure.

While others pursue rigorously legal avenues, I'm happy to work with
code and reveal the grey areas. I believe both roles are needed at
this stage.

Daniel.

On 3/26/07, Tracey P. Lauriault <[hidden email]> wrote:

> i just tried and it is wonderful!
>
> The only issue i can think of, it the look, feel and findability.
> General or novice web users may not find this service and may not be
> able to relate to the tool because of how it looks.  It is perfect for
> most of us on this list but perhaps not for the general public.  I can
> see this tool being very powerful particularly if there are elections
> coming up.
>
> I will send it around to some advocacy groups who may also find it useful.
>
> Cheers
> t


Reply | Threaded
Open this post in threaded view
|

Re: MPs by postal code

Tracey P. Lauriault-2
I understand better! Thanks!
I sent it out to some advocacy groups this morning and well see what
they do and say!

Daniel Haran wrote:

> Hi all,
>
> Thanks all for the great feedback.
>
> The site as is only demonstrates data access and redistribution under
> various formats. My hope is that others will use it in their projects
> and offer a foretaste of what civic benefits we could get from freeing
> this data. Please get in touch with me if you need help integrating
> this, or have any questions.
>
> There are a few ways in which this data can be obtained legally.
> ->Government finally stops duplicating its work and releases the data
> in the public domain.
>
> By far the best alternative, although it's going to take an organized
> (and louder) political effort to get them to notice us.
>
> ->Canada Post could allow derivative products.
>
> Ages ago I inquired with Canada Post about using one of their data
> products and was told they did not see a web service as
> redistribution, which significantly changes the licensing cost. Would
> someone check with them? A polygon file of all the postal codes would
> be enough to generate the correspondence database since the MP
> polygons are already public domain.
>
> -> StatsCan allows a web service to operate under a single license
>
> Just as Canada Post didn't count a web service as redistribution,
> maybe they'll compromise.
>
> ---
>
> My preference would be for government to wake up and smell the roses.
> In the meantime, if enough groups were to rely on a web service
> perhaps we could perhaps obtain a common license as a stop-gap
> measure.
>
> While others pursue rigorously legal avenues, I'm happy to work with
> code and reveal the grey areas. I believe both roles are needed at
> this stage.
>
> Daniel.
>
> On 3/26/07, Tracey P. Lauriault <[hidden email]> wrote:
>  
>> i just tried and it is wonderful!
>>
>> The only issue i can think of, it the look, feel and findability.
>> General or novice web users may not find this service and may not be
>> able to relate to the tool because of how it looks.  It is perfect for
>> most of us on this list but perhaps not for the general public.  I can
>> see this tool being very powerful particularly if there are elections
>> coming up.
>>
>> I will send it around to some advocacy groups who may also find it useful.
>>
>> Cheers
>> t
>>    
>
> _______________________________________________
> CivicAccess-discuss mailing list
> [hidden email]
> http://civicaccess.ca/mailman/listinfo/civicaccess-discuss_civicaccess.ca
>
>
>  



Reply | Threaded
Open this post in threaded view
|

Re: MPs by postal code

Russell McOrmond-2
In reply to this post by Robin Millette
Robin Millette wrote:
> I tend to think this is ok. The data isn't copyright itself, only
> it's disposition and grouping and layout, and we're not using that.

   Please be careful here.  If you take source material that is under
copyright, and manipulate it such that it is no longer the same work
(Remix, etc), this doesn't mean that the new work is no longer a
copyright infringement.

   In Canada, "original" databases are protected under copyright, but
the Federal Court of Appeal has held that "non-original" databases are
not protected.  There was discussion as part of the Section 92 report
and the 2001 consultation about whether non-original databases should
also receive the copyright monopoly.

http://strategis.ic.gc.ca/epic/site/crp-prda.nsf/en/rp00872e.html#A1_4

   Database protection was not a big part of the process, given larger
issues such as legal protection for TPMs and other digital issues were
seen as more critical.


   The lines aren't always black-and-white, although I think in this case
if there was a court case that we would easily loose as the cache and
the use of the screen scraping isn't even a remixing but a simple use of
the Crown Copyright database of postal-code to MPs.  IANAL, TINLA, but I
want people to be very careful and not blindly believe that screen
scraping and remixing avoids any copyright questions.

> I would be comfortable using this new database and sharing it with
> anyone, I would suggest making it public domain for now even and take
> some more time to come up with a licence to use.

   Dedicating things to the public domain will also make it easier to win
in the "court of public opinion", where applying a strong license (such
as a CopyLeft/ShareAlike) could backfire.


--
  Russell McOrmond, Internet Consultant: <http://www.flora.ca/>
  Please help us tell the Canadian Parliament to protect our property
  rights as owners of Information Technology. Sign the petition!
  http://www.digital-copyright.ca/petition/ict/

  "The government, lobbied by legacy copyright holders and hardware
   manufacturers, can pry my camcorder, computer, home theatre, or
   portable media player from my cold dead hands!"


Reply | Threaded
Open this post in threaded view
|

Re: MPs by postal code

Russell McOrmond-2
In reply to this post by Daniel Haran
Daniel Haran wrote:
> Ages ago I inquired with Canada Post about using one of their data
> products and was told they did not see a web service as
> redistribution, which significantly changes the licensing cost.


   I am wondering if someone has the time to do the footwork to check
with Statistics Canada on the PCFRF file.  If it turns out that a web
service would not be a problem, then I can talk to the people at Make
Poverty History about setting up a web service for this type of thing.
I'd envision some XML-RPC type of service, similar to the ECTOOLS
scripts at http://sourceforge.net/projects/campaigntoolz/ .

   They may be willing in exchange for credit for running the web
service, which can hopefully drive more 'human' traffic to their site
that would then generate more letters to MPs/etc.


(I would do it, but I'm a bit over-booked for the next little while..)
--
  Russell McOrmond, Internet Consultant: <http://www.flora.ca/>
  Please help us tell the Canadian Parliament to protect our property
  rights as owners of Information Technology. Sign the petition!
  http://www.digital-copyright.ca/petition/ict/

  "The government, lobbied by legacy copyright holders and hardware
   manufacturers, can pry my camcorder, computer, home theatre, or
   portable media player from my cold dead hands!"


Reply | Threaded
Open this post in threaded view
|

Re: MPs by postal code

Robin Millette
In reply to this post by Russell McOrmond-2
On 3/26/07, Russell McOrmond <[hidden email]> wrote:

>    The lines aren't always black-and-white, although I think in this case
> if there was a court case that we would easily loose as the cache and
> the use of the screen scraping isn't even a remixing but a simple use of
> the Crown Copyright database of postal-code to MPs.  IANAL, TINLA, but I
> want people to be very careful and not blindly believe that screen
> scraping and remixing avoids any copyright questions.

Oh, I support 100% the notion that laws should be changed, etc.
Scraping is just a stopgap measure. I really think we need to do both,
that is, to get the word out why it's important we have legitimate
access to this data. That's the main mission of COACID, no?

--
Robin 'oqp' Millette : http://rym.waglo.com/
Bande-Passante : http://bande-passante.info/
SQIL 2007 : http://2007.sqil.info/


Reply | Threaded
Open this post in threaded view
|

Re: MPs by postal code

Tracey P. Lauriault-2
In reply to this post by Daniel Haran
absolutely!

Tracey P. Lauriault
Geomatics and Cartographic Research Centre
Department of Geography and Environmental Studies
Carleton University
Ottawa (ON) K1S 5B6 Canada
[hidden email]
https://gcrc.carleton.ca/confluence/display/GCRCWEB/Lauriault

On Tue Mar 27 11:16 , 'Robin Millette' sent:

On 3/26/07, Russell McOrmond <<a href="javascript:top.opencompose('russell@flora.ca','','','')">russell@...> wrote:

> The lines aren't always black-and-white, although I think in this case
> if there was a court case that we would easily loose as the cache and
> the use of the screen scraping isn't even a remixing but a simple use of
> the Crown Copyright database of postal-code to MPs. IANAL, TINLA, but I
> want people to be very careful and not blindly believe that screen
> scraping and remixing avoids any copyright questions.

Oh, I support 100% the notion that laws should be changed, etc.
Scraping is just a stopgap measure. I really think we need to do both,
that is, to get the word out why it's important we have legitimate
access to this data. That's the main mission of COACID, no?

--
Robin 'oqp' Millette : http://rym.waglo.com/
Bande-Passante : http://bande-passante.info/
SQIL 2007 : http://2007.sqil.info/

_______________________________________________
CivicAccess-discuss mailing list
<a href="javascript:top.opencompose('CivicAccess-discuss@civicaccess.ca','','','')">CivicAccess-discuss@...
http://civicaccess.ca/mailman/listinfo/civicaccess-discuss_civicaccess.ca

Reply | Threaded
Open this post in threaded view
|

Re: MPs by postal code

Russell McOrmond-2
In reply to this post by Robin Millette

Robin Millette wrote:
> Oh, I support 100% the notion that laws should be changed, etc.
> Scraping is just a stopgap measure. I really think we need to do both,
> that is, to get the word out why it's important we have legitimate
> access to this data. That's the main mission of COACID, no?

   Some of us are not in the position to make use of stopgap measures
that possibly push the legal envelope.  For instance, I think I would
have a hard time going to the coalition members of Make Poverty History
with a proposal to use data that might (or might not) infringe Crown
Copyright.  Given many of their members receiving government funding,
they wouldn't be interested in participating in pushing that envelope.


   In the case of Elections Canada releasing their version of the postal
code-->EDID mapping, no laws need to change.  This is a simple policy
decision on their part to make this database consistent with the public
releasing that they are already doing for things like detailed election
results (after elections) and candidate lists (during elections).

   Note that I believe that the PCFRF product from Statistics Canada may
be a distraction for our goal.  Whether Statistics Canada has this data
or not as part of a larger data product shouldn't discourage Elections
Canada from freely releasing this information.



BTW: Those who wanted to investigate possible inconsistencies in the
data between government agencies may want to look at the additional
PCFRF query results I added to
http://www.digital-copyright.ca/node/1607#comment-1667

I provide examples of postal codes that map to multiple electoral
district.  There are 2 of these that have 6 matches, 11 that have 5
matches, and I provide a sample of 10 that have between 2 and 4 matches.

--
  Russell McOrmond, Internet Consultant: <http://www.flora.ca/>
  Please help us tell the Canadian Parliament to protect our property
  rights as owners of Information Technology. Sign the petition!
  http://www.digital-copyright.ca/petition/ict/

  "The government, lobbied by legacy copyright holders and hardware
   manufacturers, can pry my camcorder, computer, home theatre, or
   portable media player from my cold dead hands!"