For anyone running a web site dealing with federal elections or Parliament, allowing your users to search for their electoral riding by their postal code is a key feature. As many of you know, Statscan makes available a mapping between postal codes and electoral districts, but at the tune of $2,500 for the first year and $500 for every subsequent year.
There have been many efforts to get around paying for the data. However, none were both free and able to handle postal codes that mapped to multiple electoral ridings (there are over 900 such postal codes just in Ontario).
So, I am very happy to announce a new/improved version of Daniel Haran's postal code to electoral district web service: http://postal-code-to-edid-webservice.heroku.com/
The web service returns electoral district IDs (the same used by Elections Canada) either as JSON, JSONP or CSV. It determines the districts by first querying elections.ca, which works for all postal codes that contain a single riding. If there is no match, it queries cbc.ca, which works for postal codes that contain multiple ridings. The code is also capable of querying the web sites of the major parties and a few others.
If anyone has any questions/comments/feature requests, please let me know! James |
way to go Daniel!
On Sun, Apr 17, 2011 at 10:31 AM, James McKinney <[hidden email]> wrote:
-- Tracey P. Lauriault 613-234-2805 |
For clarification, Daniel was the original author, but I made the new version :)
On Sun, Apr 17, 2011 at 10:51 AM, Tracey P. Lauriault <[hidden email]> wrote: way to go Daniel! |
Congrats James, great work. You should ping the people at catch22.ca and swing33.ca - they would probably be interested.
On 2011-04-17, at 10:54 AM, James McKinney wrote: For clarification, Daniel was the original author, but I made the new version :) |
Ditto to what Jon and Tracey said - this is great. On 2011-04-17, at 2:31 PM, Jonathan Brun <[hidden email]> wrote:
|
Fantastic!
Quick suggestion: add to the docs a link to a page mapping EDIDs to riding names, e.g. http://elections.ca/content.aspx?section=res&dir=cir/list&document=index&lang=e#list. And maybe stick a CSV file with edid/name_en/name_fr/province into the repository?
On Sun, Apr 17, 2011 at 3:04 PM, Bernard Rudny <[hidden email]> wrote:
|
In reply to this post by James McKinney
On Sun, Apr 17, 2011 at 10:31 AM, James McKinney
<[hidden email]> wrote: > The web service returns electoral district IDs (the same used by Elections > Canada) either as JSON, JSONP or CSV. It determines the districts by first > querying elections.ca, which works for all postal codes that contain a > single riding. If there is no match, it queries cbc.ca, which works for > postal codes that contain multiple ridings. The code is also capable of > querying the web sites of the major parties and a few others. It is great that people continue to push forward with this. I would caution that screen scraping brings with it potential legal liability. You may be in violation of the terms of use of the website, and thus unauthorized access may be claimed. This also doesn't really get around the copyright issue, any more than downloading a movie via P2P where "non-substantial parts" come from a variety of sources rather than streaming where it comes from one source. In this case we are talking about facts as individual things (IE: that this postal code maps to this set of districts), but we do have copyright on collections in Canada which could theoretically come into play. I'm not wanting to discourage this -- I just want to ensure that people are aware of any potential risks. Those who have done screen scraping (and that is what I did at first on my site) are partly relying on the fact that lawsuits are unlikely as it would lead to bad publicity. But not sure if that is the safest assumption for us to always have. I'm hoping we will have a long-term solution to this by 2013 when I believe the next boundary redrawing based on the census are done. We do have the shapes for the electoral districts, but have not yet been able to get data out of Canada Post -- this should be open data, and not something that has either terms of use, copyright, or $$ issues with it. In case anyone didn't see it, I'm still hosting a sample letter to MPs on this issue. Not useful at the moment as the MP list is updated now that election was called, but something we should keep in the back of our minds. http://creform.ca/edid/letter5 -- Russell McOrmond, Internet Consultant: <http://www.flora.ca/> Please help us tell the Canadian Parliament to protect our property rights as owners of Information Technology. Sign the petition! http://creform.ca/petition/ict/ "The government, lobbied by legacy copyright holders and hardware manufacturers, can pry my camcorder, computer, home theatre, or portable media player from my cold dead hands!" |
On Sun, Apr 17, 2011 at 6:05 PM, Russell McOrmond <[hidden email]> wrote:
I would caution that screen scraping brings with it potential legal Thanks, Russell. I have a copy of a fairly lengthy opinion by David Fewer of CIPPIC, which was prepared for another person who scraped two web sites for postal-code-to-MP data. It concludes that "there is probably no copyright infringement, breach of contract, trespass to chattels, private nuisance, public nuisance, or an actionable appropriation." My API is even less aggressive than that one, as it only scrapes other web sites when a request is made to the API. Daniel Haran also sought a legal opinion (not sure who from), who agreed that it is probably legal. But, we can't be sure how things will play out, and I agree it is important to understand the risks.
|
In reply to this post by Michael Mulley
On Sun, Apr 17, 2011 at 4:44 PM, Michael Mulley <[hidden email]> wrote:
Fantastic! Done! I am reluctant to add riding names to the API as these change at unpredictable times, but I don't mind sticking in static files and links. MS Excel is notoriously bad at opening CSVs with UTF-8 characters (French accents, etc.), so I added a UTF-16LE-encoded TSV (tab-separated value), too.
|
In reply to this post by James McKinney
Is the opinion from David private advice, or something he published? It being published may also be useful as the relevant government agencies and other entities may start by looking at that before (and hopefully instead of) launching any legal action. -- On Apr 17, 2011 6:38 PM, "James McKinney" <[hidden email]> wrote:
> On Sun, Apr 17, 2011 at 6:05 PM, Russell McOrmond <[hidden email] >> wrote: > >> I would caution that screen scraping brings with it potential legal >> liability. You may be in violation of the terms of use of the >> website, and thus unauthorized access may be claimed. >> >> This also doesn't really get around the copyright issue, any more >> than downloading a movie via P2P where "non-substantial parts" come >> from a variety of sources rather than streaming where it comes from >> one source. >> >> In this case we are talking about facts as individual things (IE: >> that this postal code maps to this set of districts), but we do have >> copyright on collections in Canada which could theoretically come into >> play. >> > > Thanks, Russell. I have a copy of a fairly lengthy opinion by David Fewer of > CIPPIC, which was prepared for another person who scraped two web sites for > postal-code-to-MP data. It concludes that "there is probably no copyright > infringement, breach of contract, trespass to chattels, private nuisance, > public nuisance, or an actionable appropriation." My API is even less > aggressive than that one, as it only scrapes other web sites when a request > is made to the API. Daniel Haran also sought a legal opinion (not sure who > from), who agreed that it is probably legal. But, we can't be sure how > things will play out, and I agree it is important to understand the risks. |
In reply to this post by James McKinney
While the topic is open, I have a question about legal/copyright
infringement on data scraped from website: does anybody know if
there are some examples of gov or firm suing people for scraping
data available from websites ? If yes, with which consequence ?
In the end, with people like Montreal Ouvert, we are trying more and more to show examples of what could be done by reusing data available on website but we rarely ask ourselves if we go against any legal thing. Steph Le 11-04-17 18:37, James McKinney a écrit : On Sun, Apr 17, 2011 at 6:05 PM, Russell McOrmond <[hidden email]> wrote: |
2011/4/17 Stéphane Guidoin <[hidden email]>
The United States and Canada differ on copyright law, in any case. "Sweat of the brow" is a legal concept that is sometimes used to determine whether a dataset is copyrightable - basically, how hard was it to make the dataset http://en.wikipedia.org/wiki/Sweat_of_the_brow It is a legally untested concept in Canada, to my knowledge. In the US, I believe sweat of the brow has been rejected. Also, in some jurisdictions, "sweat of the brow" means that I can take a copyrighted dataset, and if I sufficiently sweat over it, I get copyright over my "new" dataset.
|
In reply to this post by Russell McOrmond-3
It was posted to the Files section of Visible Government's Google group, before Google removed all Files sections.
http://www.scribd.com/doc/53218667/2005-04-05-CIPPIC-Paul-Schreiber
On Sun, Apr 17, 2011 at 7:12 PM, Russell McOrmond <[hidden email]> wrote:
|
In reply to this post by Stéphane Guidoin
Stéphane, I'm not aware of any Canadian cases directly on point for scraping data (though there very well may be one I don't know about). However, as far as the copyright component, there's an 1982 Federal Court case that's relevant -- check out Tele-Direct (Publications) Inc. v. American Business Information, Inc., [1982] 2 F.C. 22.
In this case, Tele-Direct created and sold compilations of subscriber information that was originally from bell. The court held there was no copyright violation, as the compilation was not "original". Keep in mind that, in Canada, there's no copyright in databases themselves (unlike in the EU). As long as the data itself that you're collecting is actually data (and not original works in themselves), you're likely okay on that front. However, Terms of Use are a different story. This is a contract that you make with the website by using it: if the terms are clear that you cannot scrape from their site, you're breaching the contract by doing so. The liability isn't as strong in this context as it is for copyright violations (there's no statutory damages), but you should particularly watch out if your commercializing the data you collect. In any case, I'm sure the legal opinion by David Fewer that James mentioned is a lot more detailed, so you might want to also check that out.... Kent NOTE: All information on this e-mail is general information and should in no way be construed as legal advice. I do not provide any legal advice over mailing lists and readers should consult with their lawyer for advice applicable to their particular factual situations. 2011/4/18 Stéphane Guidoin <[hidden email]>
|
Free forum by Nabble | Edit this page |