Login  Register

Re: Scraping election results..

Posted by Russell McOrmond-2 on Oct 15, 2008; 7:03pm
URL: http://civicaccess.416.s1.nabble.com/Scraping-election-results-tp1344p1353.html

Robin Millette wrote:
> I didn't include URLs, perhaps I should have...

   Thanks for the sample -- I've not seen SimpleXMLElement used before.

Here is the foreach() loop, grabbing the URL's.

foreach ($z as $a) {
         $data = array();
         $data[] = trim((string)$a->td[0], "\n");
         $data[] = trim((string)$a->td[1]->a, "\n");
         $temp=($a->td[1]->a->attributes());
         $data[] = trim($temp['href']);
         $data[] = trim((string)$a->td[2]->a, "\n");
         $temp=($a->td[2]->a->attributes());
         $data[] = trim($temp['href']);
         $data[] = trim((string)$a->td[3], "\n");
         $data[] = trim((string)$a->td[4], "\n");

     $data2 = array_map('quotes', $data);
     echo join(',', $data2) . "\n";

}


   Enough playing around for today.  What I'm wanting to do is have
something handy which I can point at pages like this that will
automatically grab the pages at the URL's and parse them as well to add
data like the EDID, email, phone, website information for MPs.

--
  Russell McOrmond, Internet Consultant: <http://www.flora.ca/>
  Please help us tell the Canadian Parliament to protect our property
  rights as owners of Information Technology. Sign the petition!
  http://www.digital-copyright.ca/petition/ict/

  "The government, lobbied by legacy copyright holders and hardware
   manufacturers, can pry my camcorder, computer, home theatre, or
   portable media player from my cold dead hands!"