web scraper (email extractor) (ID:1769)
Project Creator: |
bioeducation
FC Member For 6742 Days
Credits 20 Completed Proj. Num. 0 / 4 Total payment USD Avg Daily Online 0.00 h (From 21/5/2007) Available on MSN/Skype No Last Login 1/28/2007 Peers Rating 0.00% ![]() ![]() ![]() |
---|---|
Budget: | Less than 250 |
Created: | 1/12/2007 6:20:27 PM EST |
Bidding Ends: | 1/19/2007 6:20:27 PM EST ( Expired ) |
Development Cycle: | 21 Days |
Bid Count: | 8
|
Average Bid: | 225.00 |
Project Description:
Hi, I need a very simple web scraper to collect emails from a phone directory in Australia. visit this link for an example: http://www.yellowpages.com.au/search/postCategorySearch.do?headingCode=10774&bookId=21&businessType=massage&areaId=1064&locationClue=brisbane&stateId=4 I need to be able to copy a URL like that one above from the directory, paste it into a search box in your scraper program, then have the scraper open the URL link to each of the listings, and extract the email address on that page. The email addresses are not prefixed with the usual "mailto:" tag to indentify the email address, they are prefixed with id="emailBusinessLink" onclick="emailWin(this.href); return false;"> Have a look at the source code for one of the listings in the directory link I've pasted above and you will see what I mean. Ideally, I would like the program to automatically visit all the links listed on the page and get their email addresses, then go to page 2, do the same, go to page 3 and do the same, and so on. Save all the email addresses to a text file (and put a function where I choose what name the file will be saved as). That's it. You might be able to simply adapt a web scraper script from somewhere online to search for the tag I've given you instead of the "mailto:" tag which they all seem to use. If you can make the program really simple, small, fast, easy to use, that would be great - it doesn't have to look good, just do the job well, first time. I'm using Windows on my computer - you can build the program using any language you like, as long as it can run on windows without much memory. I'd like the program completed within about 2 weeks after bidding time finishes, if possible. |
|
Job Type | |
Attached Files: | N/A |