Screen scraping of 6 websites using open source

Përfunduar Postuar Dec 26, 2008 Paguhet në dorëzim
Përfunduar Paguhet në dorëzim

I require a very simple application which scrap data from 6 different sites and create an xml output.

The program should use an open source scrapping tool which called

WebHarvest (you can find it in : [url removed, login to view])

What i need from you is a Web Harvest script files which creates variable contains the XML and a small java application which execute the script and print the XML (Example: [url removed, login to view]).

There should not be any code in the java main except running the script and sending parameters value and output the XML (all the logic and the creation of the XML will reside in the scripts)

There will be a total of 6 urls that we require web scraping. Here they are and the requirements. Each site would require its own script:

[url removed, login to view]

Takes a state as a search criteria. Returns pages of results. Each result should be converted (for all pages) should be converted into an xml file called [url removed, login to view] when the run is complete.

[url removed, login to view]

Takes a state as a search criteria. Returns results in a flash outputted view. Each result should be converted (for all pages) should be converted into an xml file called [url removed, login to view] when the run is complete.

[url removed, login to view]

Takes a state as a search criteria. Each result should be converted (for all pages) should be converted into an xml file called [url removed, login to view] when the run is complete.

[url removed, login to view]

[url removed, login to view] (list view)

Takes a zip code AND a price range. Each result should be converted (for all pages) should be converted into an xml file called [url removed, login to view] when the run is complete.

[url removed, login to view] with the real estate plugin

Takes a zip code AND a price range. Each result should be converted (for all pages) should be converted into an xml file called [url removed, login to view] when the run is complete.

Because all of these are real estate websites, you will be required to first do a post search on them in order to scrape the results. The post search query typically requires a zip code, state and/or city

The scripts should be able to be called via java code You will provide both the scripts and the java code

Inxhinieri Java MySQL PHP Arkitekturë softuerësh Testim Softuerësh Hosting Uebi Menaxhim uebsajti Testim i uebfaqeve

ID Projekti: #3498408

Rreth projektit

5 propozimet Projekti në distancë Aktiv Dec 27, 2008

Është zgjedhur fitues:

abhay78

See private message.

$212.5 USD për 14 ditë
(94 Përshtypje)
6.4

5 profesionistë freelancer dërguan një ofertë mesatare prej $315 për këtë punë

smartsallar

See private message.

$425 USD për 14 ditë
(26 Përshtypje)
5.4
brainwithstorm

See private message.

$425 USD për 14 ditë
(30 Përshtypje)
4.9
ashwinc

See private message.

$212.5 USD për 14 ditë
(4 Përshtypje)
3.6
cicdev

See private message.

$297.5 USD për 14 ditë
(3 Përshtypje)
0.0