There is a Book "Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL" on this topic - see a review here
PHP-Architect covered it in a well written article in the December 2007 Issue by Matthew Turland
There is a Book "Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL" on this topic - see a review here
PHP-Architect covered it in a well written article in the December 2007 Issue by Matthew Turland
Scraping generally encompasses 3 steps:
My Favorite program for working with RegExs is Regex Buddy. I would advise you to try the demo of that product even if you have no intention of buying it. It is an invaluable tool and will even generate code for your regexs you make in your language of choice (including php).
Usage: