Chromeless : Chrome automation made simple. Runs locally or headless on AWS Lambda.
Chromeless can be used to...
Run 1000s of browser integration tests in paralle...
PHP_CodeSniffer : PHP_CodeSniffer tokenises PHP, JavaScript and CSS files and detects violations of a defined set of coding standards.
Colly : Colly is a Go framework that provides a clean interface to write any kind of crawler/scraper/spider
With Colly you can easily extract structured data ...
Goutte is a screen scraping and web crawling library for PHP.
Goutte provides a nice API to crawl websites and extract data from the HTML/XML responses.