cssQuery() : cssQuery() is a powerful cross-browser JavaScript function that enables querying of a DOM document using CSS selectors. All CSS1 and CSS2 selectors ar...
SpaCy.io : spaCy is a library for industrial-strength natural language processing in Python and Cython. It features state-of-the-art speed and accuracy, a concis...
notifiers : A Python one stop shop for all notification providers with a unified and simple interface.
Scrapy : Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be...
docopt : Command-line interface description language. docopt helps you define interface for your command-line app, and automatically generate parser for it.
In few words, Given a html document, it pulls out the main body text and cleans it up. It also can clean up title based on latest readability.js code.