arkOS : A project to help users self-host their websites, email, files and more. Decentralize your web and reclaim your privacy rights while keeping the conve...
Feedbin : Feedbin is an open source web based RSS reader. It provides a user interface for reading and managing feeds as well as a REST-like API for clients to ...
HTML Purifier : HTML Purifier is a standards-compliant HTML filter library written in PHP. HTML Purifier will not only remove all malicious code (better known as XSS)...
docopt : Command-line interface description language. docopt helps you define interface for your command-line app, and automatically generate parser for it.
Cliquet : Cliquet is a toolkit to ease the implementation of HTTP microservices, such as data-driven REST APIs.
TextBlob : TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) ...
pydantic : Data validation and settings management using python type annotations.
In few words, Given a html document, it pulls out the main body text and cleans it up. It also can clean up title based on latest readability.js code.