Scrapy : Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be...
TextBlob : TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) ...
mock : A Python Mocking and Patching Library for Testing
readability-lxml : In few words, Given a html document, it pulls out the main body text and cleans it up. It also can clean up title based on latest readability.js code.
notifiers : A Python one stop shop for all notification providers with a unified and simple interface.
ScrapeGraphAI is a web scraping python library that uses LLM and direct graph logic to create scraping pipelines for websites and local documents (XML, HTML, JSON, Markdown, etc.).