Scrapy : Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be...
readability-lxml : In few words, Given a html document, it pulls out the main body text and cleans it up. It also can clean up title based on latest readability.js code.
Upton : Upton is a framework for easy web-scraping with a useful debug mode that doesn't hammer your target's servers. It does the repetitive parts of writing...
Kartograph : The Kartograph map generator has just one method, which is there to generate SVG maps (surprise).
scikit-learn : Machine Learning in Python
Simple and efficient tools for data mining and data analysis
Accessible to everybody, and reusable in various contexts
Buil...
ScrapeGraphAI is a web scraping python library that uses LLM and direct graph logic to create scraping pipelines for websites and local documents (XML, HTML, JSON, Markdown, etc.).