| The fast, flexible, and elegant library for parsing and manipulating HTML a... | Pushed a day ago 138 contributors Created 13 years ago | 28.7k |
| The scalable web scraping and crawling library for JavaScript/Node.js. Enab... | Pushed 2 days ago 95 contributors Created 8 years ago | 15.7k |
| Extract the Readable Content from an HTML Document | Pushed a month ago 80 contributors Created 10 years ago | 9.02k |
| Extract meaningful content from the chaos of a web page | Pushed 2 years ago 57 contributors Created 8 years ago | 5.46k |
| A command-line tool to turn web pages into readable PDF, EPUB, HTML, or Mar... | Pushed 15 days ago 20 contributors Created 6 years ago | 4.32k |
| A Node.js scraper for humans. | Pushed 7 days ago 20 contributors Created 9 years ago | 4.01k |
| Get unified metadata from websites using Open Graph, Microdata, RDFa, Twitt... | Pushed a day ago 39 contributors Created 8 years ago | 2.35k |
| Extract main article, main image and meta data from URL | Pushed 12 days ago 16 contributors Created 9 years ago | 1.6k |
| Download website to local directory (including all css, images, js, etc.) | Pushed 2 months ago 16 contributors Created 10 years ago | 1.57k |
| A super simple site crawler and broken link checker | Pushed 23 days ago 26 contributors Created 6 years ago | 1.04k |
| Metadata scraper with support for oEmbed, Twitter Cards and Open Graph Prot... | Pushed 9 months ago 22 contributors Created 8 years ago | 480 |
| AI-powered query language for web scraping and automation. It uses natural ... | Pushed 2 days ago 18 contributors Created 9 months ago | 240 |