 | The fast, flexible, and elegant library for parsing and manipulating HTML a... | Pushed 3 days ago 143 contributors Created 14 years ago | 29.7k |
 | The scalable web scraping and crawling library for JavaScript/Node.js. Enab... | Pushed a day ago 110 contributors Created 9 years ago | 19.2k |
 | Pack an entire repository into a single, AI-friendly file. Perfect for when... | Pushed a day ago 61 contributors Created a year ago | 18.8k |
 | Extract the Readable Content from an HTML Document | Pushed 14 days ago 87 contributors Created 11 years ago | 10.4k |
 | A command-line tool to turn web pages into readable PDF, EPUB, HTML, or Mar... | Pushed a day ago 21 contributors Created 7 years ago | 4.48k |
 | A Node.js scraper for humans. | Pushed a month ago 21 contributors Created 9 years ago | 4.06k |
 | Extract the main content from web pages. | Pushed 15 days ago 14 contributors Created 6 months ago | 2.81k |
 | Get unified metadata from websites using Open Graph, Microdata, RDFa, Twitt... | Pushed 16 days ago 41 contributors Created 9 years ago | 2.55k |
 | Extract main article, main image and meta data from URL | Pushed 4 months ago 16 contributors Created 10 years ago | 1.74k |
 | Download website to local directory (including all css, images, js, etc.) | Pushed 17 days ago 16 contributors Created 11 years ago | 1.64k |
 | A super simple site crawler and broken link checker | Pushed 18 days ago 25 contributors Created 7 years ago | 1.09k |
 | AI-powered query language for web scraping and automation. It uses natural ... | Pushed 2 days ago 22 contributors Created 2 years ago | 924 |