|  | The fast, flexible, and elegant library for parsing and manipulating HTML a... | Pushed a day ago 144 contributors Created 14 years ago | 29.9k | 
|  | The scalable web scraping and crawling library for JavaScript/Node.js. Enab... | Pushed a day ago 115 contributors Created 9 years ago | 20.3k | 
|  | Pack an entire repository into a single, AI-friendly file. Perfect for when... | Pushed a day ago 64 contributors Created a year ago | 20k | 
|  | Extract the Readable Content from an HTML Document | Pushed a month ago 89 contributors Created 11 years ago | 10.6k | 
|  | A command-line tool to turn web pages into readable PDF, EPUB, HTML, or Mar... | Pushed 2 months ago 21 contributors Created 7 years ago | 4.51k | 
|  | A Node.js scraper for humans. | Pushed 18 days ago 21 contributors Created 10 years ago | 4.06k | 
|  | Extract the main content from web pages. | Pushed a month ago 14 contributors Created 8 months ago | 2.93k | 
|  | Get unified metadata from websites using Open Graph, Microdata, RDFa, Twitt... | Pushed 11 days ago 41 contributors Created 9 years ago | 2.57k | 
|  | Extract main article, main image and meta data from URL | Pushed 2 months ago 16 contributors Created 10 years ago | 1.76k | 
|  | Download website to local directory (including all css, images, js, etc.) | Pushed 2 days ago 17 contributors Created 11 years ago | 1.65k | 
|  | A super simple site crawler and broken link checker | Pushed 2 days ago 28 contributors Created 7 years ago | 1.1k | 
|  | AI-powered query language for web scraping and automation. It uses natural ... | Pushed 25 days ago 22 contributors Created 2 years ago | 988 |