 | The fast, flexible, and elegant library for parsing and manipulating HTML a... | Pushed a day ago 142 contributors Created 13 years ago | 29.3k |
 | The scalable web scraping and crawling library for JavaScript/Node.js. Enab... | Pushed a day ago 104 contributors Created 9 years ago | 17.3k |
 | Extract the Readable Content from an HTML Document | Pushed 8 days ago 85 contributors Created 10 years ago | 9.69k |
 | Extract meaningful content from the chaos of a web page | Pushed 2 years ago 57 contributors Created 9 years ago | 5.57k |
 | A command-line tool to turn web pages into readable PDF, EPUB, HTML, or Mar... | Pushed 3 months ago 21 contributors Created 7 years ago | 4.4k |
 | A Node.js scraper for humans. | Pushed 2 months ago 20 contributors Created 9 years ago | 4.04k |
 | Get unified metadata from websites using Open Graph, Microdata, RDFa, Twitt... | Pushed 25 days ago 40 contributors Created 9 years ago | 2.43k |
 | Extract main article, main image and meta data from URL | Pushed 2 months ago 16 contributors Created 9 years ago | 1.68k |
 | Download website to local directory (including all css, images, js, etc.) | Pushed 9 days ago 16 contributors Created 11 years ago | 1.6k |
 | A super simple site crawler and broken link checker | Pushed 9 days ago 26 contributors Created 6 years ago | 1.06k |
 | AI-powered query language for web scraping and automation. It uses natural ... | Pushed 7 days ago 22 contributors Created a year ago | 678 |