node-poppler
Asynchronous node.js wrapper for the Poppler PDF rendering library
Overview
Poppler is a PDF rendering library that also includes a collection of utility binaries, which allows for the manipulation and extraction of data from PDF documents such as converting PDF files to HTML, TXT, or PostScript.
The node-poppler
module provides an asynchronous node.js wrapper around said utility binaries for easier use.
Installation
Install using npm
:
npm i node-poppler
Linux and macOS/Darwin support
Windows binaries are provided with this repository.
For Linux users, you will need to download the poppler-data
and poppler-utils
binaries separately.
An example of downloading the binaries on a Debian system:
sudo apt-get install poppler-data poppler-utils
For macOS users, you can download the latest versions with Homebrew:
brew install poppler
Example usage
Please refer to the JSDoc comments in the source code or the generated type definitions for information on the available options.
poppler.pdfToCairo
Example of an async
await
call to poppler.pdfToCairo()
, to convert only the first and second page of a PDF file to PNG:
const { Poppler } = require("node-poppler");
const file = "test_document.pdf";
const poppler = new Poppler();
const options = {
firstPageToConvert: 1,
lastPageToConvert: 2,
pngFile: true,
};
const outputFile = `test_document.png`;
const res = await poppler.pdfToCairo(file, outputFile, options);
console.log(res);
Example of an async
await
call to poppler.pdfToCairo()
, to convert only the first page of a PDF file to a new
PDF file using stdout:
const { writeFile } = require("node:fs/promises");
const { Poppler } = require("node-poppler");
const file = "test_document.pdf";
const poppler = new Poppler();
const options = {
lastPageToConvert: 1,
pdfFile: true,
};
const res = await poppler.pdfToCairo(file, undefined, options);
// pdfToCairo writes to stdout using binary encoding if pdfFile or singleFile options are used
await writeFile("new_file.pdf", res, { encoding: "binary" });
poppler.pdfToHtml
Example of calling poppler.pdfToHtml()
with a promise chain:
const { Poppler } = require("node-poppler");
const file = "test_document.pdf";
const poppler = new Poppler();
const options = {
firstPageToConvert: 1,
lastPageToConvert: 2,
};
poppler
.pdfToHtml(file, undefined, options)
.then((res) => {
console.log(res);
})
.catch((err) => {
console.error(err);
throw err;
});
Example of calling poppler.pdfToHtml()
with a promise chain, providing a Buffer as an input:
const { readFileSync } = require("node:fs");
const { Poppler } = require("node-poppler");
const file = readFileSync("test_document.pdf");
const poppler = new Poppler();
const options = {
firstPageToConvert: 1,
lastPageToConvert: 2,
};
poppler
.pdfToHtml(file, "tester.html", options)
.then((res) => {
console.log(res);
})
.catch((err) => {
console.error(err);
throw err;
});
poppler.pdfToText
Example of calling poppler.pdfToText()
with a promise chain:
const { Poppler } = require("node-poppler");
const file = "test_document.pdf";
const poppler = new Poppler();
const options = {
firstPageToConvert: 1,
lastPageToConvert: 2,
};
const outputFile = "test_document.txt";
poppler
.pdfToText(file, outputFile, options)
.then((res) => {
console.log(res);
})
.catch((err) => {
console.error(err);
throw err;
});
Contributing
Contributions are welcome, and any help is greatly appreciated!
See the contributing guide for details on how to get started. Please adhere to this project's Code of Conduct when contributing.
Acknowledgements
- Albert Astals Cid - Poppler developer
- Filipe Fernandes - poppler-feedstock maintainer
- Peter Williams - poppler-feedstock maintainer
- Owen Schwartz - poppler-windows developer
- Uwe Korn - poppler-feedstock maintainer
- Xylar Asay-Davis - poppler-feedstock maintainer
License
node-poppler
is licensed under the MIT license.