pdf-parse
semver
>=2.0.0 <3.0.0postconditions8functions7last verified2026-04-16coverage score88%Postconditions — what we check
- pdf · pdf-func-no-try-catcherrorWhenpdf(buffer) called without try-catch or .catch() handlerThrows
InvalidPDFException | PasswordException | FormatError | ResponseException | AbortException | UnknownErrorExceptionRequired handlingCaller MUST wrap await pdf(buffer) in try-catch. The function can throw multiple documented exception types for malformed PDFs, password-protected files, network failures (when loading from URL), and other pdfjs errors. Silent failure means corrupted data is silently swallowed.costmediumin prodimmediate exceptionusers seeservice unavailablevisibilityvisibleSources[1] - PDFParse.getText · get-text-no-try-catcherrorWhenparser.getText() called without try-catch or .catch() handlerThrows
InvalidPDFException | PasswordException | FormatError | ResponseException | AbortException | UnknownErrorExceptionRequired handlingCaller MUST wrap await parser.getText() in try-catch. Same exception types apply as the functional API. Additionally, destroy() should be called in finally block to free memory.costmediumin prodimmediate exceptionusers seeservice unavailablevisibilityvisibleSources[1] - PDFParse.getText · get-text-generic-catchwarningWhenparser.getText() in try-catch but catch block doesn't differentiate PasswordExceptionThrows
PasswordExceptionRequired handlingWhen handling user-uploaded PDFs, caller SHOULD check for PasswordException specifically to return a user-friendly error (e.g. "PDF is password-protected") rather than a generic 500 error.costlowin proddegraded serviceusers seeservice unavailablevisibilityvisibleSources[1] - PDFParse.getInfo · get-info-no-try-catcherrorWhenparser.getInfo() called without try-catch or .catch() handlerThrows
InvalidPDFException | PasswordException | FormatErrorRequired handlingCaller MUST wrap await parser.getInfo() in try-catch. Can throw same exception types as getText(). Always call destroy() in finally block.costmediumin prodimmediate exceptionusers seeservice unavailablevisibilityvisibleSources[1] - PDFParse.getImage · get-image-no-try-catcherrorWhenparser.getImage() called without try-catch or .catch() handlerThrows
InvalidPDFException | PasswordException | FormatError | ResponseException | AbortException | UnknownErrorException (all via load() → getException()). Also throws generic Error for unsupported/unrecognizable image pixel formats (convertToRGBA: Unsupported image kind / Cannot infer image format).Required handlingCaller MUST wrap await parser.getImage() in try-catch. The method calls load() internally, so all PDF-level exceptions apply. Additionally, malformed embedded image data in otherwise-valid PDFs can trigger internal canvas errors that are distinct from PDF parse errors. Always call destroy() in finally block to prevent memory leaks. try { const images = await parser.getImage({ imageDataUrl: true }); return images.pages; } catch (error) { if (error instanceof InvalidPDFException) { throw new Error('Document is not a valid PDF'); } if (error instanceof PasswordException) { throw new Error('PDF is password-protected'); } throw error; } finally { await parser.destroy(); }costmediumin prodimmediate exceptionusers seeservice unavailablevisibilityvisibleSources[2] - PDFParse.getScreenshot · get-screenshot-no-try-catcherrorWhenparser.getScreenshot() called without try-catch or .catch() handlerThrows
InvalidPDFException | PasswordException | FormatError | ResponseException | AbortException | UnknownErrorException (via load()). Also throws generic Error('PDF document not loaded') if load fails silently, and canvas-related errors if node-canvas is not installed in Node.js (TypeError: Cannot read properties of undefined — canvasFactory not configured).Required handlingCaller MUST wrap await parser.getScreenshot() in try-catch. This method is particularly sensitive to the runtime environment — in Node.js, it requires node-canvas to be installed and configured. Missing canvas support causes runtime errors that are distinct from PDF parse errors. Always call destroy() in finally block. try { const screenshots = await parser.getScreenshot({ scale: 1.5 }); return screenshots.pages; } catch (error) { if (error instanceof InvalidPDFException) { throw new Error('Cannot render — invalid PDF document'); } throw error; } finally { await parser.destroy(); }costmediumin prodimmediate exceptionusers seeservice unavailablevisibilityvisibleSources[2] - PDFParse.getTable · get-table-no-try-catcherrorWhenparser.getTable() called without try-catch or .catch() handlerThrows
InvalidPDFException | PasswordException | FormatError | ResponseException | AbortException | UnknownErrorException (via load()). Also throws generic Error('PDF document not loaded') on invalid state.Required handlingCaller MUST wrap await parser.getTable() in try-catch. PDFs without vector drawing operators return empty table arrays (no error) — but malformed or password-protected PDFs throw on load. Always call destroy() in finally block. try { const tables = await parser.getTable({ partial: [1, 2, 3] }); return tables.pages; } catch (error) { if (error instanceof PasswordException) { throw new Error('PDF is password-protected — cannot extract tables'); } throw error; } finally { await parser.destroy(); }costmediumin prodimmediate exceptionusers seeservice unavailablevisibilityvisibleSources[2] - getHeader · get-header-ok-not-checkedwarningWhengetHeader() return value used without checking result.ok before accessing result.status, result.size, result.headers, or result.magicThrows
Never throws — all errors caught internally. Returns { ok: false, status: undefined, size: undefined, magic: false, headers: {}, error: Error } on any network failure (ECONNREFUSED, ENOTFOUND, ETIMEDOUT, invalid URL, HTTP error status).Required handlingCaller MUST check result.ok before treating the result as valid. Skipping this check means network failures are silently ignored — the PDF URL is treated as valid even when unreachable, causing downstream load() calls to fail with a harder-to-diagnose error. // WRONG — silent failure on network error: const result = await getHeader(url, true); if (result.magic) { /* assumes ok */ } // CORRECT — check ok first: const result = await getHeader(url, true); if (!result.ok) { throw new Error(`PDF URL unreachable: ${result.error?.message}`); } if (!result.magic) { throw new Error('URL does not point to a valid PDF file'); }costlowin prodsilent failureusers seelost datavisibilitysilentSources[3]
Sources
Every postcondition cites at least one of these. Numbered to match the footnotes above.
- [1]raw.githubusercontent.com/mehmet-kozan/pdf-parsehttps://raw.githubusercontent.com/mehmet-kozan/pdf-parse/main/README.md
- [2]github.com/mehmet-kozan/pdf-parsehttps://github.com/mehmet-kozan/pdf-parse/blob/main/README.md
- [3]github.com/mehmet-kozan/pdf-parsehttps://github.com/mehmet-kozan/pdf-parse/blob/main/src/node/getHeader.ts
Need a different package?
Request a profile