Profiles·Public

pdf-parse

semver>=2.0.0 <3.0.0postconditions8functions7last verified2026-04-16coverage score88%

Postconditions — what we check

  • pdf · pdf-func-no-try-catch
    error
    Whenpdf(buffer) called without try-catch or .catch() handler
    ThrowsInvalidPDFException | PasswordException | FormatError | ResponseException | AbortException | UnknownErrorException
    Required handlingCaller MUST wrap await pdf(buffer) in try-catch. The function can throw multiple documented exception types for malformed PDFs, password-protected files, network failures (when loading from URL), and other pdfjs errors. Silent failure means corrupted data is silently swallowed.
    costmediumin prodimmediate exceptionusers seeservice unavailablevisibilityvisible
    Sources[1]
  • PDFParse.getText · get-text-no-try-catch
    error
    Whenparser.getText() called without try-catch or .catch() handler
    ThrowsInvalidPDFException | PasswordException | FormatError | ResponseException | AbortException | UnknownErrorException
    Required handlingCaller MUST wrap await parser.getText() in try-catch. Same exception types apply as the functional API. Additionally, destroy() should be called in finally block to free memory.
    costmediumin prodimmediate exceptionusers seeservice unavailablevisibilityvisible
    Sources[1]
  • PDFParse.getText · get-text-generic-catch
    warning
    Whenparser.getText() in try-catch but catch block doesn't differentiate PasswordException
    ThrowsPasswordException
    Required handlingWhen handling user-uploaded PDFs, caller SHOULD check for PasswordException specifically to return a user-friendly error (e.g. "PDF is password-protected") rather than a generic 500 error.
    costlowin proddegraded serviceusers seeservice unavailablevisibilityvisible
    Sources[1]
  • PDFParse.getInfo · get-info-no-try-catch
    error
    Whenparser.getInfo() called without try-catch or .catch() handler
    ThrowsInvalidPDFException | PasswordException | FormatError
    Required handlingCaller MUST wrap await parser.getInfo() in try-catch. Can throw same exception types as getText(). Always call destroy() in finally block.
    costmediumin prodimmediate exceptionusers seeservice unavailablevisibilityvisible
    Sources[1]
  • PDFParse.getImage · get-image-no-try-catch
    error
    Whenparser.getImage() called without try-catch or .catch() handler
    ThrowsInvalidPDFException | PasswordException | FormatError | ResponseException | AbortException | UnknownErrorException (all via load() → getException()). Also throws generic Error for unsupported/unrecognizable image pixel formats (convertToRGBA: Unsupported image kind / Cannot infer image format).
    Required handlingCaller MUST wrap await parser.getImage() in try-catch. The method calls load() internally, so all PDF-level exceptions apply. Additionally, malformed embedded image data in otherwise-valid PDFs can trigger internal canvas errors that are distinct from PDF parse errors. Always call destroy() in finally block to prevent memory leaks. try { const images = await parser.getImage({ imageDataUrl: true }); return images.pages; } catch (error) { if (error instanceof InvalidPDFException) { throw new Error('Document is not a valid PDF'); } if (error instanceof PasswordException) { throw new Error('PDF is password-protected'); } throw error; } finally { await parser.destroy(); }
    costmediumin prodimmediate exceptionusers seeservice unavailablevisibilityvisible
    Sources[2]
  • PDFParse.getScreenshot · get-screenshot-no-try-catch
    error
    Whenparser.getScreenshot() called without try-catch or .catch() handler
    ThrowsInvalidPDFException | PasswordException | FormatError | ResponseException | AbortException | UnknownErrorException (via load()). Also throws generic Error('PDF document not loaded') if load fails silently, and canvas-related errors if node-canvas is not installed in Node.js (TypeError: Cannot read properties of undefined — canvasFactory not configured).
    Required handlingCaller MUST wrap await parser.getScreenshot() in try-catch. This method is particularly sensitive to the runtime environment — in Node.js, it requires node-canvas to be installed and configured. Missing canvas support causes runtime errors that are distinct from PDF parse errors. Always call destroy() in finally block. try { const screenshots = await parser.getScreenshot({ scale: 1.5 }); return screenshots.pages; } catch (error) { if (error instanceof InvalidPDFException) { throw new Error('Cannot render — invalid PDF document'); } throw error; } finally { await parser.destroy(); }
    costmediumin prodimmediate exceptionusers seeservice unavailablevisibilityvisible
    Sources[2]
  • PDFParse.getTable · get-table-no-try-catch
    error
    Whenparser.getTable() called without try-catch or .catch() handler
    ThrowsInvalidPDFException | PasswordException | FormatError | ResponseException | AbortException | UnknownErrorException (via load()). Also throws generic Error('PDF document not loaded') on invalid state.
    Required handlingCaller MUST wrap await parser.getTable() in try-catch. PDFs without vector drawing operators return empty table arrays (no error) — but malformed or password-protected PDFs throw on load. Always call destroy() in finally block. try { const tables = await parser.getTable({ partial: [1, 2, 3] }); return tables.pages; } catch (error) { if (error instanceof PasswordException) { throw new Error('PDF is password-protected — cannot extract tables'); } throw error; } finally { await parser.destroy(); }
    costmediumin prodimmediate exceptionusers seeservice unavailablevisibilityvisible
    Sources[2]
  • getHeader · get-header-ok-not-checked
    warning
    WhengetHeader() return value used without checking result.ok before accessing result.status, result.size, result.headers, or result.magic
    ThrowsNever throws — all errors caught internally. Returns { ok: false, status: undefined, size: undefined, magic: false, headers: {}, error: Error } on any network failure (ECONNREFUSED, ENOTFOUND, ETIMEDOUT, invalid URL, HTTP error status).
    Required handlingCaller MUST check result.ok before treating the result as valid. Skipping this check means network failures are silently ignored — the PDF URL is treated as valid even when unreachable, causing downstream load() calls to fail with a harder-to-diagnose error. // WRONG — silent failure on network error: const result = await getHeader(url, true); if (result.magic) { /* assumes ok */ } // CORRECT — check ok first: const result = await getHeader(url, true); if (!result.ok) { throw new Error(`PDF URL unreachable: ${result.error?.message}`); } if (!result.magic) { throw new Error('URL does not point to a valid PDF file'); }
    costlowin prodsilent failureusers seelost datavisibilitysilent
    Sources[3]

Sources

Every postcondition cites at least one of these. Numbered to match the footnotes above.

  1. [1]raw.githubusercontent.com/mehmet-kozan/pdf-parsehttps://raw.githubusercontent.com/mehmet-kozan/pdf-parse/main/README.md
  2. [2]github.com/mehmet-kozan/pdf-parsehttps://github.com/mehmet-kozan/pdf-parse/blob/main/README.md
  3. [3]github.com/mehmet-kozan/pdf-parsehttps://github.com/mehmet-kozan/pdf-parse/blob/main/src/node/getHeader.ts
Need a different package?
Request a profile