Class PdfPigParser
Parses PDF files using PdfPig into a structured DoclingDocument. Extracts headings (via font-size analysis), lists (via prefix detection), and paragraphs (via spatial line grouping).
public class PdfPigParser : IDocumentParser
- Inheritance
-
PdfPigParser
- Implements
- Inherited Members
Constructors
PdfPigParser(PdfParserOptions?)
public PdfPigParser(PdfParserOptions? options = null)
Parameters
optionsPdfParserOptions
Methods
CanParse(string)
Returns true if the parser can handle the given source.
public bool CanParse(string source)
Parameters
sourcestring
Returns
ParseAsync(string, CancellationToken)
Parses the document and returns a structured DoclingDocument.
public Task<DoclingDocument> ParseAsync(string source, CancellationToken ct = default)
Parameters
sourcestringctCancellationToken