Table of Contents

Class DoclingDocument

Namespace
Mythosia.Documents
Assembly
Mythosia.Documents.Abstractions.dll

Unified document representation following the docling DoclingDocument convention. Content items are stored in flat lists; the tree structure is maintained via body/furniture root nodes using RefItem pointers.

public class DoclingDocument
Inheritance
DoclingDocument
Inherited Members

Properties

Body

Root node of the main document body tree.

public GroupItem Body { get; set; }

Property Value

GroupItem

Furniture

Root node for furniture elements (headers, footers, etc.).

public GroupItem Furniture { get; set; }

Property Value

GroupItem

Groups

Group containers (list groups, chapters, sections, slides, sheets).

public List<GroupItem> Groups { get; set; }

Property Value

List<GroupItem>

Metadata

Document-level metadata extracted by the parser or loader (e.g., type, filename, extension, title, author, page_count).

public Dictionary<string, string> Metadata { get; set; }

Property Value

Dictionary<string, string>

Name

The working name of this document (without extension).

public string Name { get; set; }

Property Value

string

Pictures

All picture items.

public List<PictureItem> Pictures { get; set; }

Property Value

List<PictureItem>

RawContent

Optional raw content that bypasses the body tree serialization. When set, ToMarkdown() returns this value directly instead of serializing the body tree via Document.MarkdownSerializer. Useful for plain-text loaders that should preserve content as-is.

public string? RawContent { get; set; }

Property Value

string

Source

The source path, URL, or identifier from which this document was loaded. Set by the loader to carry provenance through the pipeline.

public string Source { get; set; }

Property Value

string

Tables

All table items.

public List<TableItem> Tables { get; set; }

Property Value

List<TableItem>

Texts

All text-based content items (paragraphs, headings, list items, code, formulas).

public List<TextItem> Texts { get; set; }

Property Value

List<TextItem>

Methods

AddCode(string, string, NodeItem?)

Adds a code block to the document body.

public CodeItem AddCode(string text, string language = "", NodeItem? parent = null)

Parameters

text string
language string
parent NodeItem

Returns

CodeItem

AddGroup(string, GroupLabel, NodeItem?)

Adds a group container to the document body.

public GroupItem AddGroup(string name = "group", GroupLabel label = GroupLabel.Unspecified, NodeItem? parent = null)

Parameters

name string
label GroupLabel
parent NodeItem

Returns

GroupItem

AddHeading(string, int, NodeItem?)

Adds a section heading to the document body.

public SectionHeaderItem AddHeading(string text, int level = 1, NodeItem? parent = null)

Parameters

text string
level int
parent NodeItem

Returns

SectionHeaderItem

AddListItem(string, bool, string, NodeItem?)

Adds a list item to the document body.

public DocListItem AddListItem(string text, bool enumerated = false, string marker = "-", NodeItem? parent = null)

Parameters

text string
enumerated bool
marker string
parent NodeItem

Returns

DocListItem

AddParagraph(string, NodeItem?)

Adds a paragraph to the document body.

public TextItem AddParagraph(string text, NodeItem? parent = null)

Parameters

text string
parent NodeItem

Returns

TextItem

AddPicture(NodeItem?)

Adds a picture to the document body.

public PictureItem AddPicture(NodeItem? parent = null)

Parameters

parent NodeItem

Returns

PictureItem

AddTable(TableData, NodeItem?)

Adds a table to the document body.

public TableItem AddTable(TableData data, NodeItem? parent = null)

Parameters

data TableData
parent NodeItem

Returns

TableItem

AddText(string, DocItemLabel, NodeItem?)

Adds a generic text item to the document body.

public TextItem AddText(string text, DocItemLabel label = DocItemLabel.Text, NodeItem? parent = null)

Parameters

text string
label DocItemLabel
parent NodeItem

Returns

TextItem

AddTitle(string, NodeItem?)

Adds a title item to the document body.

public TitleItem AddTitle(string text, NodeItem? parent = null)

Parameters

text string
parent NodeItem

Returns

TitleItem

ToMarkdown()

Serializes this document to Markdown format.

public string ToMarkdown()

Returns

string