Table of Contents

Class MarkdownTextSplitter

Namespace
Mythosia.AI.Rag.Splitters
Assembly
Mythosia.AI.Rag.dll

Structure-aware Markdown splitter that understands heading hierarchy (H1โ€“H6), preserves atomic blocks (code fences, tables), and prepends heading breadcrumbs to each chunk so that vector search retrieves contextually rich fragments.

public class MarkdownTextSplitter : ITextSplitter
Inheritance
MarkdownTextSplitter
Implements
Inherited Members

Constructors

MarkdownTextSplitter()

public MarkdownTextSplitter()

MarkdownTextSplitter(int, int)

public MarkdownTextSplitter(int chunkSize, int chunkOverlap = 200)

Parameters

chunkSize int
chunkOverlap int

Properties

ChunkOverlap

Number of overlapping characters carried from the previous chunk.

public int ChunkOverlap { get; set; }

Property Value

int

ChunkSize

Maximum characters per chunk (excluding the prepended breadcrumb).

public int ChunkSize { get; set; }

Property Value

int

IncludeHeadingBreadcrumb

When true, each chunk is prefixed with the heading path that leads to its content (e.g. "# Doc Title\n## Section\n### Sub-section\n\n"). This dramatically improves retrieval relevance. Default is true.

public bool IncludeHeadingBreadcrumb { get; set; }

Property Value

bool

MinSplitHeadingLevel

Minimum heading level that triggers a new section split. 1 = split on all headings (#โ€“######), 2 = ignore H1, etc. Default: 1.

public int MinSplitHeadingLevel { get; set; }

Property Value

int

Methods

Split(RagDocument)

Splits a document into chunks. Implementations may split by character count, token count, sentence boundary, etc.

public IReadOnlyList<RagChunk> Split(RagDocument document)

Parameters

document RagDocument

The document to split.

Returns

IReadOnlyList<RagChunk>

An ordered list of chunks.