Table of Contents

Namespace Mythosia.AI.Rag.Splitters

Classes

CharacterTextSplitter

Splits documents into chunks based on character count with configurable overlap.

MarkdownTextSplitter

Structure-aware Markdown splitter that understands heading hierarchy (H1โ€“H6), preserves atomic blocks (code fences, tables), and prepends heading breadcrumbs to each chunk so that vector search retrieves contextually rich fragments.

RecursiveTextSplitter

Recursively splits text using an ordered list of separators (LangChain-style). At each level the best separator is chosen, small pieces are merged up to ChunkSize, and only oversized pieces recurse to the next separator.

TokenTextSplitter

Splits documents into chunks based on approximate token count using whitespace tokenization. For precise token counting with a specific model's tokenizer, consider extending this class.