Table of Contents

Class CharacterTextSplitter

Namespace
Mythosia.AI.Rag.Splitters
Assembly
Mythosia.AI.Rag.dll

Splits documents into chunks based on character count with configurable overlap.

public class CharacterTextSplitter : ITextSplitter
Inheritance
CharacterTextSplitter
Implements
Inherited Members

Constructors

CharacterTextSplitter()

public CharacterTextSplitter()

CharacterTextSplitter(int, int, string?)

public CharacterTextSplitter(int chunkSize, int chunkOverlap = 200, string? separator = "\n\n")

Parameters

chunkSize int
chunkOverlap int
separator string

Properties

ChunkOverlap

Number of overlapping characters between consecutive chunks.

public int ChunkOverlap { get; set; }

Property Value

int

ChunkSize

Maximum number of characters per chunk.

public int ChunkSize { get; set; }

Property Value

int

Separator

Separator string to attempt to split on (e.g., "\n\n", "\n", " "). If null, splits at exact character boundaries.

public string? Separator { get; set; }

Property Value

string

Methods

Split(RagDocument)

Splits a document into chunks. Implementations may split by character count, token count, sentence boundary, etc.

public IReadOnlyList<RagChunk> Split(RagDocument document)

Parameters

document RagDocument

The document to split.

Returns

IReadOnlyList<RagChunk>

An ordered list of chunks.