model_tokenizer

Shared tokenizer utilities aligned with the embedding model.

Uses the same tokenizer as the Granite embedding model to ensure accurate token counting for chunk sizing. The tokenizer is downloaded from Hugging Face and cached locally for fast, in-process tokenization. Also saves/loads from infrastructure bucket for faster startup across service instances.

Functions

`count_tokens`

def count_tokens(text: str) -> int

Return token count using the model's tokenizer.

`encode`

def encode(text: str) -> list[int]

Encode text into token ids without adding special tokens.

`decode`

def decode(tokens: Iterable[int]) -> str

Decode token ids back into text without cleanup stripping spaces.

`split_by_tokens`

def split_by_tokens(text: str, max_tokens: int) -> list[str]

Split text into segments that are each <= max_tokens.

Functions​

count_tokens​

encode​

decode​

split_by_tokens​

Functions

`count_tokens`

`encode`

`decode`

`split_by_tokens`