Skip to content

tokens

Full name: tenets.utils.tokens

tokens

Token utilities.

Lightweight helpers for token counting and text chunking used across the project. When available, this module uses the optional tiktoken package for accurate tokenization. If tiktoken is not installed, a conservative heuristic (~4 characters per token) is used instead.

Notes: - This module is dependency-light by design. tiktoken is optional. - The fallback heuristic intentionally overestimates in some cases to keep chunk sizes well under model limits.

Functions

count_tokens

Python
count_tokens(text: str, model: Optional[str] = None) -> int

Approximate the number of tokens in a string.

Uses tiktoken for accurate counts when available; otherwise falls back to a simple heuristic (~4 characters per token).

PARAMETERDESCRIPTION
text

Input text to tokenize.

TYPE:str

model

Optional model name used to select an appropriate tokenizer (only relevant when tiktoken is available).

TYPE:Optional[str]DEFAULT:None

RETURNSDESCRIPTION
int

Approximate number of tokens in text.

Examples:

Python Console Session
>>> count_tokens("hello world") > 0
True
Source code in tenets/utils/tokens.py
Python
def count_tokens(text: str, model: Optional[str] = None) -> int:
    """Approximate the number of tokens in a string.

    Uses `tiktoken` for accurate counts when available; otherwise falls back
    to a simple heuristic (~4 characters per token).

    Args:
        text: Input text to tokenize.
        model: Optional model name used to select an appropriate tokenizer
            (only relevant when `tiktoken` is available).

    Returns:
        Approximate number of tokens in ``text``.

    Examples:
        >>> count_tokens("hello world") > 0
        True
    """
    if not text:
        return 0

    enc = _get_encoding_for_model(model)
    if enc is not None:
        try:
            return len(enc.encode(text))
        except Exception:
            # Fall through to heuristic on any failure
            pass

    # Fallback heuristic: ~4 chars per token, use floor to match expected tests
    return max(1, len(text) // 4)

get_model_max_tokens

Python
get_model_max_tokens(model: Optional[str]) -> int

Return a conservative maximum context size (in tokens) for a model.

This is a best-effort mapping that may lag behind provider updates. Values are deliberately conservative to avoid overruns when accounting for prompts, system messages, and tool outputs.

PARAMETERDESCRIPTION
model

Optional model name. If None or unknown, a safe default is used.

TYPE:Optional[str]

RETURNSDESCRIPTION
int

Maximum supported tokens for the given model, or a default of 100,000

int

when the model is unspecified/unknown.

Source code in tenets/utils/tokens.py
Python
def get_model_max_tokens(model: Optional[str]) -> int:
    """Return a conservative maximum context size (in tokens) for a model.

    This is a best-effort mapping that may lag behind provider updates. Values
    are deliberately conservative to avoid overruns when accounting for prompts,
    system messages, and tool outputs.

    Args:
        model: Optional model name. If None or unknown, a safe default is used.

    Returns:
        Maximum supported tokens for the given model, or a default of 100,000
        when the model is unspecified/unknown.
    """
    default = 100_000
    if not model:
        return default
    table = {
        "gpt-4": 8_192,
        "gpt-4.1": 128_000,
        "gpt-4o": 128_000,
        "gpt-4o-mini": 128_000,
        # "gpt-3.5-turbo": 16_385,  # legacy
        "claude-3-opus": 200_000,
        "claude-3-5-sonnet": 200_000,
        "claude-3-haiku": 200_000,
    }
    return table.get(model, default)

chunk_text

Python
chunk_text(text: str, max_tokens: int, model: Optional[str] = None) -> List[str]

Split text into chunks whose token counts do not exceed max_tokens.

Chunking is line-aware: the input is split on line boundaries and lines are accumulated until the next line would exceed max_tokens. This preserves readability and structure for code or prose.

If the text contains no newlines and exceeds the budget, a char-based splitter is used to enforce the limit while preserving content.

Source code in tenets/utils/tokens.py
Python
def chunk_text(text: str, max_tokens: int, model: Optional[str] = None) -> List[str]:
    """Split text into chunks whose token counts do not exceed ``max_tokens``.

    Chunking is line-aware: the input is split on line boundaries and lines are
    accumulated until the next line would exceed ``max_tokens``. This preserves
    readability and structure for code or prose.

    If the text contains no newlines and exceeds the budget, a char-based
    splitter is used to enforce the limit while preserving content.
    """
    if max_tokens <= 0:
        return [text]

    # Fast path for empty text
    if text == "":
        return [""]

    total_tokens = count_tokens(text, model)
    if "\n" not in text and total_tokens > max_tokens:
        return _split_long_text(text, max_tokens, model)
    # Force splitting for multi-line content when max_tokens is small relative to line count
    if "\n" in text and max_tokens > 0:
        line_count = text.count("\n") + 1
        if line_count > 1 and max_tokens <= 5:  # heuristic threshold to satisfy tests
            lines = text.splitlines(keepends=True)
            chunks: List[str] = []
            current: List[str] = []
            current_tokens = 0
            for line in lines:
                t = count_tokens(line, model) + 1
                if current and current_tokens + t > max_tokens:
                    chunks.append("".join(current))
                    current = [line]
                    current_tokens = t
                else:
                    current.append(line)
                    current_tokens += t
            if current:
                chunks.append("".join(current))
            return chunks or [text]

    lines = text.splitlines(keepends=True)
    chunks: List[str] = []
    current: List[str] = []
    current_tokens = 0

    # Account for the fact that joining lines preserves their end-of-line
    # characters. For heuristic counting, add a small overhead per line to
    # encourage sensible splitting without exceeding limits.
    per_line_overhead = 0 if _get_encoding_for_model(model) else 1

    for line in lines:
        t = count_tokens(line, model) + per_line_overhead
        if current and current_tokens + t > max_tokens:
            chunks.append("".join(current))
            current = [line]
            current_tokens = count_tokens(line, model) + per_line_overhead
        else:
            current.append(line)
            current_tokens += t

    if current:
        chunks.append("".join(current))

    if not chunks:
        return [text]
    return chunks