Skip to content

tenets.core.distiller Package

Distiller module - Extract and aggregate relevant context from codebases.

The distiller is responsible for the main 'distill' command functionality: 1. Understanding what the user wants (prompt parsing) 2. Finding relevant files (discovery) 3. Ranking by importance (intelligence) 4. Packing within token limits (optimization) 5. Formatting for output (presentation)

Classes

ContextAggregator

Python
ContextAggregator(config: TenetsConfig)

Aggregates files intelligently within token constraints.

Initialize the aggregator.

PARAMETERDESCRIPTION
config

Tenets configuration

TYPE:TenetsConfig

Attributes

configinstance-attribute
Python
config = config
loggerinstance-attribute
Python
logger = get_logger(__name__)
strategiesinstance-attribute
Python
strategies = {'greedy': AggregationStrategy(name='greedy', max_full_files=20, summarize_threshold=0.6, min_relevance=0.05), 'balanced': AggregationStrategy(name='balanced', max_full_files=10, summarize_threshold=0.7, min_relevance=0.08), 'conservative': AggregationStrategy(name='conservative', max_full_files=5, summarize_threshold=0.8, min_relevance=0.15)}
summarizerproperty
Python
summarizer

Lazy load summarizer when needed.

Functions

aggregate
Python
aggregate(files: List[FileAnalysis], prompt_context: PromptContext, max_tokens: int, model: Optional[str] = None, git_context: Optional[Dict[str, Any]] = None, strategy: str = 'balanced', full: bool = False, condense: bool = False, remove_comments: bool = False, docstring_weight: Optional[float] = None, summarize_imports: bool = True) -> Dict[str, Any]

Aggregate files within token budget.

PARAMETERDESCRIPTION
files

Ranked files to aggregate

TYPE:List[FileAnalysis]

prompt_context

Context about the prompt

TYPE:PromptContext

max_tokens

Maximum token budget

TYPE:int

model

Target model for token counting

TYPE:Optional[str]DEFAULT:None

git_context

Optional git context to include

TYPE:Optional[Dict[str, Any]]DEFAULT:None

strategy

Aggregation strategy to use

TYPE:strDEFAULT:'balanced'

RETURNSDESCRIPTION
Dict[str, Any]

Dictionary with aggregated content and metadata

optimize_packing
Python
optimize_packing(files: List[FileAnalysis], max_tokens: int, model: Optional[str] = None) -> List[Tuple[FileAnalysis, bool]]

Optimize file packing using dynamic programming.

This is a more sophisticated packing algorithm that tries to maximize total relevance score within token constraints.

PARAMETERDESCRIPTION
files

Files to pack

TYPE:List[FileAnalysis]

max_tokens

Token budget

TYPE:int

model

Model for token counting

TYPE:Optional[str]DEFAULT:None

RETURNSDESCRIPTION
List[Tuple[FileAnalysis, bool]]

List of (file, should_summarize) tuples

Distiller

Python
Distiller(config: TenetsConfig)

Orchestrates context extraction from codebases.

The Distiller is the main engine that powers the 'distill' command. It coordinates all the components to extract the most relevant context based on a user's prompt.

Initialize the distiller with configuration.

PARAMETERDESCRIPTION
config

Tenets configuration

TYPE:TenetsConfig

Attributes

configinstance-attribute
Python
config = config
loggerinstance-attribute
Python
logger = get_logger(__name__)
scannerinstance-attribute
Python
scanner = FileScanner(config)
analyzerinstance-attribute
Python
analyzer = CodeAnalyzer(config)
rankerinstance-attribute
Python
ranker = RelevanceRanker(config)
parserinstance-attribute
Python
parser = PromptParser(config)
gitinstance-attribute
Python
git = GitAnalyzer(config)
aggregatorinstance-attribute
Python
aggregator = ContextAggregator(config)
optimizerinstance-attribute
Python
optimizer = TokenOptimizer(config)
formatterinstance-attribute
Python
formatter = ContextFormatter(config)

Functions

distill
Python
distill(prompt: str, paths: Optional[Union[str, Path, List[Path]]] = None, *, format: str = 'markdown', model: Optional[str] = None, max_tokens: Optional[int] = None, mode: str = 'balanced', include_git: bool = True, session_name: Optional[str] = None, include_patterns: Optional[List[str]] = None, exclude_patterns: Optional[List[str]] = None, full: bool = False, condense: bool = False, remove_comments: bool = False, pinned_files: Optional[List[Path]] = None, include_tests: Optional[bool] = None, docstring_weight: Optional[float] = None, summarize_imports: bool = True) -> ContextResult

Distill relevant context from codebase based on prompt.

This is the main method that extracts, ranks, and aggregates the most relevant files and information for a given prompt.

PARAMETERDESCRIPTION
prompt

The user's query or task description

TYPE:str

paths

Paths to analyze (default: current directory)

TYPE:Optional[Union[str, Path, List[Path]]]DEFAULT:None

format

Output format (markdown, xml, json)

TYPE:strDEFAULT:'markdown'

model

Target LLM model for token counting

TYPE:Optional[str]DEFAULT:None

max_tokens

Maximum tokens for context

TYPE:Optional[int]DEFAULT:None

mode

Analysis mode (fast, balanced, thorough)

TYPE:strDEFAULT:'balanced'

include_git

Whether to include git context

TYPE:boolDEFAULT:True

session_name

Session name for stateful context

TYPE:Optional[str]DEFAULT:None

include_patterns

File patterns to include

TYPE:Optional[List[str]]DEFAULT:None

exclude_patterns

File patterns to exclude

TYPE:Optional[List[str]]DEFAULT:None

RETURNSDESCRIPTION
ContextResult

ContextResult with the distilled context

Example

distiller = Distiller(config) result = distiller.distill( ... "implement OAuth2 authentication", ... paths="./src", ... mode="thorough", ... max_tokens=50000 ... ) print(result.context)

ContextFormatter

Python
ContextFormatter(config: TenetsConfig)

Formats aggregated context for output.

Initialize the formatter.

PARAMETERDESCRIPTION
config

Tenets configuration

TYPE:TenetsConfig

Attributes

configinstance-attribute
Python
config = config
loggerinstance-attribute
Python
logger = get_logger(__name__)

Functions

format
Python
format(aggregated: Dict[str, Any], format: str, prompt_context: PromptContext, session_name: Optional[str] = None) -> str

Format aggregated context for output.

PARAMETERDESCRIPTION
aggregated

Aggregated context data containing files and statistics.

TYPE:Dict[str, Any]

format

Output format (markdown, xml, json, html).

TYPE:str

prompt_context

Original prompt context with task analysis.

TYPE:PromptContext

session_name

Optional session name for context tracking.

TYPE:Optional[str]DEFAULT:None

RETURNSDESCRIPTION
str

Formatted context string in the requested format.

RAISESDESCRIPTION
ValueError

If format is not supported.

TokenOptimizer

Python
TokenOptimizer(config: TenetsConfig)

Optimizes token usage for maximum context value.

Initialize the optimizer.

PARAMETERDESCRIPTION
config

Tenets configuration

TYPE:TenetsConfig

Attributes

configinstance-attribute
Python
config = config
loggerinstance-attribute
Python
logger = get_logger(__name__)

Functions

create_budget
Python
create_budget(model: Optional[str], max_tokens: Optional[int], prompt_tokens: int, has_git_context: bool = False, has_tenets: bool = False) -> TokenBudget

Create a token budget for context generation.

PARAMETERDESCRIPTION
model

Target model name.

TYPE:Optional[str]

max_tokens

Optional hard cap on total tokens; overrides model default.

TYPE:Optional[int]

prompt_tokens

Tokens used by the prompt/instructions.

TYPE:int

has_git_context

Whether git context will be included.

TYPE:boolDEFAULT:False

has_tenets

Whether tenets will be injected.

TYPE:boolDEFAULT:False

RETURNSDESCRIPTION
TokenBudget

Configured budget with reserves.

TYPE:TokenBudget

optimize_file_selection
Python
optimize_file_selection(files: List[FileAnalysis], budget: TokenBudget, strategy: str = 'balanced') -> List[Tuple[FileAnalysis, str]]

Optimize file selection within budget.

Uses different strategies to select which files to include and whether to summarize them.

PARAMETERDESCRIPTION
files

Ranked files to consider

TYPE:List[FileAnalysis]

budget

Token budget to work within

TYPE:TokenBudget

strategy

Selection strategy (greedy, balanced, diverse)

TYPE:strDEFAULT:'balanced'

RETURNSDESCRIPTION
List[Tuple[FileAnalysis, str]]

List of (file, action) tuples where action is 'full' or 'summary'

estimate_tokens_for_git
Python
estimate_tokens_for_git(git_context: Optional[Dict[str, Any]]) -> int

Estimate tokens needed for git context.

estimate_tokens_for_tenets
Python
estimate_tokens_for_tenets(tenet_count: int, with_reinforcement: bool = False) -> int

Estimate tokens needed for tenet injection.

Modules