Skip to content

config

Full name: tenets.config

config

Configuration management for Tenets with enhanced LLM and NLP support.

This module handles all configuration for the Tenets system, including loading from files, environment variables, and providing defaults. Configuration can be specified at multiple levels with proper precedence.

Configuration precedence (highest to lowest): 1. Runtime parameters (passed to methods) 2. Environment variables (TENETS_*) 3. Project config file (.tenets.yml in project) 4. User config file (~/.config/tenets/config.yml) 5. Default values

The configuration system is designed to work with zero configuration (sensible defaults) while allowing full customization when needed.

Enhanced with comprehensive LLM provider support for optional AI-powered features and centralized NLP configuration for all text processing operations.

Classes

NLPConfigdataclass

Python
NLPConfig(enabled: bool = True, stopwords_enabled: bool = True, code_stopword_set: str = 'minimal', prompt_stopword_set: str = 'aggressive', custom_stopword_files: List[str] = list(), tokenization_mode: str = 'auto', preserve_original_tokens: bool = True, split_camelcase: bool = True, split_snakecase: bool = True, min_token_length: int = 2, keyword_extraction_method: str = 'auto', max_keywords: int = 30, ngram_size: int = 3, yake_dedup_threshold: float = 0.7, tfidf_use_sublinear: bool = True, tfidf_use_idf: bool = True, tfidf_norm: str = 'l2', bm25_k1: float = 1.2, bm25_b: float = 0.75, embeddings_enabled: bool = False, embeddings_model: str = 'all-MiniLM-L6-v2', embeddings_device: str = 'auto', embeddings_cache: bool = True, embeddings_batch_size: int = 32, similarity_metric: str = 'cosine', similarity_threshold: float = 0.7, cache_embeddings_ttl_days: int = 30, cache_tfidf_ttl_days: int = 7, cache_keywords_ttl_days: int = 7, multiprocessing_enabled: bool = True, multiprocessing_workers: Optional[int] = None, multiprocessing_chunk_size: int = 100)

Configuration for centralized NLP (Natural Language Processing) system.

Controls all text processing operations including tokenization, keyword extraction, stopword filtering, embeddings, and similarity computation. All NLP operations are centralized in the tenets.core.nlp package.

ATTRIBUTEDESCRIPTION
enabled

Whether NLP features are enabled globally

TYPE:bool

stopwords_enabled

Whether to use stopword filtering

TYPE:bool

code_stopword_set

Stopword set for code search (minimal)

TYPE:str

prompt_stopword_set

Stopword set for prompt parsing (aggressive)

TYPE:str

custom_stopword_files

Additional custom stopword files

TYPE:List[str]

tokenization_mode

Tokenization mode ('code', 'text', 'auto')

TYPE:str

preserve_original_tokens

Keep original tokens for exact matching

TYPE:bool

split_camelcase

Split camelCase and PascalCase

TYPE:bool

split_snakecase

Split snake_case

TYPE:bool

min_token_length

Minimum token length to keep

TYPE:int

keyword_extraction_method

Method for keyword extraction

TYPE:str

max_keywords

Maximum keywords to extract

TYPE:int

ngram_size

Maximum n-gram size for extraction

TYPE:int

yake_dedup_threshold

YAKE deduplication threshold

TYPE:float

tfidf_use_sublinear

Use log scaling for term frequency

TYPE:bool

tfidf_use_idf

Use inverse document frequency

TYPE:bool

tfidf_norm

Normalization method for TF-IDF

TYPE:str

bm25_k1

BM25 term frequency saturation parameter

TYPE:float

bm25_b

BM25 length normalization parameter

TYPE:float

embeddings_enabled

Whether to use embeddings (requires ML)

TYPE:bool

embeddings_model

Default embedding model

TYPE:str

embeddings_device

Device for embeddings ('auto', 'cpu', 'cuda')

TYPE:str

embeddings_cache

Whether to cache embeddings

TYPE:bool

embeddings_batch_size

Batch size for embedding generation

TYPE:int

similarity_metric

Default similarity metric

TYPE:str

similarity_threshold

Default similarity threshold

TYPE:float

cache_embeddings_ttl_days

TTL for embedding cache

TYPE:int

cache_tfidf_ttl_days

TTL for TF-IDF cache

TYPE:int

cache_keywords_ttl_days

TTL for keyword cache

TYPE:int

multiprocessing_enabled

Enable multiprocessing for NLP operations

TYPE:bool

multiprocessing_workers

Number of workers (None = cpu_count)

TYPE:Optional[int]

multiprocessing_chunk_size

Chunk size for parallel processing

TYPE:int

LLMConfigdataclass

Python
LLMConfig(enabled: bool = False, provider: str = 'openai', fallback_providers: List[str] = (lambda: ['anthropic', 'openrouter'])(), api_keys: Dict[str, str] = (lambda: {'openai': '${OPENAI_API_KEY}', 'anthropic': '${ANTHROPIC_API_KEY}', 'openrouter': '${OPENROUTER_API_KEY}', 'cohere': '${COHERE_API_KEY}', 'together': '${TOGETHER_API_KEY}', 'huggingface': '${HUGGINGFACE_API_KEY}', 'replicate': '${REPLICATE_API_KEY}', 'ollama': ''})(), api_base_urls: Dict[str, str] = (lambda: {'openai': 'https://api.openai.com/v1', 'anthropic': 'https://api.anthropic.com/v1', 'openrouter': 'https://openrouter.ai/api/v1', 'ollama': 'http://localhost:11434'})(), models: Dict[str, str] = (lambda: {'default': 'gpt-4o-mini', 'summarization': 'gpt-3.5-turbo', 'analysis': 'gpt-4o', 'embeddings': 'text-embedding-3-small', 'code_generation': 'gpt-4o', 'semantic_search': 'text-embedding-3-small', 'anthropic_default': 'claude-3-haiku-20240307', 'anthropic_analysis': 'claude-3-sonnet-20240229', 'anthropic_code': 'claude-3-opus-20240229', 'ollama_default': 'llama2', 'ollama_code': 'codellama', 'ollama_embeddings': 'nomic-embed-text'})(), max_cost_per_run: float = 0.1, max_cost_per_day: float = 10.0, max_tokens_per_request: int = 4000, max_context_length: int = 100000, temperature: float = 0.3, top_p: float = 0.95, frequency_penalty: float = 0.0, presence_penalty: float = 0.0, requests_per_minute: int = 60, retry_on_error: bool = True, max_retries: int = 3, retry_delay: float = 1.0, retry_backoff: float = 2.0, timeout: int = 30, stream: bool = False, cache_responses: bool = True, cache_ttl_hours: int = 24, log_requests: bool = False, log_responses: bool = False, custom_headers: Dict[str, str] = dict(), organization_id: Optional[str] = None, project_id: Optional[str] = None)

Configuration for LLM (Large Language Model) integration.

Supports multiple providers and models with comprehensive cost controls, rate limiting, and fallback strategies. All LLM features are optional and disabled by default.

ATTRIBUTEDESCRIPTION
enabled

Whether LLM features are enabled globally

TYPE:bool

provider

Primary LLM provider (openai, anthropic, openrouter, litellm, ollama)

TYPE:str

fallback_providers

Ordered list of fallback providers if primary fails

TYPE:List[str]

api_keys

Dictionary of provider -> API key (can use env vars)

TYPE:Dict[str, str]

api_base_urls

Custom API endpoints for providers (e.g., for proxies)

TYPE:Dict[str, str]

models

Model selection for different tasks

TYPE:Dict[str, str]

max_cost_per_run

Maximum cost in USD per execution run

TYPE:float

max_cost_per_day

Maximum cost in USD per day

TYPE:float

max_tokens_per_request

Maximum tokens per single request

TYPE:int

max_context_length

Maximum context window to use

TYPE:int

temperature

Sampling temperature (0.0-2.0, lower = more deterministic)

TYPE:float

top_p

Nucleus sampling parameter

TYPE:float

frequency_penalty

Frequency penalty for token repetition

TYPE:float

presence_penalty

Presence penalty for topic repetition

TYPE:float

requests_per_minute

Rate limit for API requests

TYPE:int

retry_on_error

Whether to retry failed requests

TYPE:bool

max_retries

Maximum number of retry attempts

TYPE:int

retry_delay

Initial delay between retries in seconds

TYPE:float

retry_backoff

Backoff multiplier for retry delays

TYPE:float

timeout

Request timeout in seconds

TYPE:int

stream

Whether to stream responses

TYPE:bool

cache_responses

Whether to cache LLM responses

TYPE:bool

cache_ttl_hours

Cache time-to-live in hours

TYPE:int

log_requests

Whether to log all LLM requests

TYPE:bool

log_responses

Whether to log all LLM responses

TYPE:bool

custom_headers

Additional headers for API requests

TYPE:Dict[str, str]

organization_id

Organization ID for providers that support it

TYPE:Optional[str]

project_id

Project ID for providers that support it

TYPE:Optional[str]

Functions
get_api_key
Python
get_api_key(provider: Optional[str] = None) -> Optional[str]

Get API key for a specific provider.

PARAMETERDESCRIPTION
provider

Provider name (uses default if not specified)

TYPE:Optional[str]DEFAULT:None

RETURNSDESCRIPTION
Optional[str]

API key string or None if not configured

Source code in tenets/config.py
Python
def get_api_key(self, provider: Optional[str] = None) -> Optional[str]:
    """Get API key for a specific provider.

    Args:
        provider: Provider name (uses default if not specified)

    Returns:
        API key string or None if not configured
    """
    provider = provider or self.provider
    key = self.api_keys.get(provider)

    # Don't return placeholder values
    if key and key.startswith("${") and key.endswith("}"):
        return None

    return key
get_model
Python
get_model(task: str = 'default', provider: Optional[str] = None) -> str

Get model name for a specific task and provider.

PARAMETERDESCRIPTION
task

Task type (default, summarization, analysis, etc.)

TYPE:strDEFAULT:'default'

provider

Provider name (uses default if not specified)

TYPE:Optional[str]DEFAULT:None

RETURNSDESCRIPTION
str

Model name string

Source code in tenets/config.py
Python
def get_model(self, task: str = "default", provider: Optional[str] = None) -> str:
    """Get model name for a specific task and provider.

    Args:
        task: Task type (default, summarization, analysis, etc.)
        provider: Provider name (uses default if not specified)

    Returns:
        Model name string
    """
    provider = provider or self.provider

    # Try provider-specific model first
    provider_task = f"{provider}_{task}"
    if provider_task in self.models:
        return self.models[provider_task]

    # Fall back to general task model
    return self.models.get(task, self.models["default"])
to_litellm_params
Python
to_litellm_params() -> Dict[str, Any]

Convert to parameters for LiteLLM library.

RETURNSDESCRIPTION
Dict[str, Any]

Dictionary of parameters compatible with LiteLLM

Source code in tenets/config.py
Python
def to_litellm_params(self) -> Dict[str, Any]:
    """Convert to parameters for LiteLLM library.

    Returns:
        Dictionary of parameters compatible with LiteLLM
    """
    params = {
        "temperature": self.temperature,
        "top_p": self.top_p,
        "frequency_penalty": self.frequency_penalty,
        "presence_penalty": self.presence_penalty,
        "max_tokens": self.max_tokens_per_request,
        "timeout": self.timeout,
        "stream": self.stream,
    }

    # Add API key if available
    api_key = self.get_api_key()
    if api_key:
        params["api_key"] = api_key

    # Add custom base URL if specified
    if self.provider in self.api_base_urls:
        params["api_base"] = self.api_base_urls[self.provider]

    # Add organization/project IDs if specified
    if self.organization_id:
        params["organization"] = self.organization_id
    if self.project_id:
        params["project"] = self.project_id

    # Add custom headers
    if self.custom_headers:
        params["extra_headers"] = self.custom_headers

    return params

ScannerConfigdataclass

Python
ScannerConfig(respect_gitignore: bool = True, follow_symlinks: bool = False, max_file_size: int = 5000000, max_files: int = 10000, binary_check: bool = True, encoding: str = 'utf-8', additional_ignore_patterns: List[str] = (lambda: ['*.pyc', '*.pyo', '__pycache__', '*.so', '*.dylib', '*.dll', '*.egg-info', '*.dist-info', '.tox', '.nox', '.coverage', '.hypothesis', '.pytest_cache', '.mypy_cache', '.ruff_cache'])(), additional_include_patterns: List[str] = list(), workers: int = 4, parallel_mode: str = 'auto', timeout: float = 5.0, exclude_minified: bool = True, minified_patterns: List[str] = (lambda: ['*.min.js', '*.min.css', 'bundle.js', '*.bundle.js', '*.bundle.css', '*.production.js', '*.prod.js', 'vendor.prod.js', '*.dist.js', '*.compiled.js', '*.minified.*', '*.uglified.*'])(), build_directory_patterns: List[str] = (lambda: ['dist/', 'build/', 'out/', 'output/', 'public/', 'static/generated/', '.next/', '_next/', 'node_modules/'])(), exclude_tests_by_default: bool = True, test_patterns: List[str] = (lambda: ['test_*.py', '*_test.py', 'test*.py', '*.test.js', '*.spec.js', '*.test.ts', '*.spec.ts', '*.test.jsx', '*.spec.jsx', '*.test.tsx', '*.spec.tsx', '*Test.java', '*Tests.java', '*TestCase.java', '*Test.cs', '*Tests.cs', '*TestCase.cs', '*_test.go', 'test_*.go', '*_test.rb', '*_spec.rb', 'test_*.rb', '*Test.php', '*_test.php', 'test_*.php', '*_test.rs', 'test_*.rs', '**/test/**', '**/tests/**', '**/*test*/**'])(), test_directories: List[str] = (lambda: ['test', 'tests', '__tests__', 'spec', 'specs', 'testing', 'test_*', '*_test', '*_tests', 'unit_tests', 'integration_tests', 'e2e', 'e2e_tests', 'functional_tests', 'acceptance_tests', 'regression_tests'])())

Configuration for file scanning subsystem.

Controls how tenets discovers and filters files in a codebase.

ATTRIBUTEDESCRIPTION
respect_gitignore

Whether to respect .gitignore files

TYPE:bool

follow_symlinks

Whether to follow symbolic links

TYPE:bool

max_file_size

Maximum file size in bytes to analyze

TYPE:int

max_files

Maximum number of files to scan

TYPE:int

binary_check

Whether to check for and skip binary files

TYPE:bool

encoding

Default file encoding

TYPE:str

additional_ignore_patterns

Extra patterns to ignore

TYPE:List[str]

additional_include_patterns

Extra patterns to include

TYPE:List[str]

workers

Number of parallel workers for scanning

TYPE:int

parallel_mode

Parallel execution mode ("thread", "process", or "auto")

TYPE:str

timeout

Per-file analysis timeout used in parallel execution (seconds)

TYPE:float

RankingConfigdataclass

Python
RankingConfig(algorithm: str = 'balanced', threshold: float = 0.1, text_similarity_algorithm: str = 'bm25', use_tfidf: bool = True, use_stopwords: bool = False, use_embeddings: bool = False, use_git: bool = True, use_ml: bool = False, embedding_model: str = 'all-MiniLM-L6-v2', custom_weights: Dict[str, float] = (lambda: {'keyword_match': 0.25, 'path_relevance': 0.2, 'import_graph': 0.2, 'git_activity': 0.15, 'file_type': 0.1, 'complexity': 0.1})(), workers: int = 2, parallel_mode: str = 'auto', batch_size: int = 100)

Configuration for relevance ranking system.

Controls how files are scored and ranked for relevance to prompts. Uses centralized NLP components for all text processing.

ATTRIBUTEDESCRIPTION
algorithm

Default ranking algorithm (fast, balanced, thorough, ml)

TYPE:str

threshold

Minimum relevance score to include file

TYPE:float

text_similarity_algorithm

Text similarity algorithm ('bm25' or 'tfidf', default: 'bm25')

TYPE:str

use_tfidf

Whether to use TF-IDF for keyword matching (deprecated, use text_similarity_algorithm)

TYPE:bool

use_stopwords

Whether to use stopwords filtering

TYPE:bool

use_embeddings

Whether to use semantic embeddings (requires ML)

TYPE:bool

use_git

Whether to include git signals in ranking

TYPE:bool

use_ml

Whether to enable ML features (uses NLP embeddings)

TYPE:bool

embedding_model

Which embedding model to use

TYPE:str

custom_weights

Custom weights for ranking factors

TYPE:Dict[str, float]

workers

Number of parallel workers for ranking

TYPE:int

parallel_mode

Parallel execution mode ("thread", "process", or "auto")

TYPE:str

batch_size

Batch size for ML operations

TYPE:int

SummarizerConfigdataclass

Python
SummarizerConfig(default_mode: str = 'auto', target_ratio: float = 0.3, enable_cache: bool = True, preserve_code_structure: bool = True, summarize_imports: bool = True, import_summary_threshold: int = 5, max_cache_size: int = 100, llm_provider: Optional[str] = None, llm_model: Optional[str] = None, llm_temperature: float = 0.3, llm_max_tokens: int = 500, enable_ml_strategies: bool = True, quality_threshold: str = 'medium', batch_size: int = 10, docs_context_aware: bool = True, docs_show_in_place_context: bool = True, docs_context_search_depth: int = 2, docs_context_min_confidence: float = 0.6, docs_context_max_sections: int = 10, docs_context_preserve_examples: bool = True, docstring_weight: float = 0.5, include_all_signatures: bool = True)

Configuration for content summarization system.

Controls how text and code are compressed to fit within token limits.

ATTRIBUTEDESCRIPTION
default_mode

Default summarization mode (extractive, compressive, textrank, transformer, llm, auto)

TYPE:str

target_ratio

Default target compression ratio (0.3 = 30% of original)

TYPE:float

enable_cache

Whether to cache summaries

TYPE:bool

preserve_code_structure

Whether to preserve imports/signatures in code

TYPE:bool

summarize_imports

Whether to condense imports into a summary (default: True)

TYPE:bool

import_summary_threshold

Number of imports to trigger summarization (default: 5)

TYPE:int

max_cache_size

Maximum number of cached summaries

TYPE:int

llm_provider

LLM provider for LLM mode (uses global LLM config)

TYPE:Optional[str]

llm_model

LLM model to use (uses global LLM config)

TYPE:Optional[str]

llm_temperature

LLM sampling temperature

TYPE:float

llm_max_tokens

Maximum tokens for LLM response

TYPE:int

enable_ml_strategies

Whether to enable ML-based strategies

TYPE:bool

quality_threshold

Quality threshold for auto mode selection

TYPE:str

batch_size

Batch size for parallel processing

TYPE:int

docs_context_aware

Whether to enable context-aware summarization for documentation files

TYPE:bool

docs_show_in_place_context

When enabled, preserves and highlights relevant context in documentation summaries instead of generic structure

TYPE:bool

docs_context_search_depth

How deep to search for contextual references (1=direct mentions, 2=semantic similarity, 3=deep analysis)

TYPE:int

docs_context_min_confidence

Minimum confidence threshold for context relevance (0.0-1.0)

TYPE:float

docs_context_max_sections

Maximum number of contextual sections to preserve per document

TYPE:int

docs_context_preserve_examples

Whether to always preserve code examples and snippets in documentation

TYPE:bool

TenetConfigdataclass

Python
TenetConfig(auto_instill: bool = True, max_per_context: int = 5, reinforcement: bool = True, injection_strategy: str = 'strategic', min_distance_between: int = 1000, prefer_natural_breaks: bool = True, storage_path: Optional[Path] = None, collections_enabled: bool = True, injection_frequency: str = 'adaptive', injection_interval: int = 3, session_complexity_threshold: float = 0.7, min_session_length: int = 1, adaptive_injection: bool = True, track_injection_history: bool = True, decay_rate: float = 0.1, reinforcement_interval: int = 10, session_aware: bool = True, session_memory_limit: int = 100, persist_session_history: bool = True, complexity_weight: float = 0.5, priority_boost_critical: float = 2.0, priority_boost_high: float = 1.5, skip_low_priority_on_complex: bool = True, track_effectiveness: bool = True, effectiveness_window_days: int = 30, min_compliance_score: float = 0.6, system_instruction: Optional[str] = None, system_instruction_enabled: bool = False, system_instruction_position: str = 'top', system_instruction_format: str = 'markdown', system_instruction_once_per_session: bool = True)

Configuration for the tenet (guiding principles) system.

Controls how tenets are managed and injected into context, including smart injection frequency, session tracking, and adaptive behavior.

ATTRIBUTEDESCRIPTION
auto_instill

Whether to automatically apply tenets to context

TYPE:bool

max_per_context

Maximum tenets to inject per context

TYPE:int

reinforcement

Whether to reinforce critical tenets

TYPE:bool

injection_strategy

Default injection strategy ('strategic', 'top', 'distributed')

TYPE:str

min_distance_between

Minimum character distance between injections

TYPE:int

prefer_natural_breaks

Whether to inject at natural break points

TYPE:bool

storage_path

Where to store tenet database

TYPE:Optional[Path]

collections_enabled

Whether to enable tenet collections

TYPE:bool

injection_frequency

How often to inject tenets ('always', 'periodic', 'adaptive', 'manual')

TYPE:str

injection_interval

Numeric interval for periodic injection (e.g., every 3rd distill)

TYPE:int

session_complexity_threshold

Complexity threshold for smart injection (0-1)

TYPE:float

min_session_length

Minimum session length before first injection

TYPE:int

adaptive_injection

Enable adaptive injection based on context analysis

TYPE:bool

track_injection_history

Track injection history per session for smarter decisions

TYPE:bool

decay_rate

How quickly tenet importance decays (0-1, higher = faster decay)

TYPE:float

reinforcement_interval

How often to reinforce critical tenets (every N injections)

TYPE:int

session_aware

Enable session-aware injection patterns

TYPE:bool

session_memory_limit

Max sessions to track in memory

TYPE:int

persist_session_history

Save session histories to disk

TYPE:bool

complexity_weight

Weight given to complexity in injection decisions (0-1)

TYPE:float

priority_boost_critical

Boost factor for critical priority tenets

TYPE:float

priority_boost_high

Boost factor for high priority tenets

TYPE:float

skip_low_priority_on_complex

Skip low priority tenets when complexity > threshold

TYPE:bool

track_effectiveness

Track tenet effectiveness metrics

TYPE:bool

effectiveness_window_days

Days to consider for effectiveness analysis

TYPE:int

min_compliance_score

Minimum compliance score before reinforcement

TYPE:float

System instruction (system prompt) configuration

system_instruction: Optional text to inject as foundational context system_instruction_enabled: Enable auto-injection when instruction exists system_instruction_position: Where to inject (top, after_header, before_content) system_instruction_format: Format of instruction (markdown, xml, comment, plain) system_instruction_once_per_session: Inject once per session; if no session, inject every distill

Attributes
injection_configproperty
Python
injection_config: Dict[str, Any]

Get injection configuration as dictionary for TenetInjector.

CacheConfigdataclass

Python
CacheConfig(enabled: bool = True, directory: Optional[Path] = None, ttl_days: int = 7, max_size_mb: int = 500, compression: bool = False, memory_cache_size: int = 1000, sqlite_pragmas: Dict[str, str] = (lambda: {'journal_mode': 'WAL', 'synchronous': 'NORMAL', 'cache_size': '-64000', 'temp_store': 'MEMORY'})(), max_age_hours: int = 24, llm_cache_enabled: bool = True, llm_cache_ttl_hours: int = 24)

Configuration for caching system.

Controls cache behavior for analysis results and other expensive operations.

ATTRIBUTEDESCRIPTION
enabled

Whether caching is enabled

TYPE:bool

directory

Cache directory path

TYPE:Optional[Path]

ttl_days

Time-to-live for cache entries in days

TYPE:int

max_size_mb

Maximum cache size in megabytes

TYPE:int

compression

Whether to compress cached data

TYPE:bool

memory_cache_size

Number of items in memory cache

TYPE:int

sqlite_pragmas

SQLite performance settings

TYPE:Dict[str, str]

max_age_hours

Max age for certain cached entries (used by analyzer)

TYPE:int

llm_cache_enabled

Whether to cache LLM responses

TYPE:bool

llm_cache_ttl_hours

TTL for LLM response cache

TYPE:int

OutputConfigdataclass

Python
OutputConfig(default_format: str = 'markdown', syntax_highlighting: bool = True, line_numbers: bool = False, max_line_length: int = 120, include_metadata: bool = True, compression_threshold: int = 10000, summary_ratio: float = 0.25, copy_on_distill: bool = False, show_token_usage: bool = True, show_cost_estimate: bool = True)

Configuration for output formatting.

Controls how context and analysis results are formatted.

ATTRIBUTEDESCRIPTION
default_format

Default output format (markdown, xml, json)

TYPE:str

syntax_highlighting

Whether to enable syntax highlighting

TYPE:bool

line_numbers

Whether to include line numbers

TYPE:bool

max_line_length

Maximum line length before wrapping

TYPE:int

include_metadata

Whether to include metadata in output

TYPE:bool

compression_threshold

File size threshold for summarization

TYPE:int

summary_ratio

Target compression ratio for summaries

TYPE:float

copy_on_distill

Automatically copy distill output to clipboard when true

TYPE:bool

show_token_usage

Whether to show token usage statistics

TYPE:bool

show_cost_estimate

Whether to show cost estimates for LLM operations

TYPE:bool

GitConfigdataclass

Python
GitConfig(enabled: bool = True, include_history: bool = True, history_limit: int = 100, include_blame: bool = False, include_stats: bool = True, ignore_authors: List[str] = (lambda: ['dependabot[bot]', 'github-actions[bot]', 'renovate[bot]'])(), main_branches: List[str] = (lambda: ['main', 'master', 'develop', 'trunk'])())

Configuration for git integration.

Controls how git information is gathered and used.

ATTRIBUTEDESCRIPTION
enabled

Whether git integration is enabled

TYPE:bool

include_history

Whether to include commit history

TYPE:bool

history_limit

Maximum number of commits to include

TYPE:int

include_blame

Whether to include git blame info

TYPE:bool

include_stats

Whether to include statistics

TYPE:bool

ignore_authors

Authors to ignore in analysis

TYPE:List[str]

main_branches

Branch names considered "main"

TYPE:List[str]

TenetsConfigdataclass

Python
TenetsConfig(config_file: Optional[Path] = None, project_root: Optional[Path] = None, max_tokens: int = 100000, version: str = '0.1.0', debug: bool = False, quiet: bool = False, scanner: ScannerConfig = ScannerConfig(), ranking: RankingConfig = RankingConfig(), summarizer: SummarizerConfig = SummarizerConfig(), tenet: TenetConfig = TenetConfig(), cache: CacheConfig = CacheConfig(), output: OutputConfig = OutputConfig(), git: GitConfig = GitConfig(), llm: LLMConfig = LLMConfig(), nlp: NLPConfig = NLPConfig(), custom: Dict[str, Any] = dict())

Main configuration for the Tenets system with LLM and NLP support.

This is the root configuration object that contains all subsystem configs and global settings. It handles loading from files, environment variables, and provides sensible defaults.

ATTRIBUTEDESCRIPTION
config_file

Path to configuration file (if any)

TYPE:Optional[Path]

project_root

Root directory of the project

TYPE:Optional[Path]

max_tokens

Default maximum tokens for context

TYPE:int

version

Tenets version (for compatibility checking)

TYPE:str

debug

Enable debug mode

TYPE:bool

quiet

Suppress non-essential output

TYPE:bool

scanner

Scanner subsystem configuration

TYPE:ScannerConfig

ranking

Ranking subsystem configuration

TYPE:RankingConfig

summarizer

Summarizer subsystem configuration

TYPE:SummarizerConfig

tenet

Tenet subsystem configuration

TYPE:TenetConfig

cache

Cache subsystem configuration

TYPE:CacheConfig

output

Output formatting configuration

TYPE:OutputConfig

git

Git integration configuration

TYPE:GitConfig

llm

LLM integration configuration

TYPE:LLMConfig

nlp

NLP system configuration

TYPE:NLPConfig

custom

Custom user configuration

TYPE:Dict[str, Any]

Attributes
exclude_minifiedpropertywritable
Python
exclude_minified: bool

Get exclude_minified setting from scanner config.

minified_patternspropertywritable
Python
minified_patterns: List[str]

Get minified patterns from scanner config.

build_directory_patternspropertywritable
Python
build_directory_patterns: List[str]

Get build directory patterns from scanner config.

cache_dirpropertywritable
Python
cache_dir: Path

Get the cache directory path.

scanner_workersproperty
Python
scanner_workers: int

Get number of scanner workers.

ranking_workersproperty
Python
ranking_workers: int

Get number of ranking workers.

ranking_algorithmproperty
Python
ranking_algorithm: str

Get the ranking algorithm.

summarizer_modeproperty
Python
summarizer_mode: str

Get the default summarizer mode.

summarizer_ratioproperty
Python
summarizer_ratio: float

Get the default summarization target ratio.

respect_gitignorepropertywritable
Python
respect_gitignore: bool

Whether to respect .gitignore files.

Python
follow_symlinks: bool

Whether to follow symbolic links.

additional_ignore_patternspropertywritable
Python
additional_ignore_patterns: List[str]

Get additional ignore patterns.

auto_instill_tenetspropertywritable
Python
auto_instill_tenets: bool

Whether to automatically instill tenets.

max_tenets_per_contextpropertywritable
Python
max_tenets_per_context: int

Maximum tenets to inject per context.

tenet_injection_configproperty
Python
tenet_injection_config: Dict[str, Any]

Get tenet injection configuration.

cache_ttl_dayspropertywritable
Python
cache_ttl_days: int

Cache time-to-live in days.

max_cache_size_mbpropertywritable
Python
max_cache_size_mb: int

Maximum cache size in megabytes.

llm_enabledpropertywritable
Python
llm_enabled: bool

Whether LLM features are enabled.

llm_providerpropertywritable
Python
llm_provider: str

Get the current LLM provider.

nlp_enabledpropertywritable
Python
nlp_enabled: bool

Whether NLP features are enabled.

nlp_embeddings_enabledpropertywritable
Python
nlp_embeddings_enabled: bool

Whether NLP embeddings are enabled.

Functions
to_dict
Python
to_dict() -> Dict[str, Any]

Convert configuration to dictionary.

RETURNSDESCRIPTION
Dict[str, Any]

Dictionary representation of configuration

Source code in tenets/config.py
Python
def to_dict(self) -> Dict[str, Any]:
    """Convert configuration to dictionary.

    Returns:
        Dictionary representation of configuration
    """

    def _as_serializable(obj):
        if isinstance(obj, Path):
            return str(obj)
        if isinstance(obj, dict):
            return {k: _as_serializable(v) for k, v in obj.items()}
        if isinstance(obj, list):
            return [_as_serializable(v) for v in obj]
        return obj

    data = {
        "max_tokens": self.max_tokens,
        "version": self.version,
        "debug": self.debug,
        "quiet": self.quiet,
        "scanner": asdict(self.scanner),
        "ranking": asdict(self.ranking),
        "summarizer": asdict(self.summarizer),
        "tenet": asdict(self.tenet),
        "cache": asdict(self.cache),
        "output": asdict(self.output),
        "git": asdict(self.git),
        "llm": asdict(self.llm),
        "nlp": asdict(self.nlp),
        "custom": self.custom,
    }
    return _as_serializable(data)
save
Python
save(path: Optional[Path] = None)

Save configuration to file.

PARAMETERDESCRIPTION
path

Path to save to (uses config_file if not specified)

TYPE:Optional[Path]DEFAULT:None

RAISESDESCRIPTION
ValueError

If no path specified and config_file not set

Source code in tenets/config.py
Python
def save(self, path: Optional[Path] = None):
    """Save configuration to file.

    Args:
        path: Path to save to (uses config_file if not specified)

    Raises:
        ValueError: If no path specified and config_file not set
    """
    # Only allow implicit save to config_file if it was explicitly provided
    if path is None:
        if not self.config_file or self._config_file_discovered:
            raise ValueError("No path specified for saving configuration")
        save_path = self.config_file
    else:
        save_path = path

    save_path = Path(save_path)
    config_dict = self.to_dict()

    # Remove version from saved config (managed by package)
    config_dict.pop("version", None)

    with open(save_path, "w") as f:
        if save_path.suffix == ".json":
            json.dump(config_dict, f, indent=2)
        else:
            _ensure_yaml_imported()  # Import yaml when needed
            yaml.dump(config_dict, f, default_flow_style=False, sort_keys=False)

    self._logger.info(f"Configuration saved to {save_path}")
get_llm_api_key
Python
get_llm_api_key(provider: Optional[str] = None) -> Optional[str]

Get LLM API key for a provider.

PARAMETERDESCRIPTION
provider

Provider name (uses default if not specified)

TYPE:Optional[str]DEFAULT:None

RETURNSDESCRIPTION
Optional[str]

API key or None

Source code in tenets/config.py
Python
def get_llm_api_key(self, provider: Optional[str] = None) -> Optional[str]:
    """Get LLM API key for a provider.

    Args:
        provider: Provider name (uses default if not specified)

    Returns:
        API key or None
    """
    return self.llm.get_api_key(provider)
get_llm_model
Python
get_llm_model(task: str = 'default', provider: Optional[str] = None) -> str

Get LLM model for a specific task.

PARAMETERDESCRIPTION
task

Task type

TYPE:strDEFAULT:'default'

provider

Provider name (uses default if not specified)

TYPE:Optional[str]DEFAULT:None

RETURNSDESCRIPTION
str

Model name

Source code in tenets/config.py
Python
def get_llm_model(self, task: str = "default", provider: Optional[str] = None) -> str:
    """Get LLM model for a specific task.

    Args:
        task: Task type
        provider: Provider name (uses default if not specified)

    Returns:
        Model name
    """
    return self.llm.get_model(task, provider)