config
¶
Full name: tenets.config
config¶
Configuration management for Tenets with enhanced LLM and NLP support.
This module handles all configuration for the Tenets system, including loading from files, environment variables, and providing defaults. Configuration can be specified at multiple levels with proper precedence.
Configuration precedence (highest to lowest): 1. Runtime parameters (passed to methods) 2. Environment variables (TENETS_*) 3. Project config file (.tenets.yml in project) 4. User config file (~/.config/tenets/config.yml) 5. Default values
The configuration system is designed to work with zero configuration (sensible defaults) while allowing full customization when needed.
Enhanced with comprehensive LLM provider support for optional AI-powered features and centralized NLP configuration for all text processing operations.
Classes¶
NLPConfigdataclass
¶
NLPConfig(enabled: bool = True, stopwords_enabled: bool = True, code_stopword_set: str = 'minimal', prompt_stopword_set: str = 'aggressive', custom_stopword_files: List[str] = list(), tokenization_mode: str = 'auto', preserve_original_tokens: bool = True, split_camelcase: bool = True, split_snakecase: bool = True, min_token_length: int = 2, keyword_extraction_method: str = 'auto', max_keywords: int = 30, ngram_size: int = 3, yake_dedup_threshold: float = 0.7, tfidf_use_sublinear: bool = True, tfidf_use_idf: bool = True, tfidf_norm: str = 'l2', bm25_k1: float = 1.2, bm25_b: float = 0.75, embeddings_enabled: bool = False, embeddings_model: str = 'all-MiniLM-L6-v2', embeddings_device: str = 'auto', embeddings_cache: bool = True, embeddings_batch_size: int = 32, similarity_metric: str = 'cosine', similarity_threshold: float = 0.7, cache_embeddings_ttl_days: int = 30, cache_tfidf_ttl_days: int = 7, cache_keywords_ttl_days: int = 7, multiprocessing_enabled: bool = True, multiprocessing_workers: Optional[int] = None, multiprocessing_chunk_size: int = 100)
Configuration for centralized NLP (Natural Language Processing) system.
Controls all text processing operations including tokenization, keyword extraction, stopword filtering, embeddings, and similarity computation. All NLP operations are centralized in the tenets.core.nlp package.
ATTRIBUTE | DESCRIPTION |
---|---|
enabled | Whether NLP features are enabled globally TYPE: |
stopwords_enabled | Whether to use stopword filtering TYPE: |
code_stopword_set | Stopword set for code search (minimal) TYPE: |
prompt_stopword_set | Stopword set for prompt parsing (aggressive) TYPE: |
custom_stopword_files | Additional custom stopword files |
tokenization_mode | Tokenization mode ('code', 'text', 'auto') TYPE: |
preserve_original_tokens | Keep original tokens for exact matching TYPE: |
split_camelcase | Split camelCase and PascalCase TYPE: |
split_snakecase | Split snake_case TYPE: |
min_token_length | Minimum token length to keep TYPE: |
keyword_extraction_method | Method for keyword extraction TYPE: |
max_keywords | Maximum keywords to extract TYPE: |
ngram_size | Maximum n-gram size for extraction TYPE: |
yake_dedup_threshold | YAKE deduplication threshold TYPE: |
tfidf_use_sublinear | Use log scaling for term frequency TYPE: |
tfidf_use_idf | Use inverse document frequency TYPE: |
tfidf_norm | Normalization method for TF-IDF TYPE: |
bm25_k1 | BM25 term frequency saturation parameter TYPE: |
bm25_b | BM25 length normalization parameter TYPE: |
embeddings_enabled | Whether to use embeddings (requires ML) TYPE: |
embeddings_model | Default embedding model TYPE: |
embeddings_device | Device for embeddings ('auto', 'cpu', 'cuda') TYPE: |
embeddings_cache | Whether to cache embeddings TYPE: |
embeddings_batch_size | Batch size for embedding generation TYPE: |
similarity_metric | Default similarity metric TYPE: |
similarity_threshold | Default similarity threshold TYPE: |
cache_embeddings_ttl_days | TTL for embedding cache TYPE: |
cache_tfidf_ttl_days | TTL for TF-IDF cache TYPE: |
cache_keywords_ttl_days | TTL for keyword cache TYPE: |
multiprocessing_enabled | Enable multiprocessing for NLP operations TYPE: |
multiprocessing_workers | Number of workers (None = cpu_count) |
multiprocessing_chunk_size | Chunk size for parallel processing TYPE: |
LLMConfigdataclass
¶
LLMConfig(enabled: bool = False, provider: str = 'openai', fallback_providers: List[str] = (lambda: ['anthropic', 'openrouter'])(), api_keys: Dict[str, str] = (lambda: {'openai': '${OPENAI_API_KEY}', 'anthropic': '${ANTHROPIC_API_KEY}', 'openrouter': '${OPENROUTER_API_KEY}', 'cohere': '${COHERE_API_KEY}', 'together': '${TOGETHER_API_KEY}', 'huggingface': '${HUGGINGFACE_API_KEY}', 'replicate': '${REPLICATE_API_KEY}', 'ollama': ''})(), api_base_urls: Dict[str, str] = (lambda: {'openai': 'https://api.openai.com/v1', 'anthropic': 'https://api.anthropic.com/v1', 'openrouter': 'https://openrouter.ai/api/v1', 'ollama': 'http://localhost:11434'})(), models: Dict[str, str] = (lambda: {'default': 'gpt-4o-mini', 'summarization': 'gpt-3.5-turbo', 'analysis': 'gpt-4o', 'embeddings': 'text-embedding-3-small', 'code_generation': 'gpt-4o', 'semantic_search': 'text-embedding-3-small', 'anthropic_default': 'claude-3-haiku-20240307', 'anthropic_analysis': 'claude-3-sonnet-20240229', 'anthropic_code': 'claude-3-opus-20240229', 'ollama_default': 'llama2', 'ollama_code': 'codellama', 'ollama_embeddings': 'nomic-embed-text'})(), max_cost_per_run: float = 0.1, max_cost_per_day: float = 10.0, max_tokens_per_request: int = 4000, max_context_length: int = 100000, temperature: float = 0.3, top_p: float = 0.95, frequency_penalty: float = 0.0, presence_penalty: float = 0.0, requests_per_minute: int = 60, retry_on_error: bool = True, max_retries: int = 3, retry_delay: float = 1.0, retry_backoff: float = 2.0, timeout: int = 30, stream: bool = False, cache_responses: bool = True, cache_ttl_hours: int = 24, log_requests: bool = False, log_responses: bool = False, custom_headers: Dict[str, str] = dict(), organization_id: Optional[str] = None, project_id: Optional[str] = None)
Configuration for LLM (Large Language Model) integration.
Supports multiple providers and models with comprehensive cost controls, rate limiting, and fallback strategies. All LLM features are optional and disabled by default.
ATTRIBUTE | DESCRIPTION |
---|---|
enabled | Whether LLM features are enabled globally TYPE: |
provider | Primary LLM provider (openai, anthropic, openrouter, litellm, ollama) TYPE: |
fallback_providers | Ordered list of fallback providers if primary fails |
api_keys | Dictionary of provider -> API key (can use env vars) |
api_base_urls | Custom API endpoints for providers (e.g., for proxies) |
models | Model selection for different tasks |
max_cost_per_run | Maximum cost in USD per execution run TYPE: |
max_cost_per_day | Maximum cost in USD per day TYPE: |
max_tokens_per_request | Maximum tokens per single request TYPE: |
max_context_length | Maximum context window to use TYPE: |
temperature | Sampling temperature (0.0-2.0, lower = more deterministic) TYPE: |
top_p | Nucleus sampling parameter TYPE: |
frequency_penalty | Frequency penalty for token repetition TYPE: |
presence_penalty | Presence penalty for topic repetition TYPE: |
requests_per_minute | Rate limit for API requests TYPE: |
retry_on_error | Whether to retry failed requests TYPE: |
max_retries | Maximum number of retry attempts TYPE: |
retry_delay | Initial delay between retries in seconds TYPE: |
retry_backoff | Backoff multiplier for retry delays TYPE: |
timeout | Request timeout in seconds TYPE: |
stream | Whether to stream responses TYPE: |
cache_responses | Whether to cache LLM responses TYPE: |
cache_ttl_hours | Cache time-to-live in hours TYPE: |
log_requests | Whether to log all LLM requests TYPE: |
log_responses | Whether to log all LLM responses TYPE: |
custom_headers | Additional headers for API requests |
organization_id | Organization ID for providers that support it |
project_id | Project ID for providers that support it |
Functions¶
get_api_key¶
Get API key for a specific provider.
PARAMETER | DESCRIPTION |
---|---|
provider | Provider name (uses default if not specified) |
RETURNS | DESCRIPTION |
---|---|
Optional[str] | API key string or None if not configured |
Source code in tenets/config.py
def get_api_key(self, provider: Optional[str] = None) -> Optional[str]:
"""Get API key for a specific provider.
Args:
provider: Provider name (uses default if not specified)
Returns:
API key string or None if not configured
"""
provider = provider or self.provider
key = self.api_keys.get(provider)
# Don't return placeholder values
if key and key.startswith("${") and key.endswith("}"):
return None
return key
get_model¶
Get model name for a specific task and provider.
PARAMETER | DESCRIPTION |
---|---|
task | Task type (default, summarization, analysis, etc.) TYPE: |
provider | Provider name (uses default if not specified) |
RETURNS | DESCRIPTION |
---|---|
str | Model name string |
Source code in tenets/config.py
def get_model(self, task: str = "default", provider: Optional[str] = None) -> str:
"""Get model name for a specific task and provider.
Args:
task: Task type (default, summarization, analysis, etc.)
provider: Provider name (uses default if not specified)
Returns:
Model name string
"""
provider = provider or self.provider
# Try provider-specific model first
provider_task = f"{provider}_{task}"
if provider_task in self.models:
return self.models[provider_task]
# Fall back to general task model
return self.models.get(task, self.models["default"])
to_litellm_params¶
Convert to parameters for LiteLLM library.
RETURNS | DESCRIPTION |
---|---|
Dict[str, Any] | Dictionary of parameters compatible with LiteLLM |
Source code in tenets/config.py
def to_litellm_params(self) -> Dict[str, Any]:
"""Convert to parameters for LiteLLM library.
Returns:
Dictionary of parameters compatible with LiteLLM
"""
params = {
"temperature": self.temperature,
"top_p": self.top_p,
"frequency_penalty": self.frequency_penalty,
"presence_penalty": self.presence_penalty,
"max_tokens": self.max_tokens_per_request,
"timeout": self.timeout,
"stream": self.stream,
}
# Add API key if available
api_key = self.get_api_key()
if api_key:
params["api_key"] = api_key
# Add custom base URL if specified
if self.provider in self.api_base_urls:
params["api_base"] = self.api_base_urls[self.provider]
# Add organization/project IDs if specified
if self.organization_id:
params["organization"] = self.organization_id
if self.project_id:
params["project"] = self.project_id
# Add custom headers
if self.custom_headers:
params["extra_headers"] = self.custom_headers
return params
ScannerConfigdataclass
¶
ScannerConfig(respect_gitignore: bool = True, follow_symlinks: bool = False, max_file_size: int = 5000000, max_files: int = 10000, binary_check: bool = True, encoding: str = 'utf-8', additional_ignore_patterns: List[str] = (lambda: ['*.pyc', '*.pyo', '__pycache__', '*.so', '*.dylib', '*.dll', '*.egg-info', '*.dist-info', '.tox', '.nox', '.coverage', '.hypothesis', '.pytest_cache', '.mypy_cache', '.ruff_cache'])(), additional_include_patterns: List[str] = list(), workers: int = 4, parallel_mode: str = 'auto', timeout: float = 5.0, exclude_minified: bool = True, minified_patterns: List[str] = (lambda: ['*.min.js', '*.min.css', 'bundle.js', '*.bundle.js', '*.bundle.css', '*.production.js', '*.prod.js', 'vendor.prod.js', '*.dist.js', '*.compiled.js', '*.minified.*', '*.uglified.*'])(), build_directory_patterns: List[str] = (lambda: ['dist/', 'build/', 'out/', 'output/', 'public/', 'static/generated/', '.next/', '_next/', 'node_modules/'])(), exclude_tests_by_default: bool = True, test_patterns: List[str] = (lambda: ['test_*.py', '*_test.py', 'test*.py', '*.test.js', '*.spec.js', '*.test.ts', '*.spec.ts', '*.test.jsx', '*.spec.jsx', '*.test.tsx', '*.spec.tsx', '*Test.java', '*Tests.java', '*TestCase.java', '*Test.cs', '*Tests.cs', '*TestCase.cs', '*_test.go', 'test_*.go', '*_test.rb', '*_spec.rb', 'test_*.rb', '*Test.php', '*_test.php', 'test_*.php', '*_test.rs', 'test_*.rs', '**/test/**', '**/tests/**', '**/*test*/**'])(), test_directories: List[str] = (lambda: ['test', 'tests', '__tests__', 'spec', 'specs', 'testing', 'test_*', '*_test', '*_tests', 'unit_tests', 'integration_tests', 'e2e', 'e2e_tests', 'functional_tests', 'acceptance_tests', 'regression_tests'])())
Configuration for file scanning subsystem.
Controls how tenets discovers and filters files in a codebase.
ATTRIBUTE | DESCRIPTION |
---|---|
respect_gitignore | Whether to respect .gitignore files TYPE: |
follow_symlinks | Whether to follow symbolic links TYPE: |
max_file_size | Maximum file size in bytes to analyze TYPE: |
max_files | Maximum number of files to scan TYPE: |
binary_check | Whether to check for and skip binary files TYPE: |
encoding | Default file encoding TYPE: |
additional_ignore_patterns | Extra patterns to ignore |
additional_include_patterns | Extra patterns to include |
workers | Number of parallel workers for scanning TYPE: |
parallel_mode | Parallel execution mode ("thread", "process", or "auto") TYPE: |
timeout | Per-file analysis timeout used in parallel execution (seconds) TYPE: |
RankingConfigdataclass
¶
RankingConfig(algorithm: str = 'balanced', threshold: float = 0.1, text_similarity_algorithm: str = 'bm25', use_tfidf: bool = True, use_stopwords: bool = False, use_embeddings: bool = False, use_git: bool = True, use_ml: bool = False, embedding_model: str = 'all-MiniLM-L6-v2', custom_weights: Dict[str, float] = (lambda: {'keyword_match': 0.25, 'path_relevance': 0.2, 'import_graph': 0.2, 'git_activity': 0.15, 'file_type': 0.1, 'complexity': 0.1})(), workers: int = 2, parallel_mode: str = 'auto', batch_size: int = 100)
Configuration for relevance ranking system.
Controls how files are scored and ranked for relevance to prompts. Uses centralized NLP components for all text processing.
ATTRIBUTE | DESCRIPTION |
---|---|
algorithm | Default ranking algorithm (fast, balanced, thorough, ml) TYPE: |
threshold | Minimum relevance score to include file TYPE: |
text_similarity_algorithm | Text similarity algorithm ('bm25' or 'tfidf', default: 'bm25') TYPE: |
use_tfidf | Whether to use TF-IDF for keyword matching (deprecated, use text_similarity_algorithm) TYPE: |
use_stopwords | Whether to use stopwords filtering TYPE: |
use_embeddings | Whether to use semantic embeddings (requires ML) TYPE: |
use_git | Whether to include git signals in ranking TYPE: |
use_ml | Whether to enable ML features (uses NLP embeddings) TYPE: |
embedding_model | Which embedding model to use TYPE: |
custom_weights | Custom weights for ranking factors |
workers | Number of parallel workers for ranking TYPE: |
parallel_mode | Parallel execution mode ("thread", "process", or "auto") TYPE: |
batch_size | Batch size for ML operations TYPE: |
SummarizerConfigdataclass
¶
SummarizerConfig(default_mode: str = 'auto', target_ratio: float = 0.3, enable_cache: bool = True, preserve_code_structure: bool = True, summarize_imports: bool = True, import_summary_threshold: int = 5, max_cache_size: int = 100, llm_provider: Optional[str] = None, llm_model: Optional[str] = None, llm_temperature: float = 0.3, llm_max_tokens: int = 500, enable_ml_strategies: bool = True, quality_threshold: str = 'medium', batch_size: int = 10, docs_context_aware: bool = True, docs_show_in_place_context: bool = True, docs_context_search_depth: int = 2, docs_context_min_confidence: float = 0.6, docs_context_max_sections: int = 10, docs_context_preserve_examples: bool = True, docstring_weight: float = 0.5, include_all_signatures: bool = True)
Configuration for content summarization system.
Controls how text and code are compressed to fit within token limits.
ATTRIBUTE | DESCRIPTION |
---|---|
default_mode | Default summarization mode (extractive, compressive, textrank, transformer, llm, auto) TYPE: |
target_ratio | Default target compression ratio (0.3 = 30% of original) TYPE: |
enable_cache | Whether to cache summaries TYPE: |
preserve_code_structure | Whether to preserve imports/signatures in code TYPE: |
summarize_imports | Whether to condense imports into a summary (default: True) TYPE: |
import_summary_threshold | Number of imports to trigger summarization (default: 5) TYPE: |
max_cache_size | Maximum number of cached summaries TYPE: |
llm_provider | LLM provider for LLM mode (uses global LLM config) |
llm_model | LLM model to use (uses global LLM config) |
llm_temperature | LLM sampling temperature TYPE: |
llm_max_tokens | Maximum tokens for LLM response TYPE: |
enable_ml_strategies | Whether to enable ML-based strategies TYPE: |
quality_threshold | Quality threshold for auto mode selection TYPE: |
batch_size | Batch size for parallel processing TYPE: |
docs_context_aware | Whether to enable context-aware summarization for documentation files TYPE: |
docs_show_in_place_context | When enabled, preserves and highlights relevant context in documentation summaries instead of generic structure TYPE: |
docs_context_search_depth | How deep to search for contextual references (1=direct mentions, 2=semantic similarity, 3=deep analysis) TYPE: |
docs_context_min_confidence | Minimum confidence threshold for context relevance (0.0-1.0) TYPE: |
docs_context_max_sections | Maximum number of contextual sections to preserve per document TYPE: |
docs_context_preserve_examples | Whether to always preserve code examples and snippets in documentation TYPE: |
TenetConfigdataclass
¶
TenetConfig(auto_instill: bool = True, max_per_context: int = 5, reinforcement: bool = True, injection_strategy: str = 'strategic', min_distance_between: int = 1000, prefer_natural_breaks: bool = True, storage_path: Optional[Path] = None, collections_enabled: bool = True, injection_frequency: str = 'adaptive', injection_interval: int = 3, session_complexity_threshold: float = 0.7, min_session_length: int = 1, adaptive_injection: bool = True, track_injection_history: bool = True, decay_rate: float = 0.1, reinforcement_interval: int = 10, session_aware: bool = True, session_memory_limit: int = 100, persist_session_history: bool = True, complexity_weight: float = 0.5, priority_boost_critical: float = 2.0, priority_boost_high: float = 1.5, skip_low_priority_on_complex: bool = True, track_effectiveness: bool = True, effectiveness_window_days: int = 30, min_compliance_score: float = 0.6, system_instruction: Optional[str] = None, system_instruction_enabled: bool = False, system_instruction_position: str = 'top', system_instruction_format: str = 'markdown', system_instruction_once_per_session: bool = True)
Configuration for the tenet (guiding principles) system.
Controls how tenets are managed and injected into context, including smart injection frequency, session tracking, and adaptive behavior.
ATTRIBUTE | DESCRIPTION |
---|---|
auto_instill | Whether to automatically apply tenets to context TYPE: |
max_per_context | Maximum tenets to inject per context TYPE: |
reinforcement | Whether to reinforce critical tenets TYPE: |
injection_strategy | Default injection strategy ('strategic', 'top', 'distributed') TYPE: |
min_distance_between | Minimum character distance between injections TYPE: |
prefer_natural_breaks | Whether to inject at natural break points TYPE: |
storage_path | Where to store tenet database |
collections_enabled | Whether to enable tenet collections TYPE: |
injection_frequency | How often to inject tenets ('always', 'periodic', 'adaptive', 'manual') TYPE: |
injection_interval | Numeric interval for periodic injection (e.g., every 3rd distill) TYPE: |
session_complexity_threshold | Complexity threshold for smart injection (0-1) TYPE: |
min_session_length | Minimum session length before first injection TYPE: |
adaptive_injection | Enable adaptive injection based on context analysis TYPE: |
track_injection_history | Track injection history per session for smarter decisions TYPE: |
decay_rate | How quickly tenet importance decays (0-1, higher = faster decay) TYPE: |
reinforcement_interval | How often to reinforce critical tenets (every N injections) TYPE: |
session_aware | Enable session-aware injection patterns TYPE: |
session_memory_limit | Max sessions to track in memory TYPE: |
persist_session_history | Save session histories to disk TYPE: |
complexity_weight | Weight given to complexity in injection decisions (0-1) TYPE: |
priority_boost_critical | Boost factor for critical priority tenets TYPE: |
priority_boost_high | Boost factor for high priority tenets TYPE: |
skip_low_priority_on_complex | Skip low priority tenets when complexity > threshold TYPE: |
track_effectiveness | Track tenet effectiveness metrics TYPE: |
effectiveness_window_days | Days to consider for effectiveness analysis TYPE: |
min_compliance_score | Minimum compliance score before reinforcement TYPE: |
System instruction (system prompt) configuration¶
system_instruction: Optional text to inject as foundational context system_instruction_enabled: Enable auto-injection when instruction exists system_instruction_position: Where to inject (top, after_header, before_content) system_instruction_format: Format of instruction (markdown, xml, comment, plain) system_instruction_once_per_session: Inject once per session; if no session, inject every distill
CacheConfigdataclass
¶
CacheConfig(enabled: bool = True, directory: Optional[Path] = None, ttl_days: int = 7, max_size_mb: int = 500, compression: bool = False, memory_cache_size: int = 1000, sqlite_pragmas: Dict[str, str] = (lambda: {'journal_mode': 'WAL', 'synchronous': 'NORMAL', 'cache_size': '-64000', 'temp_store': 'MEMORY'})(), max_age_hours: int = 24, llm_cache_enabled: bool = True, llm_cache_ttl_hours: int = 24)
Configuration for caching system.
Controls cache behavior for analysis results and other expensive operations.
ATTRIBUTE | DESCRIPTION |
---|---|
enabled | Whether caching is enabled TYPE: |
directory | Cache directory path |
ttl_days | Time-to-live for cache entries in days TYPE: |
max_size_mb | Maximum cache size in megabytes TYPE: |
compression | Whether to compress cached data TYPE: |
memory_cache_size | Number of items in memory cache TYPE: |
sqlite_pragmas | SQLite performance settings |
max_age_hours | Max age for certain cached entries (used by analyzer) TYPE: |
llm_cache_enabled | Whether to cache LLM responses TYPE: |
llm_cache_ttl_hours | TTL for LLM response cache TYPE: |
OutputConfigdataclass
¶
OutputConfig(default_format: str = 'markdown', syntax_highlighting: bool = True, line_numbers: bool = False, max_line_length: int = 120, include_metadata: bool = True, compression_threshold: int = 10000, summary_ratio: float = 0.25, copy_on_distill: bool = False, show_token_usage: bool = True, show_cost_estimate: bool = True)
Configuration for output formatting.
Controls how context and analysis results are formatted.
ATTRIBUTE | DESCRIPTION |
---|---|
default_format | Default output format (markdown, xml, json) TYPE: |
syntax_highlighting | Whether to enable syntax highlighting TYPE: |
line_numbers | Whether to include line numbers TYPE: |
max_line_length | Maximum line length before wrapping TYPE: |
include_metadata | Whether to include metadata in output TYPE: |
compression_threshold | File size threshold for summarization TYPE: |
summary_ratio | Target compression ratio for summaries TYPE: |
copy_on_distill | Automatically copy distill output to clipboard when true TYPE: |
show_token_usage | Whether to show token usage statistics TYPE: |
show_cost_estimate | Whether to show cost estimates for LLM operations TYPE: |
GitConfigdataclass
¶
GitConfig(enabled: bool = True, include_history: bool = True, history_limit: int = 100, include_blame: bool = False, include_stats: bool = True, ignore_authors: List[str] = (lambda: ['dependabot[bot]', 'github-actions[bot]', 'renovate[bot]'])(), main_branches: List[str] = (lambda: ['main', 'master', 'develop', 'trunk'])())
Configuration for git integration.
Controls how git information is gathered and used.
ATTRIBUTE | DESCRIPTION |
---|---|
enabled | Whether git integration is enabled TYPE: |
include_history | Whether to include commit history TYPE: |
history_limit | Maximum number of commits to include TYPE: |
include_blame | Whether to include git blame info TYPE: |
include_stats | Whether to include statistics TYPE: |
ignore_authors | Authors to ignore in analysis |
main_branches | Branch names considered "main" |
TenetsConfigdataclass
¶
TenetsConfig(config_file: Optional[Path] = None, project_root: Optional[Path] = None, max_tokens: int = 100000, version: str = '0.1.0', debug: bool = False, quiet: bool = False, scanner: ScannerConfig = ScannerConfig(), ranking: RankingConfig = RankingConfig(), summarizer: SummarizerConfig = SummarizerConfig(), tenet: TenetConfig = TenetConfig(), cache: CacheConfig = CacheConfig(), output: OutputConfig = OutputConfig(), git: GitConfig = GitConfig(), llm: LLMConfig = LLMConfig(), nlp: NLPConfig = NLPConfig(), custom: Dict[str, Any] = dict())
Main configuration for the Tenets system with LLM and NLP support.
This is the root configuration object that contains all subsystem configs and global settings. It handles loading from files, environment variables, and provides sensible defaults.
ATTRIBUTE | DESCRIPTION |
---|---|
config_file | Path to configuration file (if any) |
project_root | Root directory of the project |
max_tokens | Default maximum tokens for context TYPE: |
version | Tenets version (for compatibility checking) TYPE: |
debug | Enable debug mode TYPE: |
quiet | Suppress non-essential output TYPE: |
scanner | Scanner subsystem configuration TYPE: |
ranking | Ranking subsystem configuration TYPE: |
summarizer | Summarizer subsystem configuration TYPE: |
tenet | Tenet subsystem configuration TYPE: |
cache | Cache subsystem configuration TYPE: |
output | Output formatting configuration TYPE: |
git | Git integration configuration TYPE: |
llm | LLM integration configuration TYPE: |
nlp | NLP system configuration TYPE: |
custom | Custom user configuration |
Attributes¶
exclude_minifiedproperty
writable
¶
Get exclude_minified setting from scanner config.
minified_patternsproperty
writable
¶
Get minified patterns from scanner config.
build_directory_patternsproperty
writable
¶
Get build directory patterns from scanner config.
respect_gitignoreproperty
writable
¶
Whether to respect .gitignore files.
additional_ignore_patternsproperty
writable
¶
Get additional ignore patterns.
auto_instill_tenetsproperty
writable
¶
Whether to automatically instill tenets.
max_tenets_per_contextproperty
writable
¶
Maximum tenets to inject per context.
tenet_injection_configproperty
¶
Get tenet injection configuration.
nlp_embeddings_enabledproperty
writable
¶
Whether NLP embeddings are enabled.
Functions¶
to_dict¶
Convert configuration to dictionary.
RETURNS | DESCRIPTION |
---|---|
Dict[str, Any] | Dictionary representation of configuration |
Source code in tenets/config.py
def to_dict(self) -> Dict[str, Any]:
"""Convert configuration to dictionary.
Returns:
Dictionary representation of configuration
"""
def _as_serializable(obj):
if isinstance(obj, Path):
return str(obj)
if isinstance(obj, dict):
return {k: _as_serializable(v) for k, v in obj.items()}
if isinstance(obj, list):
return [_as_serializable(v) for v in obj]
return obj
data = {
"max_tokens": self.max_tokens,
"version": self.version,
"debug": self.debug,
"quiet": self.quiet,
"scanner": asdict(self.scanner),
"ranking": asdict(self.ranking),
"summarizer": asdict(self.summarizer),
"tenet": asdict(self.tenet),
"cache": asdict(self.cache),
"output": asdict(self.output),
"git": asdict(self.git),
"llm": asdict(self.llm),
"nlp": asdict(self.nlp),
"custom": self.custom,
}
return _as_serializable(data)
save¶
Save configuration to file.
PARAMETER | DESCRIPTION |
---|---|
path | Path to save to (uses config_file if not specified) |
RAISES | DESCRIPTION |
---|---|
ValueError | If no path specified and config_file not set |
Source code in tenets/config.py
def save(self, path: Optional[Path] = None):
"""Save configuration to file.
Args:
path: Path to save to (uses config_file if not specified)
Raises:
ValueError: If no path specified and config_file not set
"""
# Only allow implicit save to config_file if it was explicitly provided
if path is None:
if not self.config_file or self._config_file_discovered:
raise ValueError("No path specified for saving configuration")
save_path = self.config_file
else:
save_path = path
save_path = Path(save_path)
config_dict = self.to_dict()
# Remove version from saved config (managed by package)
config_dict.pop("version", None)
with open(save_path, "w") as f:
if save_path.suffix == ".json":
json.dump(config_dict, f, indent=2)
else:
_ensure_yaml_imported() # Import yaml when needed
yaml.dump(config_dict, f, default_flow_style=False, sort_keys=False)
self._logger.info(f"Configuration saved to {save_path}")