tenets.core.prompt
Package¶
Prompt parsing and understanding system.
This package provides intelligent prompt analysis to extract intent, keywords, entities, temporal context, and external references from user queries. The parser supports various input formats including plain text, URLs (GitHub issues, JIRA tickets, Linear, Notion, etc.), and structured queries.
Core Features: - Intent detection (implement, debug, test, refactor, etc.) - Keyword extraction using multiple algorithms (YAKE, TF-IDF, frequency) - Entity recognition (classes, functions, files, APIs, databases) - Temporal parsing (dates, ranges, recurring patterns) - External source integration (GitHub, GitLab, JIRA, Linear, Asana, Notion) - Intelligent caching with TTL management - Programming pattern recognition - Scope and focus area detection
The parser leverages centralized NLP components for: - Keyword extraction via nlp.keyword_extractor - Tokenization via nlp.tokenizer - Stopword filtering via nlp.stopwords - Programming patterns via nlp.programming_patterns
Example
from tenets.core.prompt import PromptParser from tenets.config import TenetsConfig
Create parser with config¶
config = TenetsConfig() parser = PromptParser(config)
Parse a prompt¶
context = parser.parse("implement OAuth2 authentication for the API") print(f"Intent: {context.intent}") print(f"Keywords: {context.keywords}") print(f"Task type: {context.task_type}")
Parse from GitHub issue¶
context = parser.parse("https://github.com/org/repo/issues/123") print(f"External source: {context.external_context['source']}") print(f"Issue title: {context.text}")
Classes¶
AsanaHandler¶
Bases: ExternalSourceHandler
Handler for Asana tasks.
ExternalContentdataclass
¶
ExternalContent(title: str, body: str, metadata: Dict[str, Any], source_type: str, url: str, cached_at: Optional[datetime] = None, ttl_hours: int = 24)
ExternalSourceHandler¶
Bases: ABC
Base class for external source handlers.
Initialize handler with optional cache.
PARAMETER | DESCRIPTION |
---|---|
cache_manager | Optional cache manager for caching fetched content TYPE: |
Attributes¶
loggerinstance-attribute
¶
cacheinstance-attribute
¶
Functions¶
can_handleabstractmethod
¶
Check if this handler can process the given URL.
extract_identifierabstractmethod
¶
fetch_contentabstractmethod
¶
Fetch content from the external source.
get_cached_content¶
Get cached content if available and valid.
PARAMETER | DESCRIPTION |
---|---|
url | URL to check cache for TYPE: |
RETURNS | DESCRIPTION |
---|---|
Optional[ExternalContent] | Cached content or None if not cached/expired |
cache_content¶
Cache fetched content.
PARAMETER | DESCRIPTION |
---|---|
url | URL as cache key TYPE: |
content | Content to cache TYPE: |
process¶
Process URL with caching support.
PARAMETER | DESCRIPTION |
---|---|
url | URL to process TYPE: |
RETURNS | DESCRIPTION |
---|---|
Optional[ExternalContent] | External content or None if failed |
ExternalSourceManager¶
Manages all external source handlers.
Initialize with all available handlers.
PARAMETER | DESCRIPTION |
---|---|
cache_manager | Optional cache manager for handlers TYPE: |
Attributes¶
loggerinstance-attribute
¶
cache_managerinstance-attribute
¶
handlersinstance-attribute
¶
Functions¶
process_url¶
Process a URL with the appropriate handler.
PARAMETER | DESCRIPTION |
---|---|
url | URL to process TYPE: |
RETURNS | DESCRIPTION |
---|---|
Optional[ExternalContent] | External content or None if no handler can process it |
GitHubHandler¶
Bases: ExternalSourceHandler
Handler for GitHub issues, PRs, discussions, and gists.
GitLabHandler¶
Bases: ExternalSourceHandler
Handler for GitLab issues, MRs, and snippets.
JiraHandler¶
Bases: ExternalSourceHandler
Handler for JIRA tickets.
LinearHandler¶
Bases: ExternalSourceHandler
Handler for Linear issues.
NotionHandler¶
Bases: ExternalSourceHandler
Handler for Notion pages and databases.
CacheEntrydataclass
¶
CacheEntry(key: str, value: Any, created_at: datetime, accessed_at: datetime, ttl_seconds: int, hit_count: int = 0, metadata: Dict[str, Any] = None)
PromptCache¶
PromptCache(cache_manager: Optional[Any] = None, enable_memory_cache: bool = True, enable_disk_cache: bool = True, memory_cache_size: int = 100)
Intelligent caching for prompt parsing operations.
Initialize prompt cache.
PARAMETER | DESCRIPTION |
---|---|
cache_manager | External cache manager to use |
enable_memory_cache | Whether to use in-memory caching TYPE: |
enable_disk_cache | Whether to use disk caching TYPE: |
memory_cache_size | Maximum items in memory cache TYPE: |
Attributes¶
DEFAULT_TTLSclass-attribute
instance-attribute
¶
TTL_MODIFIERSclass-attribute
instance-attribute
¶
loggerinstance-attribute
¶
cache_managerinstance-attribute
¶
enable_memoryinstance-attribute
¶
enable_diskinstance-attribute
¶
memory_cacheinstance-attribute
¶
memory_cache_sizeinstance-attribute
¶
statsinstance-attribute
¶
Functions¶
get¶
put¶
put(key: str, value: Any, ttl_seconds: Optional[int] = None, metadata: Optional[Dict[str, Any]] = None, write_disk: bool = True) -> None
cache_parsed_prompt¶
get_parsed_prompt¶
cache_external_content¶
get_external_content¶
cache_entities¶
get_entities¶
cache_intent¶
get_intent¶
invalidate¶
get_stats¶
Entitydataclass
¶
Entity(name: str, type: str, confidence: float, context: str = '', start_pos: int = -1, end_pos: int = -1, source: str = 'regex', metadata: Dict[str, Any] = dict())
Recognized entity with confidence and context.
Attributes¶
nameinstance-attribute
¶
typeinstance-attribute
¶
confidenceinstance-attribute
¶
contextclass-attribute
instance-attribute
¶
start_posclass-attribute
instance-attribute
¶
end_posclass-attribute
instance-attribute
¶
sourceclass-attribute
instance-attribute
¶
metadataclass-attribute
instance-attribute
¶
EntityPatternMatcher¶
FuzzyEntityMatcher¶
HybridEntityRecognizer¶
HybridEntityRecognizer(use_nlp: bool = True, use_fuzzy: bool = True, patterns_file: Optional[Path] = None, spacy_model: str = 'en_core_web_sm', known_entities: Optional[Dict[str, List[str]]] = None)
Main entity recognizer combining all approaches.
Initialize hybrid entity recognizer.
PARAMETER | DESCRIPTION |
---|---|
use_nlp | Whether to use NLP-based NER TYPE: |
use_fuzzy | Whether to use fuzzy matching TYPE: |
patterns_file | Path to entity patterns JSON |
spacy_model | spaCy model name TYPE: |
known_entities | Known entities for fuzzy matching |
Attributes¶
loggerinstance-attribute
¶
pattern_matcherinstance-attribute
¶
nlp_recognizerinstance-attribute
¶
fuzzy_matcherinstance-attribute
¶
keyword_extractorinstance-attribute
¶
Functions¶
recognize¶
recognize(text: str, merge_overlapping: bool = True, min_confidence: float = 0.5) -> List[Entity]
Recognize entities using all available methods.
PARAMETER | DESCRIPTION |
---|---|
text | Text to extract entities from TYPE: |
merge_overlapping | Whether to merge overlapping entities TYPE: |
min_confidence | Minimum confidence threshold TYPE: |
RETURNS | DESCRIPTION |
---|---|
List[Entity] | List of recognized entities |
NLPEntityRecognizer¶
HybridIntentDetector¶
HybridIntentDetector(use_ml: bool = True, patterns_file: Optional[Path] = None, model_name: str = 'all-MiniLM-L6-v2')
Main intent detector combining pattern and ML approaches.
Initialize hybrid intent detector.
PARAMETER | DESCRIPTION |
---|---|
use_ml | Whether to use ML-based detection TYPE: |
patterns_file | Path to intent patterns JSON |
model_name | Embedding model name for ML TYPE: |
Attributes¶
loggerinstance-attribute
¶
pattern_detectorinstance-attribute
¶
semantic_detectorinstance-attribute
¶
keyword_extractorinstance-attribute
¶
Functions¶
detect¶
detect(text: str, combine_method: str = 'weighted', pattern_weight: float = 0.75, ml_weight: float = 0.25, min_confidence: float = 0.3) -> Intent
Detect the primary intent from text.
PARAMETER | DESCRIPTION |
---|---|
text | Text to analyze TYPE: |
combine_method | How to combine results ('weighted', 'max', 'vote') TYPE: |
pattern_weight | Weight for pattern-based detection TYPE: |
ml_weight | Weight for ML-based detection TYPE: |
min_confidence | Minimum confidence threshold TYPE: |
RETURNS | DESCRIPTION |
---|---|
Intent | Primary intent detected |
detect_multiple¶
Intentdataclass
¶
Intent(type: str, confidence: float, evidence: List[str], keywords: List[str], metadata: Dict[str, Any], source: str)
PatternBasedDetector¶
SemanticIntentDetector¶
ML-based semantic intent detection using embeddings.
Initialize semantic intent detector.
PARAMETER | DESCRIPTION |
---|---|
model_name | Embedding model name TYPE: |
PromptParser¶
PromptParser(config: TenetsConfig, cache_manager: Optional[Any] = None, use_cache: bool = True, use_ml: bool = None, use_nlp_ner: bool = None, use_fuzzy_matching: bool = True)
Comprehensive prompt parser with modular components and caching.
Attributes¶
configinstance-attribute
¶
loggerinstance-attribute
¶
cacheinstance-attribute
¶
Functions¶
parse¶
get_cache_stats¶
clear_cache¶
Clear all cached data.
This removes all cached parsing results, external content, entities, and intents from both memory and disk cache.
Example
parser.clear_cache() print("Cache cleared")
warm_cache¶
Pre-warm cache with common prompts.
This method pre-parses a list of common prompts to populate the cache, improving performance for frequently used queries.
PARAMETER | DESCRIPTION |
---|---|
common_prompts | List of common prompts to pre-parse |
Example
common = [ ... "implement authentication", ... "fix bug", ... "understand architecture" ... ] parser.warm_cache(common)
TemporalExpressiondataclass
¶
TemporalExpression(text: str, type: str, start_date: Optional[datetime], end_date: Optional[datetime], is_relative: bool, is_recurring: bool, recurrence_pattern: Optional[str], confidence: float, metadata: Dict[str, Any])
Parsed temporal expression with metadata.
TemporalParser¶
Main temporal parser combining all approaches.
Initialize temporal parser.
PARAMETER | DESCRIPTION |
---|---|
patterns_file | Path to temporal patterns JSON file |
Attributes¶
loggerinstance-attribute
¶
pattern_matcherinstance-attribute
¶
Functions¶
parse¶
Parse temporal expressions from text.
PARAMETER | DESCRIPTION |
---|---|
text | Text to parse TYPE: |
RETURNS | DESCRIPTION |
---|---|
List[TemporalExpression] | List of temporal expressions |
TemporalPatternMatcher¶
Pattern-based temporal expression matching.
Initialize with temporal patterns.
PARAMETER | DESCRIPTION |
---|---|
patterns_file | Path to temporal patterns JSON file |
PromptContextdataclass
¶
PromptContext(text: str, original: Optional[str] = None, keywords: list[str] = list(), task_type: str = 'general', intent: str = 'understand', entities: list[dict[str, Any]] = list(), file_patterns: list[str] = list(), focus_areas: list[str] = list(), temporal_context: Optional[dict[str, Any]] = None, scope: dict[str, Any] = dict(), external_context: Optional[dict[str, Any]] = None, metadata: dict[str, Any] = dict(), confidence_scores: dict[str, float] = dict(), session_id: Optional[str] = None, timestamp: datetime = datetime.now(), include_tests: bool = False)
Context extracted from user prompt.
Contains all information parsed from the prompt to guide file selection and ranking. This is the primary data structure that flows through the system after prompt parsing.
ATTRIBUTE | DESCRIPTION |
---|---|
text | The processed prompt text (cleaned and normalized) TYPE: |
original | Original input (may be URL or raw text) |
keywords | Extracted keywords for searching |
task_type | Type of task detected TYPE: |
intent | User intent classification TYPE: |
entities | Named entities found (classes, functions, modules) |
file_patterns | File patterns to match (.py, test_, etc) |
focus_areas | Areas to focus on (auth, api, database, etc) |
temporal_context | Time-related context (recent, yesterday, etc) |
scope | Scope indicators (modules, directories, exclusions) |
external_context | Context from external sources (GitHub, JIRA) |
metadata | Additional metadata for processing |
confidence_scores | Confidence scores for various extractions |
session_id | Associated session if any |
timestamp | When context was created TYPE: |
Attributes¶
textinstance-attribute
¶
originalclass-attribute
instance-attribute
¶
keywordsclass-attribute
instance-attribute
¶
task_typeclass-attribute
instance-attribute
¶
intentclass-attribute
instance-attribute
¶
entitiesclass-attribute
instance-attribute
¶
file_patternsclass-attribute
instance-attribute
¶
focus_areasclass-attribute
instance-attribute
¶
temporal_contextclass-attribute
instance-attribute
¶
scopeclass-attribute
instance-attribute
¶
external_contextclass-attribute
instance-attribute
¶
metadataclass-attribute
instance-attribute
¶
confidence_scoresclass-attribute
instance-attribute
¶
session_idclass-attribute
instance-attribute
¶
timestampclass-attribute
instance-attribute
¶
include_testsclass-attribute
instance-attribute
¶
Functions¶
add_keyword¶
Add a keyword with confidence score.
add_entity¶
Add an entity with type and confidence.
from_dictclassmethod
¶
Create PromptContext from dictionary.
get_hash¶
Compute a deterministic cache key for this prompt context.
The hash incorporates the normalized prompt text, task type, and the ordered list of unique keywords. MD5 is chosen (with usedforsecurity=False
) for speed; collision risk is acceptable for internal memoization.
RETURNS | DESCRIPTION |
---|---|
str | Hex digest suitable for use as an internal cache key. TYPE: |
TaskType¶
Bases: Enum
Types of tasks detected in prompts.
Attributes¶
FEATUREclass-attribute
instance-attribute
¶
DEBUGclass-attribute
instance-attribute
¶
TESTclass-attribute
instance-attribute
¶
REFACTORclass-attribute
instance-attribute
¶
UNDERSTANDclass-attribute
instance-attribute
¶
REVIEWclass-attribute
instance-attribute
¶
DOCUMENTclass-attribute
instance-attribute
¶
OPTIMIZEclass-attribute
instance-attribute
¶
SECURITYclass-attribute
instance-attribute
¶
ARCHITECTUREclass-attribute
instance-attribute
¶
MIGRATIONclass-attribute
instance-attribute
¶
GENERALclass-attribute
instance-attribute
¶
Functions¶
Functions¶
create_parser¶
create_parser(config=None, use_cache: bool = True, use_ml: bool = None, cache_manager=None) -> PromptParser
Create a configured prompt parser.
Convenience function to quickly create a parser with sensible defaults.
PARAMETER | DESCRIPTION |
---|---|
config | Optional TenetsConfig instance (creates default if None) DEFAULT: |
use_cache | Whether to enable caching (default: True) TYPE: |
use_ml | Whether to use ML features (None = auto-detect from config) TYPE: |
cache_manager | Optional cache manager for persistence DEFAULT: |
Uses centralized NLP components for all text processing.
RETURNS | DESCRIPTION |
---|---|
PromptParser | Configured PromptParser instance |
Example
parser = create_parser() context = parser.parse("add user authentication") print(context.intent)
parse_prompt¶
parse_prompt(prompt: str, config=None, fetch_external: bool = True, use_cache: bool = False) -> Any
Parse a prompt without managing parser instances.
Convenience function for one-off prompt parsing. Uses centralized NLP components including keyword extraction and tokenization.
PARAMETER | DESCRIPTION |
---|---|
prompt | The prompt text or URL to parse TYPE: |
config | Optional TenetsConfig instance DEFAULT: |
fetch_external | Whether to fetch external content (default: True) TYPE: |
use_cache | Whether to use caching (default: False for one-off) TYPE: |
RETURNS | DESCRIPTION |
---|---|
Any | PromptContext with extracted information |
Example
context = parse_prompt("implement caching layer") print(f"Keywords: {context.keywords}") print(f"Intent: {context.intent}")
extract_keywords¶
Extract keywords from text using NLP components.
Uses the centralized keyword extractor with YAKE/TF-IDF/frequency fallback chain for robust keyword extraction.
PARAMETER | DESCRIPTION |
---|---|
text | Input text to analyze TYPE: |
max_keywords | Maximum number of keywords to extract TYPE: |
RETURNS | DESCRIPTION |
---|---|
List[str] | List of extracted keywords |
Example
keywords = extract_keywords("implement OAuth2 authentication") print(keywords) # ['oauth2', 'authentication', 'implement']
detect_intent¶
Analyzes prompt text to determine user intent.
PARAMETER | DESCRIPTION |
---|---|
prompt | The prompt text to analyze TYPE: |
use_ml | Whether to use ML-based detection (requires ML dependencies) TYPE: |
RETURNS | DESCRIPTION |
---|---|
str | Intent type string (implement, debug, understand, etc.) |
Example
intent = detect_intent("fix the authentication bug") print(intent) # 'debug'
extract_entities¶
extract_entities(text: str, min_confidence: float = 0.5, use_nlp: bool = False, use_fuzzy: bool = True) -> List[Dict[str, Any]]
Extract named entities from text.
Identifies classes, functions, files, modules, and other programming entities mentioned in the text.
PARAMETER | DESCRIPTION |
---|---|
text | Input text to analyze TYPE: |
min_confidence | Minimum confidence threshold TYPE: |
use_nlp | Whether to use NLP-based NER (requires spaCy) TYPE: |
use_fuzzy | Whether to use fuzzy matching TYPE: |
RETURNS | DESCRIPTION |
---|---|
List[Dict[str, Any]] | List of entity dictionaries with name, type, and confidence |
Example
entities = extract_entities("update the UserAuth class in auth.py") for entity in entities: ... print(f"{entity['type']}: {entity['name']}")
parse_external_reference¶
Parse an external reference URL.
Extracts information from GitHub issues, JIRA tickets, GitLab MRs, Linear issues, Asana tasks, Notion pages, and other external references.
PARAMETER | DESCRIPTION |
---|---|
url | URL to parse TYPE: |
RETURNS | DESCRIPTION |
---|---|
Optional[Dict[str, Any]] | Dictionary with reference information or None if not recognized |
Example
ref = parse_external_reference("https://github.com/org/repo/issues/123") print(ref['type']) # 'github' print(ref['identifier']) # 'org/repo#123'
extract_temporal¶
Extract temporal expressions from text.
Identifies dates, time ranges, relative dates, and recurring patterns.
PARAMETER | DESCRIPTION |
---|---|
text | Input text to analyze TYPE: |
RETURNS | DESCRIPTION |
---|---|
List[Dict[str, Any]] | List of temporal expression dictionaries |
Example
temporal = extract_temporal("changes from last week") for expr in temporal: ... print(f"{expr['text']}: {expr['type']}")
Modules¶
cache
- Cache moduleentity_recognizer
- Entity Recognizer moduleexternal_sources
- External Sources moduleintent_detector
- Intent Detector modulenormalizer
- Normalizer moduleparser
- Parser moduletemporal_parser
- Temporal Parser module