programming_patterns
¶
Full name: tenets.core.nlp.programming_patterns
programming_patterns¶
Centralized programming patterns loader for NLP.
This module loads programming patterns from the JSON file and provides utilities for pattern matching. Consolidates duplicate logic from parser.py and strategies.py.
Classes¶
ProgrammingPatterns¶
Loads and manages programming patterns from JSON.
This class provides centralized access to programming patterns, eliminating duplication between parser.py and strategies.py.
ATTRIBUTE | DESCRIPTION |
---|---|
patterns | Dictionary of pattern categories loaded from JSON |
logger | Logger instance |
compiled_patterns | Cache of compiled regex patterns |
Initialize programming patterns from JSON file.
PARAMETER | DESCRIPTION |
---|---|
patterns_file | Path to patterns JSON file (uses default if None) |
Source code in tenets/core/nlp/programming_patterns.py
def __init__(self, patterns_file: Optional[Path] = None):
"""Initialize programming patterns from JSON file.
Args:
patterns_file: Path to patterns JSON file (uses default if None)
"""
self.logger = get_logger(__name__)
# Default patterns file location
if patterns_file is None:
patterns_file = (
Path(__file__).parent.parent.parent
/ "data"
/ "patterns"
/ "programming_patterns.json"
)
self.patterns = self._load_patterns(patterns_file)
self.compiled_patterns = {}
self._compile_all_patterns()
Functions¶
extract_programming_keywords¶
Extract programming-specific keywords from text.
This replaces the duplicate methods in parser.py and strategies.py.
PARAMETER | DESCRIPTION |
---|---|
text | Input text to extract keywords from TYPE: |
RETURNS | DESCRIPTION |
---|---|
List[str] | List of unique programming keywords found |
Source code in tenets/core/nlp/programming_patterns.py
def extract_programming_keywords(self, text: str) -> List[str]:
"""Extract programming-specific keywords from text.
This replaces the duplicate methods in parser.py and strategies.py.
Args:
text: Input text to extract keywords from
Returns:
List of unique programming keywords found
"""
keywords = set()
text_lower = text.lower()
# Check each category
for category, config in self.patterns.items():
# Check if any category keywords appear in text
category_keywords = config.get("keywords", [])
for keyword in category_keywords:
# Check if keyword appears as a substring in text
if keyword.lower() in text_lower:
keywords.add(keyword)
# Check regex patterns
if category in self.compiled_patterns:
for pattern in self.compiled_patterns[category]:
if pattern.search(text):
# Add the category name as a keyword
keywords.add(category)
# Also add any matched keywords from this category
for keyword in category_keywords[:3]: # Top 3 keywords
keywords.add(keyword)
break
return sorted(list(keywords))
analyze_code_patterns¶
Analyze code for pattern matches and scoring.
PARAMETER | DESCRIPTION |
---|---|
content | File content to analyze TYPE: |
keywords | Keywords from prompt for relevance checking |
RETURNS | DESCRIPTION |
---|---|
Dict[str, float] | Dictionary of pattern scores by category |
Source code in tenets/core/nlp/programming_patterns.py
def analyze_code_patterns(self, content: str, keywords: List[str]) -> Dict[str, float]:
"""Analyze code for pattern matches and scoring.
Args:
content: File content to analyze
keywords: Keywords from prompt for relevance checking
Returns:
Dictionary of pattern scores by category
"""
scores = {}
# Lower case keywords for comparison
keywords_lower = [kw.lower() for kw in keywords]
for category, config in self.patterns.items():
# Check if category is relevant to keywords
category_keywords = config.get("keywords", [])
# More sophisticated relevance check
relevance_score = self._calculate_relevance(category_keywords, keywords_lower)
if relevance_score > 0 and category in self.compiled_patterns:
category_score = 0.0
patterns = self.compiled_patterns[category]
# Count pattern matches with better scoring
for pattern in patterns:
matches = pattern.findall(content)
if matches:
# Use logarithmic scaling with base 2 for smoother curve
match_score = math.log2(len(matches) + 1) / math.log2(
11
) # Normalized to ~1.0 at 10 matches
category_score += min(1.0, match_score)
# Normalize and apply importance and relevance
if patterns:
normalized_score = category_score / len(patterns)
importance = config.get("importance", 0.5)
# Include relevance in final score
scores[category] = normalized_score * importance * (0.5 + 0.5 * relevance_score)
# Calculate overall pattern score as weighted average
if scores:
total_weight = sum(self.patterns[cat].get("importance", 0.5) for cat in scores)
scores["overall"] = sum(
scores[cat] * self.patterns[cat].get("importance", 0.5) / total_weight
for cat in scores
if cat != "overall"
)
else:
scores["overall"] = 0.0
return scores
get_pattern_categories¶
get_category_keywords¶
Get keywords for a specific category.
PARAMETER | DESCRIPTION |
---|---|
category | Category name TYPE: |
RETURNS | DESCRIPTION |
---|---|
List[str] | List of keywords for the category |
Source code in tenets/core/nlp/programming_patterns.py
def get_category_keywords(self, category: str) -> List[str]:
"""Get keywords for a specific category.
Args:
category: Category name
Returns:
List of keywords for the category
"""
# Handle common aliases
category_map = {
"auth": "authentication",
"config": "configuration",
"db": "database",
}
actual_category = category_map.get(category, category)
if actual_category in self.patterns:
return self.patterns[actual_category].get("keywords", [])
return []
get_category_importance¶
Get importance score for a category.
PARAMETER | DESCRIPTION |
---|---|
category | Category name TYPE: |
RETURNS | DESCRIPTION |
---|---|
float | Importance score (0-1) |
Source code in tenets/core/nlp/programming_patterns.py
def get_category_importance(self, category: str) -> float:
"""Get importance score for a category.
Args:
category: Category name
Returns:
Importance score (0-1)
"""
# Handle common aliases
category_map = {
"auth": "authentication",
"config": "configuration",
"db": "database",
}
actual_category = category_map.get(category, category)
if actual_category in self.patterns:
return self.patterns[actual_category].get("importance", 0.5)
return 0.5
match_patterns¶
Find all pattern matches in text for a category.
PARAMETER | DESCRIPTION |
---|---|
text | Text to search TYPE: |
category | Pattern category TYPE: |
RETURNS | DESCRIPTION |
---|---|
List[Tuple[str, int, int]] | List of (matched_text, start_pos, end_pos) tuples |
Source code in tenets/core/nlp/programming_patterns.py
def match_patterns(self, text: str, category: str) -> List[Tuple[str, int, int]]:
"""Find all pattern matches in text for a category.
Args:
text: Text to search
category: Pattern category
Returns:
List of (matched_text, start_pos, end_pos) tuples
"""
matches = []
# Handle common aliases
category_map = {
"auth": "authentication",
"config": "configuration",
"db": "database",
}
actual_category = category_map.get(category, category)
if actual_category in self.compiled_patterns:
for pattern in self.compiled_patterns[actual_category]:
for match in pattern.finditer(text):
matches.append((match.group(), match.start(), match.end()))
return matches
Functions¶
get_programming_patterns¶
Get singleton instance of programming patterns.
RETURNS | DESCRIPTION |
---|---|
ProgrammingPatterns | ProgrammingPatterns instance |
Source code in tenets/core/nlp/programming_patterns.py
extract_programming_keywords¶
analyze_code_patterns¶
Convenience function to analyze code patterns.
PARAMETER | DESCRIPTION |
---|---|
content | File content TYPE: |
keywords | Prompt keywords |
RETURNS | DESCRIPTION |
---|---|
Dict[str, float] | Dictionary of pattern scores |
Source code in tenets/core/nlp/programming_patterns.py
def analyze_code_patterns(content: str, keywords: List[str]) -> Dict[str, float]:
"""Convenience function to analyze code patterns.
Args:
content: File content
keywords: Prompt keywords
Returns:
Dictionary of pattern scores
"""
patterns = get_programming_patterns()
return patterns.analyze_code_patterns(content, keywords)