`factors`¶

Full name: tenets.core.ranking.factors

factors¶

Ranking factors and scored file models.

This module defines the data structures for ranking factors and scored files. It provides a comprehensive set of factors that contribute to relevance scoring, along with utilities for calculating weighted scores and generating explanations.

The ranking system uses multiple orthogonal factors to determine file relevance, allowing for flexible and accurate scoring across different use cases.

Classes¶

FactorWeight¶

Bases: Enum

Standard weight presets for ranking factors.

These presets provide balanced weights for different use cases. Can be overridden with custom weights in configuration.

RankingFactors`dataclass`¶

Python

RankingFactors(keyword_match: float = 0.0, tfidf_similarity: float = 0.0, bm25_score: float = 0.0, path_relevance: float = 0.0, import_centrality: float = 0.0, dependency_depth: float = 0.0, git_recency: float = 0.0, git_frequency: float = 0.0, git_author_relevance: float = 0.0, complexity_relevance: float = 0.0, maintainability_score: float = 0.0, semantic_similarity: float = 0.0, type_relevance: float = 0.0, code_patterns: float = 0.0, ast_relevance: float = 0.0, test_coverage: float = 0.0, documentation_score: float = 0.0, custom_scores: Dict[str, float] = dict(), metadata: Dict[str, Any] = dict())

Comprehensive ranking factors for a file.

Each factor represents a different dimension of relevance. The final relevance score is computed as a weighted sum of these factors.

Factors are grouped into categories: - Text-based: keyword_match, tfidf_similarity, bm25_score - Structure-based: path_relevance, import_centrality, dependency_depth - Git-based: git_recency, git_frequency, git_author_relevance - Complexity-based: complexity_relevance, maintainability_score - Semantic: semantic_similarity (requires ML) - Pattern-based: code_patterns, ast_relevance - Custom: custom_scores for project-specific factors

ATTRIBUTE	DESCRIPTION
`keyword_match`	Direct keyword matching score (0-1) TYPE:`float`
`tfidf_similarity`	TF-IDF cosine similarity score (0-1) TYPE:`float`
`bm25_score`	BM25 relevance score (0-1) TYPE:`float`
`path_relevance`	File path relevance to query (0-1) TYPE:`float`
`import_centrality`	How central file is in import graph (0-1) TYPE:`float`
`git_recency`	How recently file was modified (0-1) TYPE:`float`
`git_frequency`	How frequently file changes (0-1) TYPE:`float`
`git_author_relevance`	Relevance based on commit authors (0-1) TYPE:`float`
`complexity_relevance`	Relevance based on code complexity (0-1) TYPE:`float`
`maintainability_score`	Code maintainability score (0-1) TYPE:`float`
`semantic_similarity`	ML-based semantic similarity (0-1) TYPE:`float`
`type_relevance`	Relevance based on file type (0-1) TYPE:`float`
`code_patterns`	Pattern matching score (0-1) TYPE:`float`
`ast_relevance`	AST structure relevance (0-1) TYPE:`float`
`dependency_depth`	Dependency tree depth score (0-1) TYPE:`float`
`test_coverage`	Test coverage relevance (0-1) TYPE:`float`
`documentation_score`	Documentation quality score (0-1) TYPE:`float`
`custom_scores`	Dictionary of custom factor scores TYPE:`Dict[str, float]`
`metadata`	Additional metadata about factor calculation TYPE:`Dict[str, Any]`

Functions¶

get_weighted_score¶

Python

get_weighted_score(weights: Dict[str, float], normalize: bool = True) -> float

Calculate weighted relevance score.

PARAMETER	DESCRIPTION
`weights`	Dictionary mapping factor names to weights TYPE:`Dict[str, float]`
`normalize`	Whether to normalize final score to [0, 1] TYPE:`bool`DEFAULT:`True`

RETURNS	DESCRIPTION
`float`	Weighted relevance score

Source code in tenets/core/ranking/factors.py

Python

def get_weighted_score(self, weights: Dict[str, float], normalize: bool = True) -> float:
    """Calculate weighted relevance score.

    Args:
        weights: Dictionary mapping factor names to weights
        normalize: Whether to normalize final score to [0, 1]

    Returns:
        Weighted relevance score
    """
    score = 0.0
    total_weight = 0.0

    # Map attribute names to values
    factor_values = {
        "keyword_match": self.keyword_match,
        "tfidf_similarity": self.tfidf_similarity,
        "bm25_score": self.bm25_score,
        "path_relevance": self.path_relevance,
        "import_centrality": self.import_centrality,
        "dependency_depth": self.dependency_depth,
        "git_recency": self.git_recency,
        "git_frequency": self.git_frequency,
        "git_author_relevance": self.git_author_relevance,
        "complexity_relevance": self.complexity_relevance,
        "maintainability_score": self.maintainability_score,
        "semantic_similarity": self.semantic_similarity,
        "type_relevance": self.type_relevance,
        "code_patterns": self.code_patterns,
        "ast_relevance": self.ast_relevance,
        "test_coverage": self.test_coverage,
        "documentation_score": self.documentation_score,
    }

    # Add standard factors
    for factor_name, factor_value in factor_values.items():
        if factor_name in weights:
            weight = weights[factor_name]
            score += factor_value * weight
            total_weight += weight

    # Add custom factors
    for custom_name, custom_value in self.custom_scores.items():
        if custom_name in weights:
            weight = weights[custom_name]
            score += custom_value * weight
            total_weight += weight

    # Normalize if requested and weights exist
    if normalize and total_weight > 0:
        score = score / total_weight

    return max(0.0, min(1.0, score))

get_top_factors¶

Python

get_top_factors(weights: Dict[str, float], n: int = 5) -> List[Tuple[str, float, float]]

Get the top contributing factors.

PARAMETER	DESCRIPTION
`weights`	Factor weights TYPE:`Dict[str, float]`
`n`	Number of top factors to return TYPE:`int`DEFAULT:`5`

RETURNS	DESCRIPTION
`List[Tuple[str, float, float]]`	List of (factor_name, value, contribution) tuples

Source code in tenets/core/ranking/factors.py

Python

def get_top_factors(
    self, weights: Dict[str, float], n: int = 5
) -> List[Tuple[str, float, float]]:
    """Get the top contributing factors.

    Args:
        weights: Factor weights
        n: Number of top factors to return

    Returns:
        List of (factor_name, value, contribution) tuples
    """
    contributions = []

    # Calculate contributions for all factors
    factor_values = {
        "keyword_match": self.keyword_match,
        "tfidf_similarity": self.tfidf_similarity,
        "bm25_score": self.bm25_score,
        "path_relevance": self.path_relevance,
        "import_centrality": self.import_centrality,
        "dependency_depth": self.dependency_depth,
        "git_recency": self.git_recency,
        "git_frequency": self.git_frequency,
        "git_author_relevance": self.git_author_relevance,
        "complexity_relevance": self.complexity_relevance,
        "maintainability_score": self.maintainability_score,
        "semantic_similarity": self.semantic_similarity,
        "type_relevance": self.type_relevance,
        "code_patterns": self.code_patterns,
        "ast_relevance": self.ast_relevance,
        "test_coverage": self.test_coverage,
        "documentation_score": self.documentation_score,
    }

    for factor_name, factor_value in factor_values.items():
        if factor_name in weights and factor_value > 0:
            contribution = factor_value * weights[factor_name]
            contributions.append((factor_name, factor_value, contribution))

    # Add custom factors
    for custom_name, custom_value in self.custom_scores.items():
        if custom_name in weights and custom_value > 0:
            contribution = custom_value * weights[custom_name]
            contributions.append((custom_name, custom_value, contribution))

    # Sort by contribution
    contributions.sort(key=lambda x: x[2], reverse=True)

    return contributions[:n]

to_dict¶

Python

to_dict() -> Dict[str, Any]

Convert factors to dictionary representation.

RETURNS	DESCRIPTION
`Dict[str, Any]`	Dictionary with all factor values

Source code in tenets/core/ranking/factors.py

Python

def to_dict(self) -> Dict[str, Any]:
    """Convert factors to dictionary representation.

    Returns:
        Dictionary with all factor values
    """
    return {
        "keyword_match": self.keyword_match,
        "tfidf_similarity": self.tfidf_similarity,
        "bm25_score": self.bm25_score,
        "path_relevance": self.path_relevance,
        "import_centrality": self.import_centrality,
        "dependency_depth": self.dependency_depth,
        "git_recency": self.git_recency,
        "git_frequency": self.git_frequency,
        "git_author_relevance": self.git_author_relevance,
        "complexity_relevance": self.complexity_relevance,
        "maintainability_score": self.maintainability_score,
        "semantic_similarity": self.semantic_similarity,
        "type_relevance": self.type_relevance,
        "code_patterns": self.code_patterns,
        "ast_relevance": self.ast_relevance,
        "test_coverage": self.test_coverage,
        "documentation_score": self.documentation_score,
        "custom_scores": self.custom_scores,
        "metadata": self.metadata,
    }

RankedFile`dataclass`¶

Python

RankedFile(analysis: FileAnalysis, score: float, factors: RankingFactors, explanation: str = '', confidence: float = 1.0, rank: Optional[int] = None, metadata: Dict[str, Any] = dict())

A file with its relevance ranking.

Combines a FileAnalysis with ranking scores and metadata. Provides utilities for comparison, explanation generation, and result formatting.

ATTRIBUTE	DESCRIPTION
`analysis`	The FileAnalysis object TYPE:`FileAnalysis`
`score`	Overall relevance score (0-1) TYPE:`float`
`factors`	Detailed ranking factors TYPE:`RankingFactors`
`explanation`	Human-readable ranking explanation TYPE:`str`
`confidence`	Confidence in the ranking (0-1) TYPE:`float`
`rank`	Position in ranked list (1-based) TYPE:`Optional[int]`
`metadata`	Additional ranking metadata TYPE:`Dict[str, Any]`

Attributes¶

path`property`¶

Python

path: str

Get file path.

file_name`property`¶

Python

file_name: str

Get file name.

language`property`¶

Python

language: str

Get file language.

Functions¶

generate_explanation¶

Python

generate_explanation(weights: Dict[str, float], verbose: bool = False) -> str

Generate human-readable explanation of ranking.

PARAMETER	DESCRIPTION
`weights`	Factor weights used for ranking TYPE:`Dict[str, float]`
`verbose`	Include detailed factor breakdown TYPE:`bool`DEFAULT:`False`

RETURNS	DESCRIPTION
`str`	Explanation string

Source code in tenets/core/ranking/factors.py

Python

def generate_explanation(self, weights: Dict[str, float], verbose: bool = False) -> str:
    """Generate human-readable explanation of ranking.

    Args:
        weights: Factor weights used for ranking
        verbose: Include detailed factor breakdown

    Returns:
        Explanation string
    """
    if self.explanation and not verbose:
        return self.explanation

    # Get top contributing factors
    top_factors = self.factors.get_top_factors(weights, n=3)

    if not top_factors:
        return "Low relevance (no significant factors)"

    # Build explanation
    explanations = []

    for factor_name, value, contribution in top_factors:
        # Generate human-readable factor description
        if factor_name == "keyword_match":
            explanations.append(f"Strong keyword match ({value:.2f})")
        elif factor_name == "tfidf_similarity":
            explanations.append(f"High TF-IDF similarity ({value:.2f})")
        elif factor_name == "bm25_score":
            explanations.append(f"High BM25 relevance ({value:.2f})")
        elif factor_name == "semantic_similarity":
            explanations.append(f"High semantic similarity ({value:.2f})")
        elif factor_name == "path_relevance":
            explanations.append(f"Relevant file path ({value:.2f})")
        elif factor_name == "import_centrality":
            explanations.append(f"Central to import graph ({value:.2f})")
        elif factor_name == "git_recency":
            explanations.append(f"Recently modified ({value:.2f})")
        elif factor_name == "git_frequency":
            explanations.append(f"Frequently changed ({value:.2f})")
        elif factor_name == "complexity_relevance":
            explanations.append(f"Relevant complexity ({value:.2f})")
        elif factor_name == "code_patterns":
            explanations.append(f"Matching code patterns ({value:.2f})")
        elif factor_name == "type_relevance":
            explanations.append(f"Relevant file type ({value:.2f})")
        else:
            explanations.append(f"{factor_name.replace('_', ' ').title()} ({value:.2f})")

    if verbose:
        # Add confidence and rank info
        if self.rank:
            explanations.append(f"Rank: #{self.rank}")
        explanations.append(f"Confidence: {self.confidence:.2f}")

    explanation = "; ".join(explanations)
    self.explanation = explanation

    return explanation

to_dict¶

Python

to_dict() -> Dict[str, Any]

Convert to dictionary representation.

RETURNS	DESCRIPTION
`Dict[str, Any]`	Dictionary with all ranking information

Source code in tenets/core/ranking/factors.py

Python

def to_dict(self) -> Dict[str, Any]:
    """Convert to dictionary representation.

    Returns:
        Dictionary with all ranking information
    """
    return {
        "path": self.analysis.path,
        "score": self.score,
        "rank": self.rank,
        "confidence": self.confidence,
        "explanation": self.explanation,
        "factors": self.factors.to_dict(),
        "metadata": self.metadata,
        "file_info": {
            "name": self.file_name,
            "language": self.language,
            "size": self.analysis.size,
            "lines": self.analysis.lines,
        },
    }

RankingExplainer¶

Python

RankingExplainer()

Utility class for generating ranking explanations.

Provides detailed explanations of why files ranked the way they did, useful for debugging and understanding ranking behavior.

Initialize the explainer.

Source code in tenets/core/ranking/factors.py

Python

def __init__(self):
    """Initialize the explainer."""
    self.logger = get_logger(__name__)

Functions¶

explain_ranking¶

Python

explain_ranking(ranked_files: List[RankedFile], weights: Dict[str, float], top_n: int = 10, include_factors: bool = True) -> str

Generate comprehensive ranking explanation.

PARAMETER	DESCRIPTION
`ranked_files`	List of ranked files TYPE:`List[RankedFile]`
`weights`	Factor weights used TYPE:`Dict[str, float]`
`top_n`	Number of top files to explain TYPE:`int`DEFAULT:`10`
`include_factors`	Include factor breakdown TYPE:`bool`DEFAULT:`True`

RETURNS	DESCRIPTION
`str`	Formatted explanation string

Source code in tenets/core/ranking/factors.py

Python

def explain_ranking(
    self,
    ranked_files: List[RankedFile],
    weights: Dict[str, float],
    top_n: int = 10,
    include_factors: bool = True,
) -> str:
    """Generate comprehensive ranking explanation.

    Args:
        ranked_files: List of ranked files
        weights: Factor weights used
        top_n: Number of top files to explain
        include_factors: Include factor breakdown

    Returns:
        Formatted explanation string
    """
    lines = []
    lines.append("=" * 80)
    lines.append("RANKING EXPLANATION")
    lines.append("=" * 80)
    lines.append("")

    # Summary statistics
    lines.append(f"Total files ranked: {len(ranked_files)}")
    if ranked_files:
        lines.append(f"Score range: {ranked_files[0].score:.3f} - {ranked_files[-1].score:.3f}")
        avg_score = sum(f.score for f in ranked_files) / len(ranked_files)
        lines.append(f"Average score: {avg_score:.3f}")
    lines.append("")

    # Weight configuration
    lines.append("Factor Weights:")
    sorted_weights = sorted(weights.items(), key=lambda x: x[1], reverse=True)
    for factor, weight in sorted_weights:
        if weight > 0:
            lines.append(f"  {factor:25s}: {weight:.2f}")
    lines.append("")

    # Top files explanation
    lines.append(f"Top {min(top_n, len(ranked_files))} Files:")
    lines.append("-" * 80)

    for i, ranked_file in enumerate(ranked_files[:top_n], 1):
        lines.append(f"\n{i}. {ranked_file.path}")
        lines.append(f"   Score: {ranked_file.score:.3f}")
        lines.append(f"   {ranked_file.generate_explanation(weights, verbose=False)}")

        if include_factors:
            lines.append("   Factor Breakdown:")
            top_factors = ranked_file.factors.get_top_factors(weights, n=5)
            for factor_name, value, contribution in top_factors:
                lines.append(
                    f"     - {factor_name:20s}: {value:.3f} × {weights.get(factor_name, 0):.2f} = {contribution:.3f}"
                )

    return "\n".join(lines)

compare_rankings¶

Python

compare_rankings(rankings1: List[RankedFile], rankings2: List[RankedFile], labels: Tuple[str, str] = ('Ranking 1', 'Ranking 2')) -> str

Compare two different rankings.

Useful for understanding how different algorithms or weights affect ranking results.

PARAMETER	DESCRIPTION
`rankings1`	First ranking TYPE:`List[RankedFile]`
`rankings2`	Second ranking TYPE:`List[RankedFile]`
`labels`	Labels for the two rankings TYPE:`Tuple[str, str]`DEFAULT:`('Ranking 1', 'Ranking 2')`

RETURNS	DESCRIPTION
`str`	Comparison report

Source code in tenets/core/ranking/factors.py

Python

def compare_rankings(
    self,
    rankings1: List[RankedFile],
    rankings2: List[RankedFile],
    labels: Tuple[str, str] = ("Ranking 1", "Ranking 2"),
) -> str:
    """Compare two different rankings.

    Useful for understanding how different algorithms or weights
    affect ranking results.

    Args:
        rankings1: First ranking
        rankings2: Second ranking
        labels: Labels for the two rankings

    Returns:
        Comparison report
    """
    lines = []
    lines.append("=" * 80)
    lines.append("RANKING COMPARISON")
    lines.append("=" * 80)
    lines.append("")

    # Create path to rank mappings
    rank1_map = {r.path: i + 1 for i, r in enumerate(rankings1)}
    rank2_map = {r.path: i + 1 for i, r in enumerate(rankings2)}

    # Find differences
    all_paths = set(rank1_map.keys()) | set(rank2_map.keys())

    differences = []
    for path in all_paths:
        rank1 = rank1_map.get(path, len(rankings1) + 1)
        rank2 = rank2_map.get(path, len(rankings2) + 1)
        diff = abs(rank1 - rank2)
        differences.append((path, rank1, rank2, diff))

    # Sort by difference
    differences.sort(key=lambda x: x[3], reverse=True)

    # Report
    lines.append(f"{labels[0]}: {len(rankings1)} files")
    lines.append(f"{labels[1]}: {len(rankings2)} files")
    lines.append("")

    lines.append("Largest Rank Differences:")
    lines.append("-" * 80)

    for path, rank1, rank2, diff in differences[:10]:
        if diff > 0:
            direction = "↑" if rank2 < rank1 else "↓"
            lines.append(
                f"{Path(path).name:30s}: #{rank1:3d} → #{rank2:3d} ({direction}{diff:3d})"
            )

    return "\n".join(lines)

factors¶

factors¶

Classes¶

FactorWeight¶

RankingFactorsdataclass¶

Functions¶

get_weighted_score¶

get_top_factors¶

to_dict¶

RankedFiledataclass¶

Attributes¶

pathproperty¶

file_nameproperty¶

languageproperty¶

Functions¶

generate_explanation¶

to_dict¶

RankingExplainer¶

Functions¶

explain_ranking¶

compare_rankings¶

Functions¶

`factors`¶

RankingFactors`dataclass`¶

RankedFile`dataclass`¶

path`property`¶

file_name`property`¶

language`property`¶