`ranker`¶

Full name: tenets.core.ranking.ranker

ranker¶

Main relevance ranking orchestrator.

This module provides the main RelevanceRanker class that coordinates different ranking strategies, manages corpus analysis, and produces ranked results. It supports multiple algorithms, parallel processing, and custom ranking extensions.

The ranker is designed to be efficient, scalable, and extensible while providing high-quality relevance scoring for code search and context generation.

Classes¶

RankingAlgorithm¶

Bases: Enum

Available ranking algorithms.

Each algorithm provides different trade-offs between speed and accuracy.

RankingStats`dataclass`¶

Python

RankingStats(total_files: int = 0, files_ranked: int = 0, files_failed: int = 0, time_elapsed: float = 0.0, algorithm_used: str = '', threshold_applied: float = 0.0, files_above_threshold: int = 0, average_score: float = 0.0, max_score: float = 0.0, min_score: float = 0.0, corpus_stats: Dict[str, Any] = None)

Statistics from ranking operation.

Tracks performance metrics and diagnostic information about the ranking process for monitoring and optimization.

ATTRIBUTE	DESCRIPTION
`total_files`	Total number of files processed TYPE:`int`
`files_ranked`	Number of files successfully ranked TYPE:`int`
`files_failed`	Number of files that failed ranking TYPE:`int`
`time_elapsed`	Total time in seconds TYPE:`float`
`algorithm_used`	Which algorithm was used TYPE:`str`
`threshold_applied`	Relevance threshold used TYPE:`float`
`files_above_threshold`	Number of files above threshold TYPE:`int`
`average_score`	Average relevance score TYPE:`float`
`max_score`	Maximum relevance score TYPE:`float`
`min_score`	Minimum relevance score TYPE:`float`
`corpus_stats`	Dictionary of corpus statistics TYPE:`Dict[str, Any]`

Functions¶

to_dict¶

Python

to_dict() -> Dict[str, Any]

Convert to dictionary representation.

RETURNS	DESCRIPTION
`Dict[str, Any]`	Dictionary with all statistics

Source code in tenets/core/ranking/ranker.py

Python

def to_dict(self) -> Dict[str, Any]:
    """Convert to dictionary representation.

    Returns:
        Dictionary with all statistics
    """
    return {
        "total_files": self.total_files,
        "files_ranked": self.files_ranked,
        "files_failed": self.files_failed,
        "time_elapsed": self.time_elapsed,
        "algorithm_used": self.algorithm_used,
        "threshold_applied": self.threshold_applied,
        "files_above_threshold": self.files_above_threshold,
        "average_score": self.average_score,
        "max_score": self.max_score,
        "min_score": self.min_score,
        "corpus_stats": self.corpus_stats,
    }

RelevanceRanker¶

Python

RelevanceRanker(config: TenetsConfig, algorithm: Optional[str] = None, use_stopwords: Optional[bool] = None)

Main relevance ranking system.

Orchestrates the ranking process by analyzing the corpus, selecting appropriate strategies, and producing ranked results. Supports multiple algorithms, parallel processing, and custom ranking extensions.

The ranker follows a multi-stage process: 1. Corpus analysis (TF-IDF, import graph, statistics) 2. Strategy selection based on algorithm 3. Parallel factor calculation 4. Score aggregation and weighting 5. Filtering and sorting

ATTRIBUTE	DESCRIPTION
`config`	TenetsConfig instance
`logger`	Logger instance
`strategies`	Available ranking strategies
`custom_rankers`	Custom ranking functions TYPE:`List[Callable]`
`executor`	Thread pool for parallel processing
`stats`	Latest ranking statistics
`cache`	Internal cache for optimizations

Initialize the relevance ranker.

PARAMETER	DESCRIPTION
`config`	Tenets configuration TYPE:`TenetsConfig`
`algorithm`	Override default algorithm TYPE:`Optional[str]`DEFAULT:`None`
`use_stopwords`	Override stopword filtering setting TYPE:`Optional[bool]`DEFAULT:`None`

Source code in tenets/core/ranking/ranker.py

Python

def __init__(
    self,
    config: TenetsConfig,
    algorithm: Optional[str] = None,
    use_stopwords: Optional[bool] = None,
):
    """Initialize the relevance ranker.

    Args:
        config: Tenets configuration
        algorithm: Override default algorithm
        use_stopwords: Override stopword filtering setting
    """
    self.config = config
    self.logger = get_logger(__name__)

    # Determine algorithm
    algo_str = algorithm or config.ranking.algorithm
    try:
        self.algorithm = RankingAlgorithm(algo_str)
    except ValueError:
        self.logger.warning(f"Unknown algorithm '{algo_str}', using balanced")
        self.algorithm = RankingAlgorithm.BALANCED

    # Stopword configuration
    self.use_stopwords = (
        use_stopwords if use_stopwords is not None else config.ranking.use_stopwords
    )

    # ML configuration
    self.use_ml = (
        config.ranking.use_ml if config and hasattr(config.ranking, "use_ml") else False
    )
    self.use_reranker = (
        getattr(config.ranking, "use_reranker", False)
        if config and hasattr(config.ranking, "use_reranker")
        else False
    )
    self.rerank_top_k = (
        getattr(config.ranking, "rerank_top_k", 20)
        if config and hasattr(config.ranking, "rerank_top_k")
        else 20
    )

    # Initialize strategies lazily to avoid loading unnecessary models
    self._strategies_cache: Dict[RankingAlgorithm, RankingStrategy] = {}
    self.strategies = self._strategies_cache  # Alias for compatibility

    # Pre-populate core strategies for tests that expect them
    # These are lightweight and don't load ML models until actually used
    self._init_core_strategies()

    # Custom rankers list (keep public and test-expected private alias)
    self.custom_rankers: List[Callable] = []
    self._custom_rankers: List[Callable] = self.custom_rankers

    # Thread pool for parallel ranking (lazy initialization to avoid Windows issues)
    from tenets.utils.multiprocessing import get_ranking_workers, log_worker_info

    max_workers = get_ranking_workers(config)
    self.max_workers = max_workers  # Store for logging
    self._executor_instance = None  # Will be created lazily
    # Backwards-compat alias expected by some tests
    self._executor = None

    # Statistics and cache
    self.stats = RankingStats()
    self.cache = {}

    # ML model (loaded lazily)
    self._ml_model = None

    # Optional ML embedding model placeholder for tests that patch it
    # Also expose module-level symbol on instance for convenience
    self.SentenceTransformer = SentenceTransformer

    # Log worker configuration
    log_worker_info(self.logger, "RelevanceRanker", max_workers)
    self.logger.info(
        f"RelevanceRanker initialized: algorithm={self.algorithm.value}, "
        f"use_stopwords={self.use_stopwords}, use_ml={self.use_ml}"
    )

Attributes¶

executor`property`¶

Python

executor

Lazy initialization of ThreadPoolExecutor to avoid Windows import issues.

Functions¶

rank_files¶

Python

rank_files(files: List[FileAnalysis], prompt_context: PromptContext, algorithm: Optional[str] = None, parallel: bool = True, explain: bool = False) -> List[FileAnalysis]

Rank files by relevance to prompt.

This is the main entry point for ranking files. It analyzes the corpus, applies the selected ranking strategy, and returns files sorted by relevance above the configured threshold.

PARAMETER	DESCRIPTION
`files`	List of files to rank TYPE:`List[FileAnalysis]`
`prompt_context`	Parsed prompt information TYPE:`PromptContext`
`algorithm`	Override algorithm for this ranking TYPE:`Optional[str]`DEFAULT:`None`
`parallel`	Whether to rank files in parallel TYPE:`bool`DEFAULT:`True`
`explain`	Whether to generate ranking explanations TYPE:`bool`DEFAULT:`False`

RETURNS	DESCRIPTION
`List[FileAnalysis]`	List of FileAnalysis objects sorted by relevance (highest first)
`List[FileAnalysis]`	and filtered by threshold

RAISES	DESCRIPTION
`ValueError`	If algorithm is invalid

Source code in tenets/core/ranking/ranker.py

Python

def rank_files(
    self,
    files: List[FileAnalysis],
    prompt_context: PromptContext,
    algorithm: Optional[str] = None,
    parallel: bool = True,
    explain: bool = False,
) -> List[FileAnalysis]:
    """Rank files by relevance to prompt.

    This is the main entry point for ranking files. It analyzes the corpus,
    applies the selected ranking strategy, and returns files sorted by
    relevance above the configured threshold.

    Args:
        files: List of files to rank
        prompt_context: Parsed prompt information
        algorithm: Override algorithm for this ranking
        parallel: Whether to rank files in parallel
        explain: Whether to generate ranking explanations

    Returns:
        List of FileAnalysis objects sorted by relevance (highest first)
        and filtered by threshold

    Raises:
        ValueError: If algorithm is invalid
    """
    if not files:
        return []

    start_time = time.time()

    # Reset statistics
    self.stats = RankingStats(
        total_files=len(files),
        algorithm_used=algorithm or self.algorithm.value,
        threshold_applied=self.config.ranking.threshold,
    )

    # No need to disable parallel on Windows Python 3.13+ anymore
    # The executor property handles it properly with ProcessPoolExecutor

    self.logger.info(
        f"Ranking {len(files)} files using {self.stats.algorithm_used} algorithm "
        f"(parallel={parallel}, workers={self.max_workers if parallel else 1})"
    )

    # Select strategy
    if algorithm:
        try:
            strategy = self._get_strategy(algorithm)
        except ValueError:
            raise ValueError(f"Unknown ranking algorithm: {algorithm}")
    else:
        strategy = self._get_strategy(self.algorithm.value)

    if not strategy:
        raise ValueError(f"No strategy for algorithm: {self.algorithm}")

    # Analyze corpus
    corpus_stats = self._analyze_corpus(files, prompt_context)
    self.stats.corpus_stats = corpus_stats

    # Rank files
    ranked_files = self._rank_with_strategy(
        files, prompt_context, corpus_stats, strategy, parallel
    )

    # Apply custom rankers
    for custom_ranker in self.custom_rankers:
        try:
            ranked_files = custom_ranker(ranked_files, prompt_context)
        except Exception as e:
            self.logger.warning(f"Custom ranker failed: {e}")

    # Sort by score
    ranked_files.sort(reverse=True)

    # Apply neural reranking if enabled and ML strategy is used
    if self.use_reranker and self.algorithm == RankingAlgorithm.ML and len(ranked_files) > 0:
        ranked_files = self._apply_neural_reranking(
            ranked_files, prompt_context, min(self.rerank_top_k, len(ranked_files))
        )

    # Filter by threshold and update statistics
    threshold = self.config.ranking.threshold
    filtered_files = []
    scores = []

    for i, rf in enumerate(ranked_files):
        scores.append(rf.score)

        if rf.score >= threshold:
            # Update FileAnalysis with ranking info
            rf.analysis.relevance_score = rf.score
            rf.analysis.relevance_rank = i + 1

            # Generate explanation if requested
            if explain:
                rf.explanation = rf.generate_explanation(strategy.get_weights(), verbose=True)

            filtered_files.append(rf.analysis)

    # Update statistics
    self.stats.files_ranked = len(ranked_files)
    self.stats.files_above_threshold = len(filtered_files)
    self.stats.time_elapsed = time.time() - start_time

    if scores:
        self.stats.average_score = sum(scores) / len(scores)
        self.stats.max_score = max(scores)
        self.stats.min_score = min(scores)

    # If nothing passed threshold, fall back to returning top 1-3 files
    if not filtered_files and ranked_files:
        top_k = min(3, len(ranked_files))
        fallback = [rf.analysis for rf in ranked_files[:top_k]]
        for i, a in enumerate(fallback, 1):
            a.relevance_score = ranked_files[i - 1].score
            a.relevance_rank = i
        filtered_files = fallback

    self.logger.info(
        f"Ranking complete: {len(filtered_files)}/{len(files)} files "
        f"above threshold ({threshold:.2f}) in {self.stats.time_elapsed:.2f}s"
    )

    # Generate explanation report if requested
    if explain and ranked_files:
        explainer = RankingExplainer()
        explanation = explainer.explain_ranking(ranked_files[:20], strategy.get_weights())
        self.logger.info(f"Ranking Explanation:\n{explanation}")

    return filtered_files

register_custom_ranker¶

Python

register_custom_ranker(ranker_func: Callable[[List[RankedFile], PromptContext], List[RankedFile]])

Register a custom ranking function.

Custom rankers are applied after the main ranking strategy and can adjust scores based on project-specific logic.

PARAMETER	DESCRIPTION
`ranker_func`	Function that takes ranked files and returns modified list TYPE:`Callable[[List[RankedFile], PromptContext], List[RankedFile]]`

Example

def boost_tests(ranked_files, prompt_context): ... if 'test' in prompt_context.text: ... for rf in ranked_files: ... if 'test' in rf.path: ... rf.score *= 1.5 ... return ranked_files ranker.register_custom_ranker(boost_tests)

Source code in tenets/core/ranking/ranker.py

Python

def register_custom_ranker(
    self, ranker_func: Callable[[List[RankedFile], PromptContext], List[RankedFile]]
):
    """Register a custom ranking function.

    Custom rankers are applied after the main ranking strategy and can
    adjust scores based on project-specific logic.

    Args:
        ranker_func: Function that takes ranked files and returns modified list

    Example:
        >>> def boost_tests(ranked_files, prompt_context):
        ...     if 'test' in prompt_context.text:
        ...         for rf in ranked_files:
        ...             if 'test' in rf.path:
        ...                 rf.score *= 1.5
        ...     return ranked_files
        >>> ranker.register_custom_ranker(boost_tests)
    """
    self.custom_rankers.append(ranker_func)
    # Keep alias updated
    self._custom_rankers = self.custom_rankers
    self.logger.info(f"Registered custom ranker: {ranker_func.__name__}")

get_ranking_explanation¶

Python

get_ranking_explanation(ranked_files: List[RankedFile], top_n: int = 10) -> str

Get detailed explanation of ranking results.

PARAMETER	DESCRIPTION
`ranked_files`	List of ranked files TYPE:`List[RankedFile]`
`top_n`	Number of top files to explain TYPE:`int`DEFAULT:`10`

RETURNS	DESCRIPTION
`str`	Formatted explanation string

Source code in tenets/core/ranking/ranker.py

Python

def get_ranking_explanation(self, ranked_files: List[RankedFile], top_n: int = 10) -> str:
    """Get detailed explanation of ranking results.

    Args:
        ranked_files: List of ranked files
        top_n: Number of top files to explain

    Returns:
        Formatted explanation string
    """
    explainer = RankingExplainer()
    strategy = self.strategies.get(self.algorithm)
    weights = strategy.get_weights() if strategy else {}

    return explainer.explain_ranking(ranked_files[:top_n], weights, top_n=top_n)

get_stats¶

Python

get_stats() -> RankingStats

Get latest ranking statistics.

RETURNS	DESCRIPTION
`RankingStats`	RankingStats object

Source code in tenets/core/ranking/ranker.py

Python

def get_stats(self) -> RankingStats:
    """Get latest ranking statistics.

    Returns:
        RankingStats object
    """
    return self.stats

shutdown¶

Python

shutdown()

Shutdown the ranker and clean up resources.

Source code in tenets/core/ranking/ranker.py

Python

def shutdown(self):
    """Shutdown the ranker and clean up resources."""
    if self._executor_instance is not None:
        self._executor_instance.shutdown(wait=True)
    self.logger.info("RelevanceRanker shutdown complete")

Functions¶

create_ranker¶

Python

create_ranker(config: Optional[TenetsConfig] = None, algorithm: str = 'balanced', use_stopwords: bool = False) -> RelevanceRanker

Create a configured relevance ranker.

PARAMETER	DESCRIPTION
`config`	Configuration (uses default if None) TYPE:`Optional[TenetsConfig]`DEFAULT:`None`
`algorithm`	Ranking algorithm to use TYPE:`str`DEFAULT:`'balanced'`
`use_stopwords`	Whether to filter stopwords TYPE:`bool`DEFAULT:`False`

RETURNS	DESCRIPTION
`RelevanceRanker`	Configured RelevanceRanker instance

Source code in tenets/core/ranking/ranker.py

Python

def create_ranker(
    config: Optional[TenetsConfig] = None, algorithm: str = "balanced", use_stopwords: bool = False
) -> RelevanceRanker:
    """Create a configured relevance ranker.

    Args:
        config: Configuration (uses default if None)
        algorithm: Ranking algorithm to use
        use_stopwords: Whether to filter stopwords

    Returns:
        Configured RelevanceRanker instance
    """
    if config is None:
        config = TenetsConfig()

    return RelevanceRanker(config, algorithm=algorithm, use_stopwords=use_stopwords)

ranker¶

ranker¶

Classes¶

RankingAlgorithm¶

RankingStatsdataclass¶

Functions¶

to_dict¶

RelevanceRanker¶

Attributes¶

executorproperty¶

Functions¶

rank_files¶

register_custom_ranker¶

get_ranking_explanation¶

get_stats¶

shutdown¶

Functions¶

create_ranker¶

`ranker`¶

RankingStats`dataclass`¶

executor`property`¶