Skip to content

llm

Full name: tenets.core.summarizer.llm

llm

LLM-based summarization strategies.

This module provides integration with Large Language Models (LLMs) for high-quality summarization. Supports OpenAI, Anthropic, and OpenRouter APIs.

NOTE: These strategies incur API costs. Use with caution and appropriate rate limiting. Always check pricing before using in production.

Classes

LLMProvider

Bases: Enum

Supported LLM providers.

LLMConfigdataclass

Python
LLMConfig(provider: LLMProvider = LLMProvider.OPENAI, model: str = 'gpt-4o-mini', api_key: Optional[str] = None, base_url: Optional[str] = None, temperature: float = 0.3, max_tokens: int = 500, system_prompt: str = 'You are an expert at summarizing code and technical documentation. \nYour summaries are concise, accurate, and preserve critical technical details.', user_prompt: str = 'Summarize the following text to approximately {target_percent}% of its original length. \nFocus on the most important information and maintain technical accuracy.\n\nText to summarize:\n{text}\n\nSummary:', retry_attempts: int = 3, retry_delay: float = 1.0, timeout: float = 30.0)

Configuration for LLM summarization.

ATTRIBUTEDESCRIPTION
provider

LLM provider to use

TYPE:LLMProvider

model

Model name/ID

TYPE:str

api_key

API key (if not in environment)

TYPE:Optional[str]

base_url

Base URL for API (for custom endpoints)

TYPE:Optional[str]

temperature

Sampling temperature (0-1)

TYPE:float

max_tokens

Maximum tokens in response

TYPE:int

system_prompt

System prompt template

TYPE:str

user_prompt

User prompt template

TYPE:str

retry_attempts

Number of retry attempts

TYPE:int

retry_delay

Delay between retries in seconds

TYPE:float

timeout

Request timeout in seconds

TYPE:float

Functions
get_api_key
Python
get_api_key() -> Optional[str]

Get API key from config or environment.

RETURNSDESCRIPTION
Optional[str]

API key or None

Source code in tenets/core/summarizer/llm.py
Python
def get_api_key(self) -> Optional[str]:
    """Get API key from config or environment.

    Returns:
        API key or None
    """
    if self.api_key:
        return self.api_key

    # Check environment variables
    env_vars = {
        LLMProvider.OPENAI: "OPENAI_API_KEY",
        LLMProvider.ANTHROPIC: "ANTHROPIC_API_KEY",
        LLMProvider.OPENROUTER: "OPENROUTER_API_KEY",
    }

    env_var = env_vars.get(self.provider)
    if env_var:
        return os.getenv(env_var)

    return None

LLMSummarizer

Python
LLMSummarizer(config: Optional[LLMConfig] = None)

Base class for LLM-based summarization.

Provides common functionality for different LLM providers. Handles API calls, retries, and error handling.

Initialize LLM summarizer.

PARAMETERDESCRIPTION
config

LLM configuration

TYPE:Optional[LLMConfig]DEFAULT:None

Source code in tenets/core/summarizer/llm.py
Python
def __init__(self, config: Optional[LLMConfig] = None):
    """Initialize LLM summarizer.

    Args:
        config: LLM configuration
    """
    self.config = config or LLMConfig()
    self.logger = get_logger(__name__)
    self.client = None
    self._initialize_client()
Functions
summarize
Python
summarize(text: str, target_ratio: float = 0.3, max_length: Optional[int] = None, min_length: Optional[int] = None, custom_prompt: Optional[str] = None) -> str

Summarize text using LLM.

PARAMETERDESCRIPTION
text

Text to summarize

TYPE:str

target_ratio

Target compression ratio

TYPE:floatDEFAULT:0.3

max_length

Maximum summary length

TYPE:Optional[int]DEFAULT:None

min_length

Minimum summary length

TYPE:Optional[int]DEFAULT:None

custom_prompt

Custom prompt override

TYPE:Optional[str]DEFAULT:None

RETURNSDESCRIPTION
str

Summarized text

RAISESDESCRIPTION
RuntimeError

If API call fails after retries

Source code in tenets/core/summarizer/llm.py
Python
def summarize(
    self,
    text: str,
    target_ratio: float = 0.3,
    max_length: Optional[int] = None,
    min_length: Optional[int] = None,
    custom_prompt: Optional[str] = None,
) -> str:
    """Summarize text using LLM.

    Args:
        text: Text to summarize
        target_ratio: Target compression ratio
        max_length: Maximum summary length
        min_length: Minimum summary length
        custom_prompt: Custom prompt override

    Returns:
        Summarized text

    Raises:
        RuntimeError: If API call fails after retries
    """
    if not self.client:
        raise RuntimeError(f"No client initialized for {self.config.provider.value}")

    # Prepare prompt
    target_percent = int(target_ratio * 100)

    if custom_prompt:
        user_prompt = custom_prompt.format(
            text=text,
            target_percent=target_percent,
            max_length=max_length,
            min_length=min_length,
        )
    else:
        user_prompt = self.config.user_prompt.format(text=text, target_percent=target_percent)

    # Add length constraints to prompt if specified
    if max_length:
        user_prompt += f"\nMaximum length: {max_length} characters"
    if min_length:
        user_prompt += f"\nMinimum length: {min_length} characters"

    # Make API call with retries
    for attempt in range(self.config.retry_attempts):
        try:
            summary = self._call_api(user_prompt)

            # Validate length constraints
            if max_length and len(summary) > max_length:
                summary = summary[:max_length].rsplit(" ", 1)[0] + "..."
            elif min_length and len(summary) < min_length:
                # Request longer summary
                user_prompt += f"\n\nThe summary is too short. Please provide more detail."
                continue

            return summary

        except Exception as e:
            self.logger.warning(
                f"API call failed (attempt {attempt + 1}/{self.config.retry_attempts}): {e}"
            )
            if attempt < self.config.retry_attempts - 1:
                time.sleep(self.config.retry_delay * (2**attempt))  # Exponential backoff
            else:
                raise RuntimeError(
                    f"Failed to summarize after {self.config.retry_attempts} attempts: {e}"
                )

    return text[:max_length] if max_length else text  # Fallback
estimate_cost
Python
estimate_cost(text: str) -> Dict[str, float]

Estimate cost of summarization.

PARAMETERDESCRIPTION
text

Text to summarize

TYPE:str

RETURNSDESCRIPTION
Dict[str, float]

Dictionary with cost estimates

Source code in tenets/core/summarizer/llm.py
Python
def estimate_cost(self, text: str) -> Dict[str, float]:
    """Estimate cost of summarization.

    Args:
        text: Text to summarize

    Returns:
        Dictionary with cost estimates
    """
    # Rough token estimation (1 token ≈ 4 characters)
    input_tokens = len(text) // 4
    output_tokens = int(input_tokens * 0.3)  # Assume 30% compression

    # Pricing per 1K tokens (as of 2024)
    pricing = {
        "gpt-4o-mini": {"input": 0.00015, "output": 0.0006},
        "gpt-4o": {"input": 0.005, "output": 0.015},
        "gpt-4": {"input": 0.03, "output": 0.06},
        "gpt-4-turbo": {"input": 0.01, "output": 0.03},
        "claude-3-opus": {"input": 0.015, "output": 0.075},
        "claude-3-sonnet": {"input": 0.003, "output": 0.015},
        "claude-3-haiku": {"input": 0.00025, "output": 0.00125},
    }

    model_pricing = pricing.get(self.config.model, {"input": 0.001, "output": 0.002})

    input_cost = (input_tokens / 1000) * model_pricing["input"]
    output_cost = (output_tokens / 1000) * model_pricing["output"]
    total_cost = input_cost + output_cost

    return {
        "input_tokens": input_tokens,
        "output_tokens": output_tokens,
        "input_cost": input_cost,
        "output_cost": output_cost,
        "total_cost": total_cost,
        "currency": "USD",
    }

LLMSummaryStrategy

Python
LLMSummaryStrategy(provider: Union[str, LLMProvider] = LLMProvider.OPENAI, model: str = 'gpt-4o-mini', api_key: Optional[str] = None)

LLM-based summarization strategy for use with Summarizer.

Wraps LLMSummarizer to match the SummarizationStrategy interface.

WARNING: This strategy incurs API costs. Always estimate costs before use.

Initialize LLM strategy.

PARAMETERDESCRIPTION
provider

LLM provider name or enum

TYPE:Union[str, LLMProvider]DEFAULT:OPENAI

model

Model to use

TYPE:strDEFAULT:'gpt-4o-mini'

api_key

API key (if not in environment)

TYPE:Optional[str]DEFAULT:None

Source code in tenets/core/summarizer/llm.py
Python
def __init__(
    self,
    provider: Union[str, LLMProvider] = LLMProvider.OPENAI,
    model: str = "gpt-4o-mini",
    api_key: Optional[str] = None,
):
    """Initialize LLM strategy.

    Args:
        provider: LLM provider name or enum
        model: Model to use
        api_key: API key (if not in environment)
    """
    self.logger = get_logger(__name__)

    # Convert string to enum if needed
    if isinstance(provider, str):
        provider = LLMProvider(provider.lower())

    # Create config
    config = LLMConfig(provider=provider, model=model, api_key=api_key)

    # Initialize summarizer
    self.summarizer = LLMSummarizer(config)

    # Warn about costs
    self.logger.warning(
        f"LLM summarization enabled with {provider.value}/{model}. "
        f"This will incur API costs. Use estimate_cost() to check pricing."
    )
Functions
summarize
Python
summarize(text: str, target_ratio: float = 0.3, max_length: Optional[int] = None, min_length: Optional[int] = None) -> str

Summarize text using LLM.

PARAMETERDESCRIPTION
text

Input text

TYPE:str

target_ratio

Target compression ratio

TYPE:floatDEFAULT:0.3

max_length

Maximum summary length

TYPE:Optional[int]DEFAULT:None

min_length

Minimum summary length

TYPE:Optional[int]DEFAULT:None

RETURNSDESCRIPTION
str

LLM-generated summary

Source code in tenets/core/summarizer/llm.py
Python
def summarize(
    self,
    text: str,
    target_ratio: float = 0.3,
    max_length: Optional[int] = None,
    min_length: Optional[int] = None,
) -> str:
    """Summarize text using LLM.

    Args:
        text: Input text
        target_ratio: Target compression ratio
        max_length: Maximum summary length
        min_length: Minimum summary length

    Returns:
        LLM-generated summary
    """
    return self.summarizer.summarize(
        text, target_ratio=target_ratio, max_length=max_length, min_length=min_length
    )
estimate_cost
Python
estimate_cost(text: str) -> Dict[str, float]

Estimate cost for summarizing text.

PARAMETERDESCRIPTION
text

Text to summarize

TYPE:str

RETURNSDESCRIPTION
Dict[str, float]

Cost estimate dictionary

Source code in tenets/core/summarizer/llm.py
Python
def estimate_cost(self, text: str) -> Dict[str, float]:
    """Estimate cost for summarizing text.

    Args:
        text: Text to summarize

    Returns:
        Cost estimate dictionary
    """
    return self.summarizer.estimate_cost(text)

Functions

create_llm_summarizer

Python
create_llm_summarizer(provider: str = 'openai', model: Optional[str] = None, api_key: Optional[str] = None) -> LLMSummaryStrategy

Create an LLM summarizer with defaults.

PARAMETERDESCRIPTION
provider

Provider name (openai, anthropic, openrouter)

TYPE:strDEFAULT:'openai'

model

Model name (uses provider default if None)

TYPE:Optional[str]DEFAULT:None

api_key

API key (uses environment if None)

TYPE:Optional[str]DEFAULT:None

RETURNSDESCRIPTION
LLMSummaryStrategy

Configured LLMSummaryStrategy

summarizer = create_llm_summarizer("openai", "gpt-4o-mini") >>> summary = summarizer.summarize(long_text, target_ratio=0.2)

Source code in tenets/core/summarizer/llm.py
Python
def create_llm_summarizer(
    provider: str = "openai", model: Optional[str] = None, api_key: Optional[str] = None
) -> LLMSummaryStrategy:
    """Create an LLM summarizer with defaults.

    Args:
        provider: Provider name (openai, anthropic, openrouter)
        model: Model name (uses provider default if None)
        api_key: API key (uses environment if None)

    Returns:
        Configured LLMSummaryStrategy

    Example:
    >>> summarizer = create_llm_summarizer("openai", "gpt-4o-mini")
        >>> summary = summarizer.summarize(long_text, target_ratio=0.2)
    """
    # Default models for each provider
    default_models = {
        "openai": "gpt-4o-mini",
        "anthropic": "claude-3-haiku-20240307",
        "openrouter": "openai/gpt-4o-mini",
        "local": "llama2",
    }

    if model is None:
        model = default_models.get(provider.lower(), "gpt-4o-mini")

    return LLMSummaryStrategy(provider=provider, model=model, api_key=api_key)