Skip to content

Why Context Is Everything in AI Coding

Author: Johnny Dunn | Date: September 15, 2024


The Context Problem

When you ask an LLM to write code, it generates tokens based on probability distributions conditioned on its input. The model has no access to your filesystem, your git history, or your architectural decisions—it only sees what's in the context window.

This creates two problems:

  1. Knowledge gaps: The model fills missing information with generic training data patterns
  2. Context drift: In long conversations, the model forgets earlier instructions and coding standards

Tenets addresses both: intelligent code context (finding relevant files) and automatic tenets injection (your guiding principles in every prompt).

Python
# You ask: "Add a user authentication endpoint"

# Without context, the LLM might generate:
from flask import Flask  # You use FastAPI
import bcrypt  # You use argon2
from sqlalchemy import create_engine  # You use async SQLAlchemy

# With proper context, it generates:
from fastapi import APIRouter, Depends  # Matches your stack
from app.auth.argon2 import hash_password  # Uses your utility
from app.db.session import get_async_session  # Your DB pattern

The difference isn't AI capability—it's input quality.


The Math: Why Random File Selection Fails

Consider a 50,000-file codebase with a 128k token context window. Naive approaches fail mathematically:

Random Selection

If you randomly select files until hitting the token budget:

  • P(relevant file) ≈ 50/50,000 = 0.1%
  • Expected relevant files in 100 selected: ~0.1 files
  • Result: Context filled with irrelevant code

Keyword Matching Only

Simple substring matching has precision problems:

Python
# Query: "authentication"
# Matches:
"authentication.py"           # ✓ Relevant
"test_authentication.py"       # Maybe relevant
"old_authentication_backup.py" # ✗ Deprecated
"docs/authentication.md"       # ✗ Wrong type
"// TODO: add authentication"  # ✗ Comment noise

False positives dilute context quality.

Why Multi-Factor Ranking Works

Tenets uses BM25 + structural analysis + git signals to compute relevance:

Text Only
score(file) = 
    0.25 × BM25(query, file_content) +      # Statistical text relevance
    0.20 × keyword_match(query, file) +     # Direct term matching
    0.15 × path_relevance(query, file) +    # Directory structure signals
    0.10 × tfidf_similarity(query, file) +  # Term frequency analysis
    0.10 × import_centrality(file) +        # Dependency importance
    0.10 × git_signals(file) +              # Recency + frequency
    0.05 × complexity_relevance(file) +     # Code complexity
    0.05 × type_relevance(query, file)      # File type matching

This multi-factor approach:

  1. Prevents repetition bias: BM25 penalizes files that repeat terms without adding information
  2. Captures structure: Import centrality finds files that are "hubs" in your dependency graph
  3. Prioritizes freshness: Git signals weight recently-modified, frequently-changed files

Token Budgets: The Constraint That Shapes Everything

LLMs have hard context limits. Even GPT-4's 128k tokens fills fast:

Content TypeTokens/File (avg)Files in 100k
Python modules800-150065-125
TypeScript files600-120080-165
Config files100-300300+
Test files1000-200050-100

The problem: You can't include everything. You must rank and select.

Intelligent Truncation

When a file is relevant but too large, naive truncation loses signal:

Python
# Bad: First N tokens (loses the important parts)
def helper_one():
    pass

def helper_two():
    pass

# ... truncated at 500 tokens ...
# MISSED: The actual authenticate() function at line 400

Tenets preserves structure through intelligent summarization:

Python
# Good: Signature + docstring + key blocks
def authenticate(username: str, password: str) -> AuthResult:
    """
    Authenticate user credentials against database.
    Returns AuthResult with user data or error details.
    """
    # Implementation: 45 lines - validates credentials
    # Uses: app.auth.argon2.verify_password
    # Raises: AuthenticationError, RateLimitError

Real Example: Debugging Authentication

Let's trace how context quality affects a real task.

Task: "Fix the bug where users can't reset passwords"

Without Intelligent Context

The LLM sees random files or keyword matches:

Text Only
Context (keyword "password"):
- docs/security-policy.md (mentions "password" 20x)
- scripts/generate_test_passwords.py
- migrations/0001_add_password_hash.sql
- tests/test_password_validation.py

Result: The LLM hallucinates a solution based on generic patterns.

With Tenets Context

Bash
tenets distill "fix password reset bug" --mode balanced

Tenets analyzes: 1. Query understanding: Extracts keywords password, reset, bug 2. BM25 ranking: Scores files by statistical relevance 3. Import graph: Finds files that import/export password utilities 4. Git signals: Prioritizes recently-modified auth files

Text Only
Context (ranked by relevance):
1. app/auth/password_reset.py      (0.89) - Reset logic
2. app/auth/token_manager.py       (0.76) - Token generation
3. app/models/user.py              (0.71) - User model
4. app/email/templates/reset.html  (0.65) - Email template
5. tests/auth/test_reset.py        (0.61) - Existing tests

The LLM now sees exactly the code it needs.


The Import Centrality Signal

One of Tenets' most powerful signals is import centrality—measuring how "central" a file is in your dependency graph.

graph TD
    A[app/main.py] --> B[app/auth/router.py]
    A --> C[app/users/router.py]
    B --> D[app/auth/service.py]
    B --> E[app/auth/models.py]
    C --> E
    D --> E
    D --> F[app/auth/utils.py]

Files imported by many others (like app/auth/models.py) are architectural keystones. Including them gives the LLM understanding of shared data structures.

Centrality calculation:

Python
def import_centrality(file: Path, graph: ImportGraph) -> float:
    """PageRank-style centrality for import graph."""
    in_degree = len(graph.importers_of(file))
    out_degree = len(graph.imports_of(file))
    total_files = len(graph.all_files())

    return (in_degree + 0.5 * out_degree) / total_files

Session State: Context Across Interactions

Coding isn't one-shot—it's iterative. Tenets maintains session state:

Bash
# Create a session for your feature work
tenets session create auth-refactor

# Pin files you'll reference repeatedly
tenets session pin src/auth/ --session auth-refactor

# Add guiding principles
tenets tenet add "Use argon2 for all password hashing" --session auth-refactor
tenets tenet add "All endpoints require rate limiting" --priority high

# Every distill now includes pinned files + tenets
tenets distill "add MFA support" --session auth-refactor

Your context compounds across the session.


Practical Configuration

Mode Selection

TaskModeWhy
Quick questionfastKeyword + path only, <1s
Feature developmentbalancedFull NLP pipeline, ~3s
Major refactoringthoroughDeep analysis + ML, ~10s

Token Budget Tuning

YAML
# .tenets.yml
context:
  max_tokens: 100000  # Default: fits most models
  reserve_tokens: 10000  # Leave room for response

ranking:
  algorithm: balanced
  threshold: 0.1  # Minimum relevance score
  use_git: true  # Include git signals

Key Takeaways

  1. Context quality determines output quality—not model capability alone
  2. Multi-factor ranking outperforms keyword matching for code relevance
  3. Token budgets require intelligent selection, not random sampling
  4. Import centrality identifies architectural keystones
  5. Automatic tenets injection prevents context drift in long conversations
  6. Session state compounds context across interactions

Ready to try intelligent context?

Bash
pip install tenets[mcp]
tenets distill "your task" --copy

See the Architecture Documentation for the full technical breakdown.