stats
¶
Full name: tenets.core.git.stats
stats¶
Git statistics module.
This module provides comprehensive statistical analysis of git repositories, including commit patterns, contributor metrics, file statistics, and repository growth analysis. It helps understand repository health, development patterns, and team dynamics through data-driven insights.
The statistics module aggregates various git metrics to provide actionable insights for project management and technical decision-making.
Classes¶
CommitStatsdataclass
¶
CommitStats(total_commits: int = 0, commits_per_day: float = 0.0, commits_per_week: float = 0.0, commits_per_month: float = 0.0, commit_size_avg: float = 0.0, commit_size_median: float = 0.0, commit_size_std: float = 0.0, largest_commit: Dict[str, Any] = dict(), smallest_commit: Dict[str, Any] = dict(), merge_commits: int = 0, revert_commits: int = 0, fix_commits: int = 0, feature_commits: int = 0, hourly_distribution: List[int] = (lambda: [0] * 24)(), daily_distribution: List[int] = (lambda: [0] * 7)(), monthly_distribution: List[int] = (lambda: [0] * 12)())
Statistics for commits.
Provides detailed statistical analysis of commit patterns including frequency, size, timing, and distribution metrics.
ATTRIBUTE | DESCRIPTION |
---|---|
total_commits | Total number of commits TYPE: |
commits_per_day | Average commits per day TYPE: |
commits_per_week | Average commits per week TYPE: |
commits_per_month | Average commits per month TYPE: |
commit_size_avg | Average commit size (lines changed) TYPE: |
commit_size_median | Median commit size TYPE: |
commit_size_std | Standard deviation of commit size TYPE: |
largest_commit | Largest single commit |
smallest_commit | Smallest single commit |
merge_commits | Number of merge commits TYPE: |
revert_commits | Number of revert commits TYPE: |
fix_commits | Number of fix commits TYPE: |
feature_commits | Number of feature commits TYPE: |
hourly_distribution | Commits by hour of day |
daily_distribution | Commits by day of week |
monthly_distribution | Commits by month |
Attributes¶
merge_ratioproperty
¶
Calculate merge commit ratio.
RETURNS | DESCRIPTION |
---|---|
float | Ratio of merge commits to total TYPE: |
fix_ratioproperty
¶
Calculate fix commit ratio.
RETURNS | DESCRIPTION |
---|---|
float | Ratio of fix commits to total TYPE: |
ContributorStatsdataclass
¶
ContributorStats(total_contributors: int = 0, active_contributors: int = 0, new_contributors: int = 0, contributor_commits: Dict[str, int] = dict(), contributor_lines: Dict[str, int] = dict(), contributor_files: Dict[str, Set[str]] = dict(), top_contributors: List[Tuple[str, int]] = list(), contribution_inequality: float = 0.0, collaboration_graph: Dict[Tuple[str, str], int] = dict(), timezone_distribution: Dict[str, int] = dict(), retention_rate: float = 0.0, churn_rate: float = 0.0)
Statistics for contributors.
Provides analysis of contributor patterns, productivity metrics, and team dynamics based on git history.
ATTRIBUTE | DESCRIPTION |
---|---|
total_contributors | Total unique contributors TYPE: |
active_contributors | Contributors active in last 30 days TYPE: |
new_contributors | New contributors in period TYPE: |
contributor_commits | Commits per contributor |
contributor_lines | Lines changed per contributor |
contributor_files | Files touched per contributor |
top_contributors | Most active contributors |
contribution_inequality | Gini coefficient of contributions TYPE: |
collaboration_graph | Who works with whom |
timezone_distribution | Contributors by timezone |
retention_rate | Contributor retention rate TYPE: |
churn_rate | Contributor churn rate TYPE: |
FileStatsdataclass
¶
FileStats(total_files: int = 0, active_files: int = 0, new_files: int = 0, deleted_files: int = 0, file_changes: Dict[str, int] = dict(), file_sizes: Dict[str, int] = dict(), largest_files: List[Tuple[str, int]] = list(), most_changed: List[Tuple[str, int]] = list(), file_age: Dict[str, int] = dict(), file_churn: Dict[str, float] = dict(), hot_files: List[str] = list(), stable_files: List[str] = list(), file_types: Dict[str, int] = dict())
Statistics for files.
Provides analysis of file-level metrics including change frequency, size distribution, and file lifecycle patterns.
ATTRIBUTE | DESCRIPTION |
---|---|
total_files | Total files in repository TYPE: |
active_files | Files changed in period TYPE: |
new_files | Files added in period TYPE: |
deleted_files | Files deleted in period TYPE: |
file_changes | Number of changes per file |
file_sizes | Size distribution of files |
largest_files | Largest files by line count |
most_changed | Most frequently changed files |
file_age | Age distribution of files |
file_churn | Churn rate per file |
hot_files | Files with high activity |
stable_files | Files with low activity |
file_types | Distribution by file type |
RepositoryStatsdataclass
¶
RepositoryStats(repo_age_days: int = 0, total_commits: int = 0, total_contributors: int = 0, total_files: int = 0, total_lines: int = 0, languages: Dict[str, int] = dict(), commit_stats: CommitStats = CommitStats(), contributor_stats: ContributorStats = ContributorStats(), file_stats: FileStats = FileStats(), growth_rate: float = 0.0, activity_trend: str = 'stable', health_score: float = 0.0, risk_factors: List[str] = list(), strengths: List[str] = list())
Overall repository statistics.
Aggregates various statistical analyses to provide comprehensive insights into repository health and development patterns.
ATTRIBUTE | DESCRIPTION |
---|---|
repo_age_days | Age of repository in days TYPE: |
total_commits | Total commits TYPE: |
total_contributors | Total contributors TYPE: |
total_files | Total files TYPE: |
total_lines | Total lines of code TYPE: |
languages | Programming languages used |
commit_stats | Commit statistics TYPE: |
contributor_stats | Contributor statistics TYPE: |
file_stats | File statistics TYPE: |
growth_rate | Repository growth rate TYPE: |
activity_trend | Recent activity trend TYPE: |
health_score | Overall health score TYPE: |
risk_factors | Identified risk factors |
strengths | Identified strengths |
Functions¶
to_dict¶
Convert to dictionary representation.
RETURNS | DESCRIPTION |
---|---|
Dict[str, Any] | Dict[str, Any]: Dictionary representation |
Source code in tenets/core/git/stats.py
def to_dict(self) -> Dict[str, Any]:
"""Convert to dictionary representation.
Returns:
Dict[str, Any]: Dictionary representation
"""
return {
"overview": {
"repo_age_days": self.repo_age_days,
"total_commits": self.total_commits,
"total_contributors": self.total_contributors,
"total_files": self.total_files,
"total_lines": self.total_lines,
"health_score": round(self.health_score, 1),
},
"languages": dict(
sorted(self.languages.items(), key=lambda x: x[1], reverse=True)[:10]
),
"commit_metrics": {
"total": self.commit_stats.total_commits,
"per_day": round(self.commit_stats.commits_per_day, 2),
"merge_ratio": round(self.commit_stats.merge_ratio * 100, 1),
"fix_ratio": round(self.commit_stats.fix_ratio * 100, 1),
"peak_hour": self.commit_stats.peak_hour,
"peak_day": self.commit_stats.peak_day,
},
"contributor_metrics": {
"total": self.contributor_stats.total_contributors,
"active": self.contributor_stats.active_contributors,
"bus_factor": self.contributor_stats.bus_factor,
"collaboration_score": round(self.contributor_stats.collaboration_score, 1),
"top_contributors": self.contributor_stats.top_contributors[:5],
},
"file_metrics": {
"total": self.file_stats.total_files,
"active": self.file_stats.active_files,
"stability": round(self.file_stats.file_stability, 1),
"churn_rate": round(self.file_stats.churn_rate, 2),
"hot_files": len(self.file_stats.hot_files),
},
"trends": {
"growth_rate": round(self.growth_rate, 2),
"activity_trend": self.activity_trend,
},
"risk_factors": self.risk_factors,
"strengths": self.strengths,
}
GitStatsAnalyzer¶
Analyzer for git repository statistics.
Provides comprehensive statistical analysis of git repositories to understand development patterns, team dynamics, and code health.
ATTRIBUTE | DESCRIPTION |
---|---|
config | Configuration object |
logger | Logger instance |
git_analyzer | Git analyzer instance TYPE: |
Initialize statistics analyzer.
PARAMETER | DESCRIPTION |
---|---|
config | TenetsConfig instance TYPE: |
Source code in tenets/core/git/stats.py
Functions¶
analyze¶
analyze(repo_path: Path, since: Optional[str] = None, until: Optional[str] = None, branch: Optional[str] = None, include_files: bool = True, include_languages: bool = True, max_commits: int = 10000) -> RepositoryStats
Analyze repository statistics.
Performs comprehensive statistical analysis of a git repository to provide insights into development patterns and health.
PARAMETER | DESCRIPTION |
---|---|
repo_path | Path to git repository TYPE: |
since | Start date or relative time |
until | End date or relative time |
branch | Specific branch to analyze |
include_files | Whether to include file statistics TYPE: |
include_languages | Whether to analyze languages TYPE: |
max_commits | Maximum commits to analyze TYPE: |
RETURNS | DESCRIPTION |
---|---|
RepositoryStats | Comprehensive statistics TYPE: |
Example
analyzer = GitStatsAnalyzer(config) stats = analyzer.analyze(Path(".")) print(f"Health score: {stats.health_score}")
Source code in tenets/core/git/stats.py
def analyze(
self,
repo_path: Path,
since: Optional[str] = None,
until: Optional[str] = None,
branch: Optional[str] = None,
include_files: bool = True,
include_languages: bool = True,
max_commits: int = 10000,
) -> RepositoryStats:
"""Analyze repository statistics.
Performs comprehensive statistical analysis of a git repository
to provide insights into development patterns and health.
Args:
repo_path: Path to git repository
since: Start date or relative time
until: End date or relative time
branch: Specific branch to analyze
include_files: Whether to include file statistics
include_languages: Whether to analyze languages
max_commits: Maximum commits to analyze
Returns:
RepositoryStats: Comprehensive statistics
Example:
>>> analyzer = GitStatsAnalyzer(config)
>>> stats = analyzer.analyze(Path("."))
>>> print(f"Health score: {stats.health_score}")
"""
self.logger.debug(f"Analyzing statistics for {repo_path}")
# Initialize git analyzer
self.git_analyzer = GitAnalyzer(repo_path)
if not self.git_analyzer.is_repo():
self.logger.warning(f"Not a git repository: {repo_path}")
return RepositoryStats()
# Initialize stats
stats = RepositoryStats()
# Get time period
start_date, end_date = self._parse_time_period(since, until)
# Get commits
commits = self._get_commits(start_date, end_date, branch, max_commits)
if not commits:
self.logger.info("No commits found in specified period")
return stats
# Calculate basic metrics
stats.total_commits = len(commits)
stats.repo_age_days = (end_date - start_date).days
# Analyze commits
stats.commit_stats = self._analyze_commits(commits, start_date, end_date)
# Analyze contributors
stats.contributor_stats = self._analyze_contributors(commits, end_date)
stats.total_contributors = stats.contributor_stats.total_contributors
# Analyze files if requested
if include_files:
stats.file_stats = self._analyze_files(commits, repo_path)
stats.total_files = stats.file_stats.total_files
# Get total lines
stats.total_lines = sum(stats.file_stats.file_sizes.values())
# Analyze languages if requested
if include_languages:
stats.languages = self._analyze_languages(repo_path)
# Calculate trends
stats.growth_rate = self._calculate_growth_rate(commits)
stats.activity_trend = self._determine_activity_trend(commits)
# Calculate health score
stats.health_score = self._calculate_health_score(stats)
# Identify risks and strengths
stats.risk_factors = self._identify_risks(stats)
stats.strengths = self._identify_strengths(stats)
self.logger.debug(
f"Statistics analysis complete: {stats.total_commits} commits, "
f"{stats.total_contributors} contributors"
)
return stats
Functions¶
analyze_git_stats¶
analyze_git_stats(repo_path: Path, config: Optional[TenetsConfig] = None, **kwargs: Any) -> RepositoryStats
Convenience function to analyze git statistics.
PARAMETER | DESCRIPTION |
---|---|
repo_path | Path to repository TYPE: |
config | Optional configuration TYPE: |
**kwargs | Additional arguments TYPE: |
RETURNS | DESCRIPTION |
---|---|
RepositoryStats | Repository statistics TYPE: |
Example
from tenets.core.git.stats import analyze_git_stats stats = analyze_git_stats(Path(".")) print(f"Health score: {stats.health_score}")
Source code in tenets/core/git/stats.py
def analyze_git_stats(
repo_path: Path, config: Optional[TenetsConfig] = None, **kwargs: Any
) -> RepositoryStats:
"""Convenience function to analyze git statistics.
Args:
repo_path: Path to repository
config: Optional configuration
**kwargs: Additional arguments
Returns:
RepositoryStats: Repository statistics
Example:
>>> from tenets.core.git.stats import analyze_git_stats
>>> stats = analyze_git_stats(Path("."))
>>> print(f"Health score: {stats.health_score}")
"""
if config is None:
config = TenetsConfig()
analyzer = GitStatsAnalyzer(config)
return analyzer.analyze(repo_path, **kwargs)