`html_analyzer`¶

Full name: tenets.core.analysis.implementations.html_analyzer

html_analyzer¶

HTML code analyzer with modern web framework support.

This module provides comprehensive analysis for HTML files, including support for HTML5, accessibility features, web components, and modern framework patterns.

Classes¶

HTMLStructureParser¶

Python

HTMLStructureParser()

Bases: HTMLParser

Custom HTML parser to extract structure information.

Source code in tenets/core/analysis/implementations/html_analyzer.py

Python

def __init__(self):
    super().__init__()
    self.elements = []
    self.current_depth = 0
    self.max_depth = 0
    self.element_stack = []
    self.scripts = []
    self.styles = []
    self.links = []
    self.meta_tags = []
    self.forms = []
    self.current_form = None

Functions¶

handle_starttag¶

Python

handle_starttag(tag, attrs)

Handle opening tags.

Source code in tenets/core/analysis/implementations/html_analyzer.py

Python

def handle_starttag(self, tag, attrs):
    """Handle opening tags."""
    self.current_depth += 1
    self.max_depth = max(self.max_depth, self.current_depth)
    self.element_stack.append(tag)

    attr_dict = dict(attrs)
    element_info = {
        "tag": tag,
        "attrs": attr_dict,
        "depth": self.current_depth,
        "line": self.getpos()[0],
    }
    self.elements.append(element_info)

    # Track specific elements
    if tag == "script":
        self.scripts.append(attr_dict)
    elif tag == "style":
        self.styles.append(attr_dict)
    elif tag == "link":
        self.links.append(attr_dict)
    elif tag == "meta":
        self.meta_tags.append(attr_dict)
    elif tag == "form":
        self.current_form = {
            "attrs": attr_dict,
            "inputs": [],
            "line": self.getpos()[0],
        }
        self.forms.append(self.current_form)
    elif tag in ["input", "textarea", "select", "button"] and self.current_form:
        self.current_form["inputs"].append(
            {
                "tag": tag,
                "attrs": attr_dict,
            }
        )

handle_endtag¶

Python

handle_endtag(tag)

Handle closing tags.

Source code in tenets/core/analysis/implementations/html_analyzer.py

Python

def handle_endtag(self, tag):
    """Handle closing tags."""
    if self.element_stack and self.element_stack[-1] == tag:
        self.element_stack.pop()
        self.current_depth -= 1
    if tag == "form":
        self.current_form = None

handle_data¶

Python

handle_data(data)

Handle text content.

Source code in tenets/core/analysis/implementations/html_analyzer.py

Python

def handle_data(self, data):
    """Handle text content."""
    pass  # We're mainly interested in structure

HTMLAnalyzer¶

Python

HTMLAnalyzer()

Bases: LanguageAnalyzer

HTML code analyzer with modern web framework support.

Provides comprehensive analysis for HTML files including: - HTML5 semantic elements - CSS and JavaScript imports - Meta tags and SEO elements - Forms and input validation - Accessibility features (ARIA, alt text, etc.) - Web components and custom elements - Framework-specific patterns (React, Vue, Angular) - Microdata and structured data - DOM complexity and nesting depth - Performance hints (lazy loading, async/defer scripts) - Security considerations (CSP, integrity checks)

Supports HTML5 and modern web development practices.

Initialize the HTML analyzer with logger.

Source code in tenets/core/analysis/implementations/html_analyzer.py

Python

def __init__(self):
    """Initialize the HTML analyzer with logger."""
    self.logger = get_logger(__name__)

Functions¶

extract_imports¶

Python

extract_imports(content: str, file_path: Path) -> List[ImportInfo]

Extract external resource imports from HTML.

Handles: - tags for CSS -