Skip to content

html_analyzer

Full name: tenets.core.analysis.implementations.html_analyzer

html_analyzer

HTML code analyzer with modern web framework support.

This module provides comprehensive analysis for HTML files, including support for HTML5, accessibility features, web components, and modern framework patterns.

Classes

HTMLStructureParser

Python
HTMLStructureParser()

Bases: HTMLParser

Custom HTML parser to extract structure information.

Source code in tenets/core/analysis/implementations/html_analyzer.py
Python
def __init__(self):
    super().__init__()
    self.elements = []
    self.current_depth = 0
    self.max_depth = 0
    self.element_stack = []
    self.scripts = []
    self.styles = []
    self.links = []
    self.meta_tags = []
    self.forms = []
    self.current_form = None
Functions
handle_starttag
Python
handle_starttag(tag, attrs)

Handle opening tags.

Source code in tenets/core/analysis/implementations/html_analyzer.py
Python
def handle_starttag(self, tag, attrs):
    """Handle opening tags."""
    self.current_depth += 1
    self.max_depth = max(self.max_depth, self.current_depth)
    self.element_stack.append(tag)

    attr_dict = dict(attrs)
    element_info = {
        "tag": tag,
        "attrs": attr_dict,
        "depth": self.current_depth,
        "line": self.getpos()[0],
    }
    self.elements.append(element_info)

    # Track specific elements
    if tag == "script":
        self.scripts.append(attr_dict)
    elif tag == "style":
        self.styles.append(attr_dict)
    elif tag == "link":
        self.links.append(attr_dict)
    elif tag == "meta":
        self.meta_tags.append(attr_dict)
    elif tag == "form":
        self.current_form = {
            "attrs": attr_dict,
            "inputs": [],
            "line": self.getpos()[0],
        }
        self.forms.append(self.current_form)
    elif tag in ["input", "textarea", "select", "button"] and self.current_form:
        self.current_form["inputs"].append(
            {
                "tag": tag,
                "attrs": attr_dict,
            }
        )
handle_endtag
Python
handle_endtag(tag)

Handle closing tags.

Source code in tenets/core/analysis/implementations/html_analyzer.py
Python
def handle_endtag(self, tag):
    """Handle closing tags."""
    if self.element_stack and self.element_stack[-1] == tag:
        self.element_stack.pop()
        self.current_depth -= 1
    if tag == "form":
        self.current_form = None
handle_data
Python
handle_data(data)

Handle text content.

Source code in tenets/core/analysis/implementations/html_analyzer.py
Python
def handle_data(self, data):
    """Handle text content."""
    pass  # We're mainly interested in structure

HTMLAnalyzer

Python
HTMLAnalyzer()

Bases: LanguageAnalyzer

HTML code analyzer with modern web framework support.

Provides comprehensive analysis for HTML files including: - HTML5 semantic elements - CSS and JavaScript imports - Meta tags and SEO elements - Forms and input validation - Accessibility features (ARIA, alt text, etc.) - Web components and custom elements - Framework-specific patterns (React, Vue, Angular) - Microdata and structured data - DOM complexity and nesting depth - Performance hints (lazy loading, async/defer scripts) - Security considerations (CSP, integrity checks)

Supports HTML5 and modern web development practices.

Initialize the HTML analyzer with logger.

Source code in tenets/core/analysis/implementations/html_analyzer.py
Python
def __init__(self):
    """Initialize the HTML analyzer with logger."""
    self.logger = get_logger(__name__)
Functions
extract_imports
Python
extract_imports(content: str, file_path: Path) -> List[ImportInfo]

Extract external resource imports from HTML.

Handles: - tags for CSS -