feat(maven): add Maven tools for dependency and security scanning

  • Summary of changes:

    • Introduced a full Maven tooling stack integrated into the Agent, including API access, caching, version handling, and security scanning capabilities. Added MavenTools toolset and wired it into the Agent’s toolset. Created a new services/maven package with API client, in-memory TTL cache, and version utilities, plus a set of data models for Maven coordinates, metadata, and vulnerabilities. Added comprehensive unit tests for the new Maven components.
  • Key modifications and their purpose:

    • src/agent/agent.py
      • Added MavenTools to the AgentToolset initialization to expose Maven-related capabilities to the agent.
    • src/agent/config/schema.py
      • Extended AgentConfig with Maven-specific configuration fields:
        • maven_cache_ttl: TTL for Maven API response caching (default 3600s)
        • maven_timeout: Timeout for Maven API requests (default 30s)
    • src/agent/services/init.py
      • Introduced a new services package export for Maven services (MavenApiService, MavenCacheService, VersionService).
    • src/agent/services/maven/init.py, src/agent/services/maven/api.py, src/agent/services/maven/cache.py, src/agent/services/maven/types.py, src/agent/services/maven/version.py
      • Implemented Maven API client with caching, metadata/version retrieval, existence checks, search, and vulnerability-related endpoints.
      • Implemented a TTL-based in-memory cache tailored for Maven API responses with dedicated keys for metadata, versions, existence checks, and search results.
      • Added comprehensive type definitions (MavenCoordinate, MavenMetadata, Vulnerability, SecurityScanResult, etc.) to standardize data across Maven services.
      • Implemented VersionService for parsing, comparing, sorting, and detecting updates across Maven versions.
    • src/agent/tools/maven.py
      • Implemented MavenTools, a concrete Toolset for Maven operations:
        • check_version: verify a specific version’s existence and update availability
        • check_versions_batch: batch-check multiple dependencies
        • scan_security: run Trivy-based vulnerability scans on a project or POM
        • analyze_pom: analyze a POM.xml for dependencies, properties, modules, and optional version checks
        • Internal support for Trivy availability checks and path validation
    • src/agent/services/maven/api.py (and related classes)
      • Added MavenApiService with robust error handling (MavenApiError, MAVEN_API_ERROR, DEPENDENCY_NOT_FOUND, etc.) and async HTTP interactions.
    • src/agent/tools/init.py
      • Exported MavenTools as part of the public toolset API.
    • Tests (unit and tests scaffolding)
      • Added unit tests for Maven API, cache, version service, Maven coordinate parsing, and MavenTools behavior:
        • tests/unit/services/maven/test_api.py
        • tests/unit/services/maven/test_cache.py
        • tests/unit/services/maven/test_version.py
        • tests/unit/tools/test_maven.py
        • Additional scaffolding for tests under tests/unit/services/init.py and tests/unit/services/maven/init.py
      • Included tests covering parsing, caching TTL behavior, version parsing/comparison, and MavenPom analysis integration.
    • tests unit tests for vulnerability and security scanning wiring
      • Introduced data models for vulnerability (Vulnerability, VulnerabilitySeverity) and security scan results (SecurityScanResult) to support Trivy-based findings and reporting.
  • Notable technical details:

    • Caching strategy:
      • MavenCacheService provides TTL-based caching with distinct keys for metadata, versions, existence checks, and search results to optimize repeated Maven Central queries.
      • Default TTLs defined per data type (metadata/versions: DEFAULT_METADATA_TTL, search results: DEFAULT_SEARCH_TTL, existence checks: DEFAULT_EXISTS_TTL).
    • Maven data models:
      • MavenCoordinate parsing and validation with strict "groupId:artifactId" format enforcement.
      • MavenMetadata supports latest, release, and list of versions parsed from maven-metadata.xml.
      • Vulnerability and SecurityScanResult models enable structured security findings with severity levels (CRITICAL, HIGH, MEDIUM, LOW, UNKNOWN).
    • Version handling:
      • VersionService provides robust parsing (semver, calendar, simple numeric), comparison, sorting, and update-detection logic (major/minor/patch).
      • find_latest_versions and is_update_available enable clear determination of available upgrades.
    • Security scanning:
      • scan_security integrates with Trivy to perform file-system scans (fs) with severity filtering and JSON output for vulnerability reporting.
      • Vulnerability parsing maps Trivy results to the Vulnerability model, including CVE IDs, package info, and fixed versions.
    • POM analysis:
      • analyze_pom parses dependencies, dependencyManagement, properties, and modules from POMs, with support for namespace-aware XML parsing.
      • Version checks can be enabled to attach version-update insights to each dependency found in the POM.
  • Security impact analysis (vulnerability data provided):

    • Introduced a structured vulnerability model (Vulnerability and VulnerabilitySeverity) and a SecurityScanResult type to represent vulnerabilities discovered by Trivy and their severities.
    • MavenTools exposes a scan_security function that uses Trivy to detect vulnerabilities, returning detailed vulnerability data including CVE IDs, affected packages, installed and fixed versions, and severity levels.
    • New capabilities enable:
      • Identification and reporting of critical/high/severity vulnerabilities in dependencies.
      • Version-check based remediation guidance via VersionService-derived data (latest_versions, has_major_update, has_minor_update, has_patch_update).
    • No vulnerabilities were fixed in this change set; vulnerability data structures are introduced to surface and manage vulnerabilities going forward.
    • The updated codebase also ensures that vulnerability reporting is structured and testable, with unit tests validating parsing and handling of vulnerability data and Trivy output.
  • Last specific change or security finding discussed:

    • Test for _find_text with and without namespace, including a missing-text scenario, added to validate robust XML parsing in POM analysis.

Merge request reports

Loading