feat(maven): add Maven tools for dependency and security scanning
-
Summary of changes:
- Introduced a full Maven tooling stack integrated into the Agent, including API access, caching, version handling, and security scanning capabilities. Added MavenTools toolset and wired it into the Agent’s toolset. Created a new services/maven package with API client, in-memory TTL cache, and version utilities, plus a set of data models for Maven coordinates, metadata, and vulnerabilities. Added comprehensive unit tests for the new Maven components.
-
Key modifications and their purpose:
- src/agent/agent.py
- Added MavenTools to the AgentToolset initialization to expose Maven-related capabilities to the agent.
- src/agent/config/schema.py
- Extended AgentConfig with Maven-specific configuration fields:
- maven_cache_ttl: TTL for Maven API response caching (default 3600s)
- maven_timeout: Timeout for Maven API requests (default 30s)
- Extended AgentConfig with Maven-specific configuration fields:
- src/agent/services/init.py
- Introduced a new services package export for Maven services (MavenApiService, MavenCacheService, VersionService).
- src/agent/services/maven/init.py, src/agent/services/maven/api.py, src/agent/services/maven/cache.py, src/agent/services/maven/types.py, src/agent/services/maven/version.py
- Implemented Maven API client with caching, metadata/version retrieval, existence checks, search, and vulnerability-related endpoints.
- Implemented a TTL-based in-memory cache tailored for Maven API responses with dedicated keys for metadata, versions, existence checks, and search results.
- Added comprehensive type definitions (MavenCoordinate, MavenMetadata, Vulnerability, SecurityScanResult, etc.) to standardize data across Maven services.
- Implemented VersionService for parsing, comparing, sorting, and detecting updates across Maven versions.
- src/agent/tools/maven.py
- Implemented MavenTools, a concrete Toolset for Maven operations:
- check_version: verify a specific version’s existence and update availability
- check_versions_batch: batch-check multiple dependencies
- scan_security: run Trivy-based vulnerability scans on a project or POM
- analyze_pom: analyze a POM.xml for dependencies, properties, modules, and optional version checks
- Internal support for Trivy availability checks and path validation
- Implemented MavenTools, a concrete Toolset for Maven operations:
- src/agent/services/maven/api.py (and related classes)
- Added MavenApiService with robust error handling (MavenApiError, MAVEN_API_ERROR, DEPENDENCY_NOT_FOUND, etc.) and async HTTP interactions.
- src/agent/tools/init.py
- Exported MavenTools as part of the public toolset API.
- Tests (unit and tests scaffolding)
- Added unit tests for Maven API, cache, version service, Maven coordinate parsing, and MavenTools behavior:
- tests/unit/services/maven/test_api.py
- tests/unit/services/maven/test_cache.py
- tests/unit/services/maven/test_version.py
- tests/unit/tools/test_maven.py
- Additional scaffolding for tests under tests/unit/services/init.py and tests/unit/services/maven/init.py
- Included tests covering parsing, caching TTL behavior, version parsing/comparison, and MavenPom analysis integration.
- Added unit tests for Maven API, cache, version service, Maven coordinate parsing, and MavenTools behavior:
- tests unit tests for vulnerability and security scanning wiring
- Introduced data models for vulnerability (Vulnerability, VulnerabilitySeverity) and security scan results (SecurityScanResult) to support Trivy-based findings and reporting.
- src/agent/agent.py
-
Notable technical details:
- Caching strategy:
- MavenCacheService provides TTL-based caching with distinct keys for metadata, versions, existence checks, and search results to optimize repeated Maven Central queries.
- Default TTLs defined per data type (metadata/versions: DEFAULT_METADATA_TTL, search results: DEFAULT_SEARCH_TTL, existence checks: DEFAULT_EXISTS_TTL).
- Maven data models:
- MavenCoordinate parsing and validation with strict "groupId:artifactId" format enforcement.
- MavenMetadata supports latest, release, and list of versions parsed from maven-metadata.xml.
- Vulnerability and SecurityScanResult models enable structured security findings with severity levels (CRITICAL, HIGH, MEDIUM, LOW, UNKNOWN).
- Version handling:
- VersionService provides robust parsing (semver, calendar, simple numeric), comparison, sorting, and update-detection logic (major/minor/patch).
- find_latest_versions and is_update_available enable clear determination of available upgrades.
- Security scanning:
- scan_security integrates with Trivy to perform file-system scans (fs) with severity filtering and JSON output for vulnerability reporting.
- Vulnerability parsing maps Trivy results to the Vulnerability model, including CVE IDs, package info, and fixed versions.
- POM analysis:
- analyze_pom parses dependencies, dependencyManagement, properties, and modules from POMs, with support for namespace-aware XML parsing.
- Version checks can be enabled to attach version-update insights to each dependency found in the POM.
- Caching strategy:
-
Security impact analysis (vulnerability data provided):
- Introduced a structured vulnerability model (Vulnerability and VulnerabilitySeverity) and a SecurityScanResult type to represent vulnerabilities discovered by Trivy and their severities.
- MavenTools exposes a scan_security function that uses Trivy to detect vulnerabilities, returning detailed vulnerability data including CVE IDs, affected packages, installed and fixed versions, and severity levels.
- New capabilities enable:
- Identification and reporting of critical/high/severity vulnerabilities in dependencies.
- Version-check based remediation guidance via VersionService-derived data (latest_versions, has_major_update, has_minor_update, has_patch_update).
- No vulnerabilities were fixed in this change set; vulnerability data structures are introduced to surface and manage vulnerabilities going forward.
- The updated codebase also ensures that vulnerability reporting is structured and testable, with unit tests validating parsing and handling of vulnerability data and Trivy output.
-
Last specific change or security finding discussed:
- Test for _find_text with and without namespace, including a missing-text scenario, added to validate robust XML parsing in POM analysis.