Enrichment
Enrichment is the process of taking a raw package inventory (names and versions found in a container, binary, or SBOM) and matching it against vulnerability databases to produce actionable findings. ScanRook queries multiple sources in a defined pipeline order, merging results and deduplicating across providers.
What enrichment means
Turning a list of packages into a list of vulnerabilities.
When ScanRook scans an artifact, it first extracts a package inventory: the list of installed software with ecosystem, name, and version. This inventory by itself has no security information. Enrichment is the step that queries external vulnerability databases to determine which of those packages have known CVEs.
Each enrichment source contributes different data. OSV provides broad ecosystem coverage and affected version ranges. NVD adds authoritative CVSS scores and CPE-based matching. Distro feeds provide fix status specific to the Linux distribution. The scanner merges all of this into a single unified finding per CVE-package pair.
Enrichment pipeline
The order in which ScanRook queries vulnerability data sources.
Each active enrichment step (highlighted) queries an external API. Results are cached locally and in PostgreSQL/Redis when configured.
Enrichment sources
Detailed description of each vulnerability data source in the pipeline.
Open Source Vulnerabilities (OSV)
osvGoogle's open-source vulnerability database. Covers the broadest set of ecosystems via a single batch API.
Batch queries packages by ecosystem/name/version. Returns matched advisories with affected ranges, severity, and fix versions.
Always queried first. Primary source for npm, PyPI, Go, Rust, Ruby, Maven, NuGet, DPKG, APK, and RPM packages.
National Vulnerability Database (NVD)
nvdNIST's authoritative CVE dictionary. Provides CVSS scores, CPE matching, and detailed advisory metadata.
Per-CVE lookup by ID, plus CPE-based product/version matching. Returns CVSS v3.1 base scores, vector strings, references, and CWE classifications.
Used as a second-pass enrichment after OSV. Adds CVSS scores to OSV findings and discovers additional CVEs via CPE matching. Requires NVD_API_KEY for higher rate limits.
Red Hat CSAF / Security Data API
redhatRed Hat's security data API provides fix status, errata, and CSAF advisories for RHEL packages.
Queries per-CVE fix status for RPM packages. Returns fix state (affected, fixed, not affected), errata IDs, and fixed-in versions.
Automatically activated for RPM packages detected in RHEL-based container images. Also invoked when --oval-redhat is provided.
Distro security feeds
ubuntu, debian, alpine, amazon, oracle, wolfi, chainguardDistribution-specific security trackers that provide precise fix status for packages in their repositories.
Maps CVEs to distro package versions with fix status (fixed, not-affected, needs-triage). Provides distro-specific severity and urgency ratings.
Activated based on detected OS in container scans. Ubuntu CVE Tracker for Ubuntu/DPKG, Debian Security Tracker for Debian/DPKG, Alpine SecDB for Alpine/APK, Amazon Linux for AL2/AL2023, Oracle Linux, Wolfi SecDB, and Chainguard advisories.
EPSS (Exploit Prediction Scoring System)
epssFIRST's model that predicts the probability a CVE will be exploited in the next 30 days.
Returns a probability score (0.0-1.0) and percentile for each CVE, indicating real-world exploit likelihood. Results are cached for 24 hours.
Always active. Applied to all findings after vulnerability matching. Batch queries api.first.org for all CVE IDs in the report.
CISA KEV (Known Exploited Vulnerabilities)
kevCISA's catalog of vulnerabilities known to be actively exploited in the wild.
Boolean flag: is this CVE in the KEV catalog? Also provides the date added, required remediation date, and ransomware campaign association.
Always active. Downloads the full KEV catalog (cached as a HashSet), then flags any finding whose CVE ID appears in the catalog.
Deduplication and merging
How ScanRook handles overlapping results from multiple sources.
When the same CVE is reported by multiple sources (for example, both OSV and NVD report CVE-2024-12345 for the same package), ScanRook merges them into a single finding. The merge logic:
- Uses the highest CVSS score from any source
- Combines evidence items from all sources
- Prefers distro-specific fix status over generic fix versions
- Retains all references and advisory URLs
- Sets the confidence tier based on the strongest evidence available
This approach ensures that findings are both comprehensive and deduplicated, avoiding duplicate alerts for the same vulnerability.