Caching
ScanRook makes live API calls to OSV, NVD, Red Hat, EPSS, and CISA to enrich findings with up-to-date vulnerability data. To avoid hitting rate limits and to make repeated scans of the same artifact fast, every API response is cached locally. This page explains how the three caching layers work and how to configure them.
Three-layer cache hierarchy
Responses are checked in order: in-memory → file → PostgreSQL → live API.
~/.scanrook/cache/- •Always active by default
- •Keyed by SHA256 of request params
- •Stored as raw JSON bytes per entry
- •Disable with SCANNER_SKIP_CACHE=1
DATABASE_URL- •Opt-in via DATABASE_URL env var
- •Shared across multiple worker pods
- •Stores OSV advisories and Red Hat CVE data
- •Schema auto-initialized on first use
REDIS_URL- •Opt-in via REDIS_URL env var
- •Fastest layer for multi-worker setups
- •Used for NVD CPE lookups and rate-limit coordination
- •Not required for single-machine use
When a vulnerability lookup is needed, ScanRook checks each layer in order. A cache hit in any layer skips all subsequent layers including the live API call. Responses fetched from the API are written back to all configured layers so subsequent requests are served from cache.
The file cache is always active. PostgreSQL and Redis layers are additive — configuring them speeds up multi-worker deployments where multiple scanner pods share the same artifact queue but don’t share a local filesystem.
Cache key format
Every cache entry is keyed by SHA256 of its request parameters.
Cache keys are computed by hashing a list of string parts together. For example, the Red Hat CVE API response for CVE-2024-1234 is stored under sha256("redhat_cve" + "CVE-2024-1234"). The file cache stores each entry as a single file named by the hex-encoded hash inside the cache directory. This makes cache lookups O(1) regardless of how many entries are stored.
| Source | Key components | TTL | Layers |
|---|---|---|---|
| OSV batch query | sha256(ecosystems + package names) | 7 days | File + PG |
| OSV advisory JSON | sha256('osv_advisory' + advisory_id) | 7 days | File + PG |
| NVD CVE JSON | sha256('nvd_cve' + CVE-ID) | 30 days | File + PG |
| Red Hat CVE JSON | sha256('redhat_cve' + CVE-ID) | dynamic (30 days default) | File + PG |
| Red Hat per-package CVE list | sha256('redhat_pkg_cves' + package_name) | 30 days | File only |
| EPSS batch scores | sha256('epss_v1' + sorted CVE IDs) | 1 day | File only |
| CISA KEV catalog | sha256('kev_catalog') | 1 day | File only |
| OVAL XML auto-download | sha256('oval_auto' + distro_key) | 7 days | File only |
EPSS chunk keys include all sorted CVE IDs in the batch to ensure stable cache hits across repeated scans of the same artifact.
Dynamic TTL for Red Hat data
Recently-modified CVEs get shorter TTLs so fixes are surfaced quickly.
Red Hat CVE entries include a lastModified timestamp. The scanner uses this to compute a shorter cache TTL for recently-changed advisories. If a CVE was modified in the last 7 days, it is re-fetched after 1 day regardless of the base TTL. CVEs that haven’t changed in more than 90 days get a longer TTL of up to 90 days. This balances freshness with API load.
Managing the cache
Use the db subcommand to inspect and refresh cached vulnerability data.
scanrook db sourcesList all configured cache sources (file cache path, PostgreSQL URL if set, Redis URL if set).
scanrook db checkShow the number of cached entries per source, disk usage, and oldest/newest entries.
scanrook db updatePre-warm the cache by downloading the latest KEV catalog, EPSS scores, and any pending OVAL files.
To clear the entire file cache: rm -rf ~/.scanrook/cache/. The next scan will re-populate it from live APIs.
Environment variables
All caching behaviour can be tuned without changing any config file.
| Variable | Default | Description |
|---|---|---|
| SCANNER_CACHE | ~/.scanrook/cache/ | Override the file cache directory. |
| SCANNER_SKIP_CACHE | 0 | Set to 1 to bypass all file cache reads and writes. Forces fresh API calls on every scan. |
| DATABASE_URL | (unset) | PostgreSQL connection string. Enables the database cache layer for OSV advisories and Red Hat CVE data. |
| REDIS_URL | (unset) | Redis connection string (redis://host:port). Enables the in-memory cache layer for multi-worker deployments. |
| SCANNER_REDHAT_TTL_DAYS | 30 | How many days to treat Red Hat CVE API responses as fresh before re-fetching. |
| SCANNER_OSV_TTL_DAYS | 7 | How many days to treat OSV advisory responses as fresh. |
| SCANNER_EPSS_TTL_DAYS | 1 | How many days to treat EPSS scores as fresh (re-fetched daily since scores change). |
Cache in CI/CD pipelines
Mount the cache directory as a persistent volume or artifact to speed up pipeline scans.
In GitHub Actions or GitLab CI, mount ~/.scanrook/cache as a cache artifact between runs. On the first pipeline run all API calls are made live. Subsequent runs for the same set of packages hit the file cache and complete much faster — typically in under 3 seconds for a fully warm cache versus 30–60 seconds for a cold scan.
When DATABASE_URL is set, the PostgreSQL cache is shared across all worker pods automatically — no volume mounting needed. This is the recommended approach for self-hosted DeltaGuard deployments where multiple worker replicas scan different jobs concurrently.