Architecture
ScanRook is a multi-service vulnerability scanning platform deployed on a bare-metal Kubernetes cluster. Three components — a Next.js web app, a Go dispatcher, and a Rust scanner — work together to process scan jobs via PostgreSQL and S3.
System overview
Three services connected by PostgreSQL and S3/MinIO.
Cluster topology
Three-node bare-metal cluster with dedicated namespaces.
| Namespace | Contents |
|---|---|
| scanrook | scanrook-web (3), scanrook-dispatcher (1), redis (1) |
| db | CNPG pg-shared cluster (3 instances, 50 GiB PVC each) |
| storage | MinIO (1 replica, 50 GiB PVC) |
| ingress-nginx | Ingress controller (NodePort 30080/30443) |
| monitoring | Prometheus, Grafana, Loki, Promtail, node-exporter, kube-state-metrics |
| argocd | ArgoCD (GitOps deployment) |
| longhorn-system | Longhorn distributed block storage |
| cnpg-system | CloudNativePG operator |
| kube-system | Cilium CNI, CoreDNS, Hubble |
Scan pipeline
End-to-end data flow from file upload to scan results.
Registry scan flow
How container images are pulled from registries and scanned.
For public images, no credentials are needed. Private registries use encrypted credentials stored per-organization in PostgreSQL, decrypted at dispatch time with AES-256-GCM.
Network flow
How traffic flows from the internet to pods and back out.
Inbound: All external HTTPS traffic terminates at the Caddy edge-proxy, which forwards plain HTTP to the ingress-nginx NodePort. Ingress rules route to the appropriate ClusterIP service.
Outbound: Scan pods use HTTP_PROXY/HTTPS_PROXY environment variables pointing to a Squid proxy for external API calls (OSV, NVD, EPSS, container registries). This allows network policy enforcement and caching.
Data stores
PostgreSQL tables, S3 buckets, and Redis usage.
PostgreSQL tables
| Table | Purpose | Key columns |
|---|---|---|
| scan_jobs | Job queue and status tracking | id (UUID), status, bucket, object_key, summary_json, scan_type |
| scan_events | Progress timeline per job | job_id, stage, detail, pct, created_at |
| scan_findings | Normalized vulnerability findings | job_id, cve_id, severity, package, version, cvss |
| scan_files | File inventory from scanned artifacts | job_id, path, entry_type, size, sha256 |
| scan_packages | Package inventory (SBOM) | job_id, ecosystem, name, version, source |
S3 buckets
| Bucket | Purpose | Lifecycle |
|---|---|---|
| deltaguard | Uploaded artifacts (tars, binaries, ISOs) | 7-day expiry |
| reports | Scan report JSON files | 90-day expiry |
| registry-pulls | Pulled container image tars | Cleaned after scan |
Redis
Single-instance Redis used for NextAuth session caching and token revocation. Session data is stored with a 7-day TTL. Token version checks enable instant session invalidation across all web replicas.