Deep Scanning
Deep scanning is ScanRook's extended analysis mode that goes beyond package inventory matching. While the default light mode identifies known CVEs in installed packages, deep mode applies YARA pattern matching against extracted filesystem contents to detect embedded threats that package-level scanning cannot catch: crypto miners, web shells, reverse shells, leaked secrets, and anti-debugging techniques.
What is YARA
A pattern-matching engine built for malware research and threat hunting.
YARA is a pattern-matching tool created by Victor Alvarez at VirusTotal. It allows researchers to write rules that describe malware families, suspicious patterns, or any byte-level signatures of interest. Each rule consists of a set of strings (text, hex, or regex) and a boolean condition that determines when the rule fires.
YARA is widely used across the security industry for malware classification, incident response, threat hunting, and artifact triage. Its rule language is simple enough for analysts to write by hand yet expressive enough to describe complex binary patterns.
Learn more at the official YARA documentation.
How ScanRook uses YARA
Applying pattern matching to extracted filesystem contents during deep scans.
When you run ScanRook with --mode deep, the scanner performs its standard package inventory and vulnerability enrichment pipeline, then adds a second pass: YARA rule scanning against every file extracted from the target artifact.
This second pass catches threats that exist outside of package managers. A container image might have clean packages but contain a manually dropped crypto miner binary, a web shell planted in /var/www, or hardcoded AWS credentials in a configuration file. Package-level CVE matching cannot detect these because they are not part of any tracked package. YARA pattern matching fills this gap.
YARA findings appear in the report alongside CVE findings, tagged with the rule name, matched file path, and matched strings. They use the HeuristicUnverified confidence tier to distinguish them from confirmed vulnerability matches.
Bundled default rules
Out-of-the-box YARA rules that ship with ScanRook.
ScanRook ships with a curated set of default YARA rules that cover common threat categories. These rules are community-sourced and regularly updated with each release.
- Cryptocurrency mining indicators— detects Stratum protocol strings, known miner binary signatures, and mining pool configuration patterns
- Web shell detection— identifies PHP, JSP, and ASP web shells by matching eval/exec patterns, obfuscation techniques, and known web shell families
- Reverse shell patterns— catches common reverse shell payloads in bash, Python, Perl, and compiled binaries
- Hardcoded credentials and API keys— flags AWS access keys, private keys, database connection strings, and other secrets embedded in files
- Anti-debugging and packing indicators— detects UPX packing, ptrace-based anti-debug checks, and common binary obfuscation markers
Writing custom YARA rules
Extend deep scanning with your own detection logic.
YARA rules follow a structured format with three main blocks: meta for descriptive metadata, strings for the patterns to match, and condition for the boolean logic that determines a match.
rule detect_stratum_mining {
meta:
description = "Detects Stratum mining protocol usage"
severity = "high"
author = "ScanRook"
strings:
$stratum = "stratum+tcp://" ascii
$stratum_ssl = "stratum+ssl://" ascii
$mining_subscribe = "mining.subscribe" ascii
condition:
any of them
}Pass custom rules to ScanRook with the --yara flag. You can point to a single .yar file or a directory containing multiple rule files. Custom rules run alongside the bundled defaults.
Light vs deep mode
Choosing the right scan mode for your use case.
| Aspect | Light mode (default) | Deep mode |
|---|---|---|
| What it scans | Package inventories (RPM, APK, npm, pip, Go, etc.) | Package inventories + all extracted filesystem contents |
| Detection method | Version matching against OSV, NVD, distro feeds | Version matching + YARA pattern matching |
| Speed | Fast (seconds to low minutes) | Slower (depends on artifact size and rule count) |
| CI/CD suitability | Recommended for pipeline gates | Better suited for periodic audits |
| Catches embedded threats | No | Yes (crypto miners, web shells, secrets, etc.) |
| Requirements | None beyond default install | Requires libyara installed on the system |
CLI examples
Common invocations for deep scanning.
Run a deep scan with bundled default rules:
scanrook scan --file image.tar --mode deepDeep scan with a custom rules directory:
scanrook scan --file image.tar --mode deep --yara custom-rules/Deep scan with a single custom rule file, outputting JSON:
scanrook scan --file image.tar --mode deep --yara custom.yar --format jsonWhen to use deep scanning
Scenarios where deep mode provides the most value.
- Production image audits— scan images before deployment to production to catch threats that slipped past CI/CD light scans
- Incident response— when investigating a compromised container, deep scanning identifies dropped payloads, backdoors, and persistence mechanisms
- Base image validation— verify that upstream base images from public registries do not contain embedded malware or unwanted tooling
- Compliance requirements— regulatory frameworks that mandate malware scanning beyond CVE matching (e.g., FedRAMP, PCI DSS)
- Third-party artifact inspection— when accepting container images, ISOs, or binaries from vendors or open source projects, deep scanning provides an additional layer of trust verification
Further reading
Related documentation and resources.
- CLI reference — full list of flags and subcommands
- Enrichment — how ScanRook matches packages against vulnerability databases
- What is YARA? — an in-depth blog post on YARA and its role in container security