Deep Scanning

Name: ScanRook
Author: ScanRook

Deep scanning is ScanRook's extended analysis mode that goes beyond package inventory matching. While the default light mode identifies known CVEs in installed packages, deep mode applies YARA pattern matching against extracted filesystem contents to detect embedded threats that package-level scanning cannot catch: crypto miners, web shells, reverse shells, leaked secrets, and anti-debugging techniques.

What is YARA

A pattern-matching engine built for malware research and threat hunting.

YARA is a pattern-matching tool created by Victor Alvarez at VirusTotal. It allows researchers to write rules that describe malware families, suspicious patterns, or any byte-level signatures of interest. Each rule consists of a set of strings (text, hex, or regex) and a boolean condition that determines when the rule fires.

YARA is widely used across the security industry for malware classification, incident response, threat hunting, and artifact triage. Its rule language is simple enough for analysts to write by hand yet expressive enough to describe complex binary patterns.

Learn more at the official YARA documentation.

How ScanRook uses YARA

Applying pattern matching to extracted filesystem contents during deep scans.

When you run ScanRook with --mode deep, the scanner performs its standard package inventory and vulnerability enrichment pipeline, then adds a second pass: YARA rule scanning against every file extracted from the target artifact.

This second pass catches threats that exist outside of package managers. A container image might have clean packages but contain a manually dropped crypto miner binary, a web shell planted in /var/www, or hardcoded AWS credentials in a configuration file. Package-level CVE matching cannot detect these because they are not part of any tracked package. YARA pattern matching fills this gap.

YARA findings appear in the report alongside CVE findings, tagged with the rule name, matched file path, and matched strings. They use the HeuristicUnverified confidence tier to distinguish them from confirmed vulnerability matches.

Bundled default rules

Out-of-the-box YARA rules that ship with ScanRook.

ScanRook ships with a curated set of default YARA rules that cover common threat categories. These rules are community-sourced and regularly updated with each release.

Cryptocurrency mining indicators— detects Stratum protocol strings, known miner binary signatures, and mining pool configuration patterns
Web shell detection— identifies PHP, JSP, and ASP web shells by matching eval/exec patterns, obfuscation techniques, and known web shell families
Reverse shell patterns— catches common reverse shell payloads in bash, Python, Perl, and compiled binaries
Hardcoded credentials and API keys— flags AWS access keys, private keys, database connection strings, and other secrets embedded in files
Anti-debugging and packing indicators— detects UPX packing, ptrace-based anti-debug checks, and common binary obfuscation markers

Writing custom YARA rules

Extend deep scanning with your own detection logic.

YARA rules follow a structured format with three main blocks: meta for descriptive metadata, strings for the patterns to match, and condition for the boolean logic that determines a match.

rule detect_stratum_mining {
    meta:
        description = "Detects Stratum mining protocol usage"
        severity = "high"
        author = "ScanRook"
    strings:
        $stratum = "stratum+tcp://" ascii
        $stratum_ssl = "stratum+ssl://" ascii
        $mining_subscribe = "mining.subscribe" ascii
    condition:
        any of them
}

Pass custom rules to ScanRook with the --yara flag. You can point to a single .yar file or a directory containing multiple rule files. Custom rules run alongside the bundled defaults.

Light vs deep mode

Choosing the right scan mode for your use case.

Aspect	Light mode (default)	Deep mode
What it scans	Package inventories (RPM, APK, npm, pip, Go, etc.)	Package inventories + all extracted filesystem contents
Detection method	Version matching against OSV, NVD, distro feeds	Version matching + YARA pattern matching
Speed	Fast (seconds to low minutes)	Slower (depends on artifact size and rule count)
CI/CD suitability	Recommended for pipeline gates	Better suited for periodic audits
Catches embedded threats	No	Yes (crypto miners, web shells, secrets, etc.)
Requirements	None beyond default install	Requires `libyara` installed on the system

CLI examples

Common invocations for deep scanning.

Run a deep scan with bundled default rules:

scanrook scan --file image.tar --mode deep

Deep scan with a custom rules directory:

scanrook scan --file image.tar --mode deep --yara custom-rules/

Deep scan with a single custom rule file, outputting JSON:

scanrook scan --file image.tar --mode deep --yara custom.yar --format json

When to use deep scanning

Scenarios where deep mode provides the most value.

Production image audits— scan images before deployment to production to catch threats that slipped past CI/CD light scans
Incident response— when investigating a compromised container, deep scanning identifies dropped payloads, backdoors, and persistence mechanisms
Base image validation— verify that upstream base images from public registries do not contain embedded malware or unwanted tooling
Compliance requirements— regulatory frameworks that mandate malware scanning beyond CVE matching (e.g., FedRAMP, PCI DSS)
Third-party artifact inspection— when accepting container images, ISOs, or binaries from vendors or open source projects, deep scanning provides an additional layer of trust verification