Batch Mode & Per-File Detail
Lexega operates in two modes depending on the input target: single-file mode and batch mode. Understanding the differences helps you choose the right invocation for your workflow.
Single-File Mode
When you point analyze at a single file (or pipe via --stdin), Lexega produces a complete report for that file alone.
lexega-sql analyze file.sql
lexega-sql analyze --dialect bigquery query.bq.sql
cat file.sql | lexega-sql analyze --stdin
What you get:
- Full risk report for one file
- All matching signals displayed inline
- Evidence with line numbers and statement previews
- Policy decision (allow/warn/block) if
--policyis set
This is the mode you're most likely to use during development or in targeted CI checks.
Batch Mode
Pass a directory (with -r for recursive scanning) to analyze an entire codebase at once. Lexega scans all .sql files, aggregates results, and prints a summary.
# Scan current directory recursively
lexega-sql analyze . -r
# Scan a specific folder
lexega-sql analyze models/ -r
# With dialect and severity filter
lexega-sql analyze . -r --dialect bigquery --min-severity medium
Default batch output is a compact summary printed to stderr:
Batch Analysis
════════════════════════════════════
Scanning: ./models
Files found: 42
Files analyzed: 38
Files skipped: 4
Statements parsed: 127
Statements analyzed: 112
Unrecognized: 15 (opaque, no analysis)
Files with signals: 11
Total signals: 27
By level: 2 CRITICAL, 5 HIGH, 12 MEDIUM, 8 LOW
Average signals per file: 2.5
Highest risk level: CRITICAL
Top risky files:
models/admin/grants.sql - 4 signal(s), max CRITICAL
models/staging/raw_load.sql - 3 signal(s), max HIGH
...
Top Matched Rules:
MASK-DROP (Masking Policy dropped) - 3 occurrence(s)
GRT-TO-PUBLIC (Grant to PUBLIC role) - 5 occurrence(s)
...
This summary tells you the overall health of the codebase without overwhelming detail.
The --detail Flag
When batch summary isn't enough, add --detail to see per-file signal breakdowns after the summary:
lexega-sql analyze . -r --detail
lexega-sql analyze . -r --detail --min-severity high
Text Output (default)
With --detail, the summary is followed by a per-file section:
Per-File Signal Details
══════════════════════════════════════════════════════════════
── models/admin/grants.sql (2 signals)
[CRITICAL] Masking Policy dropped. Column data protection removed.
↳ Line 12 • `governance:masking_policy:dropped` • "DROP MASKING POLICY ema..."
[HIGH] Network Policy dropped. Access control removed.
↳ Line 18 • `security:network_policy:dropped`
── models/staging/raw_load.sql (3 signals)
[MEDIUM] SELECT * usage detected
↳ Line 4 • `query:select_statement:select_star`
[LOW] Implicit column ordering
↳ Line 4 • `query:select_statement:implicit_order`
[MEDIUM] COPY INTO without ON_ERROR handling
↳ Line 22 • `data_access:copy_into_statement:no_on_error`
Each file shows its signals with severity level, message, and evidence (line numbers, signal values, statement previews). Files with no signals are omitted.
Markdown Output
When combined with --format markdown (for PR comments), --detail adds a Per-File Signal Details section with tables:
lexega-sql analyze . -r --detail --format markdown
This produces per-file tables with level icons:
| Level | Signal | Evidence |
|---|---|---|
| 🔴 Critical | Masking Policy dropped. Column data protection removed. | Line 12: governance:masking_policy:dropped |
| 🟠 High | Network Policy dropped. Access control removed. | Line 18: security:network_policy:dropped |
Level icons: 🔴 Critical, 🟠 High, 🟡 Medium, 🟢 Low, ℹ️ Info.
JSON / YAML Output
JSON and YAML formats always include full per-file reports regardless of --detail. The flag only affects text and markdown output.
# Full per-file JSON (--detail not needed)
lexega-sql analyze . -r --format json -q > report.json
# YAML
lexega-sql analyze . -r --format yaml -q > report.yaml
SARIF Output
SARIF is inherently per-file — each result carries its physical location (file path, line number, region). The --detail flag has no effect on SARIF output since the format already includes full detail by design.
# Single file → stdout
lexega-sql analyze file.sql --format sarif
# Single file → write to directory
lexega-sql analyze file.sql --format sarif --report-out .lexega/
# Batch → produces batch_summary.sarif in the output directory
lexega-sql analyze . -r --format sarif --report-out .lexega/
In batch mode, all signals across all files are combined into a single SARIF document. Each result entry includes the originating file path via locations[].physicalLocation, so tools like GitHub Code Scanning, VS Code SARIF Viewer, and other SARIF consumers can map findings back to exact source locations.
Key Differences: Single-File vs Batch
| Aspect | Single-File | Batch |
|---|---|---|
| Input | One file or stdin | Directory with -r |
| Default output | Full report with all signals | Aggregate summary only |
| Per-file signals | Always shown | Only with --detail |
| JSON/YAML | Full report | Full per-file reports (always) |
--min-severity | Filters signals shown | Filters signals per-file and in summary |
| Policy enforcement | Single allow/block decision | Per-file decisions, exit code 2 if any blocked |
| Embedded SQL | N/A | Use --scan-embedded for .py/.ipynb files |
Common Workflows
Quick codebase health check
lexega-sql analyze . -r
Just the summary — how many files, signals, and what's the highest risk.
Pre-merge review with full detail
lexega-sql analyze models/ -r --detail --min-severity medium
See every medium-and-above signal, file by file.
CI pipeline with JSON artifact
lexega-sql analyze . -r --format json -q > report.json
Full structured data for dashboards or downstream processing.
PR comment with detail tables
lexega-sql analyze . -r --detail --format markdown --min-severity high
Markdown tables with per-file signal breakdowns, ready for GitHub/GitLab PR comments.
Policy enforcement across a repo
lexega-sql analyze . -r --policy policy.yaml --env prod
Each file gets its own allow/warn/block decision. Exit code 2 if any file is blocked.
Need Help?
Can't find what you're looking for? Check out our GitHub or reach out to support.