Batch Mode & Per-File Detail

Lexega operates in two modes depending on the input target: single-file mode and batch mode. Understanding the differences helps you choose the right invocation for your workflow.

Single-File Mode

When you point analyze at a single file (or pipe via --stdin), Lexega produces a complete report for that file alone.

lexega-sql analyze file.sql
lexega-sql analyze --dialect bigquery query.bq.sql
cat file.sql | lexega-sql analyze --stdin

What you get:

  • Full risk report for one file
  • All matching signals displayed inline
  • Evidence with line numbers and statement previews
  • Policy decision (allow/warn/block) if --policy is set

This is the mode you're most likely to use during development or in targeted CI checks.

Batch Mode

Pass a directory (with -r for recursive scanning) to analyze an entire codebase at once. Lexega scans all .sql files, aggregates results, and prints a summary.

# Scan current directory recursively
lexega-sql analyze . -r

# Scan a specific folder
lexega-sql analyze models/ -r

# With dialect and severity filter
lexega-sql analyze . -r --dialect bigquery --min-severity medium

Default batch output is a compact summary printed to stderr:

Batch Analysis
════════════════════════════════════
Scanning: ./models
Files found: 42
Files analyzed: 38
Files skipped: 4

Statements parsed: 127
Statements analyzed: 112
Unrecognized: 15 (opaque, no analysis)
Files with signals: 11
Total signals: 27
  By level: 2 CRITICAL, 5 HIGH, 12 MEDIUM, 8 LOW
Average signals per file: 2.5
Highest risk level: CRITICAL

Top risky files:
  models/admin/grants.sql - 4 signal(s), max CRITICAL
  models/staging/raw_load.sql - 3 signal(s), max HIGH
  ...

Top Matched Rules:
  MASK-DROP (Masking Policy dropped) - 3 occurrence(s)
  GRT-TO-PUBLIC (Grant to PUBLIC role) - 5 occurrence(s)
  ...

This summary tells you the overall health of the codebase without overwhelming detail.

The --detail Flag

When batch summary isn't enough, add --detail to see per-file signal breakdowns after the summary:

lexega-sql analyze . -r --detail
lexega-sql analyze . -r --detail --min-severity high

Text Output (default)

With --detail, the summary is followed by a per-file section:

Per-File Signal Details
══════════════════════════════════════════════════════════════

── models/admin/grants.sql (2 signals)
  [CRITICAL] Masking Policy dropped. Column data protection removed.
    ↳ Line 12 • `governance:masking_policy:dropped` • "DROP MASKING POLICY ema..."
  [HIGH] Network Policy dropped. Access control removed.
    ↳ Line 18 • `security:network_policy:dropped`

── models/staging/raw_load.sql (3 signals)
  [MEDIUM] SELECT * usage detected
    ↳ Line 4 • `query:select_statement:select_star`
  [LOW] Implicit column ordering
    ↳ Line 4 • `query:select_statement:implicit_order`
  [MEDIUM] COPY INTO without ON_ERROR handling
    ↳ Line 22 • `data_access:copy_into_statement:no_on_error`

Each file shows its signals with severity level, message, and evidence (line numbers, signal values, statement previews). Files with no signals are omitted.

Markdown Output

When combined with --format markdown (for PR comments), --detail adds a Per-File Signal Details section with tables:

lexega-sql analyze . -r --detail --format markdown

This produces per-file tables with level icons:

LevelSignalEvidence
🔴 CriticalMasking Policy dropped. Column data protection removed.Line 12: governance:masking_policy:dropped
🟠 HighNetwork Policy dropped. Access control removed.Line 18: security:network_policy:dropped

Level icons: 🔴 Critical, 🟠 High, 🟡 Medium, 🟢 Low, ℹ️ Info.

JSON / YAML Output

JSON and YAML formats always include full per-file reports regardless of --detail. The flag only affects text and markdown output.

# Full per-file JSON (--detail not needed)
lexega-sql analyze . -r --format json -q > report.json

# YAML
lexega-sql analyze . -r --format yaml -q > report.yaml

SARIF Output

SARIF is inherently per-file — each result carries its physical location (file path, line number, region). The --detail flag has no effect on SARIF output since the format already includes full detail by design.

# Single file → stdout
lexega-sql analyze file.sql --format sarif

# Single file → write to directory
lexega-sql analyze file.sql --format sarif --report-out .lexega/

# Batch → produces batch_summary.sarif in the output directory
lexega-sql analyze . -r --format sarif --report-out .lexega/

In batch mode, all signals across all files are combined into a single SARIF document. Each result entry includes the originating file path via locations[].physicalLocation, so tools like GitHub Code Scanning, VS Code SARIF Viewer, and other SARIF consumers can map findings back to exact source locations.

Key Differences: Single-File vs Batch

AspectSingle-FileBatch
InputOne file or stdinDirectory with -r
Default outputFull report with all signalsAggregate summary only
Per-file signalsAlways shownOnly with --detail
JSON/YAMLFull reportFull per-file reports (always)
--min-severityFilters signals shownFilters signals per-file and in summary
Policy enforcementSingle allow/block decisionPer-file decisions, exit code 2 if any blocked
Embedded SQLN/AUse --scan-embedded for .py/.ipynb files

Common Workflows

Quick codebase health check

lexega-sql analyze . -r

Just the summary — how many files, signals, and what's the highest risk.

Pre-merge review with full detail

lexega-sql analyze models/ -r --detail --min-severity medium

See every medium-and-above signal, file by file.

CI pipeline with JSON artifact

lexega-sql analyze . -r --format json -q > report.json

Full structured data for dashboards or downstream processing.

PR comment with detail tables

lexega-sql analyze . -r --detail --format markdown --min-severity high

Markdown tables with per-file signal breakdowns, ready for GitHub/GitLab PR comments.

Policy enforcement across a repo

lexega-sql analyze . -r --policy policy.yaml --env prod

Each file gets its own allow/warn/block decision. Exit code 2 if any file is blocked.

Need Help?

Can't find what you're looking for? Check out our GitHub or reach out to support.