Block dangerous SQL before it runs.

Incidents happen when risky SQL slips past review: injection through dynamic SQL, a masking policy that masks nothing, PII reaching an output column. Lexega reads it as a semantic model and enforces your policy before it runs. Same engine whether a developer opens a PR or an AI agent writes the query. One binary. No database connection. No dbt install.

Snowflake·PostgreSQL·BigQuery·MSSQL·Databricks·Redshift·dbt & Jinja·Python & notebooks·macOS · Linux · Windows

The Problem

Valid SQL fails in more ways than review catches. These are the classes Lexega reads for — each hiding in SQL that looks safe and runs clean:

Control — injection

Untrusted input rewrites the statement, traced through procedures, branches, and loops to the execute sink.

DYNSQL-CONCAT · PROC-DYNSQL

Confidentiality — exposure

A masking policy that masks nothing, PII reaching an output column, row or column-security bypass potential.

MASK-ALLOW-ALL · Q-FLOW-TAINT

Integrity — misstatement

A fan-out double-count, a LEFT join collapsed to INNER, a NOT IN emptied by one NULL.

Q-JOIN-FANOUT-CENH · Q-JOIN-LEFT-FILT · Q-NULL-NOTIN

Authorization — privilege

Overbroad grants, privilege escalation, dropped row-access or masking policies, encryption removed.

GRT-ALL-PRIV · GRT-BROAD-PRIV

Secrets — credentials

Credentials embedded in COPY INTO, stage, and storage-credential DDL.

CRED-*

Durability — destructive writes

An UPDATE or DELETE with no filter, an unguarded DROP, an unbounded write.

DML-WRITE-UNBOUNDED · TBL-DROP

What Lexega CatchesLive

These aren't screenshots. Press Run it live and the real engine analyzes the SQL and blocks it. The highlighted spans are the engine's own evidence — the exact bytes each verdict points at, including inside string literals.

Injection, three hops from the sink: the executed variable is clean — the tainted concatenation is three assignments back, behind a decoy

-- The sink runs a bare variable. The concatenation that taints it is
-- three assignments back; a decoy copy is returned, never executed.
CREATE OR REPLACE PROCEDURE etl.add_audit_column(target_col VARCHAR)
RETURNS VARCHAR
LANGUAGE SQL
AS
$
DECLARE
  a VARCHAR;
  b VARCHAR;
  c VARCHAR;
  preview VARCHAR;
BEGIN
  a := 'ALTER TABLE fct_events ADD COLUMN ' || target_col || ' VARCHAR';
  b := a;
  preview := 'ddl: ' || b;
  c := b;
  EXECUTE IMMEDIATE c;
  RETURN preview;
END;
$;
Runs against the live engine. Nothing is stored.
BLOCKEDBlocked by severity_actions.critical (signal severity = critical)
CRITICALDYNSQL-CONCAT

Dynamic SQL argument is built via string concatenation or FORMAT(...) interpolation. SQL injection risk — switch to a parameterized query (USING clause / sp_executesql parameter binding). (2 occurrences)

HIGHDYNSQL

Dynamic SQL execution detected. SQL injection risk if input parameters are not validated. Consider parameterized queries (USING clause / sp_executesql parameters). (2 occurrences)

HIGHPROC-DYNSQL

Stored procedure executes dynamic SQL with potential injection vector. SQL injection risk if inputs are not validated. Consider parameterized queries (e.g. USING clause / sp_executesql parameters).

Same relay, same bare-variable sink. Chasing c ← b ← a finds a constant template this time, with the runtime value bound via USING — so it’s allowed. The verdict tracks the data flow, not the shape.

-- Same three-hop relay, same bare-variable sink. But the traced value
-- is a constant template; the runtime value is bound with USING.
CREATE OR REPLACE PROCEDURE etl.count_tenant_events(tenant_id VARCHAR)
RETURNS VARCHAR
LANGUAGE SQL
AS
$
DECLARE
  a VARCHAR;
  b VARCHAR;
  c VARCHAR;
BEGIN
  a := 'SELECT COUNT(*) FROM fct_events WHERE tenant_id = ?';
  b := a;
  c := b;
  EXECUTE IMMEDIATE c USING (tenant_id);
  RETURN 'ok';
END;
$;
Runs against the live engine. Nothing is stored.
ALLOWEDAllowed — no critical or high signals

No critical or high signals — nothing to block.

Injection across a procedure boundary: input flows through an OUTPUT parameter and QUOTENAME() before execution

-- Untrusted @term crosses into build_clause, is wrapped in QUOTENAME()
-- (not a sanitizer in a string-literal context), returned via an OUTPUT
-- parameter, then concatenated and executed back in the caller.
CREATE PROCEDURE dbo.build_clause @raw NVARCHAR(200), @out NVARCHAR(400) OUTPUT
AS
BEGIN
  SET @out = N'WHERE name = ''' + QUOTENAME(@raw) + N'''';
END;
GO

CREATE PROCEDURE dbo.run_search @term NVARCHAR(200)
AS
BEGIN
  DECLARE @clause NVARCHAR(400);
  EXEC dbo.build_clause @term, @clause OUTPUT;
  DECLARE @sql NVARCHAR(800) = N'SELECT * FROM users ' + @clause;
  EXEC sp_executesql @sql;
END;
GO
Runs against the live engine. Nothing is stored.
BLOCKEDBlocked by severity_actions.critical (signal severity = critical)
CRITICALDYNSQL-CONCAT

Dynamic SQL argument is built via string concatenation or FORMAT(...) interpolation. SQL injection risk — switch to a parameterized query (USING clause / sp_executesql parameter binding). (2 occurrences)

HIGHDYNSQL

Dynamic SQL execution detected. SQL injection risk if input parameters are not validated. Consider parameterized queries (USING clause / sp_executesql parameters). (2 occurrences)

HIGHPROC-DYNSQL

Stored procedure executes dynamic SQL with potential injection vector. SQL injection risk if inputs are not validated. Consider parameterized queries (e.g. USING clause / sp_executesql parameters).

Same structure, parameterized. The user value is bound as a parameter instead of concatenated, so the same shape is allowed.

-- Identical intent, but @term is bound via sp_executesql parameters
-- instead of being concatenated into the statement text.
CREATE PROCEDURE dbo.run_search @term NVARCHAR(200)
AS
BEGIN
  DECLARE @sql NVARCHAR(800) = N'SELECT * FROM users WHERE name = @p';
  EXEC sp_executesql @sql, N'@p NVARCHAR(200)', @p = @term;
END;
GO
Runs against the live engine. Nothing is stored.
ALLOWEDAllowed — no critical or high signals

No critical or high signals — nothing to block.

The dangerous statement is inside the string literal — and its WHERE clause is a lie. Lexega parses the embedded SQL and proves the filter is always true

-- The DELETE inside the literal has a WHERE clause — one that is
-- provably always true. The engine parses the embedded SQL and proves it.
CREATE PROCEDURE dbo.archive_stale
AS
BEGIN
  EXEC sp_executesql N'DELETE FROM dbo.audit_log
    WHERE batch_id = 42 OR retained = 0 OR batch_id <> 42 OR batch_id IS NULL';
END;
Runs against the live engine. Nothing is stored.
BLOCKEDBlocked by severity_actions.critical (signal severity = critical)
CRITICALDML-WRITE-UNBOUNDEDfound in dynamic SQL body

Unbounded write operation detected - the statement has no WHERE clause or its WHERE clause is provably always true. This affects ALL rows in the target table(s).

HIGHDYNSQL

Dynamic SQL execution detected. SQL injection risk if input parameters are not validated. Consider parameterized queries (USING clause / sp_executesql parameters). (2 occurrences)

HIGHPROC-DYNSQL

Stored procedure executes dynamic SQL with potential injection vector. SQL injection risk if inputs are not validated. Consider parameterized queries (e.g. USING clause / sp_executesql parameters).

MEDIUMQ-PRED-TAUTOLOGYfound in dynamic SQL body

Tautological predicate: condition is always true (e.g., WHERE x=1 OR x<>1). The WHERE clause provides no filtering.

Same shape, really bounded. The DELETE inside the literal has a WHERE clause that actually filters, with the cutoff bound as a parameter — the engine reads through the string either way and can tell a real bound from a fake one.

-- Same shape. The DELETE inside the literal is bounded for real, and
-- the cutoff arrives as a bound parameter.
CREATE PROCEDURE dbo.archive_stale @cutoff DATETIME2
AS
BEGIN
  EXEC sp_executesql
    N'DELETE FROM dbo.audit_log WHERE logged_at < @c',
    N'@c DATETIME2', @c = @cutoff;
END;
Runs against the live engine. Nothing is stored.
ALLOWEDAllowed — no critical or high signals

No critical or high signals — nothing to block.

Misstatement: a revenue rollup that runs without error and returns the wrong number — three different ways, including a billable-only filter that is provably always true

-- A revenue rollup that runs without error and returns the wrong number.
WITH active AS (
  SELECT account_id, region FROM accounts WHERE status = 'active'
),
billable AS (
  SELECT a.account_id, SUM(i.amount) AS revenue
  FROM active a
  LEFT JOIN invoices i ON i.account_id = a.account_id
  WHERE i.status = 'posted'
    AND (i.is_billable = TRUE OR i.is_billable = FALSE OR i.is_billable IS NULL)
  GROUP BY a.account_id
)
SELECT * FROM billable
WHERE account_id NOT IN (SELECT account_id FROM suspended_accounts);
Runs against the live engine. Nothing is stored.
BLOCKEDBlocked by severity_actions.critical (signal severity = critical)
CRITICALQ-JOIN-LEFT-FILT

LEFT JOIN nullable side filtered in WHERE clause. This effectively converts the LEFT JOIN to an INNER JOIN, likely a bug.

HIGHQ-NULL-NOTIN

NULL-logic hazard: NOT IN with subquery on column ''. If subquery returns any NULL, entire predicate evaluates to UNKNOWN and filters all rows. Use NOT EXISTS or ensure subquery has WHERE ... IS NOT NULL.

MEDIUMQ-PRED-TAUTOLOGY

Tautological predicate: condition is always true (e.g., WHERE x=1 OR x<>1). The WHERE clause provides no filtering.

Same rollup, null-safe, really filtered. The join filters move into the ON clause so unmatched accounts survive, the billable filter actually filters, and NOT EXISTS is immune to a NULL in suspended_accounts.

-- The join filters live in the ON clause, so unmatched accounts survive;
-- NOT EXISTS is immune to a NULL in suspended_accounts.
WITH active AS (
  SELECT account_id, region FROM accounts WHERE status = 'active'
),
billable AS (
  SELECT a.account_id, SUM(i.amount) AS revenue
  FROM active a
  LEFT JOIN invoices i
    ON i.account_id = a.account_id
    AND i.status = 'posted'
    AND i.is_billable = TRUE
  GROUP BY a.account_id
)
SELECT * FROM billable b
WHERE NOT EXISTS (
  SELECT 1 FROM suspended_accounts s WHERE s.account_id = b.account_id
);
Runs against the live engine. Nothing is stored.
ALLOWEDAllowed — no critical or high signals

No critical or high signals — nothing to block.

Data exposure: a masking policy that masks nothing

-- Looks conditional. Every branch returns the input untouched.
CREATE MASKING POLICY mask_ssn AS (val STRING) RETURNS STRING ->
  CASE WHEN CURRENT_ROLE() = 'ADMIN' THEN val ELSE val END;
Runs against the live engine. Nothing is stored.
BLOCKEDBlocked by severity_actions.critical (signal severity = critical)
CRITICALMASK-ALLOW-ALL

Masking Policy body passes through the original value without masking. Policy is effectively a no-op and sensitive data may be exposed.

Same policy, real masking. The non-admin branch actually masks, so the same policy shape is allowed.

-- Same policy. The non-admin branch actually masks.
CREATE MASKING POLICY mask_ssn AS (val STRING) RETURNS STRING ->
  CASE WHEN CURRENT_ROLE() = 'ADMIN' THEN val ELSE '***-**-****' END;
Runs against the live engine. Nothing is stored.
ALLOWEDAllowed — no critical or high signals

No critical or high signals — nothing to block.

Every signal is surfaced; what to do with each one lives in one place. Your policy sets block, warn, or allow per rule and per path, so exceptions are reviewed and version-controlled, not scattered as inline suppressions added to quiet a warning. The default policy starts conservative and you tune from there, producing a decision artifact (and exit code) you can gate on before it reaches production.

Start Small, Go Deeper

Every signal above comes from a single pass over your SQL. Give Lexega more context and it sees deeper, but each tier is optional. A single statement works on its own; add a diff, a dbt project, or a warehouse catalog only when you want the deeper signals.

One statement

Parse + rule analysis on a single query: injection taint inside a procedure, a no-op masking policy, GRANT ALL, an unbounded write. Just the SQL, with no database connection, no second version to compare, no dbt project.

MASK-ALLOW-ALL
A diff

Two versions compared by what they mean, not how they read: a WHERE that vanished, a LEFT join narrowed to INNER, a masking policy dropped, a column removed.

DIFF-JOIN-NARROW
A procedure

Taint followed through control flow: IF/CASE branches, loops, local-variable relays, even across OUTPUT parameters between procedures. The injection is the chain.

DYNSQL-CONCAT
Your dbt project

Jinja and ref()s rendered without Python, and the classification tags you already declared on your sources flowed through your SQL. No warehouse connection, no extra tagging.

Q-FLOW-TAINT
A warehouse catalog

Keys, row counts, nullability, masking and row-access policies, and the grant graph, turning "this JOIN looks risky" into "10M × 500K fanout" and a role grant into its blast radius.

Q-JOIN-FANOUT-CENH

How It Works

Lexega is policy-as-code for the SQL layer. It renders your SQL — Jinja and dbt included — reads it as a semantic model and turns what it finds into a decision you can enforce: on a single statement, a stored procedure, a dbt project, or a PR diff, in CI (shift-left) and at agent runtime.

  1. 1
    Render: Jinja and dbt macros are evaluated natively — no Python, dbt, or warehouse connection. {{ ref('orders') }} becomes the real table, so a model is analyzable before it's ever built.
  2. 2
    Parse: The rendered SQL is read into a semantic model across Snowflake, PostgreSQL, BigQuery, MSSQL, Databricks, and Redshift.
  3. 3
    Analyze: Signals are emitted from that model: injection taint followed through branches, loops, and across procedure boundaries; JOIN, NULL, and filter hazards; column lineage and PII-tag flow; no-op masking policies; privilege and governance changes. Across a dbt project it reaches cross-model — upstream schemas flow into downstream checks.
  4. 4
    Compare: On a diff (diff / review), the semantic models of two branches are compared; a removed WHERE, a LEFTINNER narrowing, a dropped masking policy.
  5. 5
    Enforce: Each signal gets a severity; your policy decides block, warn, or allow — emitting a decision record and exit code (text, JSON, YAML, or SARIF) for CI to gate on or an agent to act on at runtime.

Where It Lives

Lexega fits best in two places: on a pull request in CI, and inline at agent runtime. Both run through the same engine. What changes is only what happens next: in CI it gates the merge, at runtime it gates the query before it executes.

🔄

CI/CD Pipelines

Catch the risky change in review, not in production. Lexega reviews the SQL in a pull request and fails the check when a change trips your policy.

lexega-sql review main..HEAD -r --pr-comment
  • Semantic diff between commits
  • PR comments with signals
  • Exit code 2 = blocked
🤖

Agent Runtime

An agent steered by prompt injection or a poisoned tool result can emit SQL that exfiltrates or destroys data. Lexega evaluates every agent-generated query against your policy before execution, so a manipulated agent still can't run what your policy forbids.

lexega-sql analyze --stdin -q \
  --policy policy.yaml --mode runtime \
  --decision-out -
  • Pipe SQL, get JSON decision on stdout
  • Low latency (<20ms typical)
  • Exit code 2 = blocked

Drop it into a pipeline, no dependencies:

# .github/workflows/sql-review.yml
on: [pull_request]
permissions:
  pull-requests: write
steps:
  - uses: actions/checkout@v4
    with: { fetch-depth: 0 }
  - run: lexega-sql review $BASE..$HEAD . -r --pr-comment
    env:
      LEXEGA_LICENSE_KEY: ${{ secrets.LEXEGA_LICENSE_KEY }}
      GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

# Exit code 2 = blocked, 0 = passed

Runs in Your Environment

Lexega is a single binary you run inside your own CI or runtime. By default it talks to nothing.

Your data never moves

Lexega reads SQL text. It never connects to your warehouse to run queries, and never sees table data or query results.

Offline by default

No telemetry. License validation is offline and cryptographic. The optional catalog pull and PR comments are opt-in and only reach systems you configure.

Deterministic & stateless

Same input, same decision, every time. Analysis runs in memory and is released: no caching, no temp files, no persistent state.

Auditable by design

Every decision is a record hashed (SHA256) over the SQL, policy, and result: immutable evidence of what was evaluated and why.

Core analysis operates on SQL, not data, which keeps most GDPR/HIPAA/PCI scope off the table. See Security & Privacy for the full data-flow boundaries.

Who It's For

Built for security teams extending injection, data-exposure, and integrity coverage to the SQL layer, and enforcing it in CI and at agent runtime, before the query runs.

Also works for:

  • Platform teams adding policy gates to CI/CD pipelines
  • Engineering teams needing guardrails on LLM-generated SQL
  • Data leads scaling beyond "everyone reviews everything"
  • Compliance and risk teams requiring decision records with cryptographic proof

Get Started

Early Access
Free Trial

30 Days · No Credit Card

Full functionality. Run it on your repo, see what it catches. Self-serve install.

Install (macOS / Linux):

curl -sSL https://lexega.com/install.sh | sh
  • All detection rules + semantic diff
  • Policy engine + custom rules
  • CI integration + PR comments

Your key will appear instantly. No spam, no account required. Privacy Policy

Manual Download (GitHub)

Formatting is free forever. Trial key unlocks risk analysis.

Design Partner Pilot

$12,000 · 8 Weeks · Hands-On

Go production-ready with dedicated onboarding. Pilot fee credited 100% toward your annual license.

  • Everything in trial, plus:
  • Dedicated onboarding & policy tuning
  • Custom rules for your org
  • Weekly check-ins with our team

Success criteria we define together: CI integration on 2+ repos, policy blocking in prod, team reviewing signals.

Start Pilot Conversation →

Questions? [email protected] · We respond within 24 hours.