Known Limitations

Parser Coverage

Lexega parses most Snowflake and PostgreSQL SQL constructs, but some are handled opaquely:

Snowflake

  • Opaque handling: CREATE/ALTER for WAREHOUSE, PIPE, FILE FORMAT, and some others are preserved as OpaqueContent (parsed but not structurally analyzed)
  • Dynamic SQL (EXECUTE IMMEDIATE with concatenated strings) is parsed but not analyzed
  • Stored procedure bodies (JavaScript, Python) are opaque to the analyzer
  • Some proprietary Snowflake functions may not be recognized

PostgreSQL

  • Catalog integration is not yet available for PostgreSQL (Snowflake only currently)
  • Function/procedure bodies in $...$ blocks are parsed with full PL/pgSQL support including PERFORM, RAISE, GET DIAGNOSTICS, FOREACH, EXECUTE ... USING, %TYPE, and labeled blocks; rare or highly dynamic constructs may fall back to opaque handling
  • Some PostgreSQL-specific expression syntax (e.g., ::type casts in complex positions) may fall back to opaque handling
  • Advisory locks and other session-level features are not tracked

Jinja/dbt Rendering

  • Error positions: When SQL is rendered from nested macros and refs, error positions refer to the rendered SQL, not the original source file. There is no reliable way to map back to a specific line in macro-heavy dbt projects.
  • Unresolved variables: If Jinja variables are not provided, placeholders remain in rendered output. This may cause parse errors or incomplete analysis.
  • Macro edge cases: Some complex macro patterns (recursive macros, dynamic macro names) may not render correctly.

Risk Analysis

  • Catalog population requires companion sidecar binary (CLI provides offline analysis only)
  • Heuristic analysis without catalog is conservative (may produce false positives)
  • No query execution time prediction (only structural risk)
  • Dynamic table names (from variables) cannot be resolved statically

Formatter

  • Jinja structure is preserved; by default SQL inside Jinja blocks is formatted (pass-through mode is configurable via jinja_preserve_original)
  • Some deeply nested expressions may exceed line width targets
  • Comment positioning may shift slightly to maintain syntactic validity

Performance

  • Formatting: ~1.1 µs per statement (10,000 statements format in ~11ms)
  • Risk analysis (current benchmarks):
    • 10k statements + 50k-table catalog: ~411ms
    • 5k statements + 10k-table catalog: ~153ms
    • Complex Jinja (530-line hostile template): ~1.7ms
    • Nested CTEs (2k statements, 50k catalog): ~325ms (87% faster)
  • Memory: Risk analysis with large catalogs (>50,000 tables) uses ~200-500MB RAM for catalog index. Formatting is low-memory (~10-20MB typical).
  • Scaling: Approximately linear with statement count. Nested CTE lineage tracking has been optimized; pathological cases that previously took 2+ seconds now complete in under 400ms.

Unsupported Constructs: If you encounter parse failures or incorrect analysis, please document the failing SQL in your feedback.

Need Help?

Can't find what you're looking for? Check out our GitHub or reach out to support.