SQL dialect reference: quoting, comments, and cast syntax across 14 engines
Tidysql formats SQL across 14 dialects supported by sql-formatter v15, applying the quoting, comment, and identifier rules each engine recognizes. This reference catalogs the dialect-specific syntax differences the formatter respects, the option controls available on the toolbar, and the parse-failure patterns to watch for in the diff bar.
All 14 supported dialects
The dialect dropdown maps directly to sql-formatter v15's dialect list. Each row shows the four syntax differences most likely to break when porting a query across engines.
| Dialect | Identifier quote | Single-line comment | Type cast | String literals |
|---|---|---|---|---|
| PostgreSQL | "name" | -- | value::type or CAST(...) | '...', E'...' |
| MySQL | `name` | -- or # | CAST(...) only | '...', "..." |
| MariaDB | `name` | -- or # | CAST(...) only | '...', "..." |
| T-SQL (SQL Server) | [name] or "name" | -- | CAST(...) only | '...', N'...' |
| BigQuery | `name` | -- or # | CAST(...) | '...', "...", r'...' |
| SQLite | "name" or `name` | -- | CAST(...) | '...' |
| Snowflake | "name" | -- | value::type or CAST(...) | '...' |
| Redshift | "name" | -- | value::type or CAST(...) | '...' |
| DB2 | "name" | -- | CAST(...) | '...' |
| Hive | `name` | -- | CAST(...) | '...', "..." |
| Oracle PL/SQL | "name" | -- | CAST(...) | '...', q'[...]' |
| N1QL (Couchbase) | `name` | -- | TO_* functions | '...', "..." |
| SingleStoreDB | `name` | -- or # | CAST(...) | '...', "..." |
| Spark SQL | `name` | -- | CAST(...) | '...', "..." |
The SQL-92 standard requires -- followed by a space before a comment. MySQL, MariaDB, and SingleStoreDB enforce that space at runtime; others accept --comment without the trailing space.
Reserved-word identifier quoting
When a column or table name matches a reserved word, the formatter wraps it using the dialect's quoting style. The same input rendered across the major dialects:
| Dialect | Output for select * from order |
|---|---|
| PostgreSQL | SELECT * FROM "order" |
| MySQL / MariaDB | SELECT * FROM `order` |
| T-SQL | SELECT * FROM [order] |
| BigQuery | SELECT * FROM `order` |
| SQLite | SELECT * FROM "order" |
| Snowflake | SELECT * FROM "order" |
| Redshift | SELECT * FROM "order" |
| Oracle PL/SQL | SELECT * FROM "order" |
| Hive / Spark | SELECT * FROM `order` |
Three families emerge: double-quote (PostgreSQL-derived engines plus Oracle and DB2), backtick (MySQL and the Hadoop/cloud query engines), and square-bracket (T-SQL alone). The formatter applies the engine's quoting rules from the parsed AST, not a regex on the output, so it will only quote identifiers that the parser flags as reserved words for the selected dialect.
Comment styles by dialect
| Dialect | -- comment | # comment | /* block */ |
|---|---|---|---|
| PostgreSQL | yes | parse error | yes |
| MySQL / MariaDB | yes (space required) | yes | yes |
| T-SQL | yes | parse error | yes |
| BigQuery | yes | yes | yes |
| SQLite | yes | parse error | yes |
| Snowflake | yes | yes | yes |
| Redshift | yes | parse error | yes |
| DB2 | yes | parse error | yes |
| Oracle PL/SQL | yes | parse error | yes |
| Hive / Spark | yes | parse error | yes |
Switching dialect after pasting MySQL with # comments to PostgreSQL produces a warning in the diff bar. The output panel shows the best-effort result and leaves the # lines untouched, but the query will fail at runtime against PostgreSQL.
Type cast operators
The single most common source of dialect-mismatch breakage:
| Cast style | PostgreSQL | Redshift | Snowflake | BigQuery | MySQL | T-SQL | Oracle |
|---|---|---|---|---|---|---|---|
value::type | yes | yes | yes | no | no | no | no |
CAST(value AS type) | yes | yes | yes | yes | yes | yes | yes |
CONVERT(type, value) | no | no | no | no | yes | yes | no |
value AT TIME ZONE 'UTC' | yes | yes | yes | no | no | no | no |
CAST(... AS ...) is the only universally portable form. The formatter preserves whichever cast style the input uses; it never rewrites ::int to CAST(x AS int) when the dialect is switched. The diff bar flags :: casts as unrecognized tokens when the active dialect does not support them, but leaves the surrounding query formatted.
Identifier case folding
A subtle dialect difference that affects whether a query resolves to the same object after copy-paste:
| Dialect | Unquoted identifier | Quoted identifier |
|---|---|---|
| PostgreSQL | folded to lowercase | preserved as written |
| MySQL / MariaDB (Linux) | preserved | preserved |
| MySQL / MariaDB (Windows) | folded to lowercase | folded to lowercase |
| T-SQL | preserved (collation-dependent) | preserved |
| BigQuery | preserved | preserved |
| SQLite | preserved | preserved |
| Snowflake | folded to UPPERCASE | preserved as written |
| Redshift | folded to lowercase | preserved as written |
| Oracle PL/SQL | folded to UPPERCASE | preserved as written |
The Snowflake and Oracle uppercase-folding rule is why SELECT id FROM users and select id from USERS both resolve to the table named USERS, while "users" and "USERS" are different objects. The formatter does not touch identifier case in any direction; the keyword case option only affects tokens the parser classifies as reserved words.
Formatting option matrix
The toolbar controls three orthogonal settings:
| Option | Values | Scope |
|---|---|---|
| Dialect | 14 engines | Reserved words, quoting, comment styles, cast operators |
| Keyword case | UPPER, lower, Preserve | Reserved-word tokens only (never identifiers) |
| Indentation | 2 spaces, 4 spaces, tab | Indent width inside SELECT lists, JOIN clauses, CASE blocks, subqueries |
Indent applies recursively. A subquery in WHERE id IN (SELECT ...) uses two levels of the selected indent: four spaces at the 2-space setting, eight spaces at the 4-space setting, two tab characters in tab mode.
The Preserve keyword case is the right choice when reformatting code that intentionally mixes casing, for example a style guide that uppercases reserved words but lowercases aggregate functions like count(*). The UPPER and lower modes apply uniformly to every token in the reserved-word table for the selected dialect.
Minify mode behavior
Minify collapses the SQL to a single line by stripping all formatting whitespace. The output is the input parsed and re-emitted with a single space between tokens; the dialect, keyword case, and indent options are ignored.
| Input shape | Format mode output | Minify mode output |
|---|---|---|
| Single-line input | Multi-line indented | Single line, normalized spacing |
| Multi-line input | Re-indented per options | Single line, normalized spacing |
| Already-minified input | Multi-line indented | Single line, normalized spacing |
Minify is useful when piping a query into a log line, a curl payload, or a YAML field where multi-line SQL is awkward to escape. Switching back to Format re-parses the minified output and applies the current Format options, which lets you toggle freely between the two views on the same query.
Quick reference: keyword case behavior
The keyword case option applies only to tokens the parser recognizes as SQL reserved words for the selected dialect. Identifiers, string literals, comments, and dialect-specific functions outside the keyword table are left untouched.
| Input | UPPER output | lower output | Preserve output |
|---|---|---|---|
select id | SELECT id | select id | select id |
Select ID | SELECT ID | select ID | Select ID |
JOIN users ON users.id = ... | JOIN users ON users.id = ... | join users on users.id = ... | JOIN users ON users.id = ... |
count(*) | COUNT(*) | count(*) | count(*) |
count, sum, avg, and other aggregate functions are in the keyword table for every supported dialect, so they follow the case setting alongside SELECT and FROM.
Parse warning patterns
The diff bar shows a yellow warning banner when the parser cannot classify a token. The output panel still renders a best-effort result. Common triggers and their fixes:
| Symptom | Cause | Fix |
|---|---|---|
| Whole query unchanged after Format click | Parser bailed on the first unknown token | Check for missing semicolon, unclosed string, or unbalanced bracket |
| String literal contents reformatted | Embedded SQL inside a string (e.g., EXECUTE 'SELECT ...') | Format the inner query separately; the parser does not descend into string contents |
# comments rendered as identifiers | Dialect mismatch: # is a comment only in MySQL, MariaDB, BigQuery, Snowflake | Switch dialect or replace with -- |
:: cast flagged as syntax error | Dialect mismatch: :: is supported in PostgreSQL, Redshift, Snowflake only | Switch dialect or rewrite to CAST(... AS ...) |
Window function OVER (...) flattened to one line | Old issue from sql-formatter v14, resolved in v15 | None; tidysql ships v15 |
| Vendor-specific function name lowercased | Function not in the dialect's keyword table | Use Preserve keyword case |
| Multiple CTEs run together without blank lines | WITH a AS (...), b AS (...) is one statement to the parser | Expected; the formatter does not insert separator blank lines |
| Stored procedure body untouched | CREATE PROCEDURE ... AS BEGIN ... END is out of scope | sql-formatter v15 does not parse procedural extensions; format the body queries individually |
| Removed-lines count non-zero on Format pass | Parser ate a token (very rare on v15) | Inspect the diff; report the input if reproducible |
Related concepts
EXPLAINplan output: a separate concern from query formatting; tidysql formats the query text, not the plan that comes back from the engine.- SQL linting (SQLFluff, SQLLint): rule checks beyond formatting, such as trailing-comma policy, CTE ordering, and column-list style; complementary to a formatter, not a substitute.
- Pretty-printing inside stored procedures: handled by dialect-aware tools like
pg_format; sql-formatter v15 parses statements only. - ORM-emitted SQL: typically already auto-quoted at write time, which is why the formatter rarely needs to add quotes when reformatting an ORM debug-log dump.
- AST-level transformations: sql-formatter v15 exposes a parser AST, but tidysql uses only the printer; query rewriting is out of scope.