Simple SQL Formatter

SQL in general.

https://www.dropbox.com/s/i5qvotqtdks9r5r/SQL%20Keywords.txt?dl=0

Compiled from

https://www.drupal.org/docs/develop/coding-standards/list-of-sql-reserved-words

Kem

Part of your test code is:

 "where" = 'and' -- my where AND

and you expect to get:

"where" = 'and' -- my where AND
I have a problem with this particular construct as there is no termination go the “–” connect style. For “/" there is a termination of "/” so the issue is not there.

My question is how can any parser recognise that the AND part is part of the SQL or just part of the comment? This bit is nonsensical to me!

[quote=325090:@Simon Berridge]
I have a problem with this particular construct as there is no termination go the “–” connect style[/quote]
there is
its the end of line (if its across multiple lines) which is possible

Simon, you didn’t quote my test code, you quoted the output from the formatter. In my test code, the comment ends at the end-of-line and the “and” starts the next line.

Comments that start with “–” are terminated by the end of the line. Comments that start with “/" are terminated by the matching "/”.

so much for “simple sql formatter” :stuck_out_tongue:

not more complicated than a csv import … :wink:

csv is a simple finite state machine (literally has about 20 states)

SQL is hugely complex and “doing it right” (esp if you want to handle nested subqueries etc) is way more complex
I’ve written a sql parser many years ago as the front end to a custom back end

[quote=325098:@Kem Tekinay]-- some document
SELECT
*
FROM
" where "
WHERE
“where” = ‘and’ – my where
AND
“or” = ‘or’ /* or another comment */ OR
“not” IS NOT NULL[/quote]
You’re right, Kem, sorry about that.

The formatter now takes:

-- some document Select * from " where " where "where" = 'and' -- my where and "or" = 'or' /* or another comment */ or "not" is not null
and produces:

-- some document SELECT * FROM " where " WHERE "where" = 'and' -- my where AND "or" = 'or' /* or another comment */ OR "not" IS NOT NULL
We are getting there!

Latest version: sqlFormatter

Cant blame you for avoiding subqueries entirely :slight_smile:
A quick test says it wont touch them in any way

We now get an extra blank line before the first SELECT, and strings that split across lines are still altered.

Select 'this
And that' from table

becomes

<blank line>
SELECT
 'this And that'
FROM
 table

not sure what happened to my previous post…
But I’m liking this “contest” :slight_smile:

Would be great if someone could come up with a complex valid SQL query that incorporates all the structures discussed so far.

  • double and single quotes
  • EOL inside of quotes
  • sub query
  • block comments
  • line comments

and see who can come up with the best (most complete/accurate) formatter that does not alter the users intent nor break the SQL

we’d have several to play with if this were a contest

bring it on :slight_smile:

the prize is of course, the knowledge gained.

wasn’t volunteering to write one - I’ve done that once in C and have no desire to do it again
just that “simons code doesn’t do X” and he fixes it isn’t exactly a “contest”

you had said you were writing one though

[quote=325135:@Norman Palardy]wasn’t volunteering to write one - I’ve done that once in C and have no desire to do it again
just that “simons code doesn’t do X” and he fixes it isn’t exactly a “contest”

you had said you were writing one though[/quote]
Yes I am… hence the “contest”… I have posted the results of mine many times above,

Let’s get one going here.

--
-- Some test sql
-- with a multi-line comment
-- and stuff
--

with my_with_1 as (
select col1, col2, col3 as "col 3"
from (
select * from table2) as sub_table1 where
id = 'some string'
)
with my_with_2 as (
select some_function(param1, param2, 'string param')
from table3 t3
left join table4 as t4 on (t4.id = t4.table4_id)
where upper(lower(string_function('some 
long
string', otherparam))) like my_with_1.col1
)

/* Here come
the select statements, but not here
select * from table1 where i = 4;
*/
(
select col1, col2, col3 as "   select  "
from my_with_2 -- using my_with_2 here
right join table5 t5 -- comment in the middle of join
on (
select distinct * from table6, table7 where table6.table7_id = table7.id
) where 
t5 = some_other_function(
param1, param
)
group by col3, col2, col1
order by col1

limit 5 offset 4

) union
(
-- just a copy of the one above
select col1, col2, col3 as "   select  "
from my_with_1 -- using my_with_1 here
right join table5 t5 -- comment in the middle of join
on (
select distinct * from table6, table7 where table6.table7_id = table7.id
) where 
t5 = some_other_function(
param1, param
)
group by col3, col2, col1
order by col1

limit 5 offset 4

)

One of the online formatters made mincemeat out of this.

Edit: I was missing a paren. After adding it, the online formatter did a fine job.

I take it back, the online formatter didn’t do that great a job. It doesn’t like the second “with” I guess.

here is my first attempt…

--
-- some test sql
-- with a multi-line comment
-- and stuff
--
WITH my_with_1 AS (
SELECT col1, col2, col3 AS "col 3"
  FROM (
SELECT *
  FROM table2) AS sub_table1
 WHERE id = 'some string') WITH my_with_2 AS (
SELECT some_function (param1, param2, 'string param'
  FROM table3 t3 LEFT JOIN table4 AS t4 ON (t4.id = t4.table4_id)
 WHERE UPPER(LOWER(string_function ('some 
long
string', otherparam))) LIKEmy_with_1.col1) /* here come
the select statements, but not here
select * from table1 where i = 4;
* / (
SELECT col1, col2, col3 AS "   select  "
  FROM my_with_2 -- using my_with_2 here
 RIGHT JOIN table5 t5 -- comment in the middle of join
    ON (
SELECT DISTINCT *
  FROM table6, table7
 WHERE table6.table7_id = table7.id)
 WHERE t5 = some_other_function (param1, param)
 GROUP BY col3, col2, col1
 ORDER BY col1
 LIMIT 5
OFFSET 4)
 UNION
   (-- just a copy of the one above
SELECT col1, col2, col3 AS "   select  "
  FROM my_with_1 -- using my_with_1 here
 RIGHT JOIN table5 t5 -- comment in the middle of join
    ON (
SELECT DISTINCT *
  FROM table6, table7
 WHERE table6.table7_id = table7.id)
 WHERE t5 = some_other_function (param1, param)
 GROUP BY col3, col2, col1
 ORDER BY col1
 LIMIT 5
OFFSET 4)

You used the version with the missing close paren, but still not bad.

I should post what I’d personally like to see.

This is more or less how I’d format it by hand. Naturally this reflects my own preferences, aka, “the right way.” :stuck_out_tongue:

--
-- Some test sql
-- with a multi-line comment
-- and stuff
--

WITH my_with_1 AS (
  SELECT
    col1,
    col2,
    col3 as "col 3"
  FROM (
    SELECT
      * 
    FROM
      table2
    ) AS sub_table1
  WHERE
    id = 'some string'
)

WITH my_with_2 AS (
  SELECT
    some_function(param1, param2, 'string param')
  FROM
    table3 t3
    LEFT JOIN table4 AS t4 ON (
      t4.id = t4.table4_id
    )
  WHERE
    UPPER(LOWER(string_function('some 
long
string', otherparam))) LIKE my_with_1.col1
)

/* Here come
the select statements, but not here
select * from table1 where i = 4;
*/
(
  SELECT
    col1,
    col2,
    col3 AS "   select  "
  FROM
    my_with_2 -- using my_with_2 here
    RIGHT JOIN table5 t5 -- comment in the middle of join
    ON (
      SELECT DISTINCT
        *
      FROM
        table6,
        table7
      WHERE
        table6.table7_id = table7.id
    )
  WHERE 
    t5 = some_other_function(param1, param)
  GROUP BY
    col3,
    col2,
    col1
  ORDER BY
    col1
  LIMIT 5
  OFFSET 4
) UNION (
  -- just a copy of the one above
  SELECT
    col1,
    col2,
    col3 AS "   select  "
  FROM
    my_with_2 -- using my_with_2 here
    RIGHT JOIN table5 t5 -- comment in the middle of join
    ON (
      SELECT DISTINCT
        *
      FROM
        table6,
        table7
      WHERE
        table6.table7_id = table7.id
    )
  WHERE 
    t5 = some_other_function(param1, param)
  GROUP BY
    col3,
    col2,
    col1
  ORDER BY
    col1
  LIMIT 5
  OFFSET 4
)