Simple SQL Formatter

Kem

I’m not sure I understand the SQL string you are sending. It certainly isn’t a valid sql statement as far as I can see.

It does replace all double-space to be single-space unless it is within single or double quotes. Also, the result you posted seems to be an exact representation of your sent string but minus the multiple spaces!

Sorry, it seems like it reformatted it perfectly!

Where am I wrong?

Simon, while his example is not valid SQL, your formatter (according to what he posted) DID in fact change the content
it dropped the “” from after the FROM and after the WHERE, and adding in SINGLE QUOTE

Double and/or Single quotes in a SQL statement SHOULD NEVER BE ALTERED… for some database engines it makes a difference if a string has single quotes or double

if I take the results he posted, and just remove a few end of lines

SELECT  'select ( ) from a'
FROM  ??? ' where '
WHERE ???  'where' = 'select' and 'and a = 2' <> 3

mine

SELECT 'select     (    ) from   a'
FROM ""
WHERE ""
WHERE 'where' = 'select'
AND 'and a = 2' <> 3 

you can plainly see the altered quotes totally changed the results

Seems right to me considering you start with

SELECT 'select     (    ) from   a' FROM "      where    " WHERE 'where' = 'select' and 'and a = 2' <> 3

as the query

Single quotes are used for strings whereas double quotes denote identifiers. SQLite is lax about enforcing this but other engines are not as forgiving.

Regardless, nothing between any quotes should be altered in any way. My fake sql is just meant to illustrate the potential issues.

Also, remember that I was posting Xojo code so double quotes are doubled to escape them in the code.

yup
took your code and stuck it in xojo code like

   textarea1.text =   "SELECT 'select     (    ) from   a' FROM ""      where     "" WHERE 'where' = 'select' and 'and a = 2' <> 3"

to show the “original string”

I’m not sure that single quote / double quote “rules” are part of the sql standard
they could be but I havent read them in ages

sqlite will let you do

create table 'from' ( 'from' , 'where' ) ;
select 'from' from 'from' where 'where' = 'from' ;

other engines let you do slightly different variations :slight_smile:

SQLite is forgiving in this respect… Oracle however (for example) is not… there you rarely ever use Double quotes, and using them incorrectly raises exceptions

Even SQLite discourages using them incorrectly as that “forgiveness” is there for historical purposes only. See:

https://sqlite.org/lang_keywords.html

Postgres won’t let you switch them either.

When I used Sybase day in day out it was common to simply alternate
If you started a literal with ’ then internally you could requote with " and if you needed to quote inside that you’d use ’ again (and after that you needed your head examined)
Table names etc were never quoted in any form

This seems to be one of those areas where “different engines do different things”
And I’m not about to buy the ANSI 2016 standard just to figure this out - esp because different engines do different things

[quote=324877:@Kem Tekinay]Single quotes are used for strings whereas double quotes denote identifiers. SQLite is lax about enforcing this but other engines are not as forgiving.

Regardless, nothing between any quotes should be altered in any way. My fake sql is just meant to illustrate the potential issues.[/quote]
Kem

You have to remember that this sqlFormatter was created to deal with an SQLite database only. According to the SQLite documents the " and ’ are interchangeable, provided that there are two of them. As the standard for SQLite is ’ I change any " to ’ in the final return.

Your line is:

"SELECT 'select     (    ) from   a' FROM ""      where     "" WHERE 'where' = 'select' and 'and a = 2' <> 3"

That will translate to an actual SQL string of:

SELECT 'select     (    ) from   a' FROM "      where     " WHERE 'where' = 'select' and 'and a = 2' <> 3

My sqlFormatter changes this to:

SELECT 'select ( ) from a' FROM ' where ' WHERE 'where' = 'select' and 'and a = 2' <> 3
Taking away the end of lines gives:

SELECT 'select ( ) from a' FROM ' where 'WHERE 'where' = 'select' and 'and a = 2' <> 3

I am still failing to see where it is wrong! I accept that it has removed a whole bunch of spaces but this is (normally) not an issue.

I am sorry to keep on about this but I am really interested in getting it right and I want to understand your point.

well it shouldn’t strip spaces out of literals since they may be relevant to whatever output is being generated

That’s true and I set it up not to. I will check that out tomorrow as it is 23:15 here now!

You cannot change the internals of a given string in any way, even whitespace. Doing so will change the meaning of the SQL and might actually break it since there is probably a good reason why someone wants or doesn’t want whitespace inside their strings. It will completely break if you change the internals of an identifier. " item " is not the same thing as " item "

I really don’t know of a better way to phrase that.

If you look at the SQLite page I referenced, you’ll see that double-quotes are used for identifiers and single-quotes for strings. That it doesn’t have to be that way is incidental and should not be considered correct. Your formatter should leave the original quote character in place.

From that page:

a) the internals of a quoted string should NEVER be changed (as Norman just said)
b) Double/Single quotes should NEVER be altered (even if it is just SQLite)… do not change what the user may have intended
c) whitespace OUTSIDE of quoted strings is ok (my opinion) to condense

SELECT 'select     (    ) from   a'
FROM "      where    "
WHERE 'where' = 'select'
AND 'and a = 2' <> 3
SELECT
 'select ( ) from a'  <----- REMOVED WHITESPACE INSIDE QUOTES
FROM
 ' where ' <--- REMOVED WHITESPACE INSIDE QUOTES AND ALTERED QUOTES
WHERE
 'where' = 'select' and 'and a = 2' <> 3 <--- FAILED TO RECOGNIZE THE 'AND" that is outside of the quotes

I have taken on board all these comments and I now understand what it was that Kem was pointing out. The latest version is here:
sqlFormatter

It now takes:

SELECT 'select     (    ) from   a' FROM "      where     " WHERE 'where' = 'select' and 'and a = 2' <> 3

and the result is:

SELECT 'select ( ) from a' FROM " where " WHERE 'where' = 'select' AND 'and a = 2' <> 3
Any more comments would be useful.

Thanks to all.

Simon.

Much better, thanks.

I see you strip comment lines entirely, but I’m not sure that’s the best choice. Can’t you leave them in place?

Also, inline comments with “–” and “/* */” are not handled properly. Again, for illustration:

-- some document 
Select *
 from "   where " 
where 
"where" = 'and' -- my where
and "or" = 'or' /* or another comment */ or "not" is not null

turns into

SELECT
 *
FROM
 "   where "
WHERE
 "where" = 'and' -- my where AND
 "or" = 'or' /* OR
 another comment */ OR
 "not" is not null

Also note how “not” and “is not null” are not capitalized.

What I’d expect is:

-- some document 
SELECT
 *
FROM
 "   where "
WHERE
 "where" = 'and' -- my where
 AND
 "or" = 'or' /* or another comment */ OR
 "not" IS NOT NULL

(The capitalization thing is an aside to my main point about comment handling.)

It would also be acceptable to turn inline comments in the “–” form into the “/* */” form for the sake of expediency, so you could get this:

-- some document 
SELECT
 *
FROM
 "   where "
WHERE
 "where" = 'and' /* my where */ AND
 "or" = 'or' /* or another comment */ OR
 "not" IS NOT NULL

I like where you’ve gone with this and would like to see it better, so I hope you take my comments in that spirit, not as criticism of your so-far excellent work.

End of lines in strings are being stripped too. Consider this sql that returns a result in our database:

select
id, query
from
qc_query
where
id = 157 and
replace( query, chr(13), chr(10)) ilike
'select
id, code, first_name, last_name, primary_role
from
users'

After formatting, it no longer works:

SELECT
 id,
 query
FROM
 qc_query
WHERE
 id = 157 AND
 replace( query, chr(13),
 chr(10)) ilike 'select id, code, first_name, last_name, primary_role from users' 

Also notice how it breaks on the comma in the REPLACE function parameters.

Kem… I’m following this as well, since I am writing the same function (my formatting “style” is different than Simon)
I don’t understand why removing EOL “breaks” you example above.
Oh wait… the EOL is INSIDE the quotes. ok… makes sense.

I didn’t even consider comments… :slight_smile:
So for that we have

  • /* */
    nothing else (all my SQL commenting was in PL/SQL only)

Also… for the record… when complete, my formatter will have

  • the ability to emit keywords as UPPER, or LOWER, or TITLECASE (user choice)
  • the ability to emit NON Keywords the same… so you can say KEY=Upper, NON=lower
  • obviously contents of quoted strings will not be violated (regardless of the type of quotes)
  • the formatting style CAN be altered by the user, since it uses a table to determine if EOL should be placed before, after (or both) for specific keywords

here is what mine did to your example

SELECT id, query
  FROM qc_query
 WHERE id = 157
   AND REPLACE(query, chr (13), chr (10)) ilike 'select
id, code, first_name, last_name, primary_role
from
users'

Sounds nice, looking forward to seeing it.

For the record, the “/* */” style of comments don’t have to be dealt with at all, just included as-is where found.

BTW, I compiled a list of every SQL keyword in a text file, if that helps.

yes it would…
SQLite specific? or SQL in general