Is there something wrong with this database?

Hiya,

I’m trying to process some data for work. I have converted the data (which was in CSV format) to an SQLite database using Norman Palardy’s tool (GitHub - npalardy/CSVParser: Event based and recordset like CSV parser for Xojo).

The database has a single table (Radiologists) and the full database has about 18,000 rows.

I was just trying a simple SQL statement:

SELECT * FROM Radiologists WHERE radID = 19;

but I get an SQL error saying that column radID doesn't exist. I have definitely spelled it correctly. As far as I can tell there are no weird characters in the column name.

Oddly, if I do an almost identical query but use the Values column the query works fine:

SELECT * FROM Radiologists WHERE Values = 6;

I have attached a massively edited version of the database with only 500 rows that exhibits the same behaviour.

I have tried querying the database both from within Xojo and using a third party tool (Base for macOS) and get the same error.

What am I doing wrong?

Radiologists.sqlite.zip (186.5 KB)

A simple look at your .sqlite file (as.txt) lead to:

As you can see, there is something strange with the radID column name. I need to fire a specific tool to really watch what happened. I have to search on my SSD for it and come back later.

2 Likes

You’ve got some dodgy characters in your Create statement.

Have you looked at it with the sqlite3 CLI tool? That can do csv import too.

This looks like that, but a further look denies that:

I copied the radID line from the right ListBox (image above) and paste it into TextEdit (macOS): no gremlin in the text. I will add a bread in the code and watch what can be seen in the debugger.

How strange. I ported the code from Norm’s repo to modernise it - this is the create table code in the Xojo debugger:

CREATE TABLE "Radiologists"(
"id" INTEGER PRIMARY KEY,
"radID" INTEGER,
"FRCR" INTEGER,
"Gender" TEXT,
"ReportingYear" INTEGER,
"Age" INTEGER,
"YearsWithMedica" INTEGER,
"Type" TEXT,
"Values" TEXT
);

Don’t see any weird characters. Very odd.

I’ll take a look at the SQLite CLI - didn’t even know there was one (DBs are not something I use very often).

That’s it:

1 Like

I use the free version of Beekeeper Studio. Runs on macOS, linux and Windows and supports MySQL, PostgreSQL, SQLite and SQL Server. You can even connect via SSH to a remote database:

I think the dodgy characters are either being stripped out are not visible. Try using the cursor keys to see if the cursor appears to stick when it should move.

If I open the database in SQLite Manager, copy the Create statement and paste it into BBEdit and then do a hex dump I see the following:

0000: 43 52 45 41 54 45 20 54 41 42 4C 45 20 22 52 61 	CREATE TABLE "Ra
0010: 64 69 6F 6C 6F 67 69 73 74 73 22 28 0A 22 69 64 	diologists"(¬"id
0020: 22 20 49 4E 54 45 47 45 52 20 50 52 49 4D 41 52 	" INTEGER PRIMAR
0030: 59 20 4B 45 59 2C 0A 22 EF BB BF 72 61 64 49 44 	Y KEY,¬"radID
0040: 22 20 54 45 58 54 2C 0A 22 46 52 43 52 22 20 54 	" TEXT,¬"FRCR" T
0050: 45 58 54 2C 0A 22 47 65 6E 64 65 72 22 20 54 45 	EXT,¬"Gender" TE
0060: 58 54 2C 0A 22 52 65 70 6F 72 74 69 6E 67 59 65 	XT,¬"ReportingYe
0070: 61 72 22 20 54 45 58 54 2C 0A 22 41 67 65 22 20 	ar" TEXT,¬"Age" 
0080: 54 45 58 54 2C 0A 22 59 65 61 72 73 57 69 74 68 	TEXT,¬"YearsWith
0090: 4D 65 64 69 63 61 22 20 54 45 58 54 2C 0A 22 54 	Medica" TEXT,¬"T
00A0: 79 70 65 22 20 54 45 58 54 2C 0A 22 56 61 6C 75 	ype" TEXT,¬"Valu
00B0: 65 73 22 20 54 45 58 54 0A 29                   	es" TEXT¬)

If you display the create stable statement with UTF8 encoding, you see nothing; but with a Macintosh encoding and Kevin hex too, you can see “garbage” before the Column name.

It’s not garbage; it’s valid UTF-8 character as follows:

U+FEFF  ef bb bf ZERO WIDTH NO-BREAK SPACE

I think all that @GarryPettet has to do is edit the first line of the CSV file to remove that in the header line, and re-import.

Display nothing because it uses UTF8 as the encoding.

Did I say it should display it? :wink:
I just mentioned it as a free database tool (o;

Don’t you see the quotes around “garbage” ?

In the context, it can be considered as garbage (or call it gremlin). It have nothing to be removed. But removing something you cannot see is hard.

It was invisible too in my software tool. I only explained why we see nothing where there is something to watch.

@GarryPettet can open the CSV file with a text editor, and remove not just radID in the header, but chars to the left and right of that, and then carefully enter those characters back. That way, the characters deleted will contain the NO-BREAK-SPACE.

Job done.

Emile is correct.

There’s invisible garbage inserted. Clean it up and everything will work as it should be.

Try this:
SELECT * FROM “Radiologists” WHERE radID = 19;

This would work if the table “Radiologists” has been created with the quotes (which means “hey, i am dead serious about the capitalization”).

Not the case. Emile found the problem. Probably came from some copy/paste content. Easy fix.

That’s not where the problem lies, though. The original CSV file from which the Radiologists table and its columns was created, has a UTF-8 non-breaking space as the first char of the radID column name in the header line of the CSV file. This can be removed by editing and the database recreated.