Sorting through data in .csv file

[quote=179133:@Axel Schneider]try this

Test[/quote]

That works really well actually! How would I differentiate the different columns though?

So, let’s say I wanted to fetch data out of row 2, column 3.

Here’s a method I wrote a while back that converts an CSV String into an array. You can then easily get row 2 column 3 for example. It also handles quoted CSV files:

[code]Protected Function csvToArray(csv as string, deliminator as string = “”) As string(,)
dim sa(-1,-1) as string
csv = ReplaceLineEndings(csv,Chr(13))
dim rows() as string = csv.Split(Chr(13))
if rows.Ubound = -1 then return sa
//Clean out Blank Rows
For i as integer = 0 to rows.Ubound
if rows(i).Trim= “” then
rows.Remove(i)
i = i - 1
end
Next

//Determine Column Count
Dim cols as string = rows(0)
if deliminator = “” then deliminator = if(cols.CountFields(",") < cols.CountFields(chr(9)), chr(9), “,”)

dim Imbedding as Boolean = false
dim colcount as integer = 0
For i as integer = 1 to cols.Len
if cols.mid(i,1) = chr(34) then
if Imbedding and cols.mid(i+1,1) = chr(34) then //Imbedding and we have “” to escape a "
i = i+1 //Jump the extra double quote "
else
imbedding = not Imbedding
end
elseif cols.mid(i,1) = deliminator and not imbedding then
colcount = colcount+1
end
Next
colcount = colcount + 1

Redim sa(rows.Ubound,colcount-1)
For r as integer = 0 to rows.Ubound
dim row as String = rows®
Imbedding = false
dim curcount as integer = 0
dim e as String
For i as integer = 1 to row.Len
if row.mid(i,1) = chr(34) then
if Imbedding and row.mid(i+1,1) = chr(34) then //Imbedding and we have “” to escape a "
e = e+""""
i = i+1 //Jump the extra double quote "
else
imbedding = not Imbedding
end
elseif row.mid(i,1) = deliminator and not imbedding then
sa(r,curcount) = e
e = “”
curcount = curcount + 1
if curcount = colcount then exit
elseif row.mid(i,1) = “” then
i = i + 1
e = e+row.mid(i,1)
else
e = e+row.mid(i,1)
end
if i = row.Len then //reached the end early
sa(r,curcount) = e
exit
end
Next
Next
return sa
End Function
[/code]

Converting one line into an array of strings is some work and requires a lot of thinking.
The plugin offers it Brock just posted another code version.
If you can parse one row you can loop all lines in a file.
Put it in some data structure and sort it.

MsgBox Listbox1.Cell(2,3)

[quote=179159:@Axel Schneider] MsgBox Listbox1.Cell(2,3) [/quote]

That didn’t seem to work with your example you posted.

I put that code in after everything on the btn1 action event and it just came up with a blank message box.

Perhaps I am doing something incorrectly.

[quote=179141:@Brock Nash]Here’s a method I wrote a while back that converts an CSV String into an array. You can then easily get row 2 column 3 for example. It also handles quoted CSV files:

[code]Protected Function csvToArray(csv as string, deliminator as string = “”) As string(,)
dim sa(-1,-1) as string
csv = ReplaceLineEndings(csv,Chr(13))
dim rows() as string = csv.Split(Chr(13))
if rows.Ubound = -1 then return sa
//Clean out Blank Rows
For i as integer = 0 to rows.Ubound
if rows(i).Trim= “” then
rows.Remove(i)
i = i - 1
end
Next

//Determine Column Count
Dim cols as string = rows(0)
if deliminator = “” then deliminator = if(cols.CountFields(",") < cols.CountFields(chr(9)), chr(9), “,”)

dim Imbedding as Boolean = false
dim colcount as integer = 0
For i as integer = 1 to cols.Len
if cols.mid(i,1) = chr(34) then
if Imbedding and cols.mid(i+1,1) = chr(34) then //Imbedding and we have “” to escape a "
i = i+1 //Jump the extra double quote "
else
imbedding = not Imbedding
end
elseif cols.mid(i,1) = deliminator and not imbedding then
colcount = colcount+1
end
Next
colcount = colcount + 1

Redim sa(rows.Ubound,colcount-1)
For r as integer = 0 to rows.Ubound
dim row as String = rows®
Imbedding = false
dim curcount as integer = 0
dim e as String
For i as integer = 1 to row.Len
if row.mid(i,1) = chr(34) then
if Imbedding and row.mid(i+1,1) = chr(34) then //Imbedding and we have “” to escape a "
e = e+""""
i = i+1 //Jump the extra double quote "
else
imbedding = not Imbedding
end
elseif row.mid(i,1) = deliminator and not imbedding then
sa(r,curcount) = e
e = “”
curcount = curcount + 1
if curcount = colcount then exit
elseif row.mid(i,1) = “” then
i = i + 1
e = e+row.mid(i,1)
else
e = e+row.mid(i,1)
end
if i = row.Len then //reached the end early
sa(r,curcount) = e
exit
end
Next
Next
return sa
End Function
[/code][/quote]

And then with it in the array, how would I find the individual cell?

add thid to the listbox in the example

CellClick

if IsContextualClick then Return False MsgBox me.cell(row , column)

ConstructContextualMenu

base.append new MenuItem ("show Cell Text")

ContextualMenuAction

[code]Dim xValue As Integer
xValue = System.MouseX - Me.Left - Self.Left

Dim yValue As Integer
yValue = System.MouseY - Me.Top - Self.Top

Dim row, column As Integer
row = Me.RowFromXY(xValue, yValue)
column=Me.ColumnFromXY(xValue, yValue)

if hitItem.Text = “show Cell Text” then
MsgBox me.cell(row , column)
return True
end if[/code]

Then click with left or right mouse button in a cell

[quote=179192:@Axel Schneider]add thid to the listbox in the example

CellClick

if IsContextualClick then Return False MsgBox me.cell(row , column)

ConstructContextualMenu

base.append new MenuItem ("show Cell Text")

ContextualMenuAction

[code]Dim xValue As Integer
xValue = System.MouseX - Me.Left - Self.Left

Dim yValue As Integer
yValue = System.MouseY - Me.Top - Self.Top

Dim row, column As Integer
row = Me.RowFromXY(xValue, yValue)
column=Me.ColumnFromXY(xValue, yValue)

if hitItem.Text = “show Cell Text” then
MsgBox me.cell(row , column)
return True
end if[/code]

Then click with left or right mouse button in a cell[/quote]
I edited the program as you suggested. Still have issues.

With this file:

https://www.dropbox.com/s/fx0fkr5kvsmvejg/Example.csv?dl=0

I got the this read out from the example:

https://www.dropbox.com/s/r796vbqeoejm9i8/Screen%20Shot%202015-04-08%20at%201.46.32%20PM.png?dl=0

Which displayed this message box:

https://www.dropbox.com/s/jx6ewqw4aspcopg/Screen%20Shot%202015-04-08%20at%201.46.46%20PM.png?dl=0

Sorry brother. I appreciate all of your help!

your splitter is comma, type a comma in the splitter field

if you want to post a picture here, you can change the dropbox url

remove ?dl=0 at the end
and replace www with dl
(‘https://dl.dropbox.com/s/r796vbqeoejm9i8/Screen%20Shot%202015-04-08%20at%201.46.32%20PM.png’)

[quote=179197:@Axel Schneider]your splitter is comma, type a comma in the splitter field

[/quote]

Duh. Embarrassing. Thank you.

[quote=179184:@Meade Lewis]
And then with it in the array, how would I find the individual cell?[/quote]

dim results(,) as string = csvToArray(csvString) dim cell as string = results(2,3) //OR dim row as integer = 2 dim column as integer = 3 dim anotherCell as string = results(row, column)

[quote=179043:@Axel Schneider]it does not create columns (loading Workbook1.tab)
Xojo 2013

Xojo 2015 (additional trouble with encoding? )
[/quote]

OK
That code is doing the right thing
Everything enclosed in a [ ] IS one field from the CSV
Fun part is a CSV can have differing numbers of columns on each row :stuck_out_tongue: (yay !!)
It just concatenates the fields into one row
It doesn’t add a column for each field to the list box (although that’d be pretty simple

Change the code in the db like parser button to

  f = GetOpenFolderItem("special/any")
  
  if f is nil then return
  
  c = new CSVRecordSet(f)
  
  while c.EOF <> true
    s = ""
    
    listbox1.ColumnCount = max(c.FieldCount, ListBox1.ColumnCount)
    
    listbox1.addrow ""
    
    for i = 1 to c.FieldCount
      dbf = c.idxField(i)
      s = "[" + ReplaceLineEndings(dbf.StringValue,"<CR>") + "]"
      listbox1.Cell(listbox1.LastIndex,i-1) = s
    next
    
    c.MoveNext
    
  wend
  
  c.Close

A big plug for Norm’s CSV parser! I needed one earlier this week, happened across it in another forum post, tried it using Xojo2015r1, and it works great (as Norm’s code tends to do).

My CSV data has edge cases such as: quotes around some but not all fields, fields with embedded quotes and newlines, etc. About the only edge case not represented is variable fields per row. Norm’s parser swallows it all without a hiccup.

Prior to this I tried opening the file with Excel, which was great at parsing it, but terrible at saving it in a clean form. No such issues with Norm’s parser. Many thanks, and well done sir!

(For folks looking at the sample project, you don’t need everything in there. I just copied/pasted the CSVParser class into my project, implemented the Headers and NewLine events more or less as shown in the sample project, and fired it off as shown in PushButton1’s Action event. Easy.)

This is the code that I actually went with. It is an adjusted version of many of the examples that were given and checks for some errors that may occur. Axel really helped with his example so shout out to him!

Thanks y’all!

Some code has been removed to protect identity.

[code] dim textstring as string
dim tis as TextInputStream = TextInputStream.Open(openfile)
ListBox2.DeleteAllRows

while not tis.EOF
textstring = tis.ReadLine(encodings.UTF8)
ListBox2.AddRow
ListBox2.Cell(ListBox2.LastIndex,-1) = textstring.ReplaceAll( “,”, chr(9))
wend

if listbox2.cell(0,0) <> “" or listbox2.cell(0,1) <> "" or listbox2.cell(0,2) <> "" then
ListBox2.DeleteAllRows
MsgBox("
___”)
return
end

checkopenfields

// count number of Rows
If listbox2.listcount + openfield > 38 then
MsgBox("______________________________")
end

dim rows as integer = listbox2.ListCount

if rows < 2 then
MsgBox("_________________________________")
end

ComparePressures(rows)

rows = listbox2.ListCount

if rows < 2 then
return
end

PSIField.Text = listbox2.cell(1,6) //Pressure

if rows >= 2 then
textfield50.text = listbox2.cell(1,3).uppercase // Customer
textfield1.text = listbox2.cell(1,4).uppercase // Location
textfield2.text = listbox2.cell(1,0).uppercase //Job Number
textfield62.text = listbox2.cell(1,1).uppercase //Unit Number
TextField3.text = listbox2.cell(1,2).uppercase //Serial Number
TextField4.text = listbox2.cell(1,7).uppercase+" “+ listbox2.cell(1,5).uppercase+” “+ Listbox2.cell(1,8).uppercase+” “+ Listbox2.cell(1,10).uppercase+” “+ ListBox2.Cell(1,9).uppercase+” “+ ListBox2.Cell(1,11).uppercase+” “+ ListBox2.Cell(1,12).uppercase+” “+ ListBox2.Cell(1,13).uppercase+” “+ ListBox2.Cell(1,14).uppercase+” “+ ListBox2.Cell(1,15).uppercase+” "+ ListBox2.Cell(1,16).uppercase
end

[/code]

Release 2015r2 won’t compile Norm’s CSV parser. It complains that BinaryStream does not have a “ReadByte” member.

I changed all occurrences to ReadUInt8 and it seems to work fine.

But I’m surprised that I get an error, because I don’t see ReadByte on any list of deprecations, nor in the 2015r2 Release Notes. And it appears in the online docs as usual.

Is this a bug?

BinaryStream.ReadLong, ReadShort, ReadByte, WriteLong, WriteShort and WriteByte were actually deprecated back in 2006. That’s before the now-centralized deprecation page lists- but were only fully removed recently.

Aha! You are correct – “recently” being with the release of 2015r2 this week.

2015r1 compiles and runs all those methods without complaint, but if I Analyze Project, it lists them as deprecated and directs ReadInt8 (not ReadUInt8) as a replacement for ReadByte.

Thanks for the info.

I did say that code was old
Updated
Not bad - update things once every 8 years or so :stuck_out_tongue:

To load that project with 2015r1, I had to change its file extension from rb to rbp (else it crashed Xojo (infinite loop ?).

I get a consistent crashes when I click in the “stress test db parser” button.

At last, but it is certainly my fault (too stressed at the momment ?), I do not understand the used code / results… I must look things when I am cool, calm, no stress around and so on.

Update works great for me in 2015r2. Thanks Norm, see you in 2023! :slight_smile:

Wow, 2023 already!

Hey Norm, the CSV Parser isn’t happy with my UTF8 input file. (It has ® characters that get garbled.)

I think this is because the parser uses BInaryStream.ReadUInt8 to get a byte at a time. But UTF8 is a multibyte character set. So I think what I want is a way to read one character at a time, using a specific encoding (UTF8 in my case).

Is there such a thing? TextInputStream.Read has Count and Encoding parameters, but the docs say that Count is a byte count, not a character count. If that is a doc error, and the parameter is really a character count, then perhaps I can simply open the file as a TextInputStream and replace each call to BInaryStream.ReadUInt8 with a call to TextInputStream.Read(1, Encodings.UTF8).

Except that TextInputStream has no Position call, which CSVParser uses a bunch. Hmm…

Thoughts?