Trying to get my head around the new framework

I’m having trouble converting RB code to the new Xojo framework. We have developed our own fileformat since 10+ years (so changing this is out of the question, we can as well change programming languages in that case).

With the old framework and the string/memoryblock we could easily write/parse our files.

With the new framework, besides being overly cumbersome, I’m not getting the results we want.

The following code does not make sence, but tries to demonstrate our problem. I can’t post the real code as it part of a huge engine and proprietary.

Situation:

' old RB string where defferent delimitors are used like the $ sign and chr(23)
  Dim OldString as string = "$Code1$Naam1$0" + chr(23) + "1" + chr(23) + "2" + chr(23) + "3"+ "$Code2$Naam2$0" + chr(23) + "3" + chr(23) + "2" + chr(23)  + "1"
  
' new framework Text version of chr(23) , I think
  Dim NewChr23 as Text = chr(23).ToText
  Dim NewChr23MB as xojo.Core.MemoryBlock = Xojo.Core.TextEncoding.UTF8.ConvertTextToData(NewChr23)
  
' convert the old RB string to the new frameworks Text
  Dim NewText as Text = OldString.ToText
  Dim NewMB as xojo.Core.MemoryBlock = Xojo.Core.TextEncoding.UTF8.ConvertTextToData(NewText)
  
' search in the memoryblock for the chr(23)
  Dim i as UInteger
  i = NewMB.IndexOf(0,NewChr23MB)
  
' get the result, minus 1 to get also the first char
  Dim NewResultMB as xojo.Core.MemoryBlock = newMB.Mid(i-1,  7)
' convert the memoryblock to Text
  Dim NewResult as Text = xojo.Core.TextEncoding.UTF8.ConvertDataToText(NewResultMB)
  
' Remove all the chr(23) chars
  Dim NewReplaced as Text = NewResult.ReplaceAll(NewChr23, "")

Up to the last line, the code seems to do what it needs to do and gives as result a Text containing “0” + chr(23) + “1” + chr(23) + “2” + chr(23) + “3”
After running the replaceAll, newReplaced becomes empty.

Questions:

  1. Why does the last statement not work and how could it be resolved?
  2. Occasionally, We noticed the result of NewMB.IndexOf() is some chars of when we use it in the newMB.Mid(i-1, 7) statement. Very hard to reproduce, and in this example it seems to work correctly. However, I suspect it must have something to do with the UTF8 encoding and probably accented chars. If this is the case, how can we handle this?

Thanks for looking into it.

NewReplaced is 0123 here.

Dim NewText as Text = TextOldString.ToText Dim i As UInteger = NewText.IndexOf(&u017) Dim NewResult as Text = NewText.Mid(i - 1, 7) Dim NewReplaced as Text = NewResult.ReplaceAll(&u017, "")

Before doing TextOldString.ToText it might be necessary to do the following, depending on the encoding:

TextOldString = TextOldString.DefineEncoding(Encodings.SomeEncoding).ConvertEncoding(Encodings.UTF8)

Thanks @Eli Ott for your feedback. Weird you’re getting the desired output and on my machine it is empty (I’m running it on Windows 64bit).

The encoding of OldString is already UTF8 according to the debug window.

I also tried your code, but the result of IndexOf(&u017) = 0

I know this looks weird going through the memoryblock in my code, but as mentioned, this is only mockup code to simulate what we need to do in the real thing. From what I understand of the new framework, it should work as I did it, however ridiculous it may look in the sample I gave :slight_smile:

The result of IndexOf(&u017) is 14 in my case. – it should be the same with you.

The new Text data type is better suited in your case than a MemoryBlock becuase you don’t have to check for diacritics and similar stuff. Text will do that automatically.

I sure hope so :slight_smile:

This code on Windows 64-bit:

  Dim OldString as string = "$Code1$Naam1$0" + chr(23) + "1" + chr(23) + "2" + chr(23) + "3"+ "$Code2$Naam2$0" + chr(23) + "3" + chr(23) + "2" + chr(23)  + "1"
  
  Dim NewText as Text = OldString.ToText
  Dim i As UInteger = NewText.IndexOf(&u017)
  Dim NewResult as Text = NewText.Mid(i - 1, 7)
  Dim NewReplaced as Text = NewResult.ReplaceAll(&u017, "")
  
  MsgBox NewReplaced

Result:

Can’t help you further, as I don’t have any Windows at hand currently and string encoding isn’t my strength…

Thanks for checking it out for me @Eli Ott! It may prove there is something wrong on the Windows side. Xojo techs to the rescue :wink:

Not sure how you read the file with the old framework BUT reading as text, converting to bytes, then converting back to text seems like a lot of work and could be inducing conversion issues.
That said I do not get i = 0 with your code
I see it as 14 and the only issue I encounter seems to be related to the last line’s replaceall
I’d file a bug report with this code as the sample

@Norman Palardy Thanks for posting the report. Yes I know it’s a bit of a stretch I’m doing here :slight_smile: Once we find the time to convert everything to the new framework the code will be a lot cleaner. We’re using some own string methods using memoryblocks and I’m trying to find some quick ways to convert it to the new framework without having to redevelop those functions.

And indeed, all seem to work well, except that IndexOf() and the last ReplaceAll().

I apologize for not having time to go through the entire thread and I know that this is just some test code, but I’d like to mention it could be simplified by using &u17 in place of chr(13). Using the Unicode literal allows the compiler to do constant folding of the string, avoiding having to do any work at runtime.

@Joe Ranieri thanks for the tip. An old GW-Basic habit I can’t seem to get rid of when I’m writing some quick snippets in any Basic language :slight_smile: At some point I used to have a module full of constants like ConstChr23 when I needed that extra millisecond, but &u17 looks much cleaner