Stumped by a Customer's Result...

Hey all,

I was working recently with a customer that was having some exceptions thrown due to utilizing the “.ToText” function to convert a string to Text format. One of the things my software does is connect to some specific devices on the local network and it reads information from them like firmware version, etc. The firmware version is always of the format A6.5.5 or something like that. So why that would cause an exception in the .ToText function was a mystery to me. So I put some Try/Catch blocks in the code and had the software email me whenever the exception occurs and provide me with what is in the “FirmwareVersion” property that is causing this issue. What I got back jut doesn’t make sense and I can’t see how it could possibly happen.

There is one single place in my code where I update a FirmwareVersion property of the device being read. It’s in a method, triggered by a timer from a TCPSocket. I parse the data with regular expressions looking for patters like Firmware Version, serial port baud rates, and other device info. Here is the code I use to look for the firmware version:

  rx.SearchPattern = "(?msi-U)^([A-Z]\\x20?(?:\\d+\\.){1,2}\\d*\\x20?[A-Za-z]*.*)\\R+
```
"
  
  match = rx.Search(MySockData)
  
  If match <> Nil Then
    FirmwareVersion = match.SubExpressionString(1)
    Dim fw() As String = FirmwareVersion.Split(EndOfLine)
    FirmwareVersion = fw(0)
    FirmwareVersion = Trim(FirmwareVersion)
  End If

Now, if you look at the code, whatever is picked up by the regex is then dumped into a If/Then block which splits any possible data received and you should end up with a one line string that has been trimmed of any white space. I don’t see how you could get anything else.

But this is what is being reported back to me by the app at the customer:

A6.4.12

```


       <!-- Screen Capture

       ----------------------------------------------------------------------------------->
       <fieldset>
               <legend style="font-weight:bold;">Unnamed 2G Transmitter  - Image Pull &trade;</legend>
               <img src="pull.bmp" width=320px height=180px style="margin:auto;display:block;"></img>
               <br />
               <button style="margin:auto;display:block;" onClick="window.location.reload(true)">Refresh</button>
       </fieldset>/usr/local/bin # astparam dump
CRC = 0xF4926749
default_gateway=192.168.200.25
web_ui_cfg=nevwaus
pull_on_boot=n
soip_type2=y
soip_guest_on=y
hdcp_always_on=y
no_usb=n
en_video_wall=y
seamless_switch=y
edid_rom_mode=HDMI
ip_mode=static
ipaddr=192.168.30.26
netmask=255.255.255.0
gatewayip=192.168.30.25
/usr/local/bin # astparam r soip_guest_on
n/usr/local/bin # astparam r soip_type2
y/usr/local/bin # astparam g soip_guest_on

-snip-

There’s actually a whole lot more that I’ve cut out.

When I use Kem’s RegExRX to test this, the expression works fine and I get the A6.4.12 back as SubExpression #1. But even if I got more than that, there’s a gazillion line endings in this so why would the code in the text block not give me just the first line? It seems 100% impossible. And there is nowhere else in code where this FirmwareVersion property gets set. Nowhere.

And I’ve only had this reported from this one customer.

App was compiled with Xojo 2017R3.1 and is running on Windows 7 in 64 bit.

And this doesn’t happen all the time. Only randomly it seems. I’ll get a bunch of emails all at once.

Not at my Dev.Desk currently. But i’d suggest to first replace all LineEndings by Unix LineEndings f.e.
I assume the Split is not recognizing the LineEndings as such.

a/ Would replaceallLineendings help ?
(What style of EndofLine are you getting fed here?)

Does the version always appear before the ’

' ?
b/ What about   (i know this is simplistic)
[code]dim interim as string =  match.SubExpressionString(1)
 FirmwareVersion = trim(mid(interim, instr(interim,"

") - 9),9 )[/code]

The line endings should all be Unix line endings as the devices we connect to run Linux.

Same sometimes happens in one of my apps while i process the reply of a DSL LineCard via SSH. Then i started to replace all LineEnding with Unix LineEndings and the issues went away.

Since this day, i allways replace all LineEndings in Text i receive from “outside of my app”, before i process the Text further.

[quote=399752:@Jeff Tullin]a/ Would replaceallLineendings help ?
(What style of EndofLine are you getting fed here?)

Does the version always appear before the ’

' ?
b/ What about   (i know this is simplistic)
[code]dim interim as string =  match.SubExpressionString(1)
 FirmwareVersion = trim(mid(interim, instr(interim,"

") - 9),9 )[/code][/quote]

So version is obtained by printing out an html file on the device:

/usr/local/bin # cat /www/version.html

```

Thu, 01 Mar 2018 12:26:17 +0800
796406601 204988 u-boot_c.bin
4103912027 3136240 uuImage
2700342294 13731840 initrd2m
A6.5.5

```

        <!-- Screen Capture
        
        ----------------------------------------------------------------------------------->
        <fieldset>
                <legend style="font-weight:bold;">Unnamed 3G Receiver  - Image Pull &trade;</legend>
                <img src="pull.bmp" width=320px height=180px style="margin:auto;display:block;"></img>
                <br />
                <button style="margin:auto;display:block;" onClick="window.location.reload(true)">Refresh</button>
/usr/local/bin #   

I just did this to one of mine. So that is what you get. Always, every time. Now, when I connect to the device, I send a number of commands one right after another to get other settings, etc. Then I let the parsing routine sort it all out. I have never had this sort of a problem with it in all of my testing. Just at this one customer. And regardless, the code should strip out anything.

And technically, the whole line ending thing should never come into play anyhow. If the RegEx gets the right value, it should never be anything but just the correct firmware value. I just added the other stuff as a “belts and suspenders” type of thing.

If you take my RegEx and the output above and put it into RegExRX, sub-expression 1 will be A6.5.5

[quote=399755:@Sascha S]Same sometimes happens in one of my apps while i process the reply of a DSL LineCard via SSH. Then i started to replace all LineEnding with Unix LineEndings and the issues went away.

Since this day, i allways replace all LineEndings in Text i receive from “outside of my app”, before i process the Text further.[/quote]
It’s a good idea. I’m still not sure how I am even getting to that point though!

easy :smiley:

First:

Dim s As String s = ReplaceLineEndings(FirmwareVersion, EndOfLine.Unix)

Later:

 Dim fw() As String = s.Split(EndOfLine.Unix)

Docs: http://documentation.xojo.com/index.php/ReplaceLineEndings

never assume

[quote=399758:@Sascha S]easy :smiley:

First:

Dim s As String s = ReplaceLineEndings(FirmwareVersion, EndOfLine.Unix)

Later:

 Dim fw() As String = s.Split(EndOfLine.Unix)

Docs: http://documentation.xojo.com/index.php/ReplaceLineEndings[/quote]

Yes, I know how to do that. I think you misunderstood me. Given the RegEx and the way the data comes in, I’m not sure why I am even needing to worry about EndOfLine characters and all. Given that I have a specific grouping and substring in the RegEx, it should nearly always never end up being anything but a few characters.

But yours is still a good idea and something I will implement.

Nobody has mentioned Encoding - could it be a string encoding problem?

Unlikely in this case. And even if the encoding is unknown (NIL), non ASCII Symbols are html encoded entities is his example.

But i am just assuming again, i do not know for sure and you may be right @Michael Diehr :slight_smile:

I don’t think it’s encoding. If it was, then there would be an issue with the RegEx being matched. And I’m confident it’s not as the device I am reading from is a known entity and I’ve already set the encoding type when I read the data from the TCPSocket.

Another silly suggestion, and more in keeping with what you are seeing here:

XML as a format doesnt need the end of lines, or the white space.
Is it possible there is NO endofline in the data supplied?
The XML would still work , but your code would fail.

[code]A6.4.12

<!-- Screen Capture  ----------------------------------------------------------------------------------->       <fieldset>[/code]

[quote=399781:@Jeff Tullin]Another silly suggestion, and more in keeping with what you are seeing here:

XML as a format doesnt need the end of lines, or the white space.
Is it possible there is NO endofline in the data supplied?
The XML would still work , but your code would fail.

[code]A6.4.12

<!-- Screen Capture  ----------------------------------------------------------------------------------->       <fieldset>[/code][/quote]

But I am not reading XML format or do anything like that.  I'm just reading output from a telnet connection.  The fact that I have to read an HTML file with some HTML code in it is secondary to anything.  And there are line endings.  

And again, I repeat - this code works 100% of the time for me and I've not had any issues anywhere else.  Maybe it's some oddball line ending thing but if you look at the regex, the only way you'd have anything other than the firmware version I want is if something get between the firmware version and the 

item. I suppose that could happen but the output doesn’t show that.

I’m pretty sure I had a similar issue at one point and it was due to splitting on EndOfLine. I had to do as others are suggesting: replace line endings with a particular line ending, say UNIX, and then split on EndOfLine.UNIX.

EndOfLine is a class and the string split method requires a string for the delimiter.

I think I have figured this out. I do believe it is a combination of line endings and a problem with by Regular Expression. I was looking more at the data that was received and put into the property. I realized that my code would remove the very last "

" from the string as part of the expression.  I took all the data, put it into RegExRX and added a return and 

at the end (this is not everything but enough to show the point):

A6.4.12

```


       <!-- Screen Capture

       ----------------------------------------------------------------------------------->
       <fieldset>
               <legend style="font-weight:bold;">Unnamed 2G Transmitter  - Image Pull &trade;</legend>
               <img src="pull.bmp" width=320px height=180px style="margin:auto;display:block;"></img>
               <br />
               <button style="margin:auto;display:block;" onClick="window.location.reload(true)">Refresh</button>
       </fieldset>/usr/local/bin # astparam dump
CRC = 0xF4926749
default_gateway=192.168.200.25
web_ui_cfg=nevwaus
pull_on_boot=n
soip_type2=y
soip_guest_on=y
hdcp_always_on=y
no_usb=n
en_video_wall=y
seamless_switch=y

<- SNIP - >

AUD_DLY_READ_CURRENT
CEC_guest
/usr/local/bin # cat /www/version.html

```

Thu, 12 Jan 2017 10:41:09 +0800
1664032386 135992 u-boot_h.bin
2750015725 2375376 uuImage
2772483838 9277440 initrd2m
A6.4.12

```

So in this case, the RegEx grabs the entire dump from the TCPSocket. It works fine on a Mac because the EndofLine function on a Mac assumes a Unix character. EndOfLIne on Windows does not! So Windows won’t catch it! I’ll need to use EndOfLine.Unix as the function and just for safety sake convert the line endings to Unix. I’ll also work on fixing the RegEx.

I just had a second customer install report this so it’s not unique to one customer any longer…

Changing the RegEx to:

code^([A-Z]\x20?(?:\d+\.){1,2}\d*\x20?[A-Za-z].)\R+

[/code]
solves the issue.