Parse data from TCP response


I’ve just started using Xojo and am trying to work out the best way to evaluate the response I get from a video player. When I issue the “get clips” command to it, I receive the following response:

205 clips info:chr(13)chr(10)clip count: 1chr(13)chr(10)1: file 00:00:00:00 00:00:30:00chr(13)chr(10)chr(13)chr(10)

I need to get the number of clips, the clip id (in this case 1:), the complete filename and then the values 00:00:00:00 and 00:00:30:00. These will ultimately go into a listbox.

I guess I need to use some kind of regex here and I have tried using an online generator to make sure I was getting the right pieces, but when i copy the regex into xojo it didn’t work. Do I need to create a regex to capture the whole response so (205 clips info:)(.*) and then do another regex based on the result or can I do it all as one pass?



Try this:

dim cnt as integer
dim id as string
dim name as string
dim startTime as string
dim endTime as string

dim rx as new RegEx
rx.SearchPattern = "(?mi-Us)clip count: (\\d+)\\R(\\d+): (.*) ((?:\\d{2}:){3}\\d{2}) ((?:\\d{2}:){3}\\d{2})"

dim match as RegExMatch = rx.Search( sourceText )
if match <> nil then
  cnt = val( match.SubExpressionString( 1 ) )
  id = match.SubExpressionString( 2 )
  name = match.SubExpressionString( 3 )
  startTime = match.SubExpressionString( 4 )
  endTime = match.SubExpressionString( 5 )
end if

Amazing, thanks Kem that works a treat. I’d manage to fumble around and get quite close, but using (?mi-Us) at the beginning fixed it. I’m managed to work my way though all of the other options i needed now and get the results I need except one.

Sometimes the same header is returned but not all the data, so for example I get this back to one query

508 transport info: status: play speed: 100

but I can also get

508 transport info: status: play speed: 100 loop: true

or just

508 transport info: loop: true

I’ve built code508 transport info:\Rstatus: (\w+)\Rspeed: (\d+)\Rloop: (\w+)" [/code] which works for the middle option when all values are there, but doens’t for the other two. I can’t just reference them directly as status: is also used in other situations. Do I need to build a regex for every possible combination or is there another method of getting the values out into my variables?

This works. Note that I turned on free-space mode to make editing easier in RegExRX so the spaces have to be represented by their code \x20:


The structure code[/code] forms a non-capturing group, and ? makes something optional. Using the two, I made ever line after the first optional. You just have to see if there is a match in the groups that represent the value you want. For example, in the first example, there will only be matches in SubExpressonString( 0 ) thru ( 2 ). In the others, you will get all three, but in the last, there will be nothing in SubExpressionString( 1 ) and ( 2 ).

The mode block (?mi-Us) is how RegExRX copies the pattern for pasting and they represent options you could otherwise set through rx.Options. I prefer to use modes over rx.Options.

BTW, I copied the pattern as one line, this is how it’s actually written in RegExRX:


That makes it a bit easier to follow.

This is brilliant, thank Kem. Two days ago I hated regex, now that is all changing. It is so powerful.

How would i now get the data? I’m using

transportState = TransportMatch.SubExpressionString( 1) transportSpeed = TransportMatch.SubExpressionString( 2) transportLoop = TransportMatch.SubExpressionString( 3)
to get my values. I get an out of bound exception if there is no SubExpressionString(3) value. I though of doing an if statement, but should my results not appear in the same order then this would break.

How should I get the correct match from regex into the correct variable if sometimes there are two matches and sometimes three? Sorry if that’s an obvious question.

The order of the parenthesis define the order of the groups. (There is one exception, but it’s an advanced feature that we’re not working with here.) Use the SubExpressionCount to figure out how many matches there actually are. If match is not nil, there will always be at least 1, the main match, so SubExpressionString( 0 ) will have a value. If there are 2, SubExpressionString( 1 ) will have a value, and so on.

But whether there are 2, 3, or 4, you can rest assured that SubExpressionString( 1 ) will have the status, SubExpressionString( 2 ) will be speed, and SubExpressionString( 3 ) will be loop because that’s the order of the parenthesis. If loop is missing, there will only be 3. If loop and speed are missing, there will only be 2, but the order is always preserved.

If the status line is missing, for example, but loop is there, SubExpressionString( 1 ) will be empty. What’s more, SubExpressionStartB( 1 ) will be -1, indicating “no match”, but SubExpressionCount will be 4 because loop was, in fact, matched.

I hope this makes sense…

Thank Kem,

I’ve over it a couple of different ways and this is what i ended up with.

if TransportMatch.SubExpressionStartB(1) >0 then transportState = TransportMatch.SubExpressionString(1) end if if TransportMatch.SubExpressionStartB(2) >0 then transportSpeed = TransportMatch.SubExpressionString( 2) end if if TransportMatch.SubExpressionCount =4 then transportLoop = TransportMatch.SubExpressionString( 3) end if

So far it hasn’t broken, but I’m not convinced that it is quite the way you were suggesting.

I’d modify slightly:

    dim matchCount as integer = TransportMatch.SubExpressionCount
    if matchCount > 1 and TransportMatch.SubExpressionStartB(1) >0 then
      transportState = TransportMatch.SubExpressionString(1)
    end if
    if matchCount > 2 and TransportMatch.SubExpressionStartB(2) >0 then
      transportSpeed = TransportMatch.SubExpressionString( 2)
    end if
    if matchCount = 4 then
      transportLoop = TransportMatch.SubExpressionString( 3)
    end if