String Parsing Question

Hi,

I am wondering what the best way to accomplish searching a packet for a specific binary match. I can do it with RegEx however I am struggling a bit finding the right Instr, Len, and/or Mid solution.

Say if you have this Hex value of the following:
FFFB 01FF FB03 0D0A 4444 44444 FFFB

I can’t seem to find the best methodology to match this pattern without RegEx. Any help would be appreciated.

Thanks!

Why don’t you want to use RegEx? This is precisely what it is designed to accomplish.

I was wondering too.

In any case, you should be able to find it like this:

[code]dim i as integer
dim stringLength as integer
dim findLength as integer

stringLength=Len(sourceString)

findLength=len(stringToFind)

dim foundMatch as boolean

for i=0 to stringLength-findLength
if mid(sourceString, i, findLength)=stringToFind then
'i is the position of the stringToFind within sourceString
foundMatch=true
exit
end
next

if foundMatch then
'celebrate
end[/code]

THanks Eric! I will give that a go.

I am having problems matching the binary using RegEx.

I am trying this pattern with no success. It works in RegExRx but not with the real packet binary.

(?<=FF)(.*)(?=FF)

Here is my whole RegEx Code snippet BTW.

  While Me.BytesAvailable > 0
    ReceiveBuffer = Me.ReadAll(Encodings.ASCII)
    
    IAC_RegEx = New RegEx
    IAC_RegEx.Options.Greedy = False
    IAC_RegEx.Options.MatchEmpty = True
    IAC_RegEx.Options.StringBeginIsLineBegin = True
    IAC_RegEx.Options.StringEndIsLineEnd = True
    IAC_RegEx.Options.TreatTargetAsOneLine = False
    IAC_RegEx.Options.CaseSensitive = True
    IAC_RegEx.Options.DotMatchAll = False

    IAC_RegEx.SearchPattern = "(?<=FF)(.*)(?=FF)"
     IAC_RegExMatch =  IAC_RegEx.Search(ReceiveBuffer)
    
    if IAC_RegExMatch <> nil then
       IAC_RegEx_HitText = IAC_RegExMatch.SubExpressionString(0)
    end if
  Wend

Ok I found my issue, but still struggling with the Lookahead/behind abit.

The following match pattern matches one byte of 0xFF or ASCII 255. I am trying to get this into my Pattern of .(?<=FF)(.*)(?=FF)

IAC_RegEx.SearchPattern = ChrB(255)

So far this isn’t working.
(?<=ChrB(255))(.*)(?=ChrB(255))

Ok this works and I am almost there. Thanks guys for kicking me in the right direction

IAC_RegEx.SearchPattern = "(?<="+ChrB(255)+")(.*)(?="+ChrB(255)+")" 

It matches correctly but I think I have to Turn on Greedy since it stopped matching at the first match.

With Greedy On it matched more than I wanted (FF’s in the middle).

Kem in RegExRx (Greedy Not Checked) it matches all of my patterns properly.

RegExRx Screenshot

But with my Code in Xojo it only matches match #1 on RegExRx. How do I get Xojo to keep matching? :wink:

Thanks again!!

Note that you can use \xFF to represent Chr(255) in your pattern.

But what exactly are you trying to accomplish here? You’ve got a complexish pattern, but your original message says that you were just looking to extract an exact binary pattern from some data. Is there a pattern to the extracted data that you need to match?

For binary data, you should be using the non-unicode InstrB, LenB and MidB functions.

Eric I am searching for all data that starts with 0xFF and ends with 0xFF basically.

Tim thanks! I had been using ChrB so far in my RegEx searching and when I had my Mid/Len code I was using MidB and LenB since I quickly found no matches without it :slight_smile: I am also using wireshark (sniffer) to verify the binary constructs are consistent.

Thanks again guys!

This is my current pattern that works on RegExRx but seems to only Match the first instance and stops on Xojo.

Eric also if you look at my link above it is a screen shot received packets in Binary format. Thanks again as your recommendation of “\xFF” worked and is MUCH easier to write as my Pattern text. :slight_smile:

  
  While Me.BytesAvailable > 0
    ReceiveBuffer = Me.ReadAll(Encodings.ASCII)
    
    IAC_RegEx = New RegEx
    IAC_RegEx.Options.Greedy = False
    IAC_RegEx.Options.MatchEmpty = False
    IAC_RegEx.Options.StringBeginIsLineBegin = True
    IAC_RegEx.Options.StringEndIsLineEnd = True
    IAC_RegEx.Options.TreatTargetAsOneLine = True
    IAC_RegEx.Options.CaseSensitive = True
    IAC_RegEx.Options.DotMatchAll = False
    
    IAC_RegEx.SearchPattern = "("+ChrB(255)+").*(?="+ChrB(255)+")........." 
    
    
     IAC_RegExMatch =  IAC_RegEx.Search(ReceiveBuffer)
    
    if IAC_RegExMatch <> nil then
       IAC_RegEx_HitText = IAC_RegExMatch.SubExpressionString(0)
    end if
    
  Wend

That’s how RegEx works: it only finds the first instance of your pattern. However, you can tell it to start at a specific position in the data to find the next instance by using the SearchStartPosition parameter of RegEx.Search:

s=“Eric Eric Eric Eric”

r.SearchPattern=“Eric”

'This will find the first Eric
firstEricMatch=r.Search(s)

'This will find the next Eric by starting the search at the end of the first match
secondEricMatch=s.Search(s, firstEricMatch.SubExpressionStartB(0)+Len(firstEricMatch.SubExpressionString(0))+1)

Better, put it into a loop:

    while IAC_RegExMatch <> nil
       IAC_RegEx_HitText = IAC_RegExMatch.SubExpressionString(0)
        // do something with the result
       IAC_RegExMatch = IAC_RegEx.Search // Will find the next match
    wend

Ooo, thanks, Kem! I didn’t know about that shortcut.

You just have to be careful with it. Some patterns will yield a zero-width match, so the internal pointer never moves forward. In other words, match is not nil, but SubExpressionString(0).LenB = 0. This is rare (and probably a bad pattern), but if your pattern does that, you’ll be stuck in an endless loop. In those cases, you’ll have to use a formula, similar to the one you outlined, to move the pointer forward.

An example of such a pattern is code(?<=FF).*(?=FF)[/code] run against the text “FFFF FF12FF”. You will never match “12” because it will never loop past the nothing between the first and second sets of “FF”.

If there is no chance your pattern will do that, you don’t need to worry about it.

Guys that is exactly what I was missing. I sincerely appreciate your help both Eric and Kem!

Thank you Eric I didn’t understand how the SubExpressionStartB worked until you pointed it out and I went through the LR again. Awesome.