String manipulation

Hello everyone,

I come from a C Background and I’m learning Xojo. I’m currently wrestling with string manipulation. It’s a dilemma that could be solved rather easily with pointers, but I must use other tools here.

Essentially, my problem is that I have a string like this: “5+10+11-2*5/3”. I need to split that string into an array of numbers and an Array of operators (+, -, …). When using split I encountered two problems. The first was that I had many spaces. For example, if each element of my array is represented by (): “(5)( )(10)( )(11)( ) …”. Using join and trim I was able to eliminate most of these, but not all.

What is the best way to do this? It’s my first day using Xojo and I’m fumbling in the dark a bit. I’m sure Xojo has a way to do this but I don’t know it.

Thanks

You can simulate the pointer approach, you can use the Mid() function to look at each character or you could use the Split function with an empty delimiter to get a string array but there seems to be a bug documented in the language reference about Japanese strings that may not affect your use. Either way you would walk through the string testing for numbers or operators as you would do in C/C++.

I would recommend using a MemoryBlock if speed is important.

Here is a basic method to get you started. It should parse a expression, such as the one in your question, into a string array.

Function SplitExpression(exp As String) As String()
  Dim tokens() As String
  Dim expMB As MemoryBlock = exp
  Dim pos As Integer = 0
  Dim state As Integer = 9 ' 0 = start, 1 = number, 2 = operator
  Dim currToken As String = ""
  
  while pos < expMB.Size
    
    if expMB.Byte(pos) > 32 then ' skip whitespace
      
      if (expMB.Byte(pos) >= 48) and (expMB.Byte(pos) <= 57) then ' number
        
        if state = 1 then 
          currToken = currToken + Chr(expMB.Byte(pos))
        else
          if currToken <> "" then
            tokens.Append currToken
          end if
          currToken = Chr(expMB.Byte(pos))
          state = 1 ' number
        end if
        
      else
        
        if state = 2 then
          currToken = currToken + Chr(expMB.Byte(pos))
        else
          if currToken <> "" then
            tokens.Append currToken
          end if
          currToken = Chr(expMB.Byte(pos))
          state = 2 ' operator
        end if
        
      end if
      
    else
      
      if currToken <> "" then
        tokens.Append currToken
        currToken = ""
      end if
      
    end if
    
    pos = pos + 1
    
  wend
  
  if currToken <> "" then
    tokens.Append currToken
  end if _
  
  return tokens
End Function

Example to use it:

Dim tokenArr() As String
Dim exp As String = "5+10+11-2*5/3"
  
tokenArr = SplitExpression(exp)

Thanks for the answers. They were most helpful.

[quote=30762:@Joshua Woods]Hello everyone,

I come from a C Background and I’m learning Xojo. I’m currently wrestling with string manipulation. It’s a dilemma that could be solved rather easily with pointers, but I must use other tools here.

Essentially, my problem is that I have a string like this: “5+10+11-2*5/3”. I need to split that string into an array of numbers and an Array of operators (+, -, …). When using split I encountered two problems. The first was that I had many spaces. For example, if each element of my array is represented by (): “(5)( )(10)( )(11)( ) …”. Using join and trim I was able to eliminate most of these, but not all.

What is the best way to do this? It’s my first day using Xojo and I’m fumbling in the dark a bit. I’m sure Xojo has a way to do this but I don’t know it.

Thanks[/quote]

Josh have you tried using Regular Expressions? I use RegEx expressions in Xojo quite alot and it works very efficiently when your search pattern is optimized for your requirements. I also use the ReExRx Tool to validate my search patterns which saves a ton of time and helps my pattern efficiency.

Probably a much more elegant solution to the problem.

I haven’t tried regular expressions. I’ll look into it though, thanks.

I managed it get it to work using Mid() but I’m always open to better solutions.

This expression will do what you need:

(\\d+)|([-+/*])

This is the code what would do it:

dim rx as new RegEx
rx.SearchPattern = "(\\d+)|([-+/*])"

dim match as RegExMatch = rx.Search( sourceText )
while match <> nil
  if match.SubExpressionString( 1 ) <> "" then
    numArr.Append match.SubExpressionString( 1 ) // It's a number
  else
    opArr.Append match.SubExpressionString( 2 ) // It's an operator
  end if

  match = rx.Search
wend

We can get fancier and account for the positive/negative (e.g., “+78/-98”) like this:

sourceText = sourceText.ReplaceAll( " ", "" ) // Remove the spaces

dim rx as new RegEx
rx.SearchPattern = "(?mi-Us)((?<=[-+*/]|^)[-+]?\\d+)|([-+/*])"

dim match as RegExMatch = rx.Search( sourceText )

while match <> nil
  if match.SubExpressionString( 1 ) <> "" then
    numArr.Append match.SubExpressionString( 1 ) // It's a number
  else
    opArr.Append match.SubExpressionString( 2 ) // It's an operator
  end if

  match = rx.Search
wend

Thanks Kem. This is going to be very useful in one of my upcoming projects.

Kem you are the RegEx expert by far! Using your application (RegExRx) I couldn’t get the pattern quite right :slight_smile: I always have issues with “repeating the rest of the text searching for the same match”.

Thanks for the post as “(?mi-Us)((?<=[-+/]|^)[-+]?\d+)|([-+/])” is what i needed also for future!