Regex help!

Completely new to Regex(and Xojo for that matter) but am having to use it for this application at work as I was told it would be the easiest way.
I know my search pattern works because I tested it in RegExRx. However, I cannot figure out how to get the info I need to be displayed in the textField.

Note… I am trying to just get it in the Text Field for now, and will try to figure out how to get it added into a list box under the appropriate columns. But that’s for another day.

First - Everything works up to line 50. It displays the following info to the Text Box:
kMDItemColorSpace = “RGB”
kMDItemDisplayName = “getme.jpg”
kMDItemKind = “JPEG image”
kMDItemPhysicalSize = 16384
kMDItemPixelHeight = 208
kMDItemPixelWidth = 250
kMDItemProfileName = (null)
kMDItemResolutionHeightDPI = 96

After line 50, when I get into the Regex Search stuff, is where I’m getting lost.
What I want is to print the searched pattern to the text box. The pattern is simply taking the characters in-between the quotes and also any numbers, as I only care about that data. So it should output:
RGB
getme.jpg
JPEG image
16384
208
250
(null)
96

I added in the MsgBox for quick output. Each time, it pops up with nothing.

Here is my full code:

[code]dim sh as new Shell
dim openFile as FolderItem

dim fileName as String
dim modFileName as String
dim fileType as String
dim fileSize as String
dim imageColor as String
dim profile as String
dim dpi as String
dim dimensions as String
dim height as String
dim width as String
dim output() as String

dim imageCommands(9) as String

if obj.FolderItemAvailable then
openFile = obj.FolderItem

//shell commands
fileName = “-name kMDItemDisplayName”
fileType = “-name kMDItemKind”
fileSize = “-name kMDItemPhysicalSize”
imageColor = “-name kMDItemColorSpace”
profile = “-name kMDItemProfileName”
dpi = “-name kMDItemResolutionHeightDPI”
height = "-name kMDItemPixelHeight "
width = "-name kMDItemPixelWidth "

//Array of metadata attributes
imageCommands(0) = fileName
imageCommands(1) = fileType
imageCommands(2) = fileSize
imageCommands(3) = imageColor
imageCommands(4) = profile
imageCommands(5) = dpi
imageCommands(6) = dimensions
imageCommands(7) = height
imageCommands(8) = width

//Shell execution
sh.Execute("mdls " + Join(imageCommands, " ") + openFile.ShellPath)

dim s as string
s = sh.result

//THIS IS LINE 50 => //txtResults.text = s

//Regex stuff
Dim rg as New RegEx
Dim myMatch as RegExMatch
Dim result as String

rg.SearchPattern="(?mi-Us)""([^""])""|(\d)" //find everything inside quotes and any numbered strings
myMatch = rg.Search(s) //searches the returned shell string
result = myMatch.SubExpressionString(1) //No matter what number i put in here, 0-2, nothing works

if myMatch is nil then
MsgBox(“text not found”)
//txtResults.text=“Text not found!”
else
MsgBox(result)
//txtResults.text = result

end if
end if
[/code]

you must make a loop display the match until the match is nil you are at the end of the result list.
take a look at the example in the manual : https://documentation.xojo.com/api/text/regular_expressions/regex.html#regex-search

I had that code implemented as well and nothing happened . : -\

The problem is the pattern is matching between every character because the pattern after the alternator is \\d*, which means “zero or more”. Well, the space between two character counts as zero. :slight_smile: Change the * to a + and you should get better results.

"(?mi-Us)""([^""]*)""|(\\d+)"

(Note: I did not check the rest of the code. I defer to Jean-Yves in his additional suggestion.)

Thanks Kem- but changing it to a + does the same thing…In the case of the output string with multiple numbers, I want all of them anyway.

Can you repost your updated code?

Also, the way you’ve written the pattern, text between quotes will be in SubExpressionString( 1 ), whereas a number will be In SubExpressionString( 2 ). You have to check the SubExpressionCount to know which you should be accessing.

Or rewrite your pattern like this so the result will always be in SubExpressionString( 1 ):

"(?mi-Us)(?|""([^""]*)""|(\\d+))"

Using Kem’s suggested expression with this code, I get the results shown below:

[code]//Regex stuff
Dim rg As New RegEx
Dim myMatch As RegExMatch
Dim result As String

Dim s As String = TextArea1.Text

rg.SearchPattern = “(?mi-Us)”"([^""]*)""|(\d+)" // find everything inside quotes and any numbered strings

myMatch = rg.Search(s) //searches the returned shell string

Do
If myMatch <> Nil Then
TextArea2.AppendText(myMatch.SubExpressionString(0) + EndOfLine)

myMatch = rg.Search

End If
Loop Until myMatch Is Nil[/code]

[quote=424579:@Kem Tekinay]Can you repost your updated code?

Also, the way you’ve written the pattern, text between quotes will be in SubExpressionString( 1 ), whereas a number will be In SubExpressionString( 2 ). You have to check the SubExpressionCount to know which you should be accessing.

Or rewrite your pattern like this so the result will always be in SubExpressionString( 1 ):

"(?mi-Us)(?|""([^""]*)""|(\\d+))" [/quote]
I guess I need to read up more on the subExpressionStrings. Not really sure what that means.
In any case, I did change the pattern to match what you said and it worked! But my final question is why does it work with SubExpressionString( 1 ) AND SubExpressionString( 0 )?

In the regular expression world, SubExpressionString equals “subgroup” and captures what’s between the parenthesis in the order the parenthesis appear. If the matching pattern were code(b)©[/code], SubExpressionString( 1 ) would be “a”, SubExpressionString( 2 ) “b”, and SubExpressionString( 3 ) “c”. SubExpressionString( 0 ) is the entire match, in this example “abc”.

Awesome- that makes much more sense.
One last question @Kem Tekinay- is there a way to get the data back in a certain order?

I’m not sure what you mean. Example?

When it outputs
RGB
getme.jpg
JPEG image
16384
208
250
(null)
96
What determines the order that ^^ is returned in?

The original text determines the order.