Split String help

Hi All,

I have the following string

0    /dev/disk0s2  931Gi   88Gi  842Gi    10% 23229661 220751081   10%   /

I need to split this string into an array which I am happy with:

  [code]Dim theShell As New Shell
  Dim theVolumesData(-1) As String
  
  theShell.Execute "df -h " + f.NativePath // RETURNS 0    /dev/disk0s2  931Gi   88Gi  842Gi    10% 23229661 220751081   10%   /
  theVolumesData = Split(ReplaceLineEndings(theShell.ReadAll, EndOfLine), EndOfLine)
  
  
  Dim anArray(-1) as String
  anArray=Split(str(theVolumesData(1))," ")[/code]

However, obviously this is splitting via a space, Is there any way of wildcarding the space because between each of the elements in the string there will always be a unknown amount of spaces due to the fact it is from shell.

Thank you

Several approaches here.

  • Get the MacOSLib package and take a look at the DiskUtil module within it. That will give you an alternate, object-oriented way to get disk information.
  • Get my M_String module and take a look at SplitByRegEX. (The pattern \\x20+ would do it.)
  • Perhaps easier, using Squeeze within M_String to reduce the multiple spaces to one each, then use ordinary Split.
  • Use a regular expression. The pattern ([^ ]+) +([^ ]+) +([^ ]+) +([^ ]+) +([^ ]+) +([^ ]+) +([^ ]+) +([^ ]+) +([^ ]+) + will break out each piece of data into its own subgroup.
  • Do it exactly as you’re doing it, but then run through the array, delete the empty elements, and trim the rest.

[quote=93097:@Steven Church]However, obviously this is splitting via a space, Is there any way of wildcarding the space because between each of the elements in the string there will always be a unknown amount of spaces due to the fact it is from shell.

[/quote]

Do a Result = ReplaceAll(Result," "," ") three or four times in a raw to reduce spaces to a single space, then split ?

Since you’re already in the shell, why not parse the spaces there?

I believe this should work:

theShell.Execute “df -h " + f.NativePath + " | tr -s [:space:]”

Oh magic shell, is there nothing you can’t do?

Well, it hasn’t helped me get a date with the cute swedish teller at my bank, but then again, I’m not that skilled with the shell yet.

…or with cute swedish bank tellers…

Oh cute Swedish bank tellers, … um…

I got nothin’.

Get with the shell, man!
:slight_smile:

Just to close the loop, this code will squeeze the spaces if you want to keep it all within Xojo code:

dim rx as new RegEx
rx.SearchPattern = "\\x20{2,}"
rx.ReplacementPattern = " "
rx.Options.ReplaceAllMatches = True

s = rx.Replace(s )

So that’s the equivalent of the ReplaceAll I suggested above… \\x20 means chr(&h20) ?

Yes. I could have just put in a space, but I like that form (especially on the forum) for clarity.

FYI, in regex, you can specify a code point as either \xNN or \x{NNNN}. The latter form will work for any number of hex digits so \x{a}, for example, would match a linefeed.

[quote=93191:@Kem Tekinay]Yes. I could have just put in a space, but I like that form (especially on the forum) for clarity.

FYI, in regex, you can specify a code point as either \xNN or \x{NNNN}. The latter form will work for any number of hex digits so \x{a}, for example, would match a linefeed.[/quote]

I have learned a tiny bit of RegEx today. Thank you :slight_smile:

[quote=93183:@Kem Tekinay]Just to close the loop, this code will squeeze the spaces if you want to keep it all within Xojo code:

rx.SearchPattern = "\\x20{2,}" [/quote]

Interesting. If I understand right, your search pattern looks for 2 spaces and replaces that by one. ReplaceAll has an issue when it encounters 3 spaces : it leaves two. So it must be run twice. Your code replaces three spaces by one in one pass.

That pattern looks for two or more spaces. If you follow a token with a {x} structure, that says, “repeating x times”. You can do a range with {x,y}, but you can also leave it open-ended by leaving off y, as I did above.

So in {2,} it says “replace {2, or as many as necessary}” ?

HOWEVER, due to a flaw in the native RegEx implementation, that pattern may take forever with the native RegEx class against long strings, so your way could be much faster.

With the MBS version (RegExMBS), there is no contest. In my M_String test project, where you supply a character set to squeeze, looping over a long string takes almost six seconds. With RegExMBS, it takes about 0.6 seconds.

Cross-post. :slight_smile: Yes, that’s what it means, 2 to infinite.

Neat.

Now, if I wanted to replace several characters, how could I indicate, for instance, to search for “this” instead of \x20 ?

I think I understand the string to replace it with is :

rx.ReplacementPattern = " "

So if I wanted to replace by “that”, I would go

rx.ReplacementPattern = "that" ?

Yes, it’s as simple as that.

rx.SearchPattern = "this"
rx.ReplacementPattern = "that"

Every letter, number, and almost every symbol is a token that represents itself.

[quote=93226:@Kem Tekinay]Yes, it’s as simple as that.

rx.SearchPattern = "this"
rx.ReplacementPattern = "that"

Every letter, number, and almost every symbol is a token that represents itself.[/quote]

You are right, it is quite simple. So if I wanted to apply the same search for two or more, I would go ?

rx.SearchPattern = "this{2,}"