Average string length is 22 chars, so I unrolled 28 times and picked up a little more time - it all adds up!
Out of curiosity, can you let us know the time you started with and the time you ended up with? Did you get it as fast as you hoped?
(BTW I assume you used the Pragmas too)
-karen
Itās hard to say - when it was a separate method it was easy to measure because I could see the calls in the Profiler results. Now that itās inline, itās all mixed up with the parent method. I think overall my batch processing time has been reduced by about 10 percent.
Iām not seeing any speed improvement from the Pragmas.
Of course Iād like it to be faster, but I think Iām at (or past, lol) the point of diminishing returns with this piece of code, but added to some gains I achieved in other parts of the program overall itās much better than it was a week ago.
Thanks to all for your time, insights, and suggestions!
I havenāt gone through all the code examples, but with the initial code example, if you are trying to eliminate all < 33 characters, it assumes that once you hit one that is => 33 there wonāt be any more < 33 and you exit out.
I donāt know your data, so maybe you know that once you hit a character => 33 there wonāt be any more to remove.
That is correct. The goal is to eliminate all characters at the start of the string whose ASCII value is less than 33. Once we get a char 33 or greater, we are in the string proper.
Then, and Iām sorry if this was covered in one of the 43 previous posts, consider that rather than loop and erase, erase, erase, one character at a time, you just need to find the first char 33 or great and with one command, trim everything to the left of that position. So itās one search and one trim.
What the compiler does behind the scene in compiler code, and if some ānativeā action or Regular Function - ality results in faster execution, thatās out of my realm.
Remember, every year computers get faster, but humans may have to look at the code days/weeks/months/years later so there is something to be said for a little inefficiency traded off for more readability and understand-ability later on.
This is a great idea, but Xojo needs to search for all characters > 33 to find which one is to the left. In practice that many searches.
The Trim function removes whitespaces. Wouldnāt there be room for a new Xojo function to remove all unprintable characters? (I know āunprintableā might be a varying set depending on its definition, but Iād say characters 0ā31 and 127).
And how to you propose to do that? My initial suggestion was RegEx. Create the RegEx object and search pattern ahead of the loop over each iteration, then execute against each iteration needed.
TrimLeft accepts a ParamArray of characters to be removed. I tried it with an array of all characters 0-32 but it was slower than any other method by a factor of 3-5x.
I know Iām late to the chat but I was wondering if you tried something like this: (swiped some of KarenAās code) and if so, how did it compare?
Dim S, S1, S2, S3, S4 as String, i as Integer
Dim FirstLegalChar as String = Encodings.ASCII.Chr(33)
For j as integer = 1 to 100
S = S + Encodings.ASCII.Chr(j Mod 32)
Next
S = S + "Some Text"
Dim CharArr() as String = S.SplitB("")
Dim ub as Integer = CharArr.Ubound
While CharArr(0) < FirstLegalChar
CharArr.RemoveAt 0
Wend
S=Join(CharArr, "")
Yes I tried the array method, its speed is similar to that of the memory block suggestion but slower than the original using string functions, modified per Jimās suggestion.
Iām still curious how this would compare to using RegEx instead, especially if you pre-create the RegEx object ahead of your loop and call it within your loop and not abstract it to a called function.
We all have our experiences with different languages and ānativeā commands. I understand this is Xojo and admit that I donāt have a strong command of its (or third-party) string tools. But Iāve seen that most languages have somewhat the same feature set - though the syntax and function/statement names might be different.
Also, though one might eliminate an explicit loop by calling a single function, it might be that āunder the hoodā that function triggers its own loop to do the job. The faith is - the compiler code will be more efficient - like the difference between raising a number to a power in Assembler rather than BASIC.
Letās say we have an action - call it trim(here) just to give it a name - that loops off all characters to the left of a position on a string. So now we just have to find that position - a position, starting from the left, where the character is >= 33.
Now 33 is a lower bound. Iām guessing the user can guarantee an upper bound - like nothing greater than 126.
Iāve used languages that allow me to specify a domain of values (33 -126) and searching from left to right, returns the first position where the character is within that domain. That value, or that value minus 1, would be returned to the trim.
Now most likely ābehind the scenesā that search function generates a loop. But it should be a very efficient, complier level loop.
Iām thinking of something in the form of:
here = search ("! - ~",mystring)
It would look from left to right, and return the first position where a character in the domain of ! to ~ (thatās 33 to 126) appears. That location is passed to the trim function.
What Xojo has, or what any third-party plug-in has, or what RegEx has, that does that - I donāt know. But Iāve done it before (maybe in the procedure language of the Mac database Panorama) so Iām guessing the same functionality is available in the Xojo world.
My suggestion was to approach the problem that way, if possible, rather than explicitly looping from left to right at the user coding level.
Which is what virtually everyone has said, in one way or another.
Iām not sure another solution remains to be foundā¦
So am I
All you need to do is insert something like this in your code:
// Create RegEx object ahead of loop
var re as new RegEx
re.SearchPattern = "^[x\00-\x20]+"
re.ReplacementPattern = ""
// Iterate over source data;
' user code here for setting up variable "data" to trim
// Strip off leading data which is hex 00 to hex 20 (up to and including ASCII space)
var s as string = re.Replace(data)
Just instantiate the RegEx object and set the options ahead of your half million iterations. For each one, just use re.Replace({source string}) to left trim the values below ASCII 33.
In the search expression ā[1]+ā the parts simply mean:
- ^ = must occur at start of string; that is this performs a left trim only
- [\x00-\x20] = match any character in the range hex 00 to hex 20 (i.e. null to ASCII 32)
-
- = this pattern must occur at least once or nothing happens
Edit: in the last line, it should be a plus sign instead of degree symbol ā the forum software is changing it on me.
\x00-\x32 ā©ļø
And if you have MBS plugins, try the same thing with RegExMBSā¦
Many thanks, @Douglas_Handy, Iāll give it a shot as soon as time permits.
The Trim function accepts an array of characters to trim (Thanks Julia to having taught me that); it surely is exactly as you suggest (and has been tried, and isnāt the fastest).