Its purpose is to remove all characters from the beginning of a string whose ASCII value is < 33.
function Strip(s as String) as String
dim temp As string
If s<>"" Then
temp=s
while asc(left(temp,1))<33
temp=right(temp,len(temp)-1)
if temp = "" then Exit // Deal with s consisting of only spaces.
wend
end
return temp
Try this one, i re-uses s and this may improve things:
function Strip(s as String) as String
#Pragma BackgroundTasks False
#Pragma StackOverflowChecking False
#Pragma NilObjectChecking False
If s <> "" Then
while asc(left(s,1)) < 33
s = right(s, len(s) - 1)
if s = "" then Exit // Deal with s consisting of only spaces.
wend
end
return s
Alternatively ask @Christian_Schmitz for a c optimized version or have @Kem_Tekinay look at this if he’s not near a sunny spot on the beach.
I would not modify the string in the “while”… instead just look for the first character you want to keep and count where it is located in the string… then use .mid or .right to remove all the unwanted characters at once.
I have an old Regex that I use to strip invisible strings (I think):
dim theString as String
dim searchpattern as String = "[^\x{0009}\x{000A}\x{000D}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}\x{10000}-\x{10FFF}]+"
dim rx as RegExMBS = New RegExMBS
if rx.Compile(searchPattern) and rx.Study() then
theString = rx.ReplaceAll(theXML.ToString, "")
end if
Somewhat surprisingly, this is not significantly faster on average:
If s<>"" Then
Dim slen as Integer = s.Length
Dim c As Integer
For c = 1 to slen
If Asc(mid(s,c,1)) > 32 then
Exit
end
next
If c > slen then
return ""
Else
Return mid(s,c)
End
Else
Return ""
End
I guess the time is being spent in Asc() and Mid().
If you are sure there are no double space in the right hand side of the string, you could split it with a couple space. Then the last member of the split array result will be the one without beginning spaces.
var s as string = " hello, world"
var spl() as string = s.split(" ")
msgbox spl(spl.ubound-1)
Here are 4 different ways. One of them may work better for your type of data . Of course use the pragmas to speed things up.
This is API 1 code:
Dim S, S1, S2, S3, S4 as String, i as Integer
For j as integer = 1 to 100
S = S + Encodings.ASCII.Chr(j Mod 32)
Next
S = S+"Some Text"
' Using Char Comparison
Dim SLen as Integer = S.Len
Dim FirstLegalChar as String = Encodings.ASCII.Chr(33)
For i = 1 to SLen
If S.Mid(i,1) >= FirstLegalChar Then
S1 = S.Mid(i)
Exit
End if
Next
' Using Char Binary Comparison
Dim SLenB As Integer = S.LenB
For i = 1 to SLenB
If S.MidB(i,1) >= FirstLegalChar Then
S2 = S.MidB(i)
Exit
End if
Next
' Using Split Char Binary Comparison
Dim CharArr() as String = S.SplitB("")
Dim ub as Integer = CharArr.Ubound
For i = 0 to ub
If CharArr(i) >= FirstLegalChar Then
S3 = S.MidB(i)
Exit
End if
Next
' using a MemoryBlock
Dim MB as MemoryBlock = S
For i = 0 to ub
If MB.Byte(i) > 32 then
S4 = S.MidB(i)
Exit
End if
Next
Break
BTW i’m pretty sure at one time in the stone age in some other version of BASIC (or in some other language I used ) Trim would have taken care of that…
You know back in the days when one used ASCII control codes and ASCII was all there was!!!
Public Function Strip(s As String) As String
Var mb As MemoryBlock = s
Var lim As Integer = mb.Size - 1
Var count As Integer = 0
For i As Integer = 0 To lim
If mb.Byte(i) < 33 Then
count = count + 1
Else
Exit For i
End If
Next i
Return s.Middle(count, s.Length - count)
End Function
I’m not a pro, but just counting bytes and using string manipulating only once should be quite fast.
EDIT: I just gave it a try and in my testing scenario my algorithm is just 10 to 20 percent faster than yours. So it’s not worthwhile…
Sorry if I wasn’t clear, @KarenA - I need to remove all offending characters from the start of the string, not just the first one, and I don’t know how many of them there are at the start of the string.
The set of possible unwanted characters is probably limited to tabs, spaces, and maybe CRs and LFs, but checking the ASCII value against 32 covers all of those.
I’d forgotten that you can split a string into characters using “” as the delimiter, thanks. Maybe iterating through an array will improve speed vs Mid().