I have some issues detecting some strings and apparently String.Uppercase or String.TitleCase provide same result which is confusing
so in my case i have the names in the format “FAMILYNAME Firstname” and i need to be able to separate the family name which are always CAPITAL letters and the First name which could be 1 or many and they could be Titlecase or small letters but never Capital.
As well FamilyName could be composed from 1 or many
I tried a lot of code and regex and it always fails . Can someone shed some light here please
Assuming you won’t get spaces in the FAMILYNAME, e.g., “FAMILY NAME”, this will capture the first string that is a all caps into SubExpressionString( 1 ) and the remaining characters after a space into SubExpressionString( 2 ).
Private Function SplitFullName(fullName As String) As Dictionary
// Create a dictionary to store the first name and last name
Var result As New Dictionary
result.Value("FirstName") = ""
result.Value("LastName") = ""
Try
// Debug: Log the input
System.DebugLog("Input FullName: " + fullName)
// Split the full name into parts by spaces
Var nameParts() As String = fullName.Split(" ")
Var lastNameParts() As String
Var firstNameParts() As String
// Define regex patterns for classification
Var rxLowerCase As New RegEx
Var ro As New RegExOptions
ro.CaseSensitive = True
rxLowerCase.Options = ro
rxLowerCase.SearchPattern = "^[a-z]+" // Matches fully lowercase words
Var rxTitleCase As New RegEx
rxTitleCase.Options = ro
rxTitleCase.SearchPattern = "[A-Z][a-z]+" // Matches TitleCase words
Var rxUpperCase As New RegEx
rxUpperCase.Options = ro
rxUpperCase.SearchPattern = "^[A-Z]+" // Matches fully uppercase words
// Iterate through each part and classify
For Each part As String In nameParts
System.DebugLog("Processing Part: " + part)
If rxLowerCase.Search(part) <> Nil Then
// Fully lowercase -> FirstName
System.DebugLog("Matched as LowerCase: " + part)
firstNameParts.Add(part)
ElseIf rxTitleCase.Search(part) <> Nil Then
// TitleCase -> FirstName
System.DebugLog("Matched as TitleCase: " + part)
firstNameParts.Add(part)
ElseIf rxUpperCase.Search(part) <> Nil Then
// Fully uppercase -> LastName
System.DebugLog("Matched as UpperCase: " + part)
lastNameParts.Add(part)
Else
// Debug: Log unclassified parts
System.DebugLog("Unclassified Part: " + part)
End If
Next
// Reconstruct the first name and last name from the arrays
result.Value("FirstName") = String.FromArray(firstNameParts, " ").Trim
result.Value("LastName") = String.FromArray(lastNameParts, " ").Trim
Catch e As RuntimeException
System.DebugLog("Error: " + e.Message)
// If something goes wrong, fallback to treating the full name as the first name
result.Value("FirstName") = fullName
End Try
// Debug: Output results
System.DebugLog("Final Parsed FirstName: " + result.Value("FirstName"))
System.DebugLog("Final Parsed LastName: " + result.Value("LastName"))
Return result
End Function
This would be basically the working code, if it can be done better i guess ideas are more than welcome here.
If you didn’t want ot use RegEx, you could split the line into words, then test each word’s characters. Assume a word is all Caps until a letter over Character Code (“Z”) is found. Gather the successes in one string and the failures in another.
I’m sure this can be optimized…
Var vName As String = "THIS IS A Test of First and LAST Names"
Var aWords() As String = vName.Split(" ")
Var aChars() As String
Var firstN, lastN As String = ""
Var hasLC As Boolean
For i As Integer = 0 To aWords.LastIndex
aChars=aWords(i).Split("")
hasLC=False
For j As Integer = 0 To aChars.LastIndex
If aChars(j).Asc>90 Then
hasLC=True
exit
End If
Next
If hasLC Then
firstN=firstN+" "+aWords(i)
Else
lastN=lastN+" "+aWords(i)
End If
Next
MessageBox("Last="+lastN+", First="+firstN)