Correcting problematic date formats

I’ve got various date formats originating from Windows and Mac which have proved problematic and would like to change these date formats as of when they are read.

The existing code I’m attempting to amend already performs some date corrections from the problematic into the desired format dd/mm/yyyy.

Below are the problematic formats:

dd.mm.yyyy

dd.mm.yy

dd-mmm-yy

d/m/yy

yyyy-mm-dd,

dd mmmm yyyy

Below is the code:

  dim filename, filenameA as string
  
  dim datesep As integer
  
  filename = "23.4.15"
  
  System.DebugLog("Filename content: " + filename)
  
  filenameA = filename + "-" +"05:49:00"
  
  System.DebugLog("FilenameA: " + filenameA)
  
  filename = filename +"05:49:00"
  
  System.DebugLog("Before correction: " + filenameA)
  
  If InStr(filename,"/")>0 then
    datesep=InStr(filename,"/")
    System.DebugLog("datesep == " + Str(datesep))
    filename.ch(datesep)=""
  end if
  
  If InStr(filename,"/")>0 then
    datesep=InStr(filename,"/")
    System.DebugLog("datesep == " + Str(datesep))
    filename.ch(datesep)=""
  end if
  
  If InStr(filename,":")>0 then
    datesep=InStr(filename,":")
    System.DebugLog("datesep == " + Str(datesep))
    filename.ch(datesep)=""
  end if
  
  If InStr(filename,":")>0 then
    datesep=InStr(filename,":")
    System.DebugLog("datesep == " + Str(datesep))
    filename.ch(datesep)=""
  end if
  
  If InStr(filename,".")>0 then
    datesep=InStr(filename,".")
    System.DebugLog("datesep == " + Str(datesep))
    filename.ch(datesep)=""
  end if
  
  If InStr(filename,".")>0 then
    datesep=InStr(filename,".")
    System.DebugLog("datesep == " + Str(datesep))
    filename.ch(datesep)=""
  end if
  
  If InStr(filenamea,"/")>0 then
    datesep=InStr(filenamea,"/")
    System.DebugLog("datesep == " + Str(datesep))
    filenamea.ch(datesep)="-"
  end if
  
  If InStr(filenamea,"/")>0 then
    datesep=InStr(filenamea,"/")
    System.DebugLog("datesep == " + Str(datesep))
    filenamea.ch(datesep)="-"
  end if
  
  If InStr(filenamea,"/")>0 then
    datesep=InStr(filenamea,"/")
    System.DebugLog("datesep == " + Str(datesep))
    filenamea.ch(datesep)="-"
  end if
  
  If InStr(filenamea,"/")>0 then
    datesep=InStr(filenamea,"/")
    System.DebugLog("datesep == " + Str(datesep))
    filenamea.ch(datesep)="-"
  end if
  
  If InStr(filenamea,":")>0 then
    datesep=InStr(filenamea,":")
    System.DebugLog("datesep == " + Str(datesep))
    filenamea.ch(datesep)="-"
  end if
  
  If InStr(filenamea,":")>0 then
    datesep=InStr(filenamea,":")
    System.DebugLog("datesep == " + Str(datesep))
    filenamea.ch(datesep)="-"
  end if
  
  If InStr(filenamea,":")>0 then
    datesep=InStr(filenamea,":")
    System.DebugLog("datesep == " + Str(datesep))
    filenamea.ch(datesep)="-"
  end if
  
  If InStr(filenamea,":")>0 then
    datesep=InStr(filenamea,":")
    System.DebugLog("datesep == " + Str(datesep))
    filenamea.ch(datesep)="-"
  end if
  
  If InStr(filenamea,".")>0 then
    datesep=InStr(filenamea,".")
    System.DebugLog("datesep == " + Str(datesep))
    filenamea.ch(datesep)="/"
    System.DebugLog(filename.ch(datesep))
  end if
  
  If InStr(filenamea,".")>0 then
    datesep=InStr(filenamea,".")
    System.DebugLog("datesep == " + Str(datesep))
    filenamea.ch(datesep)="/"
  end if
  
  System.DebugLog("After correction: "+ filenamea)

I need help with two bits of the code:

The first is to understand what [quote]filenamea.ch(datesep)="-"[/quote] means? does

Secondly below I’ve played around with the code to attempt correcting the first problematic date format I mentioned above i.e. dd.mm.yyyy or dd.mm.yy to dd/mm/yyyy.

This is how do it:

  If InStr(filenamea,".")>0 then
    datesep=InStr(filenamea,".")
    System.DebugLog("datesep == " + Str(datesep))
    filenamea.ch(datesep)="/"
  end if

How do you advise I attempt correcting this date format i.e. dd-mmm-yyy

Clarification: Are you attempting to locate the existing date in the filename and replace it with the current date/time, or are you attempting to locate the existing date and fix its order?

I’m attempting to locate the existing date basically from a list on the same screen and fix its order?

In the main code the filename variable is populated as follows:

filename = LogDetails.cell(0,1) 

Thanks for any suggestions

Sorry … Logdetails is a Listbox … So I want to correct the date in this variable to the non problematic format for my subsequent task

So, other than the order “yyyy mm dd”, every other expected order is “dd mm yy” and the separators are either space, ., /, or -, right?

I’d use two regular expressions and greatly simplify this code. If you’re open to that, I’ll give you code to start with.

thanks for the reply …

Ah regular expressions sound great

The expected order is dd/mm/yyyy so in essence whatever date format is read from LogDetails.cell(0,1) it needs to be changed to dd/mm/yyyy

Any regex to simplify my life would be really really appreciated

Well, here it is anyway. :slight_smile:

dim rx1 as new RegEx
rx1.SearchPattern = "\\b(\\d{4})([-./ ])(1[0-2]|0[1-9])\\g2(3[01]|[12]\\d|0[1-9])\\b" // yyyy mm dd

dim rx2 as new RegEx
rx2.SearchPattern = "\\b(\\d{1,2})([-./ ])(\\d{1,4})\\g2(\\d{4}|\\d{2})\\b"

dim y as integer
dim m as integer
dim d as integer
dim useRegEx as RegEx

dim match as RegExMatch = rx1.Search( filename )
if match IsA RegExMatch then
  useRegEx = rx1
  y = match.SubExpressionString( 1 ).Val
  m = match.SubExpressionString( 3 ).Val
  d = match.SubExpressionString ( 4 ).Val
else
  match = rx2.Search( filename )
  if match IsA RegExMatch then
    useRegEx = rx2
    d = match.SubExpressionString( 1 ).Val
    m = match.SubExpressionString ( 3 ).Val
    y = match.SubExpressionString( 4 ).Val
  end if
end if

if useRegEx is nil then
  // Couldn't find a match so deal with it
else
  // d, m, y have the values you need so format them into whatever replacement string makes sense
  useRegEx.ReplacementPattern = newDate
  filename = useRegEx.Replace( filename )
end if

I’ve tested the patterns but not this code as written so forgive any bugs I might have missed.

Sorry to bother you but with the 1st regex

rx1.SearchPattern = “\b(\d{4})([-./ ])(1[0-2]|0[1-9])\g2(3[01]|[12]\d|0[1-9])\b” // yyyy mm dd

How does this work (I’m still a newbie but I know from my java experience regex’s make life easier)

I’ll break it down for you.

\\b                    # a word break
(\\d{4})               # any 4 digits
([-./ ])              # any of - . / or space (the separator
(1[0-2]|0[1-9])       # the month from 01-12
\\g2                   # the same separator that was matched in group 2
(3[01]|[12]\\d|0[1-9]) # the day from 01-31
\\b                    # a word break

How about the following:

[code]dim rx as new RegEx
rx.SearchPattern = “(?mi-Us)(\d{1,4}).{1}(\d{1,2}|[a-zA-Z]+\.?).{1}(\d{1,4})”

dim rxOptions as RegExOptions = rx.Options
rxOptions.LineEndType = 4

dim match as RegExMatch = rx.Search( sourceText )
[/code]

This would find all Number values in strings starting with a 1-4 letters Number, followed by any Sign, followed by a 1-2 letters Number OR a Text followed by any sign, followed by a 1-4 letters Number.

This would recognize everything in
1.12.1980
01.02.80
2015-11-4
25-May-2014
2015/11/4
25-May-2014
Substrings 1 and 3 are filled with the day and year Numbers and Substring 2 may contain a month name or month number.

Then you will always get the Numbers (and Month name?) as Substrings and can handle them accordingly.

BTW: I did not want to bring a “better” solution compared to @Kem Tekinay solution. It’s just so easy to build RegEx with RegExRX for OS X, that i could not resist :wink:

[quote=181656:@Alex Monari]I’ve got various date formats originating from Windows and Mac which have proved problematic and would like to change these date formats as of when they are read.

The existing code I’m attempting to amend already performs some date corrections from the problematic into the desired format dd/mm/yyyy.

Below are the problematic formats:

dd.mm.yyyy

dd.mm.yy

dd-mmm-yy

d/m/yy

yyyy-mm-dd,

dd mmmm yyyy
[/quote]

Dates are such fun ! :stuck_out_tongue:
Is 01/03/04 dd mm yy, mm dd yy, yy mm dd
Thats an ambiguity that your code might not be able to determine on its own

Dim tempdate, tempcorrectdate as String

’ temp date
tempdate = “2015 02 15”
tempcorrectdate= “”

@ Kem T … Really appreciate the suggestions and as for my 2nd QST please ignore it …

I’ve broken down your code and I get how the regexs work but is there a reason as to why the System.Debuglog statements don’t print in my mssgs tab in the debugger

[code]
’ Regex meaning
'\b # a word break
'(\d{4}) # any 4 digits
'([-./ ]) # any of - . / or space (the separator
'(1[0-2]|0[1-9]) # the month from 01-12
'\g2 # the same separator that was matched in group 2
'(3[01]|[12]\d|0[1-9]) # the day from 01-31
'\b # a word break

dim rx1 as new RegEx
rx1.SearchPattern = “\b(\d{4})([-./ ])(1[0-2]|0[1-9])\g2(3[01]|[12]\d|0[1-9])\b” // yyyy mm dd

dim rx2 as new RegEx
rx2.SearchPattern = “\b(\d{1,2})([-./ ])(\d{1,4})\g2(\d{4}|\d{2})\b”

dim y as integer
dim m as integer
dim d as integer
dim useRegEx as RegEx

dim match as RegExMatch = rx1.Search( tempdate)

System.DebugLog("tempdate = " + tempdate)[/code]

@ Everyone:
Please ignore my last post. Dunno why I clicked it was an answer … Still integrating Kem Tekinay 's suggestion

@Norman Palardy:
I get your point … The Kem Tekinay 's suggestion deals with this

@Sascha S:
Thanks … I’ll give your suggestion a go if it doesn’t go well with Kem Tekinay 's suggestion

Just to be clear - Kems code will get you the digits
What you cannot tell using it is which piece is day, month and year unless they use a 4 digit year which allows you to rule out the 4 digit item as either month or day. But when month & day are < 12 then you’re guessing.
Hopefully correctly.

Strictly numeric forms can be horribly ambiguous.
Here in Canada it’s not uncommon to see mm/dd/yyyy and also dd/mm/yyyy with varying separators in different documents (cheques, letters, etc) and even digitally.

Long ago the company I worked for settled on using YYYYMMDD for a variety of reasons
Even when used as a string it sorts correctly and its unambiguous (all software was required to use this format)
They settled on this format before ISO 8601 was in use

All other forms from outside institutions (banks, edi partners etc) were converted to this internal form

According to his examples, only a four-digit year or a one- or two-digit day will appear first in the string.

10, 11 and 12 are not always a day though
Is 12/10/2014 dd/mm/yyyy or mm/dd/yyyy ?
You cannot tell except by some other data, which may or may not be present, or some other heuristic like assuming its dd/mm

Given the examples, that assumption would have to be made. Otherwise, without “just knowing”, how would you know?

But points for using “heuristic” in a sentence today. :stuck_out_tongue:

I don’t - thats exactly the point :slight_smile:
Its why I mentioned the issue in relation the the company wide edict that “dates will be represented this way” - and I do mean edict.
But we had to find out from each vendor what format their software was using & code the conversion for whatever format they used. Fortunately that code could be reused when there were duplicate formats.

It sucks but dates in random formats are horrible to deal with.

The only other one I’ve seen that is not ambiguous is one using three letter month names and 4 digit years
Like
10 Aug 2019
Aug 1 2013
2013 aug 4
2013 5 aug

if you permit 2 digit years, esp right now when the 2 digit year is also a valid day number, then its a mess as well like
14 aug 15
Is that Aug 14th 2015 or Aug 15th 2014 ?

And don’t get me on about timezones - they’re equally screwy and hard to deal with esp when it crosses years and locations.
Blech !

We should all just use stardates & be done with it :stuck_out_tongue: