Advice on password substitutions

When creating a password, people will often make substitutions to make it a readable word while still using symbols. For example, they might use “pa55word” because the “5” looks like “s”. Other examples: “p4ssword”, “p4ssw0rd” (that’s a zero), etc.

I came up with a list of the potential substitutions that might be used:

“ah4@”, “be38”, “c\(\{\[”, “d\)\]\}”, “g9”, “il|1!”, “o0q”, “s5\$”, “vw”

So, “a”, “h”, “4”, and “@” are interchangeable, “b”, “e”, “3”, and “8” are interchangeable, etc. (Ignore the slashes, they are there because these will be turned into regular expressions.)

Have I missed any?

(I’ve created a method that will check a given password against the list of the 10k most common passwords, but want to include variations based on these substitutions.)

Some other common substitutions that come to mind include g6, t+7, z2

Quite right, thanks. Keep 'em coming.

/\/ / /\/ _/ /\

You get the drift. Not many people use the slash keys for N, I, M, A, V, W, J, L, H, U… but some do.

/\/
/
/\/\
/\
\/
\/\/
/
/

/-/
/_/

Best advice I can give: don’t.

Seriously, these things have already been well mapped out by crackers. They have tools that take dictionary words and try any number of crazy substitutions. These days, the only truly “secure” password is a completely random and long one.

Eric: Hmm, I’d have to do those manually. I’ll think about whether that’s worth it, but I thank you.

Thom: Your advice is don’t… what? Don’t check against the list of 10k passwords, or don’t try to eliminate based on variations? Since I’m just creating a tool for programmers, I’m not sure what I’m not supposed to be doing. I also don’t see the downside of restricting variations if I were to put this into practice.

Ah, sorry, I missed your last paragraph. I meant “don’t use substitutions at all”.

Oh, well, nobody is going to stop that, I think. :slight_smile:

I added both "" and “/” as aliases for “i” and “l”, “m” as an alias for “a” (to get those cases when someone did “AA” to mean “M”, and then added this code:

  // Substitute letters made from slashes
  // (longer to shorter)
  pw = pw.ReplaceAll( "\\/\\/", "w" )
  pw = pw.ReplaceAll( "/\\/\", "m" )
  pw = pw.ReplaceAll( "/-/", "h" )
  pw = pw.ReplaceAll( "/_/", "u" )
  pw = pw.ReplaceAll( "\\_\", "u" )
  pw = pw.ReplaceAll( "\\_/", "u" )
  pw = pw.ReplaceAll( "/_\", "u" )
  pw = pw.ReplaceAll( "\\/", "v" )
  pw = pw.ReplaceAll( "_\", "j" )
  pw = pw.ReplaceAll( "/_", "l" )
  pw = pw.ReplaceAll( "\\_", "l" )
  pw = pw.ReplaceAll( "/\", "a" )
  pw = pw.ReplaceAll( "_/", "j" )

Here is the code so far. The basic idea is that it takes the characters of the given string and turns it into a regular expression that is run against the 10k most common passwords. If it finds something, it returns it in the result. So “\/\/ord” would return “password”, for example.

The name of the method will probably change.


Protected Function VariationOn10K(pw As String) As String
  // Checks to see if a variation of the given password is on the 10,000 list.
  // Makes some common substitutions.
  
  static substitutions() as string = Array( _
  "ahm4@", "be38", "c({[", "d)]}", "g69", "il|1!/\", "o0q", "s5$", "t+7", "vw"  _
  )
  static allSubstitutions as string = join( substitutions, "" ).ReplaceAll( "\", "" )
  static subPatterns() as string
  if subPatterns.Ubound = -1 then
    redim subPatterns( substitutions.Ubound )
    for subIndex as integer = 0 to substitutions.Ubound
      dim group as string = substitutions( subIndex )
      dim chars() as string = group.Split( "" )
      for charIndex as integer = 0 to chars.Ubound
        dim thisChar as string = chars( charIndex )
        if ( thisChar >= "0" and thisChar <="9" ) or ( thisChar >= "a" and thisChar <= "z" ) then
          // Do nothing
        else
          chars( charIndex ) = "\\x" + EncodeHex( thisChar )
        end if
      next charIndex
      subPatterns( subIndex ) = join( chars, "" )
    next subIndex
  end if
  
  // Massage the password
  pw = pw.ConvertEncoding( Encodings.UTF8 )
  pw = ReplaceLineEndings( pw, "" ) // Shouldn't be line endings anyway, but just in case
  
  static squeezerRX as RegEx
  if squeezerRX is nil then
    squeezerRX = new RegEx
    squeezerRX.Options.ReplaceAllMatches = true
    squeezerRX.SearchPattern = "(?mi-Us)(.)\\g1+"
    squeezerRX.ReplacementPattern = "$1"
  end if
  pw = squeezerRX.Replace( pw )
  
  // Substitute letters made from slashes
  // (longer to shorter)
  pw = pw.ReplaceAll( "\\/\\/", "w" )
  pw = pw.ReplaceAll( "/\\/\", "m" )
  pw = pw.ReplaceAll( "/-/", "h" )
  pw = pw.ReplaceAll( "/_/", "u" )
  pw = pw.ReplaceAll( "\\_\", "u" )
  pw = pw.ReplaceAll( "\\_/", "u" )
  pw = pw.ReplaceAll( "/_\", "u" )
  pw = pw.ReplaceAll( "\\/", "v" )
  pw = pw.ReplaceAll( "_\", "j" )
  pw = pw.ReplaceAll( "_/", "j" )
  pw = pw.ReplaceAll( "/_", "l" )
  pw = pw.ReplaceAll( "\\_", "l" )
  pw = pw.ReplaceAll( "/\", "a" )
  
  dim chars() as string = pw.Split( "" )
  
  dim rx as new RegEx
  rx.Options.ReplaceAllMatches = true
  
  // Turn the password into a pattern
  for charIndex as integer = 0 to chars.Ubound
    dim thisChar as string = chars( charIndex )
    if allSubstitutions.InStr( thisChar ) = 0 then
      
      // Won't be a substitution so replace it with its value
      thisChar = "\\x{" + EncodeHex( thisChar ) + "}"
      
    else
      
      for subIndex as integer = 0 to substitutions.Ubound
        dim thisSub as string = subPatterns( subIndex )
        rx.SearchPattern = "[" + thisSub + "]"
        rx.ReplacementPattern = "[" + thisSub.ReplaceAll( "\", "\\\" ) + "]"
        thisChar = rx.Replace( thisChar )
        if thisChar.Len <> 1 then exit
      next
      
    end if
    
    chars( charIndex ) = thisChar.DefineEncoding( Encodings.UTF8 )
  next
  
  // Remove dups from the list
  for i as integer = chars.Ubound downto 1
    if chars( i ) = chars( i - 1 ) then
      chars.Remove i
    end if
  next
  dim pattern as string = "^.*" + join( chars, "+" ) + "+.*$" // Any of the characters may repeat
  
  // Now see if this pattern is within the 10K.
  dim r as string
  rx.SearchPattern = pattern
  try
    dim match as RegExMatch = rx.Search( kTenThousandString )
    if match <> nil then
      r = match.SubExpressionString( 0 )
    end if
  catch err As RegExSearchPatternException
  end try
  
  return r
  
End Function

Are you trying to calculate the quality of the password? In that case, do you know about the entropy calculation?

Anyway, I usually use easy-to-type passwords, e.g. I prefer letters in a row or in some pattern based on the location of the keys. Think of “qwert”. While that’s probably not in any dictionary nor matched by your above efforts, it’s still a very bad choice as a password, I’d think.

What I’m trying to say is: Isn’t your work a bit futile, as you’ll still get a lot of false positives, i.e. believe “asdfgh” is a rather good pw when it’s not?

No, this has nothing to do with “quality” of the password. There is a list of the 10,000 most commonly used passwords so I’m creating a tool that a programmer can use to disallow those specific passwords, including parts and variations. Any additional checks for “quality” would be up to the programmer, although I may add tools for that too.