Safely storing user auth passwords in a database - COMMENT PLEASE

Jason_Doller · November 15, 2020, 12:02pm

What is the point of this post?
There are two desired outcomes of this post:

Approach confirmation / correction: Someone notices a problem with my approach and gives me an opportunity through suggestions or discussion to fix what I’ve done and make it better.
Community accepted good practice: My approach is confirmed by the community as a good way to implement an existing best practice, providing a cut-and-paste solution for people looking to implement this on their own.

Assumptions & Background

The best way to store a user’s password in a database is to hash it with a salt.
While using a single salt for all passwords is certainly more secure than no salt, it does open you up to people with the same passwords having the same hash.
You could combat this by using a salt made up of a system salt + the users username (or similar). This creates a problem in systems where the username may change regularly, or where you need to change usernames site-wide for some reason.
The best way to salt a password is to have a unique salt for each password, and store that salt with the hashed password. While this does take up more storage, it’s generally a relatively tiny amount of storage.
The salt doesn’t need to be secret, or secure, or difficult to guess, it only has to be relatively unique.

My Approach - Creating the salt

I decided that the easiest and fastest way to create the salt would be to use the current date and time.
- (If your site is busy enough that a timestamp would not offer enough granularity, consider adding the remote password to the timestamp to add uniqueness.)
While it is theoretically possible for people to guess the value used for the salt, the salt does not have to be secret, it just has to be unique.
Additionally, I then convert the timestamp to an MD5 hash (MD5 chosen because it’s cheap and collisions aren’t an issue), which I convert to hex, and use a part of the string (32 characters in my case).
- I do this to get a consistently long string that is unique enough.

This is the function I use to do this:

Public Function MakeSalt(SaltLength As Integer) as String
  Dim dDate as New Date
  Dim RetVal As String = format(dDate.TotalSeconds, "#")
  
  if SaltLength > 100 Then SaltLength = 100
  
  RetVal = MD5(RetVal)
  RetVal = EncodeHex(RetVal)
  
  Retval = Retval.Left(SaltLength)
  
  Return RetVal
  
End Function

My Approach - Hashing and storing the password

Once I have the salt, hashing the password is simple using PBKDF2.
I then convert the hash to hex
- Converting it to hex makes it easier to work with as a string.
- Storing the password as a hex string makes it trivial to reversibly disable a user’s password - add a non-hex string to the beginning or end of the password hash.
And then combine the hash and password to create the string that gets stored in the database.
- I use a delimiter between the salt and the password hash. This is a convenience only.
- It would be more secure to store the salt and password as a single string without a delimiter, requiring an attacker to know the length of the salt before they could start attacking the password.
  - I do not believe this increased security has any real-world benefit - security through obscurity seldom does.
  - Specifically, knowing the salt is computationally almost meaningless if they do not know the PBKDF2 iterations, and if they know the iterations they know the hash length and can deduce the salt.

This is the function I use to create the hashed password string for storage:

Public Function PasswordHash(password as String, salt as String) as String
  
  Var RetVal As String
  Var cryptIterations As Integer = 712
  Var cryptHashLength As Integer = 64
  
  If salt = "" Then
    salt = MakeSalt(32)
  End If
  
  Var tmpSalt As Text = salt.DefineEncoding(Encodings.UTF8).ToText
  Dim saltData As Xojo.Core.MemoryBlock  = Xojo.Core.TextEncoding.UTF8.ConvertTextToData(tmpSalt)
  Var tmpPassword As Text = password.DefineEncoding(Encodings.UTF8).ToText
  Dim passwordData As Xojo.Core.MemoryBlock = Xojo.Core.TextEncoding.UTF8.ConvertTextToData(tmpPassword)
  
  Dim hashData As Xojo.Core.MemoryBlock = Xojo.Crypto.PBKDF2(saltData, passwordData, cryptIterations, cryptHashLength, Xojo.Crypto.HashAlgorithms.SHA256)
  
  RetVal = MemBlockToHex(hashData)
  
  if password  = "" Then 
    RetVal = "Z" + RetVal + "Z"
    System.DebugLog("No password given.  Corrupting stored password. Good luck logging in!")
  end if
  
  RetVal = salt + ":::" + RetVal
  
  Return RetVal
End Function

My Approach - Authentic a user

Get the username / password from the user
Get the password string for the username
Extract the salt from the password string (made up of the salt [+ delimiter] + password hash)
Turn the provided password into a password string using the stored salt.
Compare generated password string to stored password string
- A match means a successful auth

This is the function I use for this:

Public Function PasswordCheck(username as String, password As String) as Boolean
  
  Var SQL As String = "SELECT * FROM users_active WHERE username = ?"
  Var RS As RowSet = userdb.SelectSQL(SQL, username)
  
  if RS = Nil Or RS.ColumnCount< 1 Then
    Return False  //No matching username (count < 1), or some other problem (RS = Nil)
  Else
    Var StoredPasswordString = RS.Column("password")
	Var StoredSalt As String = StoredPasswordString.NthField(":::", 1) // I use ::: as a delimiter.  You can use StoredPasswordString.Left(salt_length) if you do not use a delimiter.
    Var HashedPasswordString As String = PasswordHash(password, StoredSalt)  // This function is defined above
  
    if HashedPasswordString = StoredPasswordString Then
      Return True // Passwords match, auth SUCCESS
    Else
      Return False // Passwords do not match, auth FAIL
    End If
  
End Function

Summary

I use a timestamp converted to hex as a salt.
I store the salt with the password hash in the database.
Authenticating is as simple as reading the stored password, extracting the salt, hashing the offered password, and comparing to see if they match.

Suggestions or corrections, please…

Kem_Tekinay · November 16, 2020, 3:17pm

Keeping in mind that I only skimmed your post…

There has been much written on this topic, including quite a few posts on this forum. You might want to take a look if you haven’t already.

My comments/suggestions in no particular order:

Generally, your scheme is fine. For the salt, I recommend Crypto.GetRandomBytes (12 or more) and use the raw output in PBKDF2. Once you have a result, convert the hash and salt to hex. When pulling the data for comparison, convert the hex to raw bytes before processing.

You’re using the Xojo module to do the bulk of your work, but all these tools are available in the Crypto module. Conversion to a MemoryBlock is unnecessary with the API2 calls.

It’s unclear what your settings are for PBKDF2, but make sure they are sufficiently high so that it takes some time to hash. I recommend somewhere between 0.25 and 0.5 seconds.

PBKDF2 is good, Bcrypt is (supposed to be be) better, and Scrypt is (supposed to be) better than that. The latter are included in my free M_Crypto package.

I recommend storing a “version” column alongside the password. This will let you update or change your scheme in the future. For example, you may start with 1000 iterations for PBKDF2 (version 1), then decide later you want 2000 (version 2), then later use a different algorithm entirely (version 3).

Be sure to test against a blank password column to make sure it fails authentication.

Kem_Tekinay · November 16, 2020, 3:28pm

One more thing. No matter how secure the scheme on your end, it will likely fail if the user uses a bad password. Create some minimum rules for the password, like a minimum length, etc.

Also, there is an available list of the 10k most common passwords. Do not allow your users to use those or close variations of those.

Jason_Doller · November 16, 2020, 4:36pm

I’ll certainly look at Crypto.GetRandomBytes. In fact, I’ll taker a look at the entire Crypto module a little more closely.

Why do you suggest converting the hex to raw bytes for the comparison?

I’m only recently getting back into Xojo after about 6 years away, and I’ll be honest - the documentation is confusing and seems to be lacking. That could be a result of me being out of it for a while, but I am feeling a little sorry for people who are new to Xojo, but not programming in general.

Between 0.25 and 0.5 will be next to impossible to calculate. My dev machine has 12 3GHz cores and 64GB RAM. The VPS this particular app will be run on has 2 2GHz cores and 4GB RAM. I have no idea the spec of the machine an attacker may use, although it’s a fair guess they will be using something GPU assisted

The Iterations I’m using in the above is currently 712, with a hashlength of 64. I couldn’t find a formula to give an estimate of hashing times given an input processor, but sometime in the next few days I’ll create some kind of measurement app and see what that shows.

Bcrypt is better, but not better enough that I’d consider switching. Scrypt is, but one of the primary benefits of Scrypt - it’s high memory utilisation - is also a weakness. If using less than 4mb, Scrypt is actually worse than Bcrypt (higher resource usage for equivalent or less strength). And for a web server, you really don’t want your server using a large chunk of it’s RAM for logins (Think denial of service attack). (Please note that my criticism of Scrypt is solely as a password storage solution).

Given that one of the things I hoped to accomplish with this is discussion about the best ways to implement this, I would be very interested in other viewpoints on this - I think PBKDF2 is good enough. We are securely storing passwords to be safe IF we are compromised, how secure do we need to be? (I know the answer to this ‘how often will you need to authenticate’ and ‘how secure do you need your passwords to be?’ and ‘what is your server configuration’). How do you make a reccomendation?

Would you be opposed to storing that in the password field? I would suggest storing all this information in a single field and extracting it using length or character delimiters since all the information is only needed together.

If I understand what you are suggesting, I deliberately corrupt blank passwords by adding a non-hex character to the hex string.

Summary

Look at Crypto.GetRandomBytes vs Timestamp
Look at Crypto in general to see what I could be using
Create a tool that measures hashing time.

Suggestions for password complexity and checking for bad passwords is valid and important, but not in the context of password storage. But here, just because I can.

Jason

Kem_Tekinay · November 16, 2020, 4:48pm

There might be a miscommunication here. I am suggesting that hex only be used for storage, otherwise converted to raw bytes for calculations. Whether you ultimately validate against the hex or the raw bytes is of no consequence.

Jason_Doller · November 16, 2020, 4:54pm

I apologise, I did misunderstand.

Jeff_Tullin · November 16, 2020, 4:57pm

I only skimmed this.
But I came away thinking

"Haha! I have stored away stuff in a REALLY secret place.
And this is exactly where I hid it."

Hopefully your actual algorithm is different and/or nobody who gets their hands on your database reads this forum.

Kem_Tekinay · November 16, 2020, 5:03pm

If I didn’t get the joke, that’s on me, but I’m curious why you’d think so?

Jeff_Tullin · November 16, 2020, 5:24pm

Only that each stage is followed by actual code showing how stuff is encrypted.
If I was trying to secure something, I’d ask , but not show specifics.
Don’t mind me.

Kem_Tekinay · November 16, 2020, 5:31pm

Not encrypted, hashed, and there is nothing really out of the ordinary here. If his technique relied on keeping something other than the password itself secret, it would be a problem, but the principle of effective security is that you should be able to publish everything about it short of the password and still be able to sleep at night.

The bad actor would still need access to the database anyway, and presumably that’s protected too.

Thom_McGrath · November 17, 2020, 12:18pm

If your security relies on somebody not knowing how it works, then it isn’t very secure. See Kerckhoffs’s principle. The security should still work even when everything about the system is known. Except the private keys, of course. So if you really want your mind blown, my app is fully open source. All code, database schemas, installed scripts, art files, for both the app and its website, are available on GitHub. And yet, the security still works.

I would be. There’s little purpose to storing everything in one field, other than it seems familiar. There’s no advantage over using separate database columns for each value. One could argue there is an advantage to splitting the values because it’s easier to review in the future. On the other hand, somebody attacking your hashes would need to write custom logic to break the values apart. I wouldn’t count on that as defense though.

Jason_Doller · November 17, 2020, 6:36pm

Back-to-front response.
Security through obscurity is only a problem if obscurity is your primary defense, since if that goes you are wide open. Using obscurity on top of a proper security implementation at worst does nothing, at best makes things more secure. If the stored hash has no obvious delimiter (i.e. width delimited) then the attacker would have no real way to tell what is password hash and what is metadata, requiring them to not only compromise the app runtime AND the database, but also reverse engineer the runtime to find the delimiter widths. Security is not absolute, it’s about making things so hard the attacker would rather go elsewhere.

Second part - one string. Assuming Xojo doesn’t completely suck at string manipulation, the cost of splitting a string vs getting an additional field from the database is negligable. It’s also easier to pass a single string around than multiple strings. With that said, easing code review is important generally, and also with regards to security. I’ll have to think about why I prefer a single string.

Finally, just to make it clear, all the comments so far have been helpful, and have challenged me to defend or reconsider my approach. I’m grateful regardless of how I appear to respond.

system · May 19, 2021, 6:36am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.