Perhaps you, too, have been advised something similar and have been wondering what the right thing to do really is. So let's explore this dilemma.
Usually, when you want to implement this measure (not revealing the existence of an email on the password reset page), the purpose is to prevent potential attackers from:
- Building a list of valid emails (to spam them later or to mount a phishing attack).
- Finding out who is registered on your website.
Especially the latter is quite dangerous because sometimes, just knowing that you are registered somewhere might put you at risk. Or, once an attacker discovers a valid email address on one website, they can try to use it to gain access elsewhere — perhaps somewhere less secure.
But why exactly do you want to implement this measure? What's the issue and the benefit?
The "why" is that you have identified a specific risk, either to your business or to your users, and you want to mitigate it for the benefit of keeping your users safe(r). But if you just fix the password reset form, you won't really achieve your goal, will you?
To fully mitigate this, your first step should be fully identifying what the risk is (done!) and then you can work your way through to a technical solution. If you do this, I am sure you will discover many places where you expose the existence of an email address; for example, the registration form, password reset form and, for logged-in users, don't forget the email change form, if you have it. Remember, a registered and logged-in user can also be an attacker!
Once you understand this, it becomes obvious that the original advice to not reveal valid email addresses on the password reset form is sound but largely incomplete; to mitigate the risk entirely, you need to identify and fix all the places where it exists. Otherwise, there truly is no point.
Let's take a look at other common pieces of "good advice" you might find online with little to no context as to why you might want to implement them.
"Hash your users' passwords with a high number of rounds."
I'd assume everyone nowadays knows that we should be hashing our users' passwords instead of storing them in plaintext in the database. But why exactly are we doing this, and what is the adequate number of rounds to use?
The goal of password hashing is to make sure that nobody except the user knows what the actual password is. You might argue that protecting the database well enough would also accomplish this, but don't forget that database administrators exist! These people should also not be able to see the passwords, as not all database administrators have purely good intentions. Also, accidents happen; a seemingly innocuous SQL script might accidentally leak the unprotected plaintext passwords. That's why hashing the passwords is the only safe way to store them.
Now, why do we need to use a hashing algorithm that takes ages (in CPU time) to compute instead of just md5ing it and reducing our carbon footprint? If the risk is revealing your users' passwords, then the mitigation for that must be to make the process of reversing a hash into the original password impossible. Since we cannot really achieve "impossible" here, we need to make it prohibitively difficult for the attacker — but not too difficult to verify when a legitimate user tries to log in.
The theory is that the stolen password hash must withstand a direct attack only long enough for the user to be notified about the leak and for them to change the password, rendering the stolen hash worthless. This is also the reason why you should not simply increment a counter in your password when you change it, but really change the whole thing. If the attacker sees a pattern in your old password, they might try a few other similar combinations and get lucky.
"Use a strong, non-default database password."
Some databases ship with a default or empty password. Fortunately enough, nowadays most of them initially restrict access to localhost-only clients, which makes things a bit more secure. However, if you only allow remote connections but forget to change the default password, you are in for some big trouble! That's why you will see this advice everywhere — and it is good advice, indeed! But don’t forget that your database will most likely be running on some kind of server. What about that server's password? Have you changed the default account's password on the server as well? Or, in the case of a cloud-based infrastructure, how about the cloud account's password?
Recently, it has become more and more common to use a managed database service (like Amazon's RDS or Google's Managed MySQL), which will enforce a custom or randomly generated database password upon installation. And the same goes for managed servers — for example, Amazon's EC2 instances will set up and only allow a public key-based login when you create them, which is the most secure type of ssh authentication. Still, it is a good idea to check all the possible access points that lead to your database or other important pieces of infrastructure, and to make sure that all the accounts are well-protected and all passwords are strong ones. You can start from the database and make your way through the infrastructure all the way up to the cloud provider's accounts.
And About Password Resets…
When a user requests a password reset, some applications generate a relatively short random string or token, send it to the user's email address and, when the user provides that token back to the application, they are allowed to change their password. What is sometimes forgotten is that this short token is also a means of logging in, therefore it should be subject to the same rigorous security measures as the main login flow itself.
Imagine this: You generate an 8-digit password reset code and send it to the user's email address as a clickable link. The password reset endpoint allows an infinite number of attempts on the reset code because you forgot to add any kind of rate limiting on the endpoint, or the reset code never expires. The attacker tries to reset someone's password — they don't receive the link, of course, but now they can try to abuse your password reset endpoint to brute-force just the eight digits instead of a regular password, which is substantially easier because there are way less possible combinations than in a regular password. Plus, there is very likely no hashing process involved, so the backend responds blazingly fast to each request. This is called a downgrade attack.
For the best security, the password reset token should really be considered a "password" — you generate a long-enough random string, hash it with the same algorithm you use to hash passwords and then store the hash in the database. At the same time, the number of attempts to guess the token should be limited to only a few tries and, if they all fail, the password reset token should ideally expire. Now both the regular password and the password reset code are equivalent in terms of cracking or guessing difficulty, meaning that an attacker does not gain an advantage in any of the two login procedures.
If you’d like to keep your password reset codes short and easily memorizable, the best thing you can do is to severely limit the number of requests on the password reset endpoint — for example, only three attempts per hour. That way, the attacker will have to wait too long between requests to make a brute-force attack feasible.
Defending Is a Lot Harder Than Attacking
Application security is a complex topic and you as the "defender" have a very difficult job to do. An attacker has it much easier as they only need to find but one weak point, whereas you need to find and (ideally) fix all of them. Hopefully, this article helped guide your thinking about some of the concepts so you can better identify weak spots and implement complex security measures to mitigate them.