It can be arbitrarily complicated. It depends on how many fields that you have to deal with (First, Last, Middle, Phone, Birthday, SS# etc.)
The more fields that you have, the more sophisticated a strategy can be employed. It depends on how may people are in the database.
If you are dealing with large databases (100,000) range and if there is manual entry then there will be many many duplicates. It is easy to alert when the names are the same and the data entry is perfect. But the issue in real life is that people’s names “change” (marriage, divorce, nicknames etc.) Data entry is very imperfect when done by humans.
So consider doing lots of different evaluations to look for duplicates. Basically you are trying to create metric for “similarity”. Robert and Bob are more similar than Robert and David. Social Security numbers that are off by 1 or 2 numbers can be assigned a value for similarity (The last 4 digits much more important than the others) Birthdays that are “off” by one month or day are “similar”. Transpositions are common with manual entry (7/12/50 is similar to 7/21/50) etc. In my own work I developed a “score” to estimate how “similar” two entries were.
Then I could manually look at all cases with high scores.
There are algorithms to evaluate how “similar” two strings are. (for example Levenshtein Distance that you can look up in Wikipedia but I found custom code appropriate to the population at hand more successful and faster. There are algorithms that evaluate how similar names are in terms of sound. (Kathy/Cathy Jefferson/Jeffersen etc.)
If you have 100,000 people, comparing every entry against every other entry is very or perhaps prohibitively time consuming. I would tend to sort the people by many different criteria and then only compare adjacent or the matching group by that criteria. So I might have ten ways of sorting the database and then compare adjacent entries and give them a similarity score.
For example sort by the first 4 letters of the last name concatenated to the last 5 digits of the phone number. Or the birthday and the last 4 digits of the social security number etc. Then assign a score to the “similarity” of adjacent people and create a list of these pairs and ultimately sort them by their score. The true duplicates will tend to float to the top.
The population matters. Old white Americans, Frenchmen, etc. all will have algorithms based on names that would be different. Two entries with last name Smith are less likely to be duplicates that two entries with the last name Quismodo. A match between a “middle” name and a last name or first name and middle name is given “some” weight but not as much as between two last names.
Once you start finding the duplicates in the database, you can start refining the scores that you assign to various degrees of similarity.
Finding exact matches is not a real description of the task at hand. You build up a fairly complex bit of code to deal with this situation. Just keep refining your code as you start seeing the “results”