There are definitely times when you are handling confidential data and want to make sure that data stays confidential. This type of data could be anything from credit card numbers to social security numbers. Typically what is done in the real world is that a part of the data is replaced with a symbol (usually an asterisk), which allows the user to make an educated guess that the data is correct and prevents others from viewing the confidential data.
In this post I will show you how you can anonymize your data through an Excel formula and through a VBA macro.
Anonymize via Excel Formula
For example purposes let’s take a made up social security number (123456789) and try to make it partially anonymous by only showing the last 4 digits.. To do this we will need to use three different functions:
- RIGHT( ) – Takes a given text or number and return the rightmost specified amount of characters. For example =RIGHT(“Hello”, 2) would return “lo”
- REPT( ) – Give this function a value and tell it how many times to repeat it. As an example, =REPT(“A”,5) would return “AAAAA”
- LEN( ) – This function outputs the character length of a given string. If you wanted to find out how many letters were in the word Mississippi, you could write a formula stating =LEN(“Mississippi”) and the LEN function would return 11.
Now that you understand the three functions we need to use, let’s put them together so we can anonymize our social security number. Below is a picture of how the functions can be combined to only show the last four digits of the social security number.
Anonymize via VBA
I will not go into too much depth on how this macro works, as conceptually it is very similar to how the Excel formula in the previous section workes. I will note that the below VBA macro code actually overrides the original data. Of course you could modify the code slightly to place the anonymized data in the next column over by incorporating an Offset function to the "cell" variable (ie cell.Offset(0,1).value = ...). Also, at the beginning of the code there are inputs for:
- Deciding whether you want to anonymize the end or beginning of the data string
- How many characters you want shown
- What symbol you want to use to replace the original data
- What range your sensitive data is located in
These inputs will provide you with a little more variability so you can conform the VBA code to your specific needs.
'PURPOSE: Anonymize a number or text so only certain amount of original characters are showing (used alot with credit card #s)
Dim ShowChars As Integer
Dim AnonymChar As String
Dim SymbolString As String
Dim StringLength As Long
Dim AnonymEnd As Boolean
Dim cell As Range
Dim rng As Range
'Do you want to anonymize begining or end of text (TRUE for end, FALSE for beginning)
AnonymEnd = False
'Number of characters you want to show
ShowChars = 4
'Symbol you want to replace original characters
AnonymChar = "*"
'Range where data is stored
Set rng = Range("A2:A25")
'Loop through each cell and anonymize text
For Each cell In rng.Cells
StringLength = Len(cell.Value)
SymbolString = Application.WorksheetFunction.Rept(AnonymChar, StringLength - ShowChars)
If StringLength > ShowChars Then
'Anonymize ending characters (ie 730*******)
If AnonymEnd = True Then cell.Value = Left(cell.Value, ShowChars) & SymbolString
'Anonymize beginning characters (ie *******540)
If AnonymEnd = False Then cell.Value = SymbolString & Right(cell.Value, ShowChars)
Here's the macro code in in action! Pretty cool, right?!
Get The Example File!
As always, in order to download this article's example file you will need to be a subscriber of my free tips newsletter. If you click the green button below you can easily sign up and you will be emailed the password to get into the subscribers-only area of this website.
Real World Examples
I have not personally had a need for this functionality in my professional career but I envision it might be very useful in the areas of Human Resources, Audit, Accounts Payable, Payroll, and Treasury. Analysts in these positions typically are pulling data that includes account numbers, social security numbers, and other personal information that most likely should not be shown outside the respective department. By anonymizing the sensitive data, they can easily present their numbers to others while at the same time showing exactly what type of data was pulled. Can you think of any other reasons someone might want to partially anonymize their data? Leave you answer in the comments section below. I look forward to reading your thoughts!
How Do I Modify This To Fit My Specific Needs?
Chances are this post did not give you the exact answer you were looking for. We all have different situations and it's impossible to account for every particular need one might have. That's why I want to share with you: My Guide to Getting the Solution to your Problems FAST! In this article, I explain the best strategies I have come up with over the years to getting quick answers to complex problems in Excel, PowerPoint, VBA, you name it!
I highly recommend that you check this guide out before asking me or anyone else in the comments section to solve your specific problem. I can guarantee 9 times out of 10, one of my strategies will get you the answer(s) you are needing faster than it will take me to get back to you with a possible solution. I try my best to help everyone out, but sometimes I don't have time to fit everyone's questions in (there never seem to be quite enough hours in the day!).
I wish you the best of luck and I hope this tutorial gets you heading in the right direction!
Chris "Macro" Newman :)