×

The "Security Effect" - How To Create Partially Anonymous Data

By Chris Newman •  Updated: 03/20/14 •  5 min read
Partially Anonymous String Formula Anonymize Microsoft Excel Data

There are definitely times when you are handling confidential data and want to make sure that data stays confidential.  This type of data could be anything from credit card numbers to social security numbers.  Typically what is done in the real world is that a part of the data is replaced with a symbol (usually an asterisk), which allows the user to make an educated guess that the data is correct and prevents others from viewing the confidential data.

In this post, I will show you how you can anonymize your data through an Excel formula and through a VBA macro.

Anonymize via Excel Formula

For example purposes, let’s take a made-up social security number (123456789) and try to make it partially anonymous by only showing the last 4 digits. To do this we will need to use three different functions:

  • RIGHT( ) – Takes a given text or number and returns the rightmost specified amount of characters.  For example =RIGHT(“Hello”, 2) would return “lo”
  • REPT( ) – Give this function a value and tell it how many times to repeat it.  As an example, =REPT(“A”,5) would return “AAAAA”
  • LEN( ) – This function outputs the character length of a given string.  If you wanted to find out how many letters were in the word Mississippi, you could write a formula stating =LEN(“Mississippi”) and the LEN function would return 11.

Now that you understand the three functions we need to use, let’s put them together so we can anonymize our social security number.  Below is a picture of how the functions can be combined to only show the last four digits of the social security number.

The formula calculates out to =REPT, 9 - 4 RIGHT123456789, 4

Anonymize via VBA

I will not go into too much depth on how this macro works, as conceptually it is very similar to how the Excel formula in the previous section works.  I will note that the below VBA macro code actually overrides the original data.  Of course, you could modify the code slightly to place the anonymized data in the next column over by incorporating an Offset function to the "cell" variable (ie cell.Offset(0,1).value = ...).  Also, at the beginning of the code there are inputs for:

  • Deciding whether you want to anonymize the end or beginning of the data string
  • How many characters do you want to be shown?
  • What symbol do you want to use to replace the original data
  • What range your sensitive data is located in

These inputs will provide you with a little more variability so you can confirm the VBA code to your specific needs.

Sub AnonymizeText()

'PURPOSE: Anonymize a number or text so only certain amount of original characters are showing (used alot with credit card #s)
'SOURCE: www.TheSpreadsheetGuru.com

Dim ShowChars As Integer
Dim AnonymChar As String
Dim SymbolString As String
Dim StringLength As Long
Dim AnonymEnd As Boolean
Dim cell As Range
Dim rng As Range

'Do you want to anonymize begining or end of text (TRUE for end, FALSE for beginning)
  AnonymEnd = False

'Number of characters you want to show
  ShowChars = 4

'Symbol you want to replace original characters
  AnonymChar = "*"

'Range where data is stored
  Set rng = Range("A2:A25")
  
'Loop through each cell and anonymize text
  For Each cell In rng.Cells
    
    StringLength = Len(cell.Value)
    SymbolString = Application.WorksheetFunction.Rept(AnonymChar, StringLength - ShowChars)
    
    If StringLength > ShowChars Then
      'Anonymize ending characters (ie 730*******)
        If AnonymEnd = True Then cell.Value = Left(cell.Value, ShowChars) & SymbolString
      
      'Anonymize beginning characters (ie *******540)
        If AnonymEnd = False Then cell.Value = SymbolString & Right(cell.Value, ShowChars)
     
    End If
  
  Next cell

End Sub

Here's the macro code in in action! Pretty cool, right?!

Partially Anonymous String Formula Anonymize Microsoft Excel Data

Download Example Excel File

If you would like to get a copy of the Excel file I used throughout this article, feel free to directly download the spreadsheet by clicking the download button below.


Real World Examples

I have not personally had a need for this functionality in my professional career but I envision it might be very useful in the areas of Human Resources, Audit, Accounts Payable, Payroll, and Treasury.  Analysts in these positions typically are pulling data that includes account numbers, social security numbers, and other personal information that most likely should not be shown outside the respective department.  By anonymizing the sensitive data, they can easily present their numbers to others while at the same time showing exactly what type of data was pulled.  Can you think of any other reasons someone might want to partially anonymize their data?  Leave your answer in the comments section below.  I look forward to reading your thoughts!


Keep Learning

Chris Newman

Chris Newman

Chris is a finance professional and Excel MVP recognized by Microsoft since 2016. With his expertise, he founded TheSpreadsheetGuru blog to help fellow Excel users, where he shares his vast creative solutions & expertise. In addition, he has developed over 7 widely-used Excel Add-ins that have been embraced by individuals and companies worldwide.