With the advent of the Health Insurance Portability and Accountability Act (HIPAA) of 1996 all entities that handle health information are required by law to secure all data which contains personally identifiable information (PII) and private health information (PHI). Fines for leaking this data can range from $100 to $50,000 per leaked record. A data breach or leak is extremely costly for both the patients as well as the companies that are entrusted with their PHI. In our presentation, we introduce Gonymizer, a tool that is written in Go at SmithRx to handle the anonymization of PHI and PII data from our production database instances.
This data is anonymized and loaded into non-production environments to allow us to use representative data to develop and test against. This makes anonymization of sensitive information quick and simple using a simple column map that is defined in a single JSON file for your dataset. There is a selection of custom processors that we have built to handle basic tasks, such as first and last name anonymization, changing data to fake locations such as street addresses, cities, zips, and states. The interface for building processors is also completely extendable and anyone with basic Go experience should be able to build processors that can anonymize your data efficiently. We will also show how this tool decreases our development time for new features as well as simplifying testing in a compliant environment with non-sensitive data sets (HIPAA, PCI, etc).
Toward the end of our presentation, we will be discussing how we built our infrastructure using Docker to containerize Gonymizer and schedule anonymization and loading of our test environments using Kubernetes. This talk is targeted for anyone working in the healthcare space where collected data contains PHI and/or PII and is regulated by HIPAA.