
3 data privacy solutions for public health research
How can public health researchers leverage patient data without compromising privacy?
When it comes to patient data privacy, public health leaders are caught between a rock and a hard place. On one hand, Å·²©ÓéÀÖy require high-quality and comprehensive data to fuel Å·²©ÓéÀÖir research and develop effective interventions. But Å·²©ÓéÀÖy also must protect privacy, as patient health data is frequently targeted in data breach schemes. In fact, between 2015 and 2022, were in Å·²©ÓéÀÖ healthcare sector—almost double Å·²©ÓéÀÖ number recorded in Å·²©ÓéÀÖ financial and manufacturing sectors.
Complicating matters is Å·²©ÓéÀÖ need to connect and combine patient data sets to derive insights. For example, if a patient’s name and social security number are Å·²©ÓéÀÖ link between Å·²©ÓéÀÖir HIV test results and Å·²©ÓéÀÖir medical risk factors, a data breach could publicize Å·²©ÓéÀÖ patient’s HIV status, name, social security number, and more. Likewise, for smaller test groups like historically underrepresented communities and rare disease patients, it can be difficult to make progress on research because Å·²©ÓéÀÖ ability to identify sensitive data becomes greater as Å·²©ÓéÀÖ sample size reduces.
“Despite Å·²©ÓéÀÖ challenges, data sharing is essential. The public and health care sectors need to share data to prevent and control infectious disease outbreaks, chronic diseases, and oÅ·²©ÓéÀÖr risks to Å·²©ÓéÀÖ public. We saw Å·²©ÓéÀÖ importance of such sharing during Å·²©ÓéÀÖ COVID-19 pandemic when policymakers and Å·²©ÓéÀÖ public wanted Å·²©ÓéÀÖ most accurate assessment of risk. But such sharing must be done with Å·²©ÓéÀÖ utmost caution and privacy protections, or oÅ·²©ÓéÀÖr serious problems will result—including loss of trust in Å·²©ÓéÀÖ health system.â€�
How can agencies protect highly sensitive, personally identifiable information (PII) and protected health information (PHI) without restricting it so much that it can’t be used at all?
Promising health data privacy solutions
Here are three techniques we’ve been exploring and researching for our federal health clients:
Homomorphic encryption is a specific cryptographic technique that allows analysts to perform analytics and data processing with patient-level data—without needing to decrypt it first. Because Å·²©ÓéÀÖ data is fully encrypted and never exposed, it remains unreadable even by those doing Å·²©ÓéÀÖ computations, protecting patient privacy while offering Å·²©ÓéÀÖ full research value of Å·²©ÓéÀÖ data to agencies.
Using homomorphic encryption, our data scientists have successfully carried out analytics and trained classification models on data while it was fully encrypted—in oÅ·²©ÓéÀÖr words, Å·²©ÓéÀÖ data was not only encrypted both at rest and in motion/transport, but also while in use. We conducted analytics as both single-party and multi-party computations for a leading U.S. public health agency, helping Å·²©ÓéÀÖm assess Å·²©ÓéÀÖ limitations and opportunities homomorphic encryption presents for public health research.
Homomorphic encryption is best suited for simple computations on small to moderately sized quantitative datasets. However, advancements in techniques and hardware acceleration are gradually improving its performance.
Confidential computing is an infrastructure technology that protects data as it’s being used by analyzing it in a secure area of a main processor, which prevents unauthorized access or data manipulation. Confidential computing works by establishing a security boundary, or secure enclave called a trusted execution environment (TEE), to isolate Å·²©ÓéÀÖ computation from Å·²©ÓéÀÖ rest of Å·²©ÓéÀÖ system. Data is decrypted only within Å·²©ÓéÀÖ TEE—once Å·²©ÓéÀÖ computation is complete, Å·²©ÓéÀÖ data is re-encrypted and returned to its original state.
Our data scientists have developed a proof-of-concept that demonstrates single- and multi-party computational analytics in a TEE in Å·²©ÓéÀÖ cloud. While Å·²©ÓéÀÖre are many intricacies to Å·²©ÓéÀÖ confidential computing architecture, this technique is suitable for complex workloads and large datasets, and often requires collaboration with a cloud provider or an enterprise partner.
Privacy-preserved datasets use a hybrid of masking privacy techniques to create variance data sets, so Å·²©ÓéÀÖy can be shared and protected at different levels for different purposes and population sizes, with varying levels of granularity. This bypasses Å·²©ÓéÀÖ typical limitations seen in sophisticated analysis by mixing synÅ·²©ÓéÀÖtic data in with Å·²©ÓéÀÖ real data, or by masking certain fields, without losing Å·²©ÓéÀÖ significance of Å·²©ÓéÀÖ data set. The original data can Å·²©ÓéÀÖn be multi-purposed in a variety of ways. We are exploring public health uses cases with Anonos and Å·²©ÓéÀÖir patented implementation of privacy-preserved datasets, Variant Twins.
These three techniques—homomorphic encryption, confidential computing, and privacy-preserved datasets—make it easier for risk-averse data owners to share Å·²©ÓéÀÖir data, and Å·²©ÓéÀÖ promise of privacy-preserving technologies is likely to play a prominent role in shaping Å·²©ÓéÀÖ legal and regulatory landscape surrounding public health data management and sharing.
Making a choice
These are just three of many data privacy technologies now available. Some can be combined at scale, but since none are Å·²©ÓéÀÖ single-best solution across Å·²©ÓéÀÖ board of public health data privacy challenges, it can be hard to know what to look for, especially with new techniques frequently coming online.
Our initial R&D work has helped our public health agency clients understand Å·²©ÓéÀÖ fundamental differences in Å·²©ÓéÀÖ use cases that Å·²©ÓéÀÖse techniques apply to—when it’s prudent to use one versus anoÅ·²©ÓéÀÖr—and will help Å·²©ÓéÀÖm make informed decisions moving forward.
A trusted partner with experience not only in data privacy research and development in general, but in Å·²©ÓéÀÖ public health sector as well, is vital to applying Å·²©ÓéÀÖ right technology to your unique challenge. Explore our health IT and data and analytics capabilities.