A Hippocratic Oath for Big Data

A Hippocratic Oath for Big Data

The news last week regarding Amazon’s Health Insurance Portability and Accountability Act (HIPAA) compliance status for health-related Alexa apps begs the question: will Amazon use health data for other purposes?

Responding to increased data production and use for health services and other purposes, privacy advocates have urged greater data use restrictions, while tech giants have advocated for a generally applicable privacy law. However, these proposals restrict data use in ways that could adversely impact both individual patient health outcomes and broader public health goals.


When medical data is securely combined across patient populations, more effective treatment is possible. If a patient is presented with a multitude of data points from these aggregated data — potential diagnoses, treatment options, predicted outcomes, likely drug interactions and expected side effects — health outcomes will likely improve. These outcomes may also be achieved at a substantially lower cost, a central point of discussion for virtually all insurance debates, including those around universal and subsidized health care.

Rather than focusing on data use restrictions, the United States should build better de-identification techniques, which remove data that can identify an individual patient and build on computational methods to secure data.

Do no (digital) harm

Consumers are increasingly concerned about digital record abuse, especially for sensitive medical data. Yet the sharing of data is economically valuable. In medicine, sharing digital records is required both for receiving quality care and for developing new therapies, and increasingly convenience has become a differentiator in health outcomes.

For example, sensitive data use will facilitate radical improvements in health outcomes through personalized medicine. Rather than aggressively restricting data collection, we should embrace data usability and privacy as the bedrock of trust, while championing new systems that improve health outcomes for individual patients and transform public health.

Trust is the essential ingredient for friendship, medicine, commerce and virtually every human interaction. Without trust, information sharing relationships crumble, negatively impacting our digital economy as consumers and patients avoid using new technologies or services.

Often, health-care laws like HIPAA restrict data use from a position of distrust and the most valuable data is siloed. Rather than limiting data collection in a manner that hamstrings constructive use, we might instead develop technical systems that reduce patient privacy risks while improving data usability. This approach reinforces the relationship between health-care providers and patients — improving health.

Data identifiability

In the practice of medicine, physicians and health-care organizations default to a seemingly outdated “paper trust” model to meet HIPAA requirements: privacy policies are presented to a patient at a health-care provider’s office or when downloading an app.

Usually, organizations like Amazon are restricted by what is communicated in a notice of privacy practices or specific authorizations under HIPAA. Under some state laws, health-care providers also must collect consent, which on the surface seems to “prove” that patients agreed to the terms. Yet patients may not fully understand the risks, including how de-identified data might be used for additional, for-profit uses, or be untraceably transferred to third parties.

The HIPAA de-identification safe harbor permits organizations to freely aggregate, transfer, and even sell data when specific identifiers are removed from a record, without having to comply with HIPAA. Despite the belief that de-identified data do not pose much risk to patients, data science has demonstrated that “de-identified” data reuse is not so safe. A handful of data elements can often be used to re-identify an individual with remarkable precision. The emergence of artificial intelligence used in health applications has made de-identification highly ineffective.

Amazon’s Alexa apps, increasingly leveraging artificial intelligence technology, may be compliant with HIPAA, but even de-identified data may still identify individual patients.

Risks associated with identifying an individual from sensitive health data include the potential for discrimination, impact to social or familial ties, and damage to economic prospects. Moreover, certain data types, such as genetic data, translate risk through generations: disclosure can impact our children and their descendants.

A digital Hippocratic Oath

Certainly, consumers should question Amazon’s interest in these data, and these risks suggest that perhaps data should not be collected, retained, or shared by big technology. Yet these data may enhance access to health care, improve health outcomes, or lower provider costs.

Investing in technologies to reduce identifiability while preserving data use reduces patient risks while improving medical efficacy and efficiency, providing a balance necessary for better medicine. A Digital Hippocratic Oath demands that health-care providers effectively balance privacy and usability to the benefit of individual patients and broader public health goals.

Piers Nash, Ph.D., is a cancer biologist who was a University of Chicago professor involved in genome project analysis for dozens of species and tens of thousands of human genomes. He served as director in the Center for Data Intensive Science for the architecting and deployment of the National Cancer Institute’s Genomic Data Commons. He is involved in Sympatic Inc., a startup venture dedicated to building core technologies that create trust and allow secure sharing of health data.

Charlotte A. Tschider is a law professor and Jaharis Faculty fellow in health law, intellectual property and information technology at the DePaul University College of Law and a Fulbright Specialist in Privacy and Cybersecurity Law. She is author of "International Cybersecurity and Privacy Law in Practice," contributor to the American Bar Association’s The Law of Artificial Intelligence and Smart Machines and author of a number of academic and practical articles on the intersection of health law policy, privacy and cybersecurity.