AI Doctors: Muddying The Waters of Patient Privacy?

By: Nirmalya Chaudhuri

Introduction

In November 2024, a study reported that ChatGPT acting on its own performed better at diagnosing diseases than trained doctors in the US, with a reported 90% success rate. While singing paeans of Artificial Intelligence (AI) in almost every field of human activity has now become par for the course, the possibility of the rise of “AI doctors” raises important questions relating to patient privacy and, in particular, the larger doctor-patient relationship.

Crisis of Confidentiality: From Ethics to Law

The delicate nature of the doctor-patient relationship was beautifully expounded in the eighteenth-century writings of John Gregory and Thomas Percival, two pioneering figures in the field of medical ethics. As Gregory and Percival noted, doctors inevitably gain access to the private lives of their patients, and this naturally requires them to exercise great discretion in dealing with the personal information entrusted to them. In certain fields, such as psychotherapy, this is all the more important, which has led to a strong movement  (and its acceptance by some courts in the US) to carve out a“psychotherapist-patient privilege” exception to the Federal Rules of Evidence governing confidential communications.

Today, the narrative on this issue may well shift from a purely ethical viewpoint into a more legalistic perspective on data protection and privacy. It is, therefore, not surprising that the General Data Protection Regulation (GDPR) in the European Union, classifies “health data” as one of the categories of “sensitive personal data”, to which heightened standards and safeguards apply. When patients directly interact with an “AI doctor,” sensitive personal data relating to not only one’s health but also one’s private life, would be disclosed. 

In the absence of strong legal regulations for processing such data, it is possible that patient data could be used for purposes other than providing patient care. The concerns are exacerbated by the fact that generative AI may “learn” or “train” itself using such sensitive patient data, leading to outcomes that are both inconceivable and perhaps undesirable for the patient. While a possible counter-argument would be that explicit consent of the patient would be a prerequisite before any health data is divulged to the AI doctor, there is a wealth of secondary literature that questions whether the patients truly know what they are consenting to instead of treating consent like a routine, box-ticking exercise. 

Judicial opinion

Case law on this topic is non-existent, presumably because the issue of “AI doctors” is novel. However, in the past, courts have sought to tread cautiously as far as divulging patient records to third parties is concerned. The Ohio Supreme Court, in Biddle v. Warren General Hospital, went so far as to suggest that unauthorized disclosure of patient records to a third party amounted to an independent tort by itself, irrespective of how that information was utilized by the third party. It is, of course, true that the court was dealing with a case where the patients had not consented to disclosure of their data. In the UK case of Source Informatics, the Court of Appeal allowed the passing of patient records to a third party. However, in that case, it is worth noting that the patient data was anonymized before disclosure, thus making the argument relating to data protection that much weaker. 

These cases would prove to be of limited application if and when the debate on AI doctors reaches the stage of litigation. In these circumstances, the courts would then be deciding whether patients can consent to the disclosure of non-anonymized sensitive personal data to an AI entity and the extent to which the AI tool can use that information. 

Conclusion

In the absence of a federal data protection law in the US, there is a possibility that legal regulation of the wide-ranging use of AI doctors would be State-specific and fragmented. Even more importantly, a lack of legal regulation would raise serious concerns about whether AI doctors are indeed “practicing” medicine, which would directly determine whether they will be bound by professional obligations that are ordinarily applicable to doctors, such as the preservation of confidentiality of patient data. A possible solution to this problem could lie in a conclusive determination of the legal status of AI doctors, and the techno-legal accountability standards like purpose limitation and data minimization that they would be subject to. While AI can potentially lead to great benefits in medical science, one must ensure that confidentiality, privacy, and protection of personal data are not allowed to be sacrificed at the altar of convenience and diagnostic efficiency.

Hide Your Info: Exploring the Lackluster Protection of HIPAA

By: Zach Finn

The Health Insurance Portability and Accountability Act (HIPAA) was enacted in 1996, and has since become a touchstone for the protection of confidentiality and security of personal health information in the United States.

Or, so we thought. The rise in technology has advanced the way information is stored and shared. Biomedical databases store high volumes of information, ranging from personal external identifiers such as medical reports, to even individual genetic sequencing, exemplified by 23andMe’s and Ancestry‘s storage of genetic information. Large data and biobanks (a collection of biological samples, like blood and health information) create access to a plethora of quality human data, which prove to be valuable in medical research, clinical trials, and understanding genomics. But at what cost?

HIPAA requires medical and genetic information to be anonymized before being distributed and shared to third parties outside the relationship of medical providers and patients. Technology has created a loophole in HIPAA, through re-identification processes, which allows individuals to match medical information back to specific individuals using open source data. Re-identification, as of now, disarms HIPAA, rendering de-identified (anonymized) medical information basically unprotected from parties who obtain personal biodata through re-identification.

HIPAA nationalizes standards for protecting the privacy and confidentiality of individuals’ personal health information (PHI). It requires covered entities to provide individuals with notice when sharing a person’s genetic information. HIPAA is violated when a covered entity discloses personal and identifiable health information without the consent of the patient. These covered entities include healthcare providers, health plans, and healthcare clearinghouses. Technology provides entities with the ability to de-identify and anonymize large data sets in order to share health information and be in compliance with HIPAA. Anonymization removes personal identifiers like names, addresses, date of birth, and other critical identifiers. HIPAA sets out requirements of what needs to be de-identified, and once anonymized, personal health information is shareable and HIPAA compliant.

Re-identification is the process to which materials and data stored in biobanks can be linked to the name of the individuals from which they were derived. This is done by taking public information and re-matching it the anonymized data. It sounds difficult, but a study concluded that 99.98% of Americans would be correctly re-identified in any dataset using 15 demographic attributes such as age, gender and marital status. For example, in the 1990s, one could purchase the Cambridge, MA voter registration list for $20, and link it to a public version of the state’s hospital discharge database to reveal persons associated with many clinical diagnoses.

HIPAA has yet to play catch up with the innovation of technology. The requirements for compliance in anonymization lack the sophistication and protective measures needed to combat the expanding use of re-identification practices. HIPAA’s privacy rule does not restrict the use or disclosure of de-identified health information, since it no longer is considered protected health information. This means that any re-identification of this earlier protected information is not subject to HIPAA. This ultimately demonstrates HIPAA’s weak protective measures, and the alarming concern of how easily accessible our genetic and medical information is to third parties.

Re-identification of HIPAA compliant anonymized information is not a violation of the statute. We must consider reforming HIPAA to acknowledge technology’s capabilities to bypass its security measures. One way an individual can ensure privacy of his or her genetic and medical information is by not consenting to sharing or storing this data. Covered entities must give notice and obtain consent before de-identifying and sharing biobanks. However, this comes with the price of stifling research, trials, and genomics. Hopefully we can figure out a balance between confidentiality and sharing private information, but it starts with drafting laws that actually protect our personal and most private information!