What if Corporations Process Our Data Without Ever Accessing it?

internet technology computer pc

Photo by Markus Spiske temporausch.com on Pexels.com

By: Samy Danesh

“Is it possible to delegate processing of your data without giving away access to it?” This was the question posed by Craig Gentry in his paper Computing Arbitrary Functions of Encrypted Data, in which he introduced the concept of homomorphic encryption.

Tech companies find the privacy discussion threatening – afraid that broad privacy legislation and high standards of privacy protection will dry up the well of innovation and economic growth.  For example, the Center for Data Innovation and IAPP view the EU’s General Data Protection Regulation as potentially inhibiting European companies’ ability to compete in AI development and the data driven economy. This fear is not completely unfounded since much of today’s technology depends on access to large troves of data. However, innovative solutions like homomorphic encryption create an opportunity for greater data privacy without compromising the industry’s ability to access data and process that information to provide products and services.

The problem: Trusting Strangers

Organizations today store most of the data their users and customers generate in the cloud and perform most of their computations in the cloud as well. But the cloud does not inherently work to protect users’ privacy. This problem can be further refined into two sub-problems: (1) unauthorized access to data by third parties, and (2) misuse of data by controllers that data subjects have given access to.

To address unauthorized access, data center operators like AWS, Google Cloud, and Microsoft Azure, as well as data controllers like Facebook and Google apply state of the art encryption on stored data. Data encryption scrambles data into another form so that only people with access to the decryption key or password can unwind the scramble and read the actual data.

Encrypted data, however, is not useful to service providers. Take for example Ancestry, a gene sequencing service used by millions of people globally. As stated in their Privacy Policy, Ancestry depends on access to genetic information to calculate ancestry, connect people with other family members, and provide information about how user’s genes may affect their health. Therefore, Ancestry’s users have to trust Ancestry with possession of their decryption key and access to decrypted, plain text form of their genetic data in exchange for the services provided by Ancestry; and in turn, Ancestry users are exposed to the potential misuse of their data by the corporation.

This is not a problem specific to Ancestry. Users of most services are left having to trust a corporation.

A Proposed Solution: Homomorphic Encryption

In principal, homomorphic encryption (“HE”) enables computation on data without ever decrypting the data and in turn, never providing access to plain text forms of data to the computing party. HE therefore removes the need for trusting a foreign data controller without hindering access to the controller’s valuable services.

The use of HE can greatly alleviate the potential for the misuse of data. Imagine two parties, one the data subject (party A) and the other an entity proposing to use the data for a purpose desirable to the data subject (party B). If party A’s data is encrypted using HE, then party B can perform functions on their data without ever revealing the contents of the data. This gives Party A all of the benefits of Party B’s services without having to trust Party B with their raw data.

Potential Applications of HE to genetic data

Empirical studies suggest that many people believe that their genetic data inherently belongs to them. However, ownership of data comes at the cost of data management—a herculean task for most. Navigating the healthcare industry, storing and sharing data, and the sheer number of data sharing opportunities are likely to overwhelm most data subjects.

For these reasons, having corporations assist in data generation, storage and sharing makes sense. Currently, however, data subjects have to trust those companies with their raw sensitive genomic data.

In 2018, direct-to-consumer gene sequencing giant 23andMe announced a $300 million dollar deal with GlaxoSmithKline (GSK). The deal is described as a four-year collaboration between the two in which GSK will mine 23andMe’s trove of 5 million customer’s genetic data in a R&D effort to develop new drugs. This partnership, however, raises privacy concerns for the data subjects. Relating to privacy concerns with data-based collaborations,  Peter Pitts, President of Center for Medicine in the Public Interest and Former FDA Associate Commissioner, told Time magazine, “[t]he risk is magnified when one organization shares it with a second organization. When information moves from one place to another, there’s always a chance for it to be intercepted by unintended third parties.”

The application of HE in these types of transactions can alleviate privacy concerns without adversely affecting the usefulness of the underlying genetic data. For example, if 23andMe applies HE to their genetic data then GSK scientists can perform analysis on 23andMe users’ genetic data without the users’ data ever being revealed to GSK staff.

When compared to the current status-quo in data sharing, this end result is a huge leap forward. HE still faces technical issues such as calculation time and the need for high computational power, but as these technical issues are resolved, HE will have the potential for broad adoption and a real impact in protecting data subjects’ privacy.

Leave a comment