A new form of “secret” computing will allow banks to analyze and monetize data while still encrypted, protecting its value and remaining compliant with data protection rules, writes Jordan Brandt
Data is the new oil, according to the Economist. The problem is how to extract and trade it without diminishing its value or comprising privacy and security.
Unlike oil, it can be copied and shared, which undermines its scarcity value. This is a source of huge frustration for banks, which have come to regard customer data as a valuable asset. If they sell the data itself, they are diminishing its value. But that is not their only problem.
Strict global and sovereign regulations protect customer data from exploitation and misuse, including fraud. As a result, it is illegal in many jurisdictions to export customer data. This makes it harder – and more expensive – for banks to aggregate and analyze data sets across regions or feed it into machine learning programs to build credit risk profiles, fight fraud or combat money laundering. Banks are stuck with having to run multiple models on individual data sets within each region or anonymize data for export.
The first option results in smaller data sets, which limits their usefulness and delivers poorer predictions. The second option fails to deliver the full value of the data. Here’s why.
Problems with anonymization
One problem is that anonymization is neither secure nor private. It’s been proven time and again – not least when a customer sued Netflix after her data became public – that it is possible to identify individuals by matching anonymized data with information already in the public domain, such as Facebook and LinkedIn profiles. As a result, some countries (including the UK) have recently amended the law to make it illegal to re-identify anonymized data.
Another snag is that anonymization removes features that are highly valuable to machine learning, such as gender, age and income. This makes it harder for banks to train machines to do specific tasks. What’s more, companies such as Google, Apple and Uber have found that there is a trade-off between accuracy and privacy when they use differential privacy to anonymize data.
Differential privacy introduces “noise” into statistical data so that individuals cannot be identified. While great for looking at trends – the length of the average Uber ride in London in August, for example – it sacrifices accuracy as a result of the noise. Differential privacy is also poor at identifying outliers, which are highly valuable in financial analysis.
The upshot of the sub-optimal tools at hand that conform with data protection rules is that banks have been unable to unlock the full value of the data they hold. Until now.
Encryption in use
There is a third way of keeping data secure, called encryption in use.
Although it has existed in theory for years, the computing power and algorithms have not. About 10 years ago, research scientist and former MacArthur Genius Fellow Craig Gentry suggested that computing could be done on encrypted data sets without exposing vital or sensitive information – known as Fully Homomorphic Encryption or FHE. At the time of his presentation, however, Gentry acknowledged that it would practically take until the end of the universe to perform complex machine learning on massive data sets.
Today, raw computer power and new algorithms mean that machine learning happens in a matter of seconds.
A second innovation that will greatly advance data analysis is secure multi-party computation, which allows many teams to work on data while still keeping that data private.
Through these innovations, it is possible to extract the valuable features of data while keeping it private – opening up new revenue streams and business opportunities for banks while remaining compliant with data protection regulations.
At Inpher, we have improved the speed of encryption in use by several orders of magnitude without compromising practical security to create Secret Computing – a new generation of encryption for those wanting to store, trade and analyze tamper-proof data. It allows banks to move data between jurisdictions and work on it – a big step forward on the road to monetizing that data and retaining its value. This innovation could not have come at a better time.
The European Payment Services Directive 2 that comes into force in January will allow banks to create their own application program interfaces (APIs) to give third parties access to analyze their data. This is expected to speed up the process of transforming data into a new revenue stream for financial groups.
Secret Computing will be vital in determining the market value of data sets. With Secret Computing, financial services companies will be able to run analytics to assess the value of a bank’s encrypted data before deciding whether to subscribe to it. It might also be used to develop new products by allowing banks and phone companies, for example, to share encrypted customer data. The prospects opened up by Secret Computing are hugely exciting.
At Inpher we are also developing a solution to run analytics on data stored on blockchains while it remains encrypted.
Secret Computing will help banks to remain compliant. It will open up new revenue streams and allow banks to quantify the value of their data. It cracks the problem of how to monetize encrypted data – at last turning it into digital oil.