When combined with other anti-fraud measures, machine learning improves detection by a factor of five. The more data computers analyze, the better they will become at spotting fraudulent behavior, making it harder for criminals to succeed.
Fraud prevention has long used data analysis to help stop criminals. By looking at how fraudsters operate, we can learn how to stop and even catch them.
Traditionally, data analysis was a long, slow slog, with the upshot that fraud was mostly spotted after the event, if at all. Today, data crunching is faster, broader and deeper because we can teach computers to recognize unusual bank transactions – a process known as machine learning.
Machine learning relies on data – the more data, the better the algorithm becomes at detecting, and even preventing, fraud.
The use of machine learning, in tandem with other anti-fraud measures, has been shown to reduce the number of false positives (an alert that is raised for a genuine transaction) by a factor of five. This frees up time for investigation departments to deal with genuine problem transactions and avoids the need to inconvenience customers unnecessarily.
For optimum results, machine learning needs to come in two forms – supervised and unsupervised.
Unsupervised learning is based on unlabeled data. It has not been examined by the bank and comes without any description. Supervised learning is based on labeled data – in a fraud detection context, it will be described fraudulent or genuine.
Labeled data allows computer programs to build up a picture of what a normal transaction looks like; unlabeled data lets them look for transactions that deviate from the norm. You need both for anti-fraud algorithms to work well.
When unlabeled data looks suspicious, the algorithm flags it and an alert is sent to the bank, which will examine it and return a verdict of genuine or fraudulent. In this way, data becomes labeled, and over time, the algorithm will refine its ability to detect suspicious transactions.
The next step is for algorithms to look for suspicious patterns in data – for example, a succession of transfers just below an alert threshold to the same recipient. The number of transactions required to build a good anti-fraud model varies according to the size and complexity of the client. A company, for example, has a much bigger financial footprint than an individual, and will require more data crunching as a result.
A single algorithm will never stop fraud by itself – that ultimately depends on using a combination of techniques and algorithms and multiple data feeds. Rules are also imperative as they allow data to be categorized and labeled. Once the data is labeled, machines can look for patterns in the way fraudsters avoid or break the rules. Through variations in patterns, machines might be able to identify new types of fraud.
Contribution from Jérôme Bovay, Chief Data Scientist at NetGuardians. Netguardians are Temenos partner and Market Place provider for Internal Fraud.