Why banks are taking big data to the cloud

To fully realise the benefits of their data lake, banks are turning to the cloud

Blog, October 10, 2019

Temenos – Company

The rapid emergence of cloud computing is transforming the way financial institutions think about how they handle their data. The advantages of cloud computing in terms of cost, performance, and scale look more compelling than ever. The benefits that can be realized from cloud for any banking cloud-deployed software are also true for the data lake. Reduced total cost of ownership, minimized IT footprint, access to near unlimited scalability, and access to a myriad of data storage and processing options, deliver greater capabilities that than any bank can achieve in their own data center.

The cloud is more than a platform on which to deploy specific data and analytics solutions. It is the portal for all data technologies in the future. Very soon, it will be unheard of to see data and AI capabilities delivered in an on premise environment.

Mainstream adoption of public cloud in banking is predicted to occur by 2023, according to Gartner. It predicts a five-year growth CAGR of 15.2% (vs 7.5% for core banking in general). Public cloud deployment is still emerging, stronger in neobanks and low-tier banks, with largest banks more likely to use public cloud for their digital banks and only use cloud for testing for the parent bank. Even in the absence of any actual cloud usage, almost all banks now have a cloud strategy. The cloud is an inevitability.

What about the services that banks are selecting to deploy in the cloud? CRM, HR and payroll services are commonly delivered from the cloud – the rise and rise of Salesforce is well documented – and many more services are emerging. Specialist functions required to support the core banking proposition (such as market data, SWIFT etc) can be accessed via third parties and partners by extending the service with APIs. And more and more banks are choosing to deploy and run their mission critical core banking from the cloud as well as other banking software solutions that provide the support, governance, and inputs into delivering the solution, like the data lake.

Data lakes can be deployed in the cloud, fully or hybrid, or on-premise, with the trend currently to deploy them in the cloud, due of the power and capacity that can be leveraged from the hyperscale providers. Cloud-based data lakes are better suited for the complex, deep learning required for artificial intelligence and machine learning applications.

Reduce the TCO

Subscription models for cloud usage eliminate the requirement for upfront investment in hardware to store, manage and processes data, and the on-demand infrastructure means that banks only pay for resources that are actually used. The data ingestion tasks that are typical of a data lake deployment are perfectly suited for the cloud, as they require resources only at certain times of the day and are dormant otherwise. The maintenance, hardware, physical security layers are all part of the service.

Much of the software underpinning cloud data lakes is serverless, meaning that banks can get to market quickly, again, only paying for the resources they consume. Fears of spiralling costs are unfounded, cloud service provider solutions are elegantly designed to introduce effective monitoring and management systems, including auto-scaling features that allow banks to automatically optimize resource utilization based on self-determined rules. A bank running a cloud-based data lake can determine the minimum and maximum number of instances to be utilized to ensure applications stay up without impacting business models.

The power of scale and speed

The data lake holds a vast amount of raw, unstructured data in its native format, and the volumes of data are growing. As the data set grows, the demands upon it will grow also, especially as AI and Machine Learning solutions become more prevalent in the provision of banking services. Scale impacts aren’t limited to the capture and storage of the data, at some point, the bank will want to mine the data. The AI and ML applications are fuelled by data, and they are going to need to access the lake. In an on-premise deployment, manual monitoring and intervention is required to free up resources to accommodate increases in the data volume, processing requests, increased numbers of users and spikes in activity. Utilizing a hyperscale cloud provider (Amazon Web Services, Microsoft Azure, and Google Cloud Platform) offers more flexible, agile environments that can scale elastically on demand and provide all of the necessary capabilities to capture, store, and analyze data faster and more easily.

From 2013 to 2020, the digital universe will grow by a factor of 10 – from 4.4 trillion gigabytes to 44 trillion. It more than doubles every two years.
IDC

Security and Governance

The management of data privacy and security can be complicated on a traditional on-premise deployment. Each tool used for the stack much have appropriate role-based data access, authentication, and encryption. The implementation and management of which requires specialist expertise and monitoring to ensure the data isn’t just safe, it is also useful. Fortunately in the massive investment in cloud data centers, security is not merely a hygiene style requirement is a primary component of their platform proposition. The Hyperscale providers support global and localised requirements for reporting, governance, and security standards, all of which are detailed carefully in their literature. These include GDPR, ISO, SOC2, PCI DSS and the list goes on. Coupled with cloud-native, cloud-agnostic, data lake software with integrated metadata management tools to support compliance obligations, the cloud can be optimized to deliver a secure solution. Whilst the technology is mature to provide a secure data foundation, data security and privacy frameworks should remain a core tenet of a banks’ data strategy.

Preparing for the future

The future-proof nature of the cloud, from the recipience of new technologies, to the low commitment scalability of the platform, provides banks a new luxury of time. Many use-cases for the data lake haven’t yet been devised, but it remains one of the bank’s greatest assets. The very nature of the raw, unstructured data that resides within the data lake provides flexibility now and in the future, the foundation is set for the use when the proverbial light-bulb goes off and the data needs mining for insights.

Summary

In the rapidly evolving banking landscape, financial institutions are managing multiple strategies in order to manage costs whilst meeting compliance obligations, delivering on customer demands, and finding or reaffirming their position in an increasingly competitive environment. There is a natural symbiosis in cloud strategy and the data strategy, one which will provide dividends in the future.

In the pursuit of big data engineering and the investment in a modern data architecture, banks are recognizing they should be deploying it in a modern way.

Temenos Data Lake is part of the market-leading Temenos Analytics product which is embedded into cloud-native, cloud-agnostic Temenos Infinity and Temenos Transact. Temenos Data Lake supports multiple underlying database and processing platforms including Apache Kafka, Spark and Hadoop. Temenos Data Lake is a real-time data lake platform built specifically to satisfy the digital requirements of global banks. Temenos Data Lake provides capabilities for real-time data event streaming, flexible data engineering, analytical banking data model which services hundreds of banking analytical APIs. The product is fully integrated with Temenos banking software products including Temenos Transact and provides the capability to integrate and blend other data sources to maximize the analytical value of the data. Temenos Data Lake can leverage virtually any cloud or database platform as its underlying technology, giving banks more choice of underlying architecture can reduce cost whilst increasing scale