RBC’s plans to marry large language and transaction models

Canada’s biggest bank has been training in-house AI models on vast bodies of financial data. Now it is integrating them with large language models developed by a local rival to Microsoft and Google. Is this a ChatGPT for banking?

One of the paradoxes of banking’s enthusiasm for generative artificial intelligence (gen AI) is that the technology has excelled more in words than numbers.

The ChatGPT model that triggered all the excitement around gen AI two years ago was notoriously bad at maths. OpenAI has sought to improve this in its latest iterations. But the ability of large language models such as ChatGPT to reliably do complex calculations remains in question.

This has not stopped bankers from embracing gen AI for tasks such as summarising documents or sifting through internal policies and procedures – mainly internally, but with some attempts now to make these systems customer-facing. Banks are also getting gen AI models to write marketing content and to help with coding, among other relatively low-risk use cases.

Yet, most banking data is numerical: securities trading, payments, risk metrics and so on. At Royal Bank of Canada (RBC), for example, 80% of the data on its servers is not in language form. The ability of language-based models alone to get to the financial heart of banking, and add value to it, is consequently limited.

Maths literacy

“Large language models don’t innately understand financial data,” says Foteini Agrafioti, RBC’s senior vice-president for data and AI, and chief science officer. “They have seen it in documents. But if you ask them to create a transactional sequence, they can’t create it from scratch. If you ask them to describe the financial health of an individual, intuitively, they can’t. They have to have seen it written somewhere.”

Before the launch of ChatGPT in 2022, banks focused on more traditional forms of machine learning and analytics that could work off financial data for narrow uses, based on relatively small data sets.

Nevertheless, speaking to Euromoney, banking sources working in AI believe wider-purpose models developed by banks can learn from much larger bodies of both language and financial data, potentially helping them with their core task of managing risk more effectively than earlier machine learning or large language models working alone. And RBC, they say, is one of the most advanced banks on this front.

We’re going to be able to deploy gen AI securely on RBC infrastructure, on our data, rather than relying on the cloud

Foteini Agrafioti, RBC

Canada’s biggest bank, RBC has one of the largest AI research and development teams in banking – set up about 10 years ago. Named Borealis, the division has trained foundational AI models and operates one of Canada’s largest enterprise-focused clusters of graphics processing units (GPUs) built partly with chipmaker NVIDIA.

Rather than a large language model with a chat function attached, such as ChatGPT, the bank has developed during the past two years what Agrafioti calls large transaction models, trained on the bank’s entire transactional history and called Atoms (asynchronous temporal models). Like ChatGPT – which you could ask to create a recipe for lasagne, just before giving you a lesson on semiotics – RBC’s models allow for more general use. That ranges from understanding a customer’s product preferences, risk tolerance, susceptibility to fraud and so on.

“You can train this model once and then you can use it across many different banking applications,” Agrafioti says. “You have the same model that understands risk and what kind of content an individual may be interested in.”

RBC-toronto-HQ-Getty-960.jpg

Home advantage

The next stage is to integrate Atom’s foundational models into large language models built by Cohere, a Canadian business-to-business AI company, which already has close relationships with TD Bank, another top four Canadian lender, and US IT firm Oracle, among others. Agrafioti describes it as bringing language and transaction-based AI models into one.

The aim is to go beyond the sorts of use cases that involve feeding in less sensitive non-financial data, such as company policies and procedures or market research. In future, the bank’s staff will be able to gain a deeper financial understanding of specific customers, using natural language, via this platform.

The design responds to some of the biggest challenges in using gen AI in banking: gaining confidence in explaining the output of gen AI models while maintaining data security. Many banks consider this especially difficult when they are using third-party models. The firms behind those models might not disclose their internal functioning and they typically run on a public cloud, often out of the bank’s national borders, which can be a problem for the local financial supervisor.

We’re going to get them to forget how to speak Shakespeare and to learn more how to speak banking, through our own data

Foteini Agrafioti

RBC, like BNP Paribas’s partnership with French AI company Mistral, is tackling this in part by working with Cohere, which offers large language models deployed on its own infrastructure.

Cohere is based almost across the road in Toronto. This January, the firm launched a workspace product called North, claiming to outperform peers such as Microsoft Copilot and Google Vertex AI Agent, and focused on businesses with special privacy and security concerns, such as finance and healthcare, which sometimes need models deployed via on-premises data infrastructure or a private cloud. RBC’s version of North will be called North for Banking.

“We’re going to be able to deploy gen AI securely on RBC infrastructure, on our data, rather than relying on the cloud,” says Agrafioti about the tie-up. “We feel that by having these technologies on our servers, with secure access to RBC data, we can unlock a ton of opportunity for how we use them. You can do more interesting things with them when they can see our data sets.”

Understanding personal finance

Already, RBC is using Atom to get a better sense, for example, when a customer might be interested in a loan or a credit card, and how creditworthy they are. This has allowed the bank to lend to some clients that it would otherwise have refused, according to Agrafioti.

In line with the emergence of the gig economy, it uses much more than what she describes as rudimentary methods of risk scoring around salaries and expenses. “Humans are way more complex than that,” she says. “How you earn money is way more complex today than before. It’s not the same paycheque every two weeks into your account. Understanding finances at a personal level is very, very important.”

But, like other banks, RBC was not comfortable using public-cloud-based closed-source large language models for personal data, despite normal assurances from the cloud providers about how they would gate and control information flows. RBC also believes its more customised, in-house models – bigger than traditional AI, but more tailored to banking than general consumer large language models like ChatGPT – will result in more relevant and reliable outputs.

Speak banking?

Gen AI models for bankers, in other words, should be less fun, but less prone to hallucinations. It does not matter if the model cannot create poetry in iambic pentameter.

Agrafioti says: “We’re going to get them to forget how to speak Shakespeare and to learn more how to speak banking, through our own data. And I think that will help address a ton of the risks, where we see models go off tangent and have random conversations or provide misinformation.”

As for any incumbent bank, trust should be a differentiator, especially versus newer or less heavily regulated firms. “We say no a lot,” says Agrafioti, betraying some of the frustration about all the things bank technologists could do if their employers were less cautious. She adds: “The stance we took right away was that we’re not going to be client-faced with gen AI. We did not feel that the technology was ready for that.”

The first way of dealing with risk, as ever, is to understand it. Here that means understanding the data science behind the models the bank is using. This is why RBC believes its investments in training its own AI models, and the exclusivity of its partnership with Cohere, are crucial. A few downloads and a bit of prompt engineering will not cut it.

“You’ve got to go back to the science and solve it there,” says Agrafioti. “This goes beyond simple fine-tuning. You’ve got to attack it at a foundational level.”