Back to Blog

Industry

AI Agents for Banking

A C-suite guide to where they fit, what regulators require, and where to start — written for the eighteen-month window that just opened.

Neural Factory Team

Industry Research

7 min read
On this page
  1. 01What "agent" means in banking — and what it doesn't
  2. 02Where banks are already deploying agents — by function
  3. 03The regulatory backbone every C-suite needs to know
  4. 04The cost lever, in plain terms
  5. 05Where to start: a prioritization framework
  6. 06The eighteen-month window

The question banks are asking in 2026 is no longer "should we do AI?"

JPMorgan Chase has 450+ AI use cases in production, attributing roughly $1.5 billion in annual value to them6. Bank of America's Erica has crossed 3 billion client interactions7. Wells Fargo's Fargo grew from 21 million interactions in 2023 to 245 million in 20248 — with zero sensitive data exposed to the underlying LLM. The era of AI-in-banking-as-pilot ended last quarter.

The real question is sharper, and it sits with the C-suite: where do we deploy agents first, under which regulations, and on what timeline?

This is the answer.

Fig. 1 — Where banking agents are deployed today, by function and maturity.

What "agent" means in banking — and what it doesn't

An AI agent is not a chatbot, not an RPA bot, and not a "model" in the SR 11-7 sense. It is software that reasons through a goal, reads context across systems, and takes action — with cited sources, audit trails, and human-in-the-loop where regulation requires it.

That last clause is the one that matters in regulated finance. A copilot drafts a memo. An agent does the work — opens the case, pulls the documentation, drafts the disposition, routes for approval, and logs every step a regulator might later ask about.

Three properties separate the category from everything older:

01 · REASONING

The agent decides how to handle a case, not just which steps to execute.

A rules engine cannot triage a novel AML pattern. An agent can — by picking the right tools, sequencing the actions, and changing course when the evidence does.

02 · CONTEXT

It reads messy, unstructured input the way a human does.

A wire-transfer narrative, a 200-page commercial credit file, a customer's email — without the world being pre-formatted into rows.

03 · ACTION

It does the work across the stack, with guardrails.

Core banking, CRM, AML platform, document store, ticketing. Cited sources, traceable decisions, escalation when uncertain. Miss any one of these and the system is something older with a new label.

Where banks are already deploying agents — by function

Strip the marketing from disclosed deployments and a maturity map emerges. Three tiers, by function.

MATURE · DEPLOYED AT SCALE

Customer service, document review, fraud detection.

Erica (Bank of America), Fargo (Wells Fargo), and CashPro Chat handle hundreds of millions of interactions per year, with cost per AI-resolved interaction at $0.50–$2 versus $8–$15 for a human channel. BNY Mellon has deployed a Contract Review Assistant across thousands of negotiated agreements per year9. Citi's developer-side agents have completed over one million code reviews in 2025 and recovered roughly 100,000 engineering hours per week. Commonwealth Bank Australia's agentic fraud system contributed to three-quarters of new card-fraud rules created in 2024.

EMERGING · DEPLOYED SELECTIVELY

AML, trade surveillance, wealth research, KYC.

HSBC's AML agent reduced alert volume by 60% while increasing true-positive detection 2–4×. Deutsche Bank's trade surveillance agent (with Google Cloud) cut false positives by ~40% and saves an estimated $5M/year. Morgan Stanley's AskResearchGPT cut institutional-sales response time by 90%. Onboarding flows that took days now compress to minutes — though human review on high-risk cases remains the rule.

EXPERIMENTAL · GATED BY REGULATION

Credit underwriting, lending decisions, full-autonomy SAR filing.

Powerful, but classified as high-risk under the EU AI Act and squarely inside GDPR Article 22's right-not-to-be-subject-to-automated-decision regime. Live deployments exist; full autonomy does not. For regulatory reporting and SAR filing, audit-trail requirements, materiality thresholds, and effective-challenge obligations make full automation unsafe. Agents draft. Humans approve.

The pattern: agents have already industrialized the work where the cost of being wrong is low and the cost of being slow is high. The frontier is the inverse.

The regulatory backbone every C-suite needs to know

There are three frameworks a US/EU bank cannot deploy an agent without understanding. They are converging on the same principle: every agent decision must be explainable, auditable, and reversible.

Fig. 2 — Three frameworks. One principle: every decision explainable, auditable, reversible.
UNITED STATES · SR 26-2

Effective April 17, 2026 — and it deliberately leaves agents out of scope.

SR 26-2 superseded the fifteen-year-old SR 11-7 as the Federal Reserve's model-risk supervisory guidance1. It tightens definitions, sharpens materiality assessment, and explicitly requires effective challenge by independent reviewers. It also does something striking: it excludes generative and agentic AI from its scope, calling them "novel and rapidly evolving," and instructs banks to apply existing enterprise risk management and third-party-risk frameworks until further guidance arrives. The Fed, OCC, and FDIC have signaled an RFI on agentic AI later this year. Read this carefully — it does not give banks a holiday. It gives them a window.

EUROPEAN UNION · EU AI ACT

Binding for high-risk systems on August 2, 2026 — three months away.

Annex III classifies credit scoring and creditworthiness assessment of natural persons as high-risk. So does life and health insurance underwriting. Fraud detection is excluded. For high-risk systems, the Act imposes data-governance standards (training data must reflect the demographic composition of the target market), human oversight, accuracy and robustness benchmarking by subgroup, transparency to users, post-deployment monitoring, and conformity assessment2. Non-compliance penalties reach €35M or 7% of global turnover.

EU + UK · GDPR ART. 22 + SS1/23

Already in force. Already settled by the courts.

A 2023 European Court of Justice ruling (the SCHUFA case) settled the question that mattered to banks: automated credit scoring is automated decision-making. Article 22 prohibits decisions made solely by an agent in significant matters — credit denial, account closure, AML offboarding — unless one of three exceptions applies, and even then the bank must provide meaningful human intervention and a route to challenge. Bank of England SS1/23 lands in the same place under UK supervision: tech-agnostic, explicit on AI, explicit on third-party explainability3.

The cross-cutting takeaway is operationally simple. Whatever the agent does, the bank must be able to (1) show the inputs it used, (2) show the rule or reasoning that produced the output, (3) show the human who reviewed it where required, and (4) reverse it on a timely customer challenge. Agents that cannot do all four do not belong in production.

The cost lever, in plain terms

The reason banks deploy agents is not that they are interesting. It is that they convert a stack of fixed costs into a variable one.

US and Canadian financial institutions reported $61 billion in compliance costs in 202410 — and 99% reported the costs rising. AML alert investigation absorbs an estimated 95% false-positive load at industry baseline; HSBC's agent collapsed that load by 60% while detecting more real cases. Customer-service economics shift from $8–$15 per human interaction to under $2 per AI-resolved one. McKinsey projects up to a 20% reduction in banking-industry costs4 from broad AI adoption and a 50% reduction in human-serviced customer contacts. Citi GPS pegs the 2028 sector profit uplift at roughly $170 billion5.

Each one of those numbers is a different way of saying the same thing: agents move the bank from "paying for compliance, service, and ops by the headcount" to "paying for them by the transaction."

That is a structural change, not an optimization.

Where to start: a prioritization framework

Not every workflow should be the first one a bank automates. The right sequence is governed by two axes: expected ROI and regulatory friction.

Fig. 3 — Build governance muscle in the top-left. Use it to earn the top-right.
TOP-LEFT · START HERE

High ROI, low friction. Internal ops, document review, AML triage with analyst-in-loop, customer service.

The disclosed deployments cluster here for a reason: ROI is measurable in months, the regulatory surface is governed by existing third-party-risk frameworks, and the institutional muscle the bank builds — audit logging, citation infrastructure, escalation routing — is exactly what becomes table stakes for the high-friction quadrant later.

TOP-RIGHT · EARN THE RIGHT

High ROI, high friction. Credit underwriting, lending decisions, full-autonomy SAR filing.

These are where the largest value sits and where regulation is sharpest. Move here only after the bank has six months of clean audit trails, validated explainability, and demonstrated human-in-the-loop discipline in the easier quadrant.

BOTTOM-RIGHT · AVOID

Low ROI, high friction. Full-autonomy regulatory reporting, model-risk decisioning, customer-facing legal communications.

The cost of being wrong dominates the value of being faster. The discipline this implies is not new to banking — it is exactly the way a well-run institution rolls out any new line of activity: contained pilot, supervised expansion, scope earned. Apply it to agents.

The eighteen-month window

The compressed period between April 17, 2026 (SR 26-2 effective) and roughly end of 2027 (when an OCC/Fed/FDIC RFI on agentic AI is likely to produce specific guidance) is the strategic deployment horizon. EU AI Act high-risk obligations bind in the middle of it, on August 2, 2026.

Banks moving inside this window will set their own governance precedent. They will build the audit-trail and explainability infrastructure now required for any future regulator-mandated configuration. They will negotiate vendor terms — data portability, contract escrow, evidence cooperation — while they still have leverage. The banks waiting for fully crystallized rules will inherit the precedent the early movers wrote.

The institutions that come out of 2027 ahead will not be the ones with the largest AI budgets. They will be the ones that, in 2026, treated agents the way they treat any new line of business: scope it, govern it, prove it, scale it.

The right question is not whether to deploy. It is which workflow you start with on Monday.

Related Articles

Neural FactoryNeural Factory

Build secure AI Coworkers
without code.