By Vijay Vijayasankar, Global Agentic AI Officer, Genpact
Earlier this year, a small textile exporter in Tiruppur found his current account frozen without prior notice or human review. An automated fraud detection system had flagged an unusual pattern in his transaction history. The activity was a seasonal spike in outward remittances. It took eleven days, three branch visits, and a formal complaint to restore access. His working capital facility lapsed in the interim.
This is not an isolated incident. As Indian financial institutions accelerate their AI deployment across lending, fraud detection, compliance monitoring, and customer servicing, the gap between what these systems can execute and what governance structures can oversee is widening. This gap can cost frozen accounts, wrongful rejections, regulatory violations, and broken customer relationships.
From Productivity Tool to Operational Infrastructure
The scale of AI deployment in Indian banking today is striking. NBFC underwriting engines process thousands of loan applications overnight with no human intervention. Fraud detection models run continuously beneath NPCI’s UPI infrastructure, which processed approximately 23 billion transactions in May 2026 alone. Compliance systems monitor trading activity at domestic brokerages in real time. These are no longer experimental pilots but core operational infrastructure.
Automation scales in ways human teams cannot, but infrastructure carries obligations that productivity tools do not. When an AI system is embedded in decisions that determine whether a farmer in Maharashtra gets a crop loan or a business owner in Tamil Nadu is flagged as a financial risk, system governance becomes just as critical as algorithmic performance.
A study Genpact conducted jointly with HFS Research, surveying 545 senior executives across 11 industries, captures the nature of this tension. While 92% enterprise leaders expect agentic AI systems to fundamentally change how business workflows are executed, only 22% feel comfortable granting these systems broad autonomy. Nearly 80% organisations continue to operate them under supervised mode, requiring human final approval. That hesitation is not timidity. It is institutional memory. Mature financial institutions understand what happens when technology scales faster than the controls designed to govern it.
The Regulator is Watching
The Reserve Bank of India (RBI) Deputy Governor Swaminathan J. warned the industry against algorithmic opacity, the condition where AI-driven decisions become black boxes that institutions cannot adequately explain and customers cannot meaningfully challenge. Those remarks were a clear signal about where regulatory scrutiny is heading.
The RBI’s Master Direction on Information Technology Governance, Risk, Controls and Assurance Practices sets out clear expectations for how regulated entities validate, monitor, and audit the automated systems driving their decisions. SEBI has moved in parallel, tightening its oversight of algorithmic trading and pushing for greater transparency in automated order execution. This indicates that regulators will increasingly expect institutions to demonstrate not just that their AI systems work, but that they can account for every significant decision those systems make. For banking and financial services leaders, it is no longer sufficient to ask whether AI improves transactional throughput. The question is whether every individual outcome can be traced, explained, and if necessary, defended before a regulator or a consumer court.
Why Traditional Explainability Falls Short
The conventional response to AI governance concerns has been to invest in explainability tools and techniques that attempt to reconstruct why a model reached a particular conclusion. Frameworks like LIME and SHAP were designed for this purpose, and they work well for classical machine learning systems with contained, interpretable structures. However, these systems were not designed for deep neural networks and autonomous architectures institutions are deploying today. Modern LLMs operate across hundreds of billions of parameters.
The path from input to output is a probabilistic process, not a decision tree, and cannot be fully reconstructed. When that process leads to a loan rejection or an account freeze, the inability to provide a clear, auditable explanation is a direct legal and regulatory liability. The problem deepens with agentic AI, which executes multi-step workflows, gathers data, draws inferences, and triggers financial actions without human checkpoints at each stage. A flawed assumption at the start of an autonomous workflow shapes every subsequent decision in the chain. By the time an error surfaces, its point of origin is difficult to isolate.
Post-hoc explainability addresses the symptom without addressing the fundamental need: reliable, auditable AI behaviour at scale, across thousands of decisions, every day.
Three Principles for Governing AI at Scale
Indian financial institutions are not newcomers to managing systems that produce probabilistic outputs under uncertainty. Credit scoring, actuarial modelling, and market risk management all operate within defined tolerances, backed by escalation protocols and independent review structures. Governing agentic AI applies to these familiar control principles to faster, more complex environments.
Three principles should anchor every institution’s approach:
1. Build auditability in, not on: In most AI deployments, governance is added after the fact, with reviews scheduled and documentation compiled only once a system is already live. For autonomous systems making consequential decisions at scale, this is record-keeping, not governance. Auditability must be designed into the system architecture from the start. Every significant AI-driven decision should generate an immutable record: data used, path followed, confidence level assigned, and actions taken. Under current RBI guidelines, this level of forensic traceability is shifting from good practice to baseline expectation. Think of it as a flight data recorder, not consulted unless something goes wrong, but always running, always complete.
2. Match human oversight to decision risk: Most AI-driven actions in financial services are high-volume and low-stakes, and blanket human oversight eliminates the efficiency gains that justify the investment. But not all decisions are equal. A credit denial, an account freeze, or a compliance escalation carry consequences significant enough that human judgement must remain available when a system is operating near the limits of its confidence. Institutions need to classify AI-driven decisions by risk level upfront and establish hard circuit breakers, precise points where the system must pause and hand off to a human reviewer. The Tiruppur exporter’s eleven-day ordeal was not inevitable. It was the result of a system deployed without one.
3. Separate the system that decides from the system that reviews: A fundamental principle of banking governance holds that the person who initiates a transaction cannot be the one who approves it. The same logic applies to AI systems, and it is being bypassed routinely. Two systems trained on the same data, built on the same architecture, and fine-tuned through the same process will share the same biases and failure modes. Asking one to audit the other does not produce independent oversight.
Effective governance requires genuine segregation, consistent with the Three Lines of Defence framework that Indian banks already apply to financial controls:
First line: The operational agent executes the workflow, optimised for performance and speed.
Second line: An independent validation layer, using a materially different model architecture, such as a specialised open-source system auditing a proprietary commercial model, evaluating risk, bias, and drift.
Third line: Human-led internal audit and periodic external validation provide the final layer of accountability.
All these lines should be independent of each other.
Measuring What Actually Matters
The dominant metrics for AI in Indian BFSI are throughput metrics: processing times, cost per transaction, straight-through processing rates. These are legitimate commercial measures, but they evaluate what AI achieves, not how safely or reliably it does so.
Institutions should track governance metrics such as auditability, model drift, exception rates, and human interventions alongside traditional performance measures.
Alongside traditional performance metrics, institutions should track governance indicators such as audit trail completeness, model drift, exception escalations, false positives, and human overrides. These measures provide early warning signals when AI systems are operating outside acceptable risk and confidence thresholds.
Institutions that build these frameworks now will be considerably better placed when regulatory expectations harden. Based on the current trajectory of both RBI and SEBI guidance, that is a matter of when, not whether.
The Trust Dividend
The case for AI governance in Indian BFSI is often framed as a compliance obligation. That framing is too narrow. India has hundreds of millions of active digital banking users, a population expanding steadily into Tier 2 and Tier 3 markets. These consumers are highly aware of their rights, vocal when systems fail them, and willing to escalate through formal channels. The RBI’s Integrated Ombudsman Scheme received exactly 9,34,355 complaints in FY2024, a 33% increase on the previous year, and that volume will only grow as AI-driven decisions reach deeper into everyday financial life.
The institutions that lead India’s next phase of financial services growth will not necessarily be those that deploy AI fastest. They will be those that deploy it in ways their customers trust and their regulators can inspect. Governance is not a constraint on that ambition. It is the condition for it.
The textile exporter in Tiruppur eventually recovered his account. But he moved his banking relationship to a competitor the following month. In a market where customer acquisition costs are rising and switching is increasingly frictionless, that is a loss no productivity metric will ever capture.