When Model Risk Management Meets Generative AI
Model risk frameworks built for statistical models are being stretched over generative AI, and the 2026 rulebook is openly acknowledging the seams. What validators now need is provable lineage, not paperwork.
The frameworks were built for regression, not generation
Model risk management grew up in a world of statistical models. A credit scorecard, a value-at-risk engine, a stress-test projection: each had a defined functional form, a fixed set of inputs, a training dataset that could be named, and an output that behaved the same way every time you fed it the same numbers. The supervisory architecture that governs these models was designed around that determinism. The United States Federal Reserve guidance known as Supervisory Letter 11-7 (SR 11-7), issued in 2011, asked banks to hold a complete model inventory, to validate models independently of the teams that built them, to version them, and to keep a record of effective challenge. The United Kingdom Prudential Regulation Authority (PRA) codified comparable expectations in Supervisory Statement 1/23 (SS1/23), which took effect on 17 May 2024 and set out five principles covering identification, governance, development, validation and use.
Both regimes share an unstated assumption. They assume a validator can reconstruct what the model did. You can re-run a scorecard, inspect its coefficients, and reproduce its decision. Generative models break that assumption at the root. A large language model fine-tuned on a bank's policy documents does not expose its weights as a coefficient table. Its behaviour shifts with a prompt edit, a retrieval-augmented generation (RAG) index refresh, or a vendor pushing a new base model underneath the same application programming interface (API). The thing being validated is no longer stable, and the record of what it did is no longer self-evident.
2026 is the year the rulebook acknowledged the gap
This year's regulatory movement is unusually candid about that mismatch. On 17 April 2026 the Office of the Comptroller of the Currency, the Federal Reserve and the Federal Deposit Insurance Corporation jointly issued revised model risk guidance, designated SR 26-2 and OCC Bulletin 2026-13, superseding SR 11-7. The revision narrows the definition of a model, introduces a thirty billion dollar asset threshold for full applicability, and moves to a materiality-tiered, principles-based posture. For anyone deploying generative systems, the most telling detail is what the guidance leaves out. It states that generative and agentic artificial intelligence (AI) models are novel and rapidly evolving and are not within its scope. The agencies signalled a forthcoming request for information addressing model risk generally and banks' use of AI, including generative and agentic AI, specifically.
Read carefully, this is not a reprieve. The agencies did not call generative AI low risk. They placed it outside the existing model rulebook and left it to be governed by a firm's broader risk management practices until a dedicated framework arrives. The United Kingdom has reached a similar fork from the opposite direction. On 1 April 2026 Deputy Governor Sarah Breeden and PRA Chief Executive Sam Woods wrote to the Chancellor setting out the Bank of England and PRA approach to safe AI innovation. They declined to write new AI-specific rules, kept SS1/23 technology-neutral, and made AI adoption a named PRA supervisory priority for 2026. Their AI Consortium is now examining explainability and transparency in generative AI, edge cases as the technology moves into credit assessment and trading, and concentration risk from a small number of third-party model providers.
What a validator actually needs, and cannot get
Strip away the acronyms and the validator's job is concrete. To sign off a model, an independent reviewer must be able to answer a short list of questions: which version produced this output, what inputs and context were supplied, what data the model was trained or fine-tuned on, who approved the change that introduced it, and whether the same conditions reproduce the same result. For a regression model these answers live in a versioned repository and a validation report. For a generative model they are scattered, ephemeral, or simply absent.
Industry guidance through 2026 has converged on the same evidence regulators are expected to want: model inventory completeness, hallucination testing, prompt and RAG change control, outcome fairness metrics, and documented circuit breakers for high-risk uses. The hard gaps are well known. Vendor foundation models hide their pre-training data, so lineage controls have to shift toward transparency reviews and fine-tuning governance. Prompts behave as code, yet a one-line prompt edit that materially changes behaviour often sits outside formal change management entirely. The European Union Artificial Intelligence Act (EU AI Act) sharpens the point. From 2 August 2026 its high-risk obligations require automatic, tamper-evident logging retained for at least six months, dataset versioning, and traceable model lineage covering inputs, outputs and decision points. Financial institutions may fold those logs into records they already keep under Union financial services law, but the logs must exist and must be trustworthy.
The trust problem inside the evidence
There is a deeper difficulty the frameworks gesture at but rarely name. Every one of these records (the inventory, the version history, the challenge log, the input trace) is produced by the institution the regulator is examining. A validator inside the bank is asked to trust logs the bank itself generated and could, in principle, edit after the fact. A supervisor reviewing those logs is asked to do the same. When models were deterministic and reproducible, that circularity mattered less, because an examiner could re-run the model and check. With generative systems that cannot be cleanly reproduced, the integrity of the record becomes the whole game. If a log can be altered, backdated, or selectively assembled after an incident, it is not evidence. It is testimony.
This is the unsolved structural problem under the 2026 guidance. Materiality tiering, independent validation and challenge logs are only as strong as the assurance that the underlying record was written when it claims to have been written and has not changed since. A bank can hold a pristine model inventory and still be unable to prove, to a sceptical outside party, that the inventory reflects what actually ran on the day a customer was declined credit or a trade was mispriced.
A record the regulator can verify without trusting the bank
This is the precise problem the Open Audit Record (OAR) is built to solve. The OAR is a subsystem of the Mickai Sovereign Intelligence Operating System (SIOS), and its principle is strict. Every action is signed before it executes, not logged afterwards, and written into an append-only, hash-chained ledger. For a generative model that means the prompt, the model version, the retrieval context, the parameters and the human approval are committed cryptographically at the moment of the decision, in the order they happened. The chain cannot be reordered or quietly amended, because each entry binds to the one before it. The signatures use a post-quantum scheme, the Federal Information Processing Standard 204 ML-DSA-65 algorithm, so the evidence is designed not to decay as cryptographic assumptions age.
Verification is the part that addresses the trust problem head on. The record is checkable offline by a browser-resident verifier that needs no network connection and no trust in the vendor. A supervisor, an external auditor or a model validator can confirm the chain's integrity independently, without asking the bank to vouch for its own logs. The audit root anchors to Pantheon, Mickai's sovereign Layer 1 blockchain, whose root in turn anchors to Bitcoin, giving the record a timestamp no single party controls. Across the fifty brains, twenty-five domain and twenty-five operational, running on the Poseidon silicon substrate, the same discipline holds: inputs, versions and decisions become a replayable lineage rather than a reconstruction. Mickai is held privately by founder Micky Irons, with the underlying methods covered by 101 filed United Kingdom patent applications spanning roughly 2,234 claims, owned by Mickai LTD (Companies House 17166618) and naming Micky Irons as inventor.
From documentation burden to provable lineage
The direction of travel across all three jurisdictions is consistent even where the rules differ. The United States has parked generative AI outside SR 26-2 pending a dedicated framework. The United Kingdom keeps SS1/23 technology-neutral while its supervisors probe explainability and concentration risk. The European Union is making tamper-evident logging a hard legal obligation from August 2026. Each of these paths ends at the same requirement: a credible, durable, independently checkable account of what a model did and why.
Banks that treat this as a documentation exercise will keep producing records that prove only that someone wrote them. The more defensible posture is to make the evidence base verifiable by construction, so that lineage is captured at execution and can be replayed and checked by an outsider who trusts nobody. When the agencies' promised request for information becomes a rule, and when the next generative model quietly changes its behaviour overnight, the institutions that can hand a regulator a record it can verify for itself, rather than one it must take on faith, will be the ones still inside the perimeter of supervisory comfort.


