MICKAI
Article · 14 June 2026

The Duty to Give Reasons Did Not Survive Automation. It Has To.

Automated benefits decisions still owe citizens an explanation they can understand and an appeal they can win. Most systems quietly lose the record that makes either possible.

The Duty to Give Reasons Did Not Survive Automation. It Has To.
Author
Micky Irons
Published
14 June 2026
Follow Micky Irons
LinkedInX
administrative lawautomated decision-makingpublic sector AIduty to give reasonsEU AI Act

A letter that explains nothing

A woman in a council flat opens an envelope. Her disability payment has been reduced. The letter tells her the decision was made, that she has a right to ask for it to be looked at again, and that she should call a number between nine in the morning and five in the afternoon. What the letter does not tell her is why. Not the real why. Not the rule that was applied, the figure that tipped the calculation, the field on the form that was read one way rather than another. She is left holding a conclusion with the reasoning amputated. If she rings the number, the person who answers will read the same letter back to her. Somewhere upstream, a model or a rules engine scored her, and the score became the outcome, and the outcome became the letter. Nobody in that chain can reconstruct the path from input to result, because nobody kept it.

I run a company that builds the opposite of that letter. So I want to be honest about what is happening in public administration right now, and about what the law actually demands, before I tell you what we built. The duty to give reasons is one of the oldest expectations we place on the state. Automation did not abolish it. Automation just made it easy to forget. And the forgetting is not malicious. It is structural. The tools we bought to make decisions faster were never designed to make those decisions answerable, and the gap between those two jobs is exactly where this woman now stands.

Why the state owes you reasons, not just outcomes

There is a principle that runs through administrative law in most mature democracies, and it is simple to state. When a public body makes a decision that affects your rights, your money, or your liberty, it owes you an explanation you can understand and challenge. This is not courtesy. It is the mechanism that makes the rest of the system work. You cannot appeal a decision whose grounds you do not know. A tribunal cannot review reasoning it cannot see. An ombudsman cannot find maladministration in a black box. The duty to give reasons is the hinge on which accountability swings. Remove it and every other safeguard, the right of appeal, the right to a fair hearing, the right to judicial review, becomes ornamental, because each of them assumes you can find out what was decided and on what basis.

In the United Kingdom the duty has grown through case law into a general expectation for decisions of real consequence, and it is reinforced by the requirement that decisions be rational, that relevant matters be considered and irrelevant ones ignored, and that the affected person have a fair chance to respond. In the United States, due process under the Fourteenth Amendment has long required notice and an opportunity to be heard before a benefit is taken away, a line drawn firmly in the welfare context decades ago. Across the European Union (EU), the right to good administration includes an explicit obligation on institutions to give reasons. The General Data Protection Regulation (GDPR) adds a right not to be subject to a decision based solely on automated processing where it produces legal or similarly significant effects, alongside a right to meaningful information about the logic involved.

None of these instruments care whether a human or a machine produced the decision. The duty attaches to the decision, not to the decision maker. That is the point administrators keep missing when they procure automated systems. The obligation does not soften because the work was outsourced to software. If anything it sharpens, because the volume of automated decisions is far higher than a human caseload could ever reach, and an error baked into a model is not one mistake by one tired official, it is the same mistake repeated thousands of times before anyone notices. Scale does not dilute the duty. It multiplies the stakes of failing it.

What automation quietly removes

When a caseworker decided your benefit by hand, the reasoning was imperfect but it was there. There was a file. There were notes. There was a person who could be asked, who could say I read this rule this way because of that fact. The reasoning was recoverable, even if it took a complaint and a few months to recover it. Automation promised to keep all of that and add speed and consistency. In practice, many deployments kept the speed and lost the file. The irony is sharp. The machine that could in principle record everything perfectly is routinely configured to record almost nothing that matters to the person it judges.

Consider how these systems are usually built. A vendor supplies a scoring model or a rules engine. It ingests application data, sometimes enriched with third-party data the applicant never sees. It outputs a number, a flag, or a category. A downstream template turns that output into a letter. The decision is fast and it is consistent, in the narrow sense that the same inputs give the same output. But ask the system to show its working for one specific person on one specific day, and it often cannot. The model that scored her has since been retrained. The data that fed it has been overwritten by a later snapshot. The version of the rules in force that morning was replaced last week. The letter survives. The reasoning does not.

This is the central failure, and it is not a failure of intelligence. The model may have been accurate. It is a failure of record. A decision you cannot reconstruct is a decision you cannot defend, and a decision the state cannot defend is one it had no business making. The Dutch childcare benefits scandal, where an automated risk system wrongly branded thousands of families as fraudsters and clawed back money that ruined lives, was not primarily a story about a bad algorithm. It was a story about a system whose outputs could not be questioned in time, whose logic was opaque to the people it harmed, and whose record did not support the scrutiny that eventually, far too late, arrived. The lesson administrators should take from it is not that automation is dangerous. It is that automation without a defensible record is a liability waiting for a docket number.

The appeal that cannot find its target

Picture the appeal that follows the letter I described at the start. Months later, the case reaches a tribunal. The panel wants to know one thing above all: what was the decision actually based on. They ask the department to produce the inputs, the rule set, and the calculation as they stood on the day. This is where automated systems tend to fall apart, and they fall apart in three distinct ways. Each one looks, from a distance, like a paperwork problem. Each one is in fact a fatal gap between what the system did and what it can prove it did.

A classical marble hand emerging from deep shadow holding a sealed document, lit by a thin gold rim light against a void black background.
A conclusion with the reasoning amputated: the letter survives, the record does not.

First, drift. The model or the rules have changed since the decision, and nobody preserved the exact version that ran. The department produces today's system and asserts it would have produced the same result, which is an assertion, not evidence. Second, opacity. The output was a score with no attached explanation, so even the original system cannot say which factors moved the result for this person. Third, mutability. The underlying data has been updated, so the inputs the tribunal sees are not the inputs the system saw. In all three cases the citizen is asked to accept the state's word that the process was sound. That is precisely the trust the duty to give reasons exists to make unnecessary.

An explanation produced after the fact, reverse engineered to justify a conclusion already reached, is not a reason. It is an alibi. The law has always been suspicious of reasons that appear only once a decision is challenged, because reasoning that was genuinely part of the decision can be shown to have existed at the moment of decision. The honest test for any automated benefits system is not can it explain itself when asked. It is can it prove what it knew, what rule it applied, and what it concluded, fixed at the instant it decided, and can a third party check that proof without taking anyone's word for it. A system that can only narrate its reasoning in hindsight has not given a reason. It has performed one.

Coming obligations make this urgent, not optional

The regulatory direction of travel removes any comfort administrators might take from the current patchiness of enforcement. Under the European Union's Artificial Intelligence Act (the EU AI Act), systems used to determine eligibility for public assistance benefits and services fall within the high-risk category. From August 2026, high-risk obligations bite in force: risk management, data governance, detailed technical documentation, logging that enables traceability of the system's functioning, human oversight, and record keeping that lets authorities reconstruct how the system behaved. The Act does not merely ask whether the model is accurate. It asks whether you can account for what it did.

Logging and traceability are the words to dwell on. A regulator arriving to inspect a public benefits system will not be satisfied by a confident description of the model. They will want the records: which version ran, on which inputs, producing which output, under which human oversight, at which time. If those records do not exist, or exist only in mutable logs that the operator could in principle have edited, the operator is exposed. And the trend extends well beyond Europe. Liability for automated decisions is rising across jurisdictions, the standard of documentation expected is rising with it, and the era in which a public body could deploy a scoring system and keep no defensible record of its individual decisions is closing. An administrator who procures a system today that cannot produce a per-decision record is not saving money. They are deferring a cost, with interest, to the first serious challenge.

What an appealable record actually requires

Let me be concrete about what a record has to do to satisfy both the duty to give reasons and the coming obligations, because vague talk of transparency helps no one. A record adequate to an automated benefits decision has to satisfy four properties at once, and most systems satisfy none of them. The four are not a wish list. They are the minimum below which a record stops being evidence and becomes mere assertion in a more elaborate font.

It must be contemporaneous. The reasoning has to be captured at the moment of decision, not assembled later from whatever survived. It must be complete. The inputs, the rule or model version, the relevant intermediate values, and the output all belong in the record, because a reason that omits the figure that tipped the balance is not a reason. It must be tamper evident. A record the operator can quietly edit is worth nothing in a dispute, because the citizen and the tribunal have no way to distinguish the original from a convenient revision. And it must be independently verifiable. The person harmed, or their representative, or a tribunal, has to be able to check the record without trusting the body that produced it, because the entire purpose of the exercise is to function where trust has broken down.

Hold those four together and you will notice that ordinary system logs fail at least two of them. Logs are usually mutable, the operator holds the keys, and verifying them means trusting the operator's own infrastructure. A screenshot fails all four. A model card describes the system in general but says nothing about your decision in particular. The standard tooling of the industry was built to help engineers debug systems, not to help citizens hold the state to account. Those are different jobs, and the second one is the one administrative law actually requires. The reason almost no deployed system clears the bar is that almost no one set out to clear it. They set out to make decisions, and treated the record as exhaust.

The signed record, fixed before the act

This is the problem we built Mickai to solve, and I will describe the mechanism plainly rather than dress it up. Mickai is a Sovereign Intelligence Operating System (SIOS), built and in production, not a prototype and not a roadmap. Inside it, every action an artificial intelligence (AI) takes is written to what we call the Open Audit Record (OAR). The record is signed before the action executes, not after. That ordering is the whole game. A reason that is committed before the decision runs cannot be a later justification, because it physically predates the outcome it would supposedly excuse. The record is hash chained and append only, so each entry is cryptographically bound to the one before it and nothing can be silently removed or reordered. The signatures are post-quantum, using the United States National Institute of Standards and Technology (NIST) standard for digital signatures (Federal Information Processing Standard 204, the algorithm known as ML-DSA-65), so the proofs do not decay as cryptography advances and the records remain checkable for the decades over which a benefits decision can echo through a person's life.

The property that matters most for a citizen is the last one. The record is verifiable offline, in an ordinary web browser, with no trust in the vendor. A person who receives an automated decision, or the adviser helping them, or the tribunal reviewing it, can take the record and check the signatures and the chain themselves, on their own machine, without calling us and without believing a word we say. That is what independent verification means. It is the difference between we promise this is what happened and here is the proof, check it yourself. For a tribunal, that distinction is the difference between accepting testimony and examining evidence, and the second is the one the law was built around.

A fluted marble column and a carved wax-seal medallion lit by a thin gold rim light against a void black background.
A sealed, hash-chained record anchors a decision the way a column anchors a building: it lets a third party verify the structure without trusting the builder.

Under the surface, the decisions are produced by fifty specialist models, twenty five covering domains and twenty five handling operations, running on a silicon substrate we call Poseidon. We are actively training our own models now, fine tuning and specialising open foundations including Llama 3.2 and Qwen 2.5 and building a sealed corpus, with funding scaling toward fully native weights. The audit root can be anchored to Bitcoin through Pantheon, our sovereign Layer 1 settlement chain, whose token PAN has a fixed supply of five billion, and which is the one part of the stack still in build. The architecture itself is the subject of 101 filed United Kingdom patent applications, roughly 2,234 claims, owned by Mickai LTD, with myself as named inventor. But for the administrator reading this, the architecture is secondary. The point is the record, and the property of the record is that it can be checked by the person it was made about.

The honest caveats

I am a security realist, so I will not pretend a signed record fixes everything. It does not make a bad rule good. If the underlying policy is unfair, an immaculate audit trail simply proves, with cryptographic precision, that you applied an unfair rule consistently. The record makes decisions contestable. It does not make them correct. That remains the work of good policy, good model training, and human judgment about where automation belongs at all and where a person must decide. A signed record is a flashlight, not a conscience. It shows you exactly what was done, which is necessary, and entirely separate from whether what was done should have been done.

Nor does verifiability substitute for explanation in plain language. A citizen needs both the proof that the record is genuine and a statement, in words she can read, of why her payment changed. The signed record makes the explanation trustworthy. It does not write the explanation for you. And a record is only as honest as the data committed to it. If a department feeds the system the wrong facts, the record will faithfully and permanently memorialise a decision made on wrong facts. That is still an improvement, because the error becomes visible and fixed in place rather than deniable, but nobody should mistake tamper evidence for truth. The record guarantees that what you are looking at is what happened. It cannot guarantee that what happened was right, and any vendor who tells you otherwise is selling you comfort, not accountability.

What I would ask of anyone automating a public decision

So here is the standard I would hold any automated benefits system to, ours included. Before it decides anything about a real person, ask whether it can produce, for that one decision, a record that is contemporaneous, complete, tamper evident, and verifiable by the citizen without trusting the operator. If it cannot, it is not ready to make decisions that reduce someone's income, because it cannot meet the duty the law already imposes, and it will not survive the documentation obligations now arriving. Speed and consistency were never the hard part. The hard part was always keeping a record worthy of an appeal. Any vendor can give you the first two. Ask hard about the third, because that is the one that will be tested in front of a panel, under oath, with a person's livelihood on the table.

The woman opening that envelope does not need a faster letter. She needs a letter she can argue with, backed by a record she can check, in front of a tribunal that can see exactly what the machine knew and concluded on the morning it decided. That is not a feature request. It is the oldest promise the state makes to the governed: that power over your life will be exercised for reasons you are allowed to know and entitled to contest. Automation can keep that promise better than paper ever did, because a signed, hash-chained record forgets nothing and hides nothing. It can also break that promise more quietly than paper ever could, because a deleted log leaves no torn page and no missing file. The signed, offline-verifiable record is how we make sure it keeps the promise rather than breaks it. Everything else in this essay is detail. That sentence is the thesis.

Subscribe
Get every new Mickai article by email.

Long-form essays on sovereign AI from Micky Irons. One email per article. No tracking, no marketing, no third parties. Every email includes a one-click unsubscribe link.

Prefer RSS? Subscribe at /articles/feed.xml.

Originally published at https://mickai.co.uk/articles/appealable-record-automated-benefits-decisions. If you operate in a regulated sector or want sovereign AI on your own hardware, the audit form on mickai.co.uk is the entry point.
More articles
15 Jun 2026
The Provenance of a Generated Molecule
A regulator and a court will both ask how an AI-generated drug candidate was derived. The molecule is the hypothesis. The signed, offline-verifiable record of its generation is the asset you can actually defend.
14 Jun 2026
The Logbook That Cannot Be Rewritten: Autonomous Vessels and the Discipline of the Signed Record
A ship's logbook was admissible in court because it was written in real time, in sequence, and could not be quietly rewritten after the fact. Autonomous vessels keep the data and throw away the discipline. Here is what the sea taught us about records, and why the only honest answer is a signed, hash-chained, offline-verifiable account of every decision a machine makes at sea.
13 Jun 2026
The Black Box AI Never Built: Why Every Machine Decision Needs a Flight Recorder
Aviation became the safest way to travel not because crashes stopped, but because every crash became investigable. The flight recorder turned disaster into evidence. Artificial intelligence makes millions of consequential decisions a day and keeps almost no equivalent record. I want to explain why that gap is the central safety problem of the next decade, and what a real fix looks like.
15 Jun 2026
When the Network Runs Itself: The Account Telecoms Regulators Will Demand
In modern telecoms, artificial intelligence makes thousands of operational decisions a minute, and almost none of them are written down in a form anyone can later check. That gap is about to become a regulatory problem. The fix is not a better dashboard. It is a signed, hash-chained, offline-verifiable account of what the network decided and why.