Chain of Custody for a Machine Decision
Forensic evidence handling solved provenance a century ago. Machine decisions are about to be held to the same standard, and most artificial intelligence systems cannot answer the only question that matters: prove what you did.
The question nobody asks until it is too late
I spent years in and around systems where the output of a machine could ruin someone's day, or their year. A fraud model freezes an account. A triage model deprioritises a patient. A scoring model denies a mortgage. When something goes wrong, the first question is never about accuracy. It is about provenance. Who decided this, on what input, under which version of the model, and can you prove it without taking the vendor's word for it. That is not a data science question. It is a chain of custody question, and most artificial intelligence (AI) systems cannot answer it.
Forensic evidence handling solved this problem a long time ago, and it solved it the hard way, through cases that collapsed because a sample was unsealed, a bag was unlabelled, or a log had a gap nobody could explain. The lesson is brutal and simple. Evidence that cannot prove its own handling is not weak evidence. It is no evidence. A jury never gets to weigh it. I think we are about to learn the same lesson about machine decisions, and I would rather we learn it on purpose than in a courtroom. The organisations that get ahead of this will treat every consequential decision as a piece of evidence from the moment it is made, not as a line in a file they hope nobody ever subpoenas.
What evidence handling actually requires
Chain of custody is an unglamorous discipline. It demands that every item be uniquely identified, that every transfer be recorded with who, when, and why, that the record be tamper-evident, and that any break in the chain be visible rather than hidden. It does not require that the handlers be honest. That is the whole point. The system is designed to survive dishonest, careless, or simply forgetful handlers, because the integrity lives in the record, not in anyone's good character.
Now hold an ordinary AI deployment up against that standard. The decision is real. The consequences are real. But the record is usually a log file, written after the fact, stored in a system the operator controls, editable by anyone with the right access, and meaningful only if you trust the people who run the database. There is no seal. There is no independent identity for each decision. A gap in the log looks exactly like a quiet day. By forensic standards, that is not a chain of custody. It is a story told by an interested party, and an interested party's story is the first thing a serious investigator sets aside.
Logs are an alibi, not a record
Here is the uncomfortable distinction. A log is something you write down. A record, in the forensic sense, is something you cannot later deny. The difference is everything, and almost every AI system on the market today only produces the first kind. Application logs are written by the same process that made the decision, into storage that the same organisation administers. If that organisation is careless, the log is unreliable. If it is motivated, the log is editable. Either way, when you need it most, you are asking a court, a regulator, or a customer to trust the defendant's own diary.
I am not being cynical about people. I am being realistic about incentives and about how breaches actually unfold. When an AI system is the thing under suspicion, retrospective logs are the first artefact anyone with access will be tempted to tidy. A real chain of custody removes the temptation by removing the ability. The record has to be sealed at the moment of the act, not assembled afterwards when the stakes are clear. Once the consequences are visible, every log becomes a negotiation with the past, and a record that can be renegotiated is not a record at all.
Sign before you act, not after
This is the hinge of the whole argument, so I will be blunt about it. If you record a decision after it executes, you are documenting history and hoping nobody rewrites it. If you sign the decision before it executes, you have created custody. The order matters more than almost anything else in the design. A signature that comes first commits the system to exactly one account of what it was about to do. There is no room to reconcile the story with the outcome later, because the story was sealed before the outcome existed.
In the Mickai Sovereign Intelligence Operating System (SIOS), this is the function of the Open Audit Record (OAR). Every AI action is signed before it executes, then hash-chained into an append-only sequence where each entry locks the one before it. Break a link and the break is visible to anyone. The signatures are post-quantum, using the United States National Institute of Standards and Technology (NIST) standard FIPS 204 (ML-DSA-65), because a custody record has to outlive the cryptography that was fashionable when it was written. A chain of custody that can be forged in ten years is not a chain of custody. It is a delay. We chose to sign first and to sign with cryptography built for the next decade precisely because the cheap shortcut, recording after the fact with today's algorithms, fails exactly when someone finally cares enough to attack it.
Admissibility is the real test
We are walking into a period where machine decisions will be examined the way physical evidence is examined. The European Union (EU) Artificial Intelligence Act brings hard obligations for high-risk systems from August 2026, including logging and traceability that you can actually produce on demand. AI liability is rising across jurisdictions, and the direction of travel is clear even where the precise rules are still settling. The qualitative trend is unmistakable. Operators will increasingly be asked not whether their model was good, but whether they can prove what it did, in a form that survives hostile scrutiny.
Admissibility has never been about how clever your evidence is. It is about whether its handling can be trusted by people who have every reason to doubt you. The forensic world expresses this through a handful of plain demands, and they map onto machine decisions almost without translation. Meet them and your record walks into the room. Miss any one of them and a competent opponent has the whole thing excluded before anyone debates whether the model was right.
- Identity: every decision is a uniquely identified item, not an anonymous line in a stream.
- Integrity: the record is tamper-evident, so alteration is detectable rather than deniable.
- Continuity: no silent gaps, because a missing link is visible as a missing link.
- Independence: the record can be checked without trusting the party that produced it.
Verifiable by the doubter, offline
The last demand is the one most systems quietly fail, and it is the one I care about most. If verifying a decision requires logging into the vendor's portal, calling the vendor's interface, or trusting the vendor's dashboard, then the vendor is still the custodian, and we are back to the diary. Real custody means the person who distrusts you most can verify the record on their own terms. With the OAR, you can take a record and confirm it in an ordinary web browser, offline, with no connection to Mickai and no trust in us required. The mathematics either holds or it does not, and you do not need our permission to find out.
That offline, vendor-independent check is what turns an internal log into something closer to admissible evidence. And because the SIOS runs sovereign, with the fifty brains (twenty-five domain and twenty-five operational) on the Poseidon substrate, the custody record is anchored where you control it rather than dispersed across infrastructure you rent. The Pantheon chain, the one part of this we are still building, is a sovereign Layer 1 that extends that anchor by rooting the audit trail to Bitcoin (through its token PAN, fixed supply five billion), so the integrity of the record stops depending on any single operator surviving, including us. We are also actively training our own models now, fine-tuning and specialising open foundations and building a sealed corpus, so that over time even the intelligence behind a decision sits inside the same chain of custody as the decision itself.
The standard I would want applied to me
I will put it plainly, because I think the industry has been hiding behind complexity. If an AI decision affects a person, that decision should carry the same burden of proof we already demand of a fingerprint or a blood sample. Identified, sealed, continuous, and checkable by someone who does not trust the lab. We have known how to do this for evidence for over a century. There is no honest reason machine decisions should be held to a lower standard, and every reason to expect that regulators and courts will soon refuse to. Complexity has been the excuse for skipping the discipline. It was never a good one, and it is running out of road.
Accuracy will always matter. But accuracy is the argument you make after your evidence is admitted. Custody is what gets it admitted in the first place. Build the record so it is signed before the act, sealed against tampering, continuous by construction, and verifiable by the person most determined to prove you wrong. Do that, and you are not asking anyone to trust you. You are giving them the means to check. That is the difference between a log and a record, and it is the whole of the thesis.


