Article · 15 June 2026

The Provenance of a Defect

When an AI passes a faulty part, liability turns on a provable account of the decision, not on how clever the model was.

Author

Micky Irons

Published

15 June 2026

Follow Micky Irons

LinkedIn X

manufacturing AIquality controlAI liabilityprovenancetamper-evident audit

A part that passed

A torque-bearing bracket leaves the line stamped good. Six months later it fails in the field, a vehicle is recalled, and a lawyer asks a single question: who decided this part was acceptable, and on what basis? In a modern plant the honest answer is no longer a person. It is a model. An optical inspection system, a vibration classifier, or a fused vision-and-sensor network looked at that bracket, scored it, and waved it through. The defect was always there. The machine simply did not flag it, or flagged it and was overruled by a downstream rule nobody can now reconstruct. I have spent enough time around both manufacturing floors and machine-learning systems to believe the central problem of industrial artificial intelligence (AI) is not accuracy. It is provenance. When a model passes a faulty part, liability does not turn on how clever the model was. It turns on whether you can produce a provable, tamper-evident account of the decision it made.

This is an essay about that account. About why the record of an AI quality decision is now the most valuable artifact in the building, why most factories cannot produce one that survives scrutiny, and what it takes to fix that. I will be plain about where the industry is, plainer still about where it is going, and I will end on what we have built at Mickai to address it. I am not selling a miracle. I am describing a discipline, and a discipline is something you practise on every part, every shift, not something you buy once and forget.

The defect was always going to happen

Start with a fact that engineers accept and lawyers resist: no inspection system catches everything. Quality control has always been a statistical exercise. You set a threshold, you accept a false-negative rate, and you live with the tail. Human inspectors missed parts for a century. The difference now is scale and speed. A vision model can clear thousands of parts an hour, which means a rare miss is not a hypothetical, it is a routine certainty over a production run. When you industrialise judgment, you industrialise its errors too, and you do it at a volume no human line ever reached.

So the interesting question is never whether a defect escaped. Defects escape. The question is what you knew, when you knew it, and what your system did with that knowledge. A part that fails because the model genuinely could not have seen the flaw is one kind of event. A part that fails because the model scored it borderline and a confidence threshold was quietly relaxed last quarter to hit throughput targets is an entirely different kind of event. The physical defect is identical. The legal and moral weight is not. And the only thing that separates the two is the record. Without it, both stories look the same, and in a dispute the worse story is the one a court will be invited to assume.

Liability followed the decision, and the decision moved

For most of industrial history, the chain of responsibility for a defective part was traceable through people and paper. An inspector signed off. A supervisor approved a deviation. A quality engineer authored the control plan. Each of those was a human act with a name attached, and the law is comfortable assigning fault to named humans. What has changed is that the decisive act, the moment of accept or reject, has migrated into software. The named human is now three steps removed, configuring a model, approving a dataset, or simply trusting a vendor's closed system. The decision still gets made thousands of times a day. It just no longer leaves a signature in the old sense.

Regulators have noticed. The European Union (EU) AI Act treats AI used in the safety components of products as high-risk, and from August 2026 the substantive obligations for many high-risk systems begin to bite: logging, traceability, human oversight, and technical documentation that lets an authority reconstruct how the system behaved. Product liability regimes across the EU and beyond are being updated so that a defect in software, including the behaviour of an AI model, can ground a claim the same way a cracked weld would. The direction of travel is unambiguous. If an AI made the call, you will be expected to show the call. Qualitatively, the burden of proof is shifting toward the operator who deployed the system and away from the person who was harmed by it. That shift is the whole game. In the old world a claimant had to prove your process was negligent. In the emerging world you increasingly have to prove your process was sound, and you have to prove it with evidence that a court, an auditor, or an insurer will accept. Evidence you generated yourself, that you could have edited, is weak evidence. This is the uncomfortable centre of the matter, and most quality systems are not built for it.

Why your current logs will not save you

Walk into most plants, ask to see the AI inspection logs, and you will get something. A database table. A manufacturing execution system feed. Per-part scores, timestamps, a pass or fail flag. This feels like a record. Under adversarial conditions it is closer to a liability than a defence, for three reasons that compound. The first is mutability. An ordinary database log can be altered, and crucially it can be altered without a trace. If your defence rests on a record that your own administrators could have rewritten after the incident, opposing counsel will say exactly that, and they will be right to. The record proves what the database currently says, not what happened at the time.

The second is the after-the-fact problem. Many systems log the outcome but not the decision as it stood before execution. They tell you the part was passed. They do not prove what the model actually saw, what threshold was live, and what configuration governed that exact pass, sealed at the instant before the part moved on. A log written after the conveyor advanced is a description of history, not a witness to it. The third is vendor dependence. When the inspection model is a supplier's closed system, your evidence is whatever the supplier chooses to expose, in whatever format they choose, retained for however long their retention policy allows. Your ability to defend yourself in court is hostage to a third party's logging design and their willingness to cooperate years later. None of this means current logs are worthless. It means they are the wrong shape for the question being asked. The question is not what does the record say now. The question is can you prove the record has not changed since the decision, and can you prove it to someone who does not trust you.

What a defensible record actually requires

Strip the problem to its requirements and a short, demanding list appears. A record fit to anchor liability has to satisfy all of these at once, and most systems satisfy maybe two. Read the list as an engineer first, then read it again as an adversary, because the adversary is who it is really written for.

Pre-commitment: the decision is recorded before it takes effect, not described after. The model's input, the configuration, the threshold, and the score are sealed at the moment of judgment, so the record witnesses the act rather than reporting on it.
Tamper evidence: any later alteration is detectable. The record is hash-chained and append-only, so a changed or deleted entry breaks the chain visibly. You cannot quietly rewrite history, and just as important, you cannot be credibly accused of having done so.
Independent verifiability: a third party can check the record without trusting you and without trusting your vendor. Verification does not depend on a live connection to your servers or on software only you control.
Durability against future cryptography: the signatures protecting the record must hold up against the computing of the coming decades, including quantum computers, so that a record made today is still defensible when the claim is litigated years from now.
Offline checkability: an investigator can verify the record on an ordinary machine, disconnected, with no special portal and no cooperation required from the party being investigated.

Read that list back as an adversary and its logic becomes obvious. Every property exists to remove a way the record could be doubted. Pre-commitment removes the suspicion that you decided the story afterwards. Tamper evidence removes the suspicion that you edited it. Independent and offline verification remove the suspicion that the tooling is rigged in your favour. The goal is a record whose trustworthiness does not rest on anyone trusting you, because in a dispute, nobody will. That is a high bar, and it is deliberately high. A record that only convinces people already inclined to believe you is not evidence. It is a comfort blanket.

A concrete walk through a single part

Make it concrete. A casting arrives at an optical inspection station. The vision model is asked one question: is the porosity within tolerance? Here is the sequence that matters, and the point at which each step either creates evidence or destroys it. The camera captures the image. The model produces a score, say 0.91 against an accept threshold of 0.90. The part is about to be passed. In a defensible system, the moment before that pass executes, the system seals a record: the image reference, the model identity, the exact threshold in force, the score, the timestamp, and the operator or automation context. That seal is signed and chained to the seal before it. Only then does the part advance. The decision was committed before it took effect, and the commitment is now part of an unbroken chain.

Now run the failure forward. Months later the casting fails. An investigator pulls the record for that serial number. They see the part scored 0.91 against a 0.90 threshold. That is a defensible miss: the system did what it was configured to do, the configuration was reasonable, and you can prove both. But suppose instead the investigator finds the threshold had been moved from 0.95 to 0.90 two weeks before the part was made, that the score was 0.91, and that under the old threshold the part would have been rejected. Now the record tells a story about a decision to loosen tolerance, who made it, and when. That is not a defence, but it is the truth, and a system that hides it is worse than one that reveals it, because the truth comes out either way and only honesty is survivable. The record does not make you innocent. It makes you accountable on the facts rather than on a fight about whose logs to believe, and a fight about whose logs to believe is one you lose the moment it starts.

The honest caveats

I am a security realist, which means I will not pretend a perfect record solves everything. Three honest limits. The first is garbage in. A signed, tamper-evident record of a badly designed inspection is a beautifully preserved account of a bad decision. Provenance is not a substitute for getting the model, the thresholds, and the validation right. It tells the truth about your quality system, and if that system is poor, the truth will not flatter you. The second is scope. The record proves what the AI did and under what configuration. It does not prove the physics of the defect, nor does it settle every causal question about why the part failed in service. It is one strong link in an evidentiary chain, not the whole chain. Anyone who tells you a cryptographic log resolves a product liability case has never sat through one.

The third is discipline. A record is only as good as the completeness of what you choose to seal. If you log the score but not the threshold, or the outcome but not the input, you have a chain with a hole in it. Good provenance is an engineering practice, not a feature you switch on and forget. It demands that you decide, in advance, what evidence a future dispute will need, and that you seal it every single time. These limits are not arguments against provenance. They are the boundaries that tell you what provenance is for. It will not make a weak quality system strong, and it will not absolve you of bad engineering. What it will do is end the argument about whether the record can be believed, which in practice is the argument that decides most of these cases before the engineering is ever discussed.

Why this is a substrate problem, not an add-on

The instinct in most plants is to treat provenance as something you bolt on: a logging module, a compliance tool, a retention policy. I think that instinct is the mistake. If signing and chaining the record is a separate step that runs alongside the decision, then there is always a moment where the decision exists and the record does not, and that gap is exactly where doubt lives. The record has to be produced by the same act that makes the decision, or it is not a witness to the decision at all. You cannot stitch certainty back in afterwards. The gap, however small, is the seam an opposing expert will pull at.

That is an architectural commitment, and it is the one we made when we built Mickai as a Sovereign Intelligence Operating System (SIOS) rather than a model with some logging attached. Mickai runs fifty specialist brains, twenty-five domain and twenty-five operational, on a silicon substrate we call Poseidon. Underneath every one of them sits the Open Audit Record, the OAR. The principle is simple and strict: every AI action is signed before it executes, hash-chained into an append-only record, and verifiable offline in an ordinary browser with no trust in us, the vendor. The signatures use a post-quantum standard, the United States National Institute of Standards and Technology specification Federal Information Processing Standard 204 (the Module-Lattice Digital Signature Algorithm at security level 65, known as ML-DSA-65), so a record sealed today survives the cryptography of tomorrow. We anchor the audit root through Pantheon, a sovereign Layer 1 settlement chain we are building, which itself anchors to Bitcoin, so the chain's integrity can be checked against an independent public reference rather than only against our own systems. Pantheon carries a fixed-supply token, PAN, capped at five billion.

I will be precise about status, because precision is the whole point of this essay. The SIOS is built and in production. The Open Audit Record exists, signs before execution, and verifies offline today. Pantheon is the one component still being built. I will be precise about our own models too: we are actively training our specialised sovereign models now, hardening them on a sealed corpus, with funding scaling that work toward fully native weights. The architecture is backed by 104 filed United Kingdom patent applications, 2,340 claims, owned by Mickai LTD with me as the named inventor. I give you the count not to impress you but to be specific: filed, which means lodged and on the public record. I will not dress that up as anything more than it is.

What I would do on Monday

If you run quality for a plant that has put AI in the inspection loop, you do not need to buy anything from me to start fixing this. You need to ask four questions and not accept comfortable answers. Can I reproduce, for any given part, exactly what the model saw and what configuration governed its decision? If the answer is the database says so, you have a description, not a witness. Could someone have changed that record without it being detectable? If yes, you have a record that proves nothing under pressure. Can an outside investigator verify it without trusting my team and without my vendor's cooperation? If no, your defence depends on goodwill you may not have when it matters most.

And the fourth, the one people skip: have I decided, in advance, what a future dispute will need me to prove, and am I sealing exactly that, every time, before the part moves? Provenance is not retroactive. You cannot reconstruct a tamper-evident pre-commitment after the fact, because the whole value was that it existed before the fact. The work has to be done while the line is running, on every part, or it is not done at all. Ask those four questions honestly and you will usually find you can answer two of them well and two of them not at all. The two you cannot answer are the ones that decide the case. Start there, before an incident decides the order for you.

The part will still fail. The account is what you control.

I opened with a bracket that passed and failed, and I will close there, because the lesson sits in that gap. You will never drive your false-negative rate to zero, and you should be suspicious of anyone who promises you will. Defects will escape. Parts will fail. The thing you actually control is not whether the machine is ever wrong. It is whether, when the machine is wrong, you can stand in front of an auditor, an insurer, or a court and produce a record that says, truthfully and verifiably, here is exactly what was decided, here is the configuration that decided it, here is the proof none of it was touched since.

That record is not a feature. In the world that is arriving, it is the difference between an honest account of a hard problem and a story nobody is obliged to believe. We built Mickai so that the account is always there, sealed before the act, checkable by anyone, owned by no one's goodwill. The regulations are tightening, the cryptography is being forced forward by the prospect of quantum computers, and the liability is moving toward whoever deployed the model. None of that is in your control. The defect was always going to happen. The provenance is a choice, and it is one of the few choices on the floor that is still entirely yours to make.

ShareLinkedIn X Hacker News Reddit Mastodon Bluesky Email

Originally published at https://mickai.co.uk/articles/provenance-of-a-defect-ai-quality-control. If you operate in a regulated sector or want sovereign AI on your own hardware, the audit form on mickai.co.uk is the entry point.