Self-Hosted Enterprise Search AI: Air-Gapped RAG Over Decades of Records
The on-premise alternative to cloud enterprise search, connecting decades of un-redacted corporate data to a local AI with zero data egress
What self-hosted enterprise search AI is
Self-hosted enterprise search AI connects decades of un-redacted corporate records, contracts, correspondence, board papers, research, case files, to a local model running on hardware the organisation owns, so that a single natural-language question can be answered from the entire archive without any of it leaving the building. It is the on-premise alternative to cloud enterprise search: the model is brought to the records, the records are not shipped to the model, and what happens in the server room stays in the server room.
For a chief information officer, a chief information security officer or a general counsel, that is the proposition in one sentence. An organisation's most valuable asset is often its own accumulated memory, and that memory is exactly what a retrieval-augmented generation (RAG) system must read in full to be useful. The cloud route asks the organisation to expose that un-redacted memory to a third-party processor and, where the search runs offshore, to a cross-border transfer. The Mickai Sovereign Intelligence Operating System (SIOS) removes the cross-border transfer and third-party processing path, because the retrieval and the inference both sit inside the organisation's own perimeter.
The cloud tools it replaces, and why on-premise wins
The enterprise-search market has produced capable cloud products, Glean and Box AI among them, and an honest comparison should acknowledge how well they index a connected estate. The Mickai distinction is not about connectors or relevance. It is about where the corporate memory lives at the moment of retrieval, and who owns the engine reading it.
A cloud search tool, however well secured, indexes the organisation's content on infrastructure the organisation does not control. Self-hosted enterprise search inverts that on the dimensions that decide a regulated procurement.
- **The archive stays put.** Every document is indexed and retrieved in place by a local engine, with zero data egress. There is no transit path to intercept and no offshore copy to account for.
- **The model and the index are owned.** Both the search brain and the Mickai sovereign vector store are snapshots the organisation holds, immune to a cloud vendor changing its data-use policy or to the European Union Artificial Intelligence Act shifting under a hosted service.
- **Context is ingested without a throttle.** Owned compute allows unthrottled context ingestion across millions of historical documents, where a cloud service would meter the same work expensively per token.
- **The economics flip to capital.** Indexing and querying decades of records runs at near zero marginal cost on owned hardware rather than as a recurring consumption bill.
“The point of air-gapped RAG is not to protect the pipeline that carries your archive to the cloud. It is to remove the pipeline, so the archive never travels at all.”
The compliance barrier it clears
Enterprise search touches everything an organisation holds, which is precisely why the most regulated firms have hesitated. A sovereign deployment clears the barriers at the level of architecture.
Data protection under UK GDPR and the GDPR
A full corporate archive is dense with personal data, and often with special-category data. Indexing it through an external model adds a third-party processor and, where retrieval runs offshore, a cross-border transfer. Running the system on-premise means data residency holds and the records never leave the building. The organisation keeps its own controller obligations on a fully contained footprint.
Privilege, confidentiality and fiduciary duty
For a law firm, a bank or a professional-services firm, the archive contains privileged, confidential and fiduciary material. Containment is what preserves it: the records are read locally, and every material answer is wrapped in an Open Audit Record, a signed, inspectable account of which documents the model drew on and what it concluded.
Sector-specific regimes
Where the organisation also sits under financial secrecy rules, the Network and Information Systems regime, or export controls, keeping the corpus and the inference inside the perimeter removes the external exposure those regimes are most concerned with. The organisation still holds its own obligations, but the structural transfer risk is gone.
The Mickai studio that delivers it: Pinakes
Within the Mickai SIOS, knowledge management and enterprise search are delivered by Pinakes, named for the ancient catalogue of the Library of Alexandria, the first systematic index of recorded knowledge. Pinakes is a horizontal capability the organisation composes into a vertical pack: full-archive indexing, natural-language search across un-redacted records, citation-grounded answers, and knowledge-base generation, paired with the organisation's own domain knowledge base and a compliance crosswalk.
Pinakes works with the rest of the relevant studios. It pairs with Astraea, the contract-review studio, for legal-ops retrieval, with Nomos, the compliance studio, for regulator-facing evidence, and with Documents for ingestion of mixed-format archives. The retrieval layer is the Mickai sovereign vector store, which holds the embeddings locally and has no external route. The whole institutional memory is searchable, and none of it is exposed.
What makes Mickai different
Many providers will offer a private index. The Mickai difference is that the guarantees are engineered into the system rather than promised in a contract.
- **The Open Audit Record.** Every consequential answer is sealed into a signed, inspectable record showing the sources the model used, the evidence an auditor, a regulator or a risk committee can examine.
- **A defensible patent moat.** The architecture rests on 101 filed United Kingdom patent applications owned by Mickai LTD, covering the sovereign substrate, its audit machinery and its identity model. The barrier is intentional.
- **Hardware-bound identity.** The instance's identity is bound to the silicon it runs on, so the index cannot be quietly cloned or relocated off the organisation's estate.
- **Built and owned, not rented.** The organisation owns the model, the vector store and the compute. Search runs independent of cloud outages because the organisation owns the machine, and the index is insulated from a vendor rewriting the terms.
Mickai's own sovereign brains do the reasoning, and the Mickai sovereign vector store does the retrieval. There is no dependency on an external public model or a third-party store, and the corporate archive is never harvested to train someone else's.
How a sovereign deployment actually runs
The pattern is undramatic by design. The organisation provisions local compute inside its own data centre, sized to the scale of its archive and its query load. Its record systems are connected to the Mickai sovereign vector store in place, where the content is embedded and indexed without a copy leaving the perimeter. Pinakes answers questions locally, grounding each response in retrieved sources and sealing it into the Open Audit Record. Nothing in that loop needs an internet path to the archive, so search runs independent of cloud outages because the organisation owns the compute, and the attack surface is reduced to the organisation's own perimeter.
The honest boundary: this delivers zero data egress for the archive and removes the cross-border transfer and third-party processing path, and it reduces the external attack surface. It does not remove the organisation's own data-protection and confidentiality obligations, and insider and physical access remain the organisation's to control. The promise is data residency, ownership of model and index, and an institutional memory that stays in-house.
Request a private demonstration
If you are a chief information officer, chief information security officer, chief financial officer, chief operating officer or general counsel deciding how to put decades of un-redacted records to work with artificial intelligence without any of it leaving the building, the right next step is to watch the system answer real questions from data that never leaves the room.
Mickai was built by Micky Irons, founder, chief executive and named inventor, on a single principle: bring the intelligence to the archive and keep both inside the institution. Request a private demonstration, and we will show you self-hosted enterprise search AI indexing your records, answering with citations and sealing an Open Audit Record entirely behind your own firewall.






