Home
MICKAI

Mickai Subsystem

Mickai Lama

Mickai Lama is the subsystem of the Mickai SIOS that serves local models from 1.1B to 32.5B parameters, on x86-64 and ARM64. Ten rebranded SKUs, an OpenAI-compatible API on port 11438, dynamic model loading, in-app benchmarking, RAG-ready context windows. Mickai is downloadable at mickai.co.uk/download and runs on Windows, Linux, or macOS.

View capabilities
SovereignLocalQuantisedAuditable

The Mickai SIOS

Mickai is a Sovereign Intelligence Operating System (SIOS). It runs entirely on your own hardware, on Windows, Linux, or macOS. No cloud, no telemetry. This page describes one subsystem of the Mickai SIOS. Download Mickai at mickai.co.uk/download.

A subsystem of the Mickai SIOS. Local model serving from 1.1B to 32.5B parameters, on x86-64 and ARM64. Weights stay on the operator's machine.

Read the patentsVerify a Mickai audit chain

Local model serving, your hardware, your weights.

What Lama serves

Seven primitives that turn a workstation into a model server. Ten rebranded SKUs, OpenAI-compatible API on port 11438, dynamic loading, in-app benchmarking, RAG-ready context windows up to 128k tokens.

01 / SKUs

Ten rebranded model SKUs

Mickai-tiny (1.1B), mickai-small (3B), mickai-base (7B), mickai-medium (14B), mickai-large (32.5B), plus specialist variants for code, reasoning, embedding, and routing. Every SKU ships with a signed manifest and a deterministic inference seed.

02 / API

OpenAI-compatible API on port 11438

/v1/chat/completions, /v1/completions, /v1/embeddings, /v1/models. Drop-in for any client that already speaks OpenAI. The shim translates aliases (gpt-4 to mickai-large, gpt-3.5-turbo to mickai-base) so existing code switches over with one base URL change.

03 / Loading

Dynamic model loading

Models load on first request and stay resident under an LRU policy. The runtime advertises memory headroom so a workstation with 64 GB can hold mickai-large alongside two specialist brains, while a 16 GB laptop swaps between SKUs without manual intervention.

04 / Benchmarking

In-app benchmarking

Run a benchmark suite against the local installation. Tokens per second, time-to-first-token, prompt-cache hit rate, KV-cache memory, all reported and signed into the audit chain. No third-party benchmark site, no opaque scoring.

05 / Context

RAG-ready context windows

Up to 128k tokens on the larger SKUs, with KV-cache compression and grouped-query attention for memory-bound hardware. Hippocampus retrievals stream straight in without copying, so a multi-document RAG query finishes inside one inference call.

06 / Architectures

x86-64 and ARM64

Builds for Windows, Linux, macOS, on Intel, AMD, Apple Silicon, and ARM64 server hardware. AVX-512, AVX-2, NEON, and Apple Metal back-ends. Quantised weights (Q4_K_M, Q5_K_M, Q8_0) included for low-memory machines.

07 / Sovereignty

Weights on your hardware

Model weights live on the operator's machine. No cloud calls, no model-update telemetry, no per-token billing. Plug a different model in if you wish; Lama treats foreign GGUF weights as first-class.

Patent anchors

Lama sits on three of the 31 filed UK patent applications behind the Mickai SIOS. Patent 02 anchors multi-brain routing, patent 04 the multi-tenant brain isolation, patent 05 the privacy-preserving RAG primitive.

GB2607309.8 to GB2610422.4 · 31 filed UK patent applications · 914 claims

Wired with

  • Ten Mickai SKUs (1.1B to 32.5B parameters)
  • OpenAI-compatible API on port 11438
  • Dynamic model loading with LRU residency
  • In-app benchmark suite, results signed into the chain
  • Up to 128k context with KV-cache compression
  • AVX-512, AVX-2, NEON, Apple Metal back-ends
  • Q4_K_M, Q5_K_M, Q8_0 quantised weights
  • 100 percent on-device, weights stay on the operator's machine
Read

Serve sovereign models on your hardware.

Lama serves the ten Mickai SKUs locally on Windows, Linux, or macOS. Read the multi-brain patent, or download Mickai and run mickai-large against the OpenAI-compatible API on port 11438.

Read patent 02

Engineered by Micky Irons in Cumbria, United Kingdom · @mickyirons