Bronze
Back the work
$10 / month
- Your name on our supporters list
- A vote on what we build next
- A thank you in the build log
Open software
A local AI advisor over your own documents. Every answer names its exact source. What your documents cannot support, it refuses. Every check it passed is a receipt you can re-run.
Orionfold Advisor
Orionfold Advisor is a local AI advisor that answers from your own body of documents. It runs on one NVIDIA DGX Spark, a small AI desktop, and it is more than a model. It is a working unit: a small trained model, a search layer that finds the right sources, a rule-based router with a spending cap, a swappable pack of documents, and the Orionfold Arena cockpit as its control room.
Ask it a question and it answers from what it found, naming the exact source every claim came from. Ask it something its documents do not cover, or something about private operator state, and it refuses, with no sources cited. It holds that line even when the question arrives dressed up as an urgent exception, a roleplay, or an instruction to cite the wrong thing.
The hard part of an assistant over your documents was never smooth answers. It is answers you can check. An answer with a vague citation cannot be audited. A confident answer to a question the documents cannot support is worse. The Advisor treats both as failures you can measure, then trains them away. A citation must name a source that was really retrieved; calling it “Source 2” counts as a miss. And the refusal test questions were frozen before training started, so the score cannot flatter itself.
The most important measurement of the build: on a frozen set of trick questions, a big 30B model with a carefully written instruction sheet scored 8 of 21, and on three of those it made up private-looking information. The small 4B model that was trained for the job scored 18 of 21, refused all 9 of the questions it should have refused, and made nothing up. The training run itself took about 21 minutes on the Spark.
That is the lesson in one line: the trained model carries the discipline, not the prompt. And it is only cheap to learn when training, serving, and testing all live on one machine.
Jargon, in plain words: a corpus is just the pile of documents the Advisor reads. To fine-tune is to train a model further on your own examples. Retrieval means finding the right passages first, then answering only from them.
The Advisor went from a written plan to a promoted, serving model in two days, about 30 hours of wall-clock across 31 commits. The order of work is the story: the test questions were frozen first, the search layer was gated next, candidate models had to pass a pre-flight before a single training hour was spent, and promotion at the end was a script reading receipts. Twice the gates caught a real problem before it shipped: once when a rebuild let the test’s own paperwork leak into the search index, and once when the first training pass quietly made refusals worse. Both have numbers attached because the frozen tests could not be bent.
The agent did the typing: 10 sessions, 871 turns, 118.9 million tokens processed with 97.4 percent served from cache, about 4,300 lines of new harness code, and 54 tests, all driven on Claude Fable 5.
The Advisor is a thin new layer over tools that already earned their keep. Search rides Orionfold Cortex. The control room is Orionfold Arena, which gained the Advisor’s proof cards, its replayable test drawer, and the guarded model-swap screen during this build. Training, shrinking, and publishing ride the fieldkit toolbox. The model is free on Hugging Face, the test set is published next to it, and every receipt quoted here is in the public repo.

01
The gate as a button. Eight questions run against the live model, pass or fail per row.
Before any model earns a training hour, it must pass this gate: eight questions that check exact citations and clean refusals, run against the live model with one click. Two well-known base models failed right here, before any time was spent training them. The shot shows the winning model passing 8 of 8.

02
What the Advisor knows is a sealed pack, not a folder of files. The card shows its fingerprint and its checks.
The Advisor's knowledge is a sealed pack of 182 sources with a fingerprint, not a loose folder of files. The card shows the pack's two recall checks and the training sets built from it, so you can trace the chain from what the Advisor knows to what the model was trained on. Swap in a different pack and the same checks re-run.

03
Every hosted call has a price tag and a verdict. Private questions are blocked from leaving at all.
The Advisor can ask a bigger hosted model for help, but only on rules: an allow list, a dollar cap, and never for private questions. Each measured setup shows its score and its bill. Local-only scored 28 of 28 for free; with the frontier in the loop it scored 28 of 28 for a third of a cent.

04
Promotion is a script that reads the evidence, not an announcement. Nine gates, each backed by a saved file.
Promotion is not an announcement here, it is a script that reads the saved evidence. The card shows the winning model, nine green gates each backed by a named file, and why every other candidate was rejected. Run the script again any time; if the evidence stops supporting a claim, it fails.

05
All 89 measured questions are pickable in chat, and each replays the exact setup it was scored with.
All 89 measured test questions ship with the product. Pick any row and the chat replays the exact setup it was scored with, system rules included. What you see in chat is what the published numbers measured, not a friendlier version of it.

06
A refusal you can score as it happens. No judge model, just a strict checker.
Ask the Advisor about private operator state and it refuses, citing nothing. A strict checker grades the refusal the moment it lands, with no judge model in the loop. This live row scored 100 percent at the wire.
Back this work with a monthly tier. Your support moves your requests up the list, and Gold or Platinum earns a badge on the roadmap item you back.
Back the work
$10 / month
Get a say
$25 / month
Move it up the list
$50 / month
Shape the roadmap
$100 / month
Need something specific? Send an enquiry from the roadmap.

Real notes from doing AI research on one desktop. The NVIDIA DGX Spark is a small machine with huge power (petascale means it runs about a quadrillion math steps a second), so you can push local AI further with no cloud needed. Every lesson is backed by code that runs.

One screen to run, compare, score, and now train the AI models on your own desktop. Watch live speed and memory, rank models on a private leaderboard, queue tests and training runs, and wake up to a morning report. Nothing you type leaves your machine.

A second brain that lives on your own desktop. It indexes your notes, stamps where every fact came from, and grades its own memory: a rebuild that would make recall worse is caught, not shipped. Your documents never leave your machine.

A Python toolbox of patterns we proved on a small AI desktop. It covers the whole job: faster replies, search over your own files, scoring, training, and shipping models. Use just the parts you need.