Open-weight model

II-Medical 8B

Name: II-Medical 8B
Brand: Orionfold
Availability: InStock

An open AI model for medical questions and clinical text. It runs fully offline on a small desktop, so patient details never leave the clinic.

Sponsor this work Open on HuggingFace

Field: Medicine
Runs: Fully offline
Built on: II-Medical 8B
License: Apache-2.0, free

II-Medical 8B

Sponsor

II-Medical 8B is an open AI model for medical questions and clinical text. It runs fully offline on a small desktop, so patient details never leave the clinic.

What it can do

It answers health and medical questions, explains conditions and terms in plain words, and works through clinical text step by step. It is built on Intelligent-Internet’s II-Medical-8B, which learned to reason its way to an answer rather than just guess, and it is packed into ready-to-run files for a single desktop.

How well it works

We scored five builds on MedMCQA, a 50-question medical exam test, on a small Spark desktop. The Q5_K_M build scored the best at 52 percent, above the full-size build, while running at about 36 tokens a second. The table above shows every build.

This is a short test, and the model is a study and drafting helper, not a doctor. It can be wrong, so never use it to make a real medical decision. Always check with a qualified clinician.

How to run it

Download the GGUF files (the ready-to-run format) and run them with llama.cpp on a Spark-class desktop, a small AI machine with 128 GB of memory. The Q5_K_M build is the sweet spot here: the best score and a healthy 36 tokens a second.

Install

huggingface-cli download Orionfold/II-Medical-8B-GGUF

llama-cli -hf Orionfold/II-Medical-8B-GGUF:Q5_K_M

Use it

llama-cli -hf Orionfold/II-Medical-8B-GGUF:Q5_K_M -p "Explain the difference between Type 1 and Type 2 diabetes."

from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="Orionfold/II-Medical-8B-GGUF",
    filename="*Q5_K_M.gguf",
)
out = llm("Explain the difference between Type 1 and Type 2 diabetes.")
print(out["choices"][0]["text"])

Specs

Base model: Intelligent-Internet/II-Medical-8B
Format: GGUF (ready to run)
Builds: Q4_K_M · Q5_K_M · Q6_K · Q8_0 · F16
Best build: Q5_K_M (about 36 tokens a second on a Spark desktop)
License: Apache-2.0 (free to use)

Benchmarks

Build	MedMCQA score	Speed on a Spark
Q4_K_M	42%	44 tokens a second
Q5_K_M (best pick)	52%	36 tokens a second
Q6_K	46%	33 tokens a second
Q8_0	48%	28 tokens a second
F16 (full size)	48%	16 tokens a second

Used in the open

Live counts from HuggingFace, refreshed when the site builds. Built and maintained in the open by Orionfold.

129
Downloads · last 30 days

Get the Proof playbook

Think a small local model can beat the frontier ones?

We proved it. Rerun it yourself, do not take our word for it.

By subscribing you agree to receive the AI For Everyone digest, one email a week, no more. You can unsubscribe any time. See our privacy policy.

See which AI wins, on your own desk

Run, compare, and score on your own Spark.

Orionfold Arena is the cockpit that runs, compares, scores, and trains local AI models on one DGX Spark. The model, the tests, and the results in one place you control.

$349 founding, first 25 then $499 one time

or see the full details

Orionfold Arena poster: the eval cockpit you run on your own DGX Spark.

Keep exploring

Model

Patent Strategist

Offline patent reasoning in ready-to-run files, built with the NeMo toolkit. Nothing leaves your desktop.

Book

AI Research on NVIDIA DGX Spark

Real notes from doing AI research on one desktop. The NVIDIA DGX Spark is a small machine with huge power (petascale means it runs about a quadrillion math steps a second), so you can push local AI further with no cloud needed. Every lesson is backed by code that runs.

Resources

WeightsGGUF files (ready to run)

II-Medical 8B

What it can do

How well it works

How to run it

Install

Use it

Specs

Benchmarks

Used in the open

Sponsor II-Medical 8B

Bronze

Silver

Gold

Platinum

Think a small local model can beat the frontier ones?

Run, compare, and score on your own Spark.

Keep exploring

Patent Strategist

AI Research on NVIDIA DGX Spark

Resources

Further reading