Models8quant · lora · adapter
Benches3eval datasets
Notebooks5runnable on-ramps
Tooling6harnesses · skills
Free tier22run offline · yours
quant advisor free
A governed 4B advisor over your corpus — exact source-id citations, trusted refusals, local on a DGX Spark
base nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16
recommended Q4_K_M · 70 tok/s Explore →
quant astro free
A numeric astrodynamics reasoner — one verifiable boxed number out, served local on a DGX Spark for $0 a query
base Qwen/Qwen3-8B
recommended Q8_0 · 21 tok/s Explore →
lora patent free
Offline patent-prosecution reasoning on Spark-class hardware
base deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
nemo recommended BF16 Explore →
quant patent free
Offline patent-prosecution reasoning on Spark-class hardware
base deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
nemo recommended Q5_K_M · 35 tok/s Explore →
quant medical free
An 8B medical-reasoning model with a visible think-chain, quantized for offline clinical Q&A
base Intelligent-Internet/II-Medical-8B
recommended Q5_K_M · 36 tok/s Explore →
quant cyber free
A 7B cybersecurity chat model, quantized to run offline on a consumer GPU
base ZySec-AI/SecurityLLM
recommended Q4_K_M · 48 tok/s Explore →
quant legal free
A 7B legal-domain chat model, quantized to run offline on a consumer GPU
base Equall/Saul-7B-Instruct-v1
recommended Q5_K_M · 20 tok/s Explore →
quant finance free
A finance-specialized 7B chat model, quantized to run offline on a 4 GB consumer GPU
base AdaptLLM/finance-chat
recommended F16 · 12 tok/s Explore →
bench advisor free
The eval set that caught what prompting couldn't hold — frozen OOD curveballs for grounded citation, refusal, and routing
base n/a
0 variants Explore →
bench free
hermes-brain-bench-v0.1
base n/a
0 variants Explore →
bench patent free
patent-strategist-bench-v0.1
base n/a
0 variants Explore →
notebook finance free
Build the finance-chat quant — and call the model — on a Spark or a free cloud GPU
base AdaptLLM/finance-chat
recommended builder Explore →
notebook legal free
Build the Saul-7B quant — and call the legal model — on a Spark or a free cloud GPU
base Equall/Saul-7B-Instruct-v1
recommended builder Explore →
notebook cyber free
Build the SecurityLLM quant — and call the model — on a Spark or a free cloud GPU
base ZySec-AI/SecurityLLM
recommended builder Explore →
notebook medical free
Build the II-Medical-8B quant — and call the reasoner — on a Spark or a free cloud GPU
base Intelligent-Internet/II-Medical-8B
recommended builder Explore →
notebook patent free
Run the patent-strategist build — and use the model — on a Spark or a free cloud GPU
base deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
nemo recommended builder Explore →
harness advisor free
A local memory layer that gates its own recall
base fieldkit.memory · pgvector(vectors/blog_chunks) · NIM llama-nemotron-embed-1b-v2
recommended cosine-only · top_k=5 · GB10 measured baseline Explore →
arena run astro free
An operator cockpit you run on your own DGX Spark
base fieldkit[arena] · Astro + FastAPI sidecar
0 variants Explore →
harness free
When does local stop being enough? Measure first, then route.
base Hermes Agent v0.14.0
recommended Local Spark — Qwen3-30B-A3B MoE Q4_K_M Explore →
harness free
Which local lane should drive your always-on Spark agent?
base Hermes Agent v0.14.0
recommended llama.cpp · Qwen3-30B-A3B (MoE, Q4_K_M) · 88 tok/s Explore →
skill free
The skills you write for Claude Code load into Hermes unchanged.
base agentskills.io SKILL.md (Hermes / Claude Code compatible)
recommended spark-serve Explore →
harness free
One always-on brain, five specialists, zero LLM-classifier overhead.
base Hermes Agent v0.14.0
recommended Default brain (MoE) Explore →