GPU Util % utilisation
GPU Temp °C die
Unified GB of 128 · 8 GB guard
Throughput tok / second
TTFT ms · first token
throughput & first-token from the active lane
Active Lane idle no warm brain

← Models

What it's for
  • Local numeric Q&A over astrodynamics and quantitative astrophysics — periods, transfers, orbital elements — with boxed, verifiable answers
  • A worked example of gate-driven method selection (base preflight → SFT gate → RL-headroom gate) for greenfield verticals
  • A measured four-variant quant ladder (Q4_K_M→Q8_0) for quality-vs-throughput tradeoffs on unified-memory hardware

Audience — Spark operators who want a $0 local astro reasoner with bench receipts, and builders studying how to decide SFT vs RLVR before spending the run.

Quant economics quality × speed per build
Variant tok/s astro-bench v0.1 held-out (n=44, \boxed ±2%)
Q4_K_M 33.1 0.75
Q5_K_M 28.1 0.75
Q6_K 24.6 0.84
Q8_0 sweet spot 21.1 0.89