Bronze
Back the work
$10 / month
- Your name on our supporters list
- A vote on what we build next
- A thank you in the build log
Open-weight model
An open AI model for cyber security work, like spotting threats and writing up what they mean. It runs fully offline on a small desktop, so sensitive details never leave your network.
SecurityLLM
SecurityLLM is an open AI model tuned for cyber security. Security work means finding weak spots, reading attack reports, and explaining what a threat does and how to stop it. A lot of that text is sensitive, so this model does its thinking fully offline, and nothing leaves your network.
It answers security questions in plain words, sums up threat reports, and walks through how an attack works and how to defend against it. It is built on ZySec-AI’s SecurityLLM, an open model already trained on security material, and packed into ready-to-run files so it starts fast on a single desktop.
We scored five builds on CyberMetric, a 50-question security quiz, on a small Spark desktop. The surprise: the smallest, fastest build (Q4_K_M) scored the best at 40 percent and ran at about 48 tokens a second. The full-size build was both slower and a touch behind. So the cheap build is the one to run. The table above has every build.
These scores come from a short quiz, not a full audit. Treat the model as a fast helper for security questions, and always check its answers against trusted sources before you act.
Download the GGUF files (the ready-to-run format) and run them with llama.cpp on a Spark-class desktop, a small AI machine with 128 GB of memory. Start with the Q4_K_M build: it is the fastest here and scored highest on our test.
huggingface-cli download Orionfold/SecurityLLM-GGUFllama-cli -hf Orionfold/SecurityLLM-GGUF:Q4_K_Mllama-cli -hf Orionfold/SecurityLLM-GGUF:Q4_K_M -p "Explain how a SQL injection attack works and how to stop it."from llama_cpp import Llama
llm = Llama.from_pretrained(
repo_id="Orionfold/SecurityLLM-GGUF",
filename="*Q4_K_M.gguf",
)
out = llm("Explain how a SQL injection attack works and how to stop it.")
print(out["choices"][0]["text"])
| Build | CyberMetric score | Speed on a Spark |
|---|---|---|
| Q4_K_M (best pick) | 40% | 48 tokens a second |
| Q5_K_M | 38% | 40 tokens a second |
| Q6_K | 36% | 35 tokens a second |
| Q8_0 | 36% | 30 tokens a second |
| F16 (full size) | 34% | 17 tokens a second |
Live counts from HuggingFace, refreshed when the site builds. Built and maintained in the open by Orionfold.
Back this work with a monthly tier. Your support moves your requests up the list, and Gold or Platinum earns a badge on the roadmap item you back.
Back the work
$10 / month
Get a say
$25 / month
Move it up the list
$50 / month
Shape the roadmap
$100 / month
Need something specific? Send an enquiry from the roadmap.

Offline patent reasoning in ready-to-run files, built with the NeMo toolkit. Nothing leaves your desktop.

Real notes from doing AI research on one desktop. The NVIDIA DGX Spark is a small machine with huge power (petascale means it runs about a quadrillion math steps a second), so you can push local AI further with no cloud needed. Every lesson is backed by code that runs.