Open-weight model

Finance Chat

Name: Finance Chat
Brand: Orionfold
Availability: InStock

An open AI model for finance and money questions in plain chat. It runs fully offline on a small desktop, so account details and deal terms never leave your machine.

Sponsor this work Open on HuggingFace

Field: Finance
Runs: Fully offline
Built on: AdaptLLM finance-chat
License: Free to use

Finance Chat

Sponsor

Finance Chat is an open AI model for money and finance questions in plain chat. It runs fully offline on a small desktop, so account numbers, deal terms, and other private figures never leave your machine.

What it can do

It talks through finance ideas in simple words: what a margin is, how a balance sheet fits together, or what a term in a deal means. It is built on AdaptLLM’s finance-chat, an open model trained on finance text, and packed into ready-to-run files for a single desktop.

How well it works

We scored five builds on FinanceBench, a strict 50-question test that only counts an exact number as right. The scores are low, 14 to 18 percent, and this is the honest weak spot: the model is good at explaining finance in words, but not at pulling exact figures out of a filing. So use it to learn and to draft, and check every number yourself. One nice find from our testing: the Q8_0 build matches the full-size model almost exactly while taking far less space.

How to run it

Download the GGUF files (the ready-to-run format) and run them with llama.cpp on a Spark-class desktop, a small AI machine with 128 GB of memory. The Q4_K_M build is the fastest, at about 31 tokens a second, and a good place to start.

Install

huggingface-cli download Orionfold/finance-chat-GGUF

llama-cli -hf Orionfold/finance-chat-GGUF:Q4_K_M

Use it

llama-cli -hf Orionfold/finance-chat-GGUF:Q4_K_M -p "Explain the difference between gross margin and net margin."

from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="Orionfold/finance-chat-GGUF",
    filename="*Q4_K_M.gguf",
)
out = llm("Explain the difference between gross margin and net margin.")
print(out["choices"][0]["text"])

Specs

Base model: AdaptLLM/finance-chat
Format: GGUF (ready to run)
Builds: Q4_K_M · Q5_K_M · Q6_K · Q8_0 · F16
Best build: Q4_K_M (about 31 tokens a second on a Spark desktop)
License: Free to use

Benchmarks

Build	FinanceBench score	Speed on a Spark
Q4_K_M (fastest)	14%	31 tokens a second
Q5_K_M	16%	27 tokens a second
Q6_K	16%	24 tokens a second
Q8_0	18%	9 tokens a second
F16 (full size)	18%	12 tokens a second

Used in the open

Live counts from HuggingFace, refreshed when the site builds. Built and maintained in the open by Orionfold.

179
Downloads · last 30 days

Get the Proof playbook

Think a small local model can beat the frontier ones?

We proved it. Rerun it yourself, do not take our word for it.

By subscribing you agree to receive the AI For Everyone digest, one email a week, no more. You can unsubscribe any time. See our privacy policy.

Run your agents from one place

Put your trusted AI to work.

Orionfold Relay is the free and open engine that runs your AI agents and workflows from one board. Own a license and add the premium packs that save you the setup.

$349 founding, first 25 then $499 one time

or see the full details

Orionfold Relay poster: one operator running a whole AI agency from one board.

Keep exploring

Model

Patent Strategist

Offline patent reasoning in ready-to-run files, built with the NeMo toolkit. Nothing leaves your desktop.

Book

AI Research on NVIDIA DGX Spark

Real notes from doing AI research on one desktop. The NVIDIA DGX Spark is a small machine with huge power (petascale means it runs about a quadrillion math steps a second), so you can push local AI further with no cloud needed. Every lesson is backed by code that runs.

Resources

WeightsGGUF files (ready to run)

Finance Chat

What it can do

How well it works

How to run it

Install

Use it

Specs

Benchmarks

Used in the open

Sponsor Finance Chat

Bronze

Silver

Gold

Platinum

Think a small local model can beat the frontier ones?

Put your trusted AI to work.

Keep exploring

Patent Strategist

AI Research on NVIDIA DGX Spark

Resources

Further reading