Ornith 1.0 Model Comparison: Choose the Right Size
Ornith 1.0 ships four model sizes from edge-deployable Ornith 1.0-9B to production-grade Ornith 1.0-397B. Each Ornith 1.0 variant uses self-scaffolding RL to deliver best-in-class agentic coding for its parameter count.
All Ornith 1.0 Models at a Glance
Ornith 1.0 comes in four sizes. Every Ornith 1.0 model is MIT-licensed with GGUF, FP8, and bf16 weights on Hugging Face.
| Model | Parameters | Architecture | Base | VRAM | Best For |
|---|---|---|---|---|---|
| Ornith-1.0-9B | 9B Dense | Dense (all params active) | Qwen 3.5 | ~19 GB (bf16), ~6 GB (Q4) | Edge deployment, single-GPU setups, fast triage |
| Ornith-1.0-31B | 31B Dense | Dense (all params active) | Gemma 4 | ~62 GB (bf16), ~20 GB (Q4) | Balanced quality and speed |
| Ornith-1.0-35B★ Best Value | 35B MoE (~3B active/token) | Mixture-of-Experts | Qwen 3.5 MoE | ~25 GB (Q5_K_M) | Best value — faster than 9B, more accurate than 31B |
| Ornith-1.0-397B | 397B MoE | Mixture-of-Experts | Qwen 3.5 397B | ~400 GB (bf16), ~200 GB (FP8) | Maximum accuracy, production agent pipelines |
Ornith 1.0 Model Details
Deep dive into each Ornith 1.0 model — architecture, VRAM, GPU requirements, and recommended use cases for every Ornith 1.0 variant:
Ornith-1.0-9B
9B Dense · Dense (all params active) · Qwen 3.5
VRAM Required
~19 GB (bf16), ~6 GB (Q4)
GPU Setup
Single consumer GPU
Context Length
262K tokens
Best For
Edge deployment, single-GPU setups, fast triage
Ornith-1.0-31B
31B Dense · Dense (all params active) · Gemma 4
VRAM Required
~62 GB (bf16), ~20 GB (Q4)
GPU Setup
Single 80GB GPU
Context Length
262K tokens
Best For
Balanced quality and speed
Ornith-1.0-35B
35B MoE (~3B active/token) · Mixture-of-Experts · Qwen 3.5 MoE
VRAM Required
~25 GB (Q5_K_M)
GPU Setup
Single GPU with 24GB+
Context Length
262K tokens
Best For
Best value — faster than 9B, more accurate than 31B
Ornith-1.0-397B
397B MoE · Mixture-of-Experts · Qwen 3.5 397B
VRAM Required
~400 GB (bf16), ~200 GB (FP8)
GPU Setup
8x 80GB GPUs
Context Length
262K tokens
Best For
Maximum accuracy, production agent pipelines
Ornith 1.0 Model Selection Guide
Not sure which Ornith 1.0 model to pick? Here is the Ornith 1.0 recommendation for every hardware tier:
The Sweet Spot: Ornith 1.0-35B MoE
Ornith 1.0-35B is a rare case where the larger Ornith 1.0 model is both faster and more accurate than the smaller one. Thanks to MoE architecture, only ~3B parameters activate per token, making Ornith 1.0-35B faster than the dense Ornith 1.0-9B while having access to 35B of total knowledge. If you have 24GB+ VRAM, this is the Ornith 1.0 model to choose.
Budget Hardware: Ornith 1.0-9B
At Q4_K_M quantization, Ornith 1.0-9B fits in about 6GB — making this Ornith 1.0 variant viable on gaming laptops and entry-level GPUs. Despite its compact size, Ornith 1.0-9B matches Gemma 4-31B on Terminal-Bench 2.1 and scores 69.4 on SWE-Bench Verified.
Maximum Quality: Ornith 1.0-397B
For teams running production agent pipelines where accuracy is critical, Ornith 1.0-397B delivers the highest Ornith 1.0 scores across all benchmarks. This Ornith 1.0 variant requires 8x 80GB GPU infrastructure but outperforms Claude Opus 4.7 on every agentic coding benchmark.
Ornith 1.0 Model FAQ
Which Ornith 1.0 model should I use for local development?
Why is Ornith 1.0-35B faster than the 9B model?
Can I run Ornith 1.0-397B on consumer hardware?
What quantization should I use for Ornith 1.0?
Start Running Ornith 1.0
Download any Ornith 1.0 model from Hugging Face and deploy Ornith 1.0 locally in minutes. Our Ornith 1.0 setup guide covers vLLM, Ollama, and LM Studio.