Ornith 1.0 Model Comparison: Choose the Right Size

Ornith 1.0 ships four model sizes from edge-deployable Ornith 1.0-9B to production-grade Ornith 1.0-397B. Each Ornith 1.0 variant uses self-scaffolding RL to deliver best-in-class agentic coding for its parameter count.

All Ornith 1.0 Models at a Glance

Ornith 1.0 comes in four sizes. Every Ornith 1.0 model is MIT-licensed with GGUF, FP8, and bf16 weights on Hugging Face.

Model Parameters Architecture Base VRAM Best For
Ornith-1.0-9B 9B Dense Dense (all params active) Qwen 3.5 ~19 GB (bf16), ~6 GB (Q4) Edge deployment, single-GPU setups, fast triage
Ornith-1.0-31B 31B Dense Dense (all params active) Gemma 4 ~62 GB (bf16), ~20 GB (Q4) Balanced quality and speed
Ornith-1.0-35B★ Best Value 35B MoE (~3B active/token) Mixture-of-Experts Qwen 3.5 MoE ~25 GB (Q5_K_M) Best value — faster than 9B, more accurate than 31B
Ornith-1.0-397B 397B MoE Mixture-of-Experts Qwen 3.5 397B ~400 GB (bf16), ~200 GB (FP8) Maximum accuracy, production agent pipelines

Ornith 1.0 Model Details

Deep dive into each Ornith 1.0 model — architecture, VRAM, GPU requirements, and recommended use cases for every Ornith 1.0 variant:

Ornith-1.0-9B

9B Dense · Dense (all params active) · Qwen 3.5

VRAM Required

~19 GB (bf16), ~6 GB (Q4)

GPU Setup

Single consumer GPU

Context Length

262K tokens

Best For

Edge deployment, single-GPU setups, fast triage

Ornith-1.0-31B

31B Dense · Dense (all params active) · Gemma 4

VRAM Required

~62 GB (bf16), ~20 GB (Q4)

GPU Setup

Single 80GB GPU

Context Length

262K tokens

Best For

Balanced quality and speed

Ornith-1.0-35B

35B MoE (~3B active/token) · Mixture-of-Experts · Qwen 3.5 MoE

Recommended for most users

VRAM Required

~25 GB (Q5_K_M)

GPU Setup

Single GPU with 24GB+

Context Length

262K tokens

Best For

Best value — faster than 9B, more accurate than 31B

Ornith-1.0-397B

397B MoE · Mixture-of-Experts · Qwen 3.5 397B

VRAM Required

~400 GB (bf16), ~200 GB (FP8)

GPU Setup

8x 80GB GPUs

Context Length

262K tokens

Best For

Maximum accuracy, production agent pipelines

Ornith 1.0 Model Selection Guide

Not sure which Ornith 1.0 model to pick? Here is the Ornith 1.0 recommendation for every hardware tier:

The Sweet Spot: Ornith 1.0-35B MoE

Ornith 1.0-35B is a rare case where the larger Ornith 1.0 model is both faster and more accurate than the smaller one. Thanks to MoE architecture, only ~3B parameters activate per token, making Ornith 1.0-35B faster than the dense Ornith 1.0-9B while having access to 35B of total knowledge. If you have 24GB+ VRAM, this is the Ornith 1.0 model to choose.

Budget Hardware: Ornith 1.0-9B

At Q4_K_M quantization, Ornith 1.0-9B fits in about 6GB — making this Ornith 1.0 variant viable on gaming laptops and entry-level GPUs. Despite its compact size, Ornith 1.0-9B matches Gemma 4-31B on Terminal-Bench 2.1 and scores 69.4 on SWE-Bench Verified.

Maximum Quality: Ornith 1.0-397B

For teams running production agent pipelines where accuracy is critical, Ornith 1.0-397B delivers the highest Ornith 1.0 scores across all benchmarks. This Ornith 1.0 variant requires 8x 80GB GPU infrastructure but outperforms Claude Opus 4.7 on every agentic coding benchmark.

Ornith 1.0 Model FAQ

Which Ornith 1.0 model should I use for local development?
Ornith 1.0-35B MoE is the best choice for most developers. It runs on a single GPU with 24GB+ VRAM, is faster than the 9B dense model thanks to MoE architecture, and delivers significantly better accuracy. If you only have 6-8GB VRAM, start with Ornith 1.0-9B Q4.
Why is Ornith 1.0-35B faster than the 9B model?
Ornith 1.0-35B uses Mixture-of-Experts (MoE) architecture where only about 3B parameters are active per token, despite having 35B total parameters. This means each inference step processes fewer computations than the dense 9B model, resulting in faster generation speed while maintaining access to more learned knowledge.
Can I run Ornith 1.0-397B on consumer hardware?
Not practically. Ornith 1.0-397B requires approximately 200GB VRAM in FP8 or 400GB in bf16, typically needing 8x 80GB GPUs. For consumer hardware, Ornith 1.0-35B MoE at Q5_K_M quantization (~25GB) is the best performing option.
What quantization should I use for Ornith 1.0?
For Ornith 1.0-35B, use Q5_K_M for the best quality-to-size ratio (~25GB). For Ornith 1.0-9B, Q4_K_M (~6GB) is practical for limited VRAM. All GGUF quantizations are published on Hugging Face by DeepReinforce.

Start Running Ornith 1.0

Download any Ornith 1.0 model from Hugging Face and deploy Ornith 1.0 locally in minutes. Our Ornith 1.0 setup guide covers vLLM, Ollama, and LM Studio.