Ornith 1.0: Self-Improving Open-Source Models for Agentic Coding

Ornith 1.0 is a family of MIT-licensed LLMs from DeepReinforce that jointly learn to solve coding tasks and build their own scaffolds. From 9B edge-deployable to 397B MoE rivaling Claude Opus 4.7 — run locally with vLLM, Ollama, or LM Studio.

Jun 25, 2026

Release Date

9B → 397B

Model Sizes

MIT

License

82.4%

SWE-Bench

What Is Ornith 1.0?

Ornith 1.0 is a family of open-source large language models built by DeepReinforce AI specifically for agentic coding. Released on June 25, 2026, Ornith 1.0 spans four parameter sizes — 9B Dense, 31B Dense, 35B MoE, and 397B MoE — all under MIT license with no regional restrictions. The name Ornith comes from the ancient Greek word for bird, and like a bird building its own nest, Ornith 1.0 learns to construct its own scaffolding before solving coding tasks.

The core innovation behind Ornith 1.0 is self-scaffolding reinforcement learning. Traditional coding agents rely on human-designed harnesses — fixed workflows for tool calls, error recovery, and task decomposition. Ornith 1.0 treats the scaffold as a learnable object that co-evolves with the model's policy during RL training. This means Ornith 1.0 generates its own task plans, launches tools, inspects intermediate results, and rewrites failing steps without human intervention.

At flagship scale, Ornith 1.0-397B achieves 77.5 on Terminal-Bench 2.1 and 82.4 on SWE-Bench Verified, surpassing Claude Opus 4.7 and outperforming every other open-source model of comparable size. The smaller Ornith 1.0-35B MoE scores 64.2 on Terminal-Bench 2.1 — beating Qwen 3.5-397B (53.5) with a fraction of the parameters. All Ornith 1.0 variants are available on Hugging Face with FP8, GGUF, and bf16 weights.

How Ornith 1.0 Works

Ornith 1.0 introduces a self-improving training framework that sets it apart from standard coding models:

1

Self-Improving RL Training

Unlike standard fine-tuning, Ornith 1.0 jointly optimizes both the solution code and the scaffold (task plan, tool calls, error recovery) during reinforcement learning.

2

Scaffold Co-Evolution

The model learns to construct its own orchestration framework — which tools to call, when to retry, how to decompose complex tasks — rather than relying on human-designed harnesses.

3

Anti-Reward Hacking

Three layers of safeguards — fixed trust boundary, deterministic monitor, and frozen LLM judge — prevent the model from gaming benchmark scores during RL training.

4

Reasoning + Tool Calls

Every response starts with a thinking block before the final answer. The models emit well-formed tool calls for agent loops, compatible with any OpenAI-format agent framework.

Ornith 1.0 Use Cases

Ornith 1.0 excels in scenarios where agentic coding agents need to autonomously plan, execute, and recover from errors. Here are the top Ornith 1.0 applications:

Multi-File Refactoring

Ornith 1.0 generates its own task plans, launches tools, inspects intermediate results, and rewrites failing steps — ideal for repository-scale refactoring across dozens of files.

Bug Localization & Fixes

The self-scaffolding approach lets Ornith 1.0 systematically search a codebase, narrow down root causes, and produce test-driven patches with minimal human intervention.

Terminal-Based Agents

Ornith 1.0 is optimized for terminal-native coding agents. Works directly with Claude Code, OpenHands, OpenClaw, and Hermes Agent out of the box.

Edge / Offline Coding

The 9B and 35B models run on consumer hardware — a gaming GPU or MacBook Pro — giving you a fully private, offline AI coding assistant with no API costs.

Ornith 1.0 Benchmark Highlights

Ornith 1.0-397B is the top open-source agentic coding model on major benchmarks. See the full benchmark page for all model sizes.

Benchmark Ornith 397B Qwen 3.5 Opus 4.7
Terminal-Bench 2.1 77.5 53.5 70.3
SWE-Bench Verified 82.4 76.4 80.8
SWE-Bench Pro 62.2 51.6 64.3
SWE-Bench Multilingual 78.9 69.3

Ornith 1.0 Model Family

Choose the right Ornith 1.0 model for your hardware. Every Ornith 1.0 variant uses the same self-scaffolding RL training. See the full Ornith 1.0 model comparison for detailed specs.

9B Dense

9B

~19 GB (bf16), ~6 GB (Q4)

Best for: Edge deployment, single-GPU setups, fast triage

31B Dense

31B

~62 GB (bf16), ~20 GB (Q4)

Best for: Balanced quality and speed

35B MoE (~3B active/token)

35B

~25 GB (Q5_K_M)

Best for: Best value — faster than 9B, more accurate than 31B

Recommended

397B MoE

397B

~400 GB (bf16), ~200 GB (FP8)

Best for: Maximum accuracy, production agent pipelines

Ornith 1.0 FAQ

Frequently Asked Questions

What is Ornith 1.0?
Ornith 1.0 is a family of open-source large language models built by DeepReinforce AI specifically for agentic coding. Ornith 1.0 comes in four sizes — 9B Dense, 31B Dense, 35B MoE, and 397B MoE — all released under MIT license. The key innovation of Ornith 1.0 is self-scaffolding: the model jointly learns to solve coding tasks and construct the orchestration framework that guides those solutions.
Who made Ornith 1.0?
Ornith 1.0 was created by DeepReinforce AI and released on June 25, 2026. The name comes from the ancient Greek word for bird. All Ornith 1.0 models are available on Hugging Face under the deepreinforce-ai organization with MIT licensing and no regional restrictions.
What base models is Ornith 1.0 built on?
Ornith 1.0 models are built on two base architectures. The 9B Dense, 35B MoE, and 397B MoE variants are post-trained on Qwen 3.5. The 31B Dense variant is post-trained on Gemma 4. All variants undergo the same self-improving reinforcement learning process that jointly optimizes scaffolds and solutions.
Which Ornith 1.0 model should I choose?
For most users, Ornith 1.0-35B MoE is the sweet spot — it is actually faster than the 9B model due to MoE architecture (only ~3B parameters active per token) while being significantly more accurate. If you only have 6-8GB VRAM, Ornith 1.0-9B Q4 is a realistic entry point. The 397B model is for production agent pipelines where maximum accuracy matters.
How much VRAM do I need for Ornith 1.0?
Ornith 1.0-9B needs about 6 GB in Q4 quantization or 19 GB in bf16. Ornith 1.0-35B MoE needs about 25 GB in Q5_K_M. Ornith 1.0-397B requires approximately 200 GB in FP8 or 400 GB in bf16, typically served across 8x 80GB GPUs. The 35B MoE is the best option for consumer GPUs with 24GB+ VRAM.

Ready to Run Ornith 1.0?

Get started with Ornith 1.0 locally using vLLM, Ollama, or LM Studio — Ornith 1.0 needs no API keys, is completely free, and keeps your code private.