Mistral Large 2512

Weighted composite

3.63

Recommendation

Practitioner-grade

Cohort

large open weight

Scorecard

Per-criterion scores not available for this model. It appears in the cross-part overview but is not part of the canonical Models 1–10 hand-graded set; only the headline composite is shown.

Per-part composites

Part	Opus 4.7 (inline)	DeepSeek V4 Pro (judge)
Part A	3.63	2.40
Part B	3.30	1.75
Part C	3.45	4.45

Notes from the evaluation

Mistral Large 2512 r1** | **12/12** | 13/15 | 8/10 | 6/8 | 3/5 | **3.63** | | **Mistral Large 2512 r2** | **12/12** | 13/15 | 8/10 | 6/8 | 3/5 | **3.65** | | **Mistral Large 2512 r3** | **12/12** | 13/15 | 8/10 | 6/8 | 3/5 | **3.68** | | **Qwen 3.5 397B r1** | 11/12 | 12/15 | 7/10 | 5/8 | 4/5 | **3.50** | | **Qwen 3.5 397B r2** | 11/12 | 13/15 | 8/10 | 5/8 | 4/5 | **3.55** | | **Qwen 3.5 397B r3**

Source files in the repo

Cross-judge composites: analysis/results_overview.md

Full report · PDF

Get the full report

All 111 questions, the complete rubric, per-model verdicts, and the methodology paper. Delivered as PDF within 60 seconds. CC-BY 4.0.

Welcome back. You've already requested the full report.

Download the report (PDF) ↓