Knowledge
SimpleQA
Short-answer factuality benchmark.
3 models published a score
| # | Model | Company | Score |
|---|---|---|---|
| 1 | Gemini 3 Pro | Google DeepMind | 72.1 |
| 2 | GPT-5.2 | OpenAI | 58.0 |
| 3 | Mistral Large 3 | Mistral AI | 23.8 |
Short-answer factuality benchmark.
| # | Model | Company | Score |
|---|---|---|---|
| 1 | Gemini 3 Pro | Google DeepMind | 72.1 |
| 2 | GPT-5.2 | OpenAI | 58.0 |
| 3 | Mistral Large 3 | Mistral AI | 23.8 |