Coding
Terminal-Bench-Hard
Hard terminal/CLI tasks.
2 models published a score
| # | Model | Company | Score |
|---|---|---|---|
| 1 | Claude Opus 4.5 | Anthropic | 44.0 |
| 2 | Command A+ | Cohere | 25.0 |
Hard terminal/CLI tasks.
| # | Model | Company | Score |
|---|---|---|---|
| 1 | Claude Opus 4.5 | Anthropic | 44.0 |
| 2 | Command A+ | Cohere | 25.0 |