I gave 9 AI models the same coding challenge: write code to generate an animation. Same prompt, same constraints, same evaluation criteria. No cherry-picking, no re-prompts โ just raw output.
| # | Model | Provider |
|---|---|---|
| 1 | GPT-5.5 | OpenAI |
| 2 | Claude Haiku | Anthropic |
| 3 | Claude Sonnet | Anthropic |
| 4 | Claude Opus | Anthropic |
| 5 | Gemini 3.1 Pro | |
| 6 | Qwen 3.6 27B | Alibaba |
| 7 | Kimi K2.5 | Moonshot |
| 8 | GPT-5.4 / Mini | OpenAI |
Ranked by quality โ scroll down for detailed outputs from every model:
Full video with panda commentary: youtube.com/watch?v=CeMXQGfNuXo