โ† All Episodes

I Tested ๏ผ™ AI Models in a Real Scenario (Surprising Results)

Ep01 ยท 27:46 ยท Published Apr 26, 2026 ยท 57 views

The Test

I gave 9 AI models the same coding challenge: write code to generate an animation. Same prompt, same constraints, same evaluation criteria. No cherry-picking, no re-prompts โ€” just raw output.

Models Tested

# Model Provider
1GPT-5.5OpenAI
2Claude HaikuAnthropic
3Claude SonnetAnthropic
4Claude OpusAnthropic
5Gemini 3.1 ProGoogle
6Qwen 3.6 27BAlibaba
7Kimi K2.5Moonshot
8GPT-5.4 / MiniOpenAI

Full Results

All prompts, model outputs, and my notes are available on the companion benchmark page:

๐Ÿ“Š kuro-llm-benchmark.pages.dev

What I Learned

Watch on YouTube

Full video with panda commentary: youtube.com/watch?v=CeMXQGfNuXo