@karpathy: I quite like the idea using ga...
@karpathy
43 views
Feb 03, 2025
1
I quite like the idea using games to evaluate LLMs against each other, instead of fixed evals. Playing against another intelligent entity self-balances and adapts difficulty, so each eval (/environment) is leveraged a lot more. There's some early attempts around. Exciting area.
View Tweet