I don't know what labs are doing to these poor LLMs during RL but they are mortally terrified of exceptions, in any infinitesimally likely case. Exceptions are a normal part of life and healthy dev process. Sign my LLM welfare petition for improved rewards in cases of exceptions.
POV: Your LLM agent is dividing a by b

Generated by Thread Navigator
Press ⌘ + S to quick-export
