Remember reinforcement fine-tuning? We’ve been working away at it since last December, and it’s available today with OpenAI o4-mini! RFT uses chain-of-thought reasoning and task-specific grading to improve model performance—especially useful for complex domains. Take @AccordanceAI, which used RFT to fine-tune a model that’s SOTA for their tax and accounting purposes.
And in supervised fine-tuning news: you can now fine-tune GPT-4.1 nano. Get even more from our fastest, cheapest model by training it specifically for your use-case.