Osnat Haj Yahia

Intuit
The Reward is All You Need (to Get Wrong): A Practitioner's Framework for Stress-Testing Reward Functions in RL Fine-Tuning

Abstract

Reinforcement fine-tuning can teach language models to reason — but the hardest part isn’t the model or the optimizer. It’s the reward function. A reward function looks like engineering: it compiles, it runs, it returns a number. But it behaves like a contract with an adversarial optimizer that will find every loophole you left open.
Through a real-world case study — teaching LLMs to match financial transactions across record-keeping systems — we iterated through three rounds of reward redesign across both OpenAI’s proprietary RFT API and open-weight GRPO training. Each iteration uncovered a new failure mode hiding behind the one we just fixed: the Abstention Trap (asymmetric penalties that make silence optimal), the Partial Credit Cliff (elegant math with catastrophic gradient signal), the Stratum Blind Spot (aggregate reward rising while subcategories collapse), the Entropy Death Spiral (mode collapse invisible behind an API), and the Reasoning Ceiling (performance walls that no reward engineering can break without chain-of-thought).
We distill these failures into a Reward Stress Testing framework: five concrete diagnostic checks practitioners can run before and during any RL fine-tuning job. Your reward function has bugs. This talk shows you how to find them before the optimizer does.

Bio

Osnat Haj-Yahia is a Staff AI Scientist with 7 years in data science and over 15 years in the tech industry. She holds a BS in Computer Science and pursued graduate studies in Neurobiology — a detour that deepened her thinking about learning systems before she returned full-time to the AI field. Her current work focuses on reinforcement learning fine-tuning , building and developing LLMs and AI agents to help businesses around the world prosper.

Agenda

08:45

Reception & gathering

09:30

Opening remarks by WiDS TLV ambassadors

09:45

Keynote session: Prof. Michal Rosen Zvi

10:15

Keynote session: Hadas Grossmon Ella

10:45

Poster pitches

10:55

Break

11:10

Lightning talks session

12:45

Lunch & poster session

13:30

Roundtable session & poster session

14:20

Roundtable closing

14:30

Talk by Hila Paz

14:50

Talk by Dr. Moran Mizrahi

15:15

Closing remarks

15:30

End