Extracting structured information from free-text medical documents may seem like a straightforward task for large language models. In practice, we found that end-to-end LLM solutions struggle in domains that require strict alignment with predefined medical vocabularies, often producing confident but invalid outputs.
In this talk, we share how we redesigned such a system by shifting from a generative approach to a constrained, multi-step pipeline. Rather than asking an LLM to produce clinical codes directly, we decomposed the problem into interpretable stages: structured finding extraction, retrieval of relevant concept candidates, LLM-based ranking under tight constraints, and deterministic post-processing. This approach significantly reduced hallucinations, improved robustness across different types of imaging reports, and made the system easier to evaluate and debug.
Beyond healthcare, this case study highlights broader lessons for data scientists building LLM-powered systems in high-stakes or closed-vocabulary domains, where reliability matters more than raw generative ability.
Sharon Fogel is a Senior Data Scientist at Navina, where she works on applying machine learning and natural language processing to improve clinical decision-making from unstructured medical data. She brings extensive experience from the healthcare and tech industries, having previously worked at Quris AI on developing machine learning models for predicting drug safety, and at Amazon AWS on large-scale document analysis using NLP and computer vision. Sharon holds an MSc in Electrical Engineering from Tel Aviv University and a BSc in Physics and Mathematics from the Hebrew University of Jerusalem through the Talpiot Program, and enjoys building research-driven ML solutions that make a real-world impact.
Keynote session: Hadas Grossmon Ella
Break
Lightning talks session
Roundtable closing
Talk by Hila Paz
Talk by Dr. Moran Mizrahi
Closing remarks
End
Reception
Opening remarks by WiDS TLV ambassadors
Dr. Mor Geva , Tel Aviv University: “MRI for Large Language Models: Mechanistic Interpretability from Neurons to Attention Heads”
Panel: “Pioneering Progress: a strategic look at the GenAI revolution and the new role of data scientists“ Shani Gershtein, Melingo | Mirit Elyada Bar, Intuit | Dr. Asi Messica, Lightricks Moderated by Nitzan Gado, Intuit
Poster pitches
Break
Lightning talks session
Lunch & poster session
Roundtable session & poster session
Roundtable closing
Shunit Agmon, Technion: “Bridging the Gender Gap in Clinical AI: Temporal Adaptation with TeDi-BERT”
Shaked Naor Hoffmann, Apartment List: “Building Generative AI Agents for Production: Turning Ideas into Real-World Applications”
Closing remarks
The end