Prof. Reut Tsarfaty

Will Hebrew Speakers Be Able to Use Generative AI in Their Native Tongue?
reut tzarfati

Bio

Reut Tsarfaty is a Professor at Bar-Ilan University leading the Open Natural Language Processing Research Lab (ONLP), and a Visiting Professor at Google. Her research focuses on natural language parsing, broadly interpreted to cover morphological, syntactic, semantic, and pragmatic aspects for typologically different languages. She is a leading figure in research on morphologically-rich languages, and is internationally renowned for her groundbreaking work on Hebrew NLP. Reut’s research lab is funded by prestigious grants and awards, including the European Research Council, the Israeli Science Foundation, grants from the Ministry of Science and Technology, and the Israel Innovation Authority.

Abstract

Generative AI in general, and specifically generative Large Language Models (LLMs), have changed the game for many tasks and domains in the field of natural language processing. Indeed, the ability to comprehend and generate natural language in a human-like way is instrumental in many language technology applications – from sentiment analysis and information extraction to dialogue systems.

However, much of this success has been demonstrated in English. For a resource-scarce language such as Hebrew, with its profound linguistic structure and morphological intricacies, do LLMs hold the same promise? Will Hebrew speakers be able to use generative AI in their native tongue?

In this talk, I will present three different threads from my research lab, focusing on Hebrew LLMs. One concerns the LLM architecture and, in particular, inquires about the correlation of LLMs’ performance with properties of the tokenizer and the LLM’s internal lexicon. The second concerns the use of LLMs, particularly how to develop good prompts for Hebrew tasks, in monolingual and multilingual settings. The last one concerns the evaluation of LLMs on real downstream tasks, where I will present benchmarks for challenging applications such as summarization and credibility (fake news), and news detection in Hebrew.

All in all, I will show that on the path to LLM Hebrew fluency, interesting research questions emerge, providing useful answers and further insights into this groundbreaking technology.

Agenda

8:45 Reception
9:30 Opening remarks by WiDS TLV ambassadors Noah Eyal Altman, Or Basson, and Nitzan Gado
9:45 Dr. Aya Soffer, IBM: "Putting Generative AI to Work: What Have We Learned So Far?"
10:15 Prof. Reut Tsarfaty, Bar-llan University: "Will Hebrew Speakers Be Able to Use Generative AI in Their Native Tongue?"
10:45 Poster Pitches
10:55 Break
11:10 Lightning talks
12:30 Lunch & poster session
13:30 Roundtable session & poster session
14:15 Roundtable closing
14:30 Break
14:40 Naomi Ken Korem, Lightricks: "Mastering the Art of Generative Models: Training and Controlling Text-to-Video Models"
15:00 Dr. Yael Mathov, Intuit: "Surviving the AI-pocalypse: Your Guide to LLM Security"
15:20 Closing remarks
15:30 The end