Discover what's next in tech

Watch on demand

Solving GenAI's great challenge: Evaluating your LLM in production

Tech Horizons executive webinar series

The opaque nature of large language models (LLMs) is one of the biggest challenges preventing organizations from getting great AI concepts into production.

Traditional machine learning evaluation techniques simply fall short with LLMs and new LLM evaluation frameworks seem to be popping up every week. What’s the right approach for your RAG and fine-tuning use cases? In this webinar, our AI experts discuss how to evaluate LLM effectiveness and risks.

Attendees will learn:

Tips for defining clear objectives for what the LLM should achieve.
What performance metrics to assess for accuracy, relevance, response time, toxicity and user satisfaction.
Error analysis identification and categorization.
Benchmarking considerations.
How and when to consider qualitative user feedback.

Watch the recording

Speakers

Aaron Erickson

Senior Engineering Manager, NVIDIA

Carlos Villela

Senior Software Engineer, NVIDIA

Musa Parmaksiz

Head of AI and Data Center Excellence, UBS Investment Bank

Prasanna Pendse

Director of AI Strategy, Thoughtworks (Moderator)

Shayan Mohanty

Head of AI Research, Thoughtworks

You may also be interested in...

Thoughtworks acknowledges the Traditional Owners of the land where we work and live, and their continued connection to Country. We pay our respects to Elders past and present. Aboriginal and Torres Strait Islander peoples were the world's first scientists, technologists, engineers and mathematicians. We celebrate the stories, culture and traditions of Aboriginal and Torres Strait Islander Elders of all communities who also work and live on this land.

As a company, we invite Thoughtworkers to be actively engaged in advancing reconciliation and strengthen their solidarity with the First Peoples of Australia. Since 2019, we have been working with Reconciliation Australia to formalize our commitment and take meaningful action to advance reconciliation. We invite you to review our Reconciliation Action Plan.