IN THIS LESSON

This is the thirteenth lecture in the Language Models and Intelligent Agentic Systems course, run by Meridian Cambridge in collaboration with the Cambridge Centre for Data Driven Discovery (C2D3).

This lecture covers evaluations for frontier AI models. We discuss the uses of model evaluations in decision making both pre- and post-deployment of a model. We examine how frontier models can be evaluated for their capacity to cause extreme harm via dangerous capability evaluations. Finally, we cover how evaluations can be used to forecast future develops and progress in AI.

View Lecture Slides

13. Evaluations