Evaluating LLMs Through a Federated, Scenario-Writing Approach

Article by Bogdana “Bobi” Rakova: “What do screenwriters, AI builders, researchers, and survivors of gender-based violence have in common? I’d argue they all imagine new, safe, compassionate, and empowering approaches to building understanding.

In partnership with Kwanele South Africa, I lead an interdisciplinary team, exploring this commonality in the context of evaluating large language models (LLMs) — more specifically, chatbots that provide legal and social assistance in a critical context. The outcomes of our engagement are a series of evaluation objectives and scenarios that contribute to an evaluation protocol with the core tenet that when we design for the most vulnerable, we create better futures for everyone. In what follows I describe our process. I hope this methodological approach and our early findings will inspire other evaluation efforts to meaningfully center the margins in building more positive futures that work for everyone…(More)”

Share

How to contribute:

Did you come across – or create – a compelling project/report/book/app at the leading edge of innovation in governance?

Share it with us at info@thelivinglib.org so that we can add it to the Collection!

About the author

Stefaan Verhulst

Get the latest news right in you inbox

Subscribe to curated findings and actionable knowledge from The Living Library, delivered to your inbox every Friday

Evaluating LLMs Through a Federated, Scenario-Writing Approach

Share

How to contribute:

About the author

Stefaan Verhulst

Get the latest news right in you inbox

Related articles

Making Civic Trust Less Abstract: A Framework for Measuring Trust Within Cities

The AI Policy Playbook

Europe’s dream to wean off US tech gets reality check