Paper by Angelina McMillan-Major, Emily M. Bender, and Batya Friedman: “Responsible computing ultimately requires that technical communities develop and adopt tools, processes, and practices that mitigate harms and support human flourishing. Prior efforts toward the responsible development and use of datasets, machine learning models, and other technical systems have led to the creation of documentation toolkits to facilitate transparency, diagnosis, and inclusion. This work takes the next step: to catalyze community uptake, alongside toolkit improvement. Specifically, starting from one such proposed toolkit specialized for language datasets, data statements for natural language processing, we explore how to improve the toolkit in three senses: (1) the content of the toolkit itself, (2) engagement with professional practice, and (3) moving from a conceptual proposal to a tested schema that the intended community of use may readily adopt. To achieve these goals, we first conducted a workshop with natural language processing practitioners to identify gaps and limitations of the toolkit as well as to develop best practices for writing data statements, yielding an interim improved toolkit. Then we conducted an analytic comparison between the interim toolkit and another documentation toolkit, datasheets for datasets. Based on these two integrated processes, we present our revised Version 2 schema and best practices in a guide for writing data statements. Our findings more generally provide integrated processes for co-evolving both technology and practice to address ethical concerns within situated technical communities…(More)”
How to contribute:
Did you come across – or create – a compelling project/report/book/app at the leading edge of innovation in governance?
Share it with us at info@thelivinglib.org so that we can add it to the Collection!
About the author
Get the latest news right in you inbox
Subscribe to curated findings and actionable knowledge from The Living Library, delivered to your inbox every Friday
Related articles
INSTITUTIONAL INNOVATION
Why PeaceTech must be the next frontier of innovation and investment
Posted in June 18, 2025 by Stefaan Verhulst
artificial intelligence
Sharing trustworthy AI models with privacy-enhancing technologies
Posted in June 17, 2025 by Stefaan Verhulst
INSTITUTIONAL INNOVATION
2025 State of the Digital Decade
Posted in June 17, 2025 by Stefaan Verhulst