This collection gathers foundational and recent work on **data leverage**—the strategic use of data withholding, contribution, or manipulation as a form of collective action.
## Core Concepts
- **Data Strikes**: Coordinated refusal to contribute data to platforms
- **Data Poisoning**: Intentionally corrupting training data to degrade model performance
- **Conscious Data Contribution**: Strategically directing data to preferred systems
- **Data Valuation**: Methods for quantifying individual data contributions (Shapley values, etc.)
## Why This Matters
As AI systems become more dependent on user-generated content and behavioral data, data creators gain potential leverage over technology companies. This research explores when and how such leverage can be effectively exercised.
-
Algorithmic Collective Action with Two Collectives
Aditya Karan, Nicholas Vincent, Karrie Karahalios, Hari Sundaram
ACM FAccT (2025)
-
The Economics of AI Training Data: A Research Agenda
Hamidah Oderinwale, Anna Kazlauskas
arXiv preprint (2025)
-
Collective Bargaining in the Information Economy Can Address AI-Driven Power Concentration
Nicholas Vincent, Matthew Prewitt, Hanlin Li
NeurIPS Position Papers (2025)
-
Push and Pull: A Framework for Measuring Attentional Agency on Digital Platforms
Zachary Wojtowicz, Shrey Jain, Nicholas Vincent
ACM FAccT (2025)
-
Poisoning Web-Scale Training Datasets is Practical
Nicholas Carlini, Matthew Jagielski, Christopher A. Choquette-Choo, Daniel Paleka, Will Pearce, Hyrum Anderson, Andreas Terzis, Kurt Thomas, Florian Tramèr
2024
-
Large language models reduce public knowledge sharing on online Q&A platforms
R. Maria del Rio-Chanona, Nadzeya Laurentsyeva, Johannes Wachs
PNAS Nexus (2024)
-
Algorithmic Collective Action in Machine Learning
Moritz Hardt, Eric Mazumdar, Celestine Mendler-Dünner, Tijana Zrnic
International Conference on Machine Learning (ICML) (2023)
-
The Dimensions of Data Labor: A Road Map for Researchers, Activists, and Policymakers to Empower Data Producers
Hanlin Li, Nicholas Vincent, Stevie Chancellor, Brent Hecht
Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency (2023)
-
Behavioral Use Licensing for Responsible AI
Danish Contractor, Daniel McDuff, Julia Katherine Haines, Jenny Lee, Christopher Hines, Brent Hecht, Nicholas Vincent, Hanlin Li
ACM FAccT (2022)
-
Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses
Micah Goldblum, Dimitris Tsipras, Chulin Xie, Xinyun Chen, Avi Schwarzschild, Dawn Song, Aleksander Madry, Bo Li, Tom Goldstein
IEEE Transactions on Pattern Analysis and Machine Intelligence (2022)
-
Addressing Documentation Debt in Machine Learning Research: A Retrospective Datasheet for BookCorpus
Jack Bandy, Nicholas Vincent
NeurIPS Datasets and Benchmarks (2021)
-
Machine Unlearning
Lucas Bourtoule, Varun Chandrasekaran, Christopher A. Choquette-Choo, Hengrui Jia, Adelin Travers, Baiwu Zhang, David Lie, Nicolas Papernot
IEEE Symposium on Security and Privacy (S&P) (2021)
-
Extracting Training Data from Large Language Models
Carlini, Nicholas, Tramer, Florian, Wallace, Eric, Jagielski, Matthew, Herbert-Voss, Ariel, Lee, Katherine, Roberts, Adam, Brown, Tom B., Song, Dawn, Erlingsson, {\'U}lfar, Oprea, Alina, Papernot, Nicolas
Proceedings of USENIX Security Symposium (2021)
-
Can "Conscious Data Contribution" Help Users to Exert "Data Leverage" Against Technology Companies?
Nicholas Vincent, Brent Hecht
Proceedings of the ACM on Human-Computer Interaction (2021)
-
Data Leverage: A Framework for Empowering the Public in its Relationship with Technology Companies
Vincent, Nicholas and Li, Hanlin and Tilly, Nicole and Chancellor, Stevie and Hecht, Brent
Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (2021)
-
Data Shapley: Equitable Valuation of Data for Machine Learning
Amirata Ghorbani, James Zou
International Conference on Machine Learning (2019)
-
BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain
Tianyu Gu, Brendan Dolan-Gavitt, Siddharth Garg
IEEE Access (2019)
-
How Do People Change Their Technology Use in Protest?: Understanding Protest Users
Hanlin Li, Nicholas Vincent, Janice Tsai, Jofish Kaye, Brent Hecht
ACM CSCW (2019)
-
"Data Strikes": Evaluating the Effectiveness of a New Form of Collective Action Against Technology Companies
Nicholas Vincent, Brent Hecht, Shilad Sen
The World Wide Web Conference (WWW) (2019)
-
Examining Wikipedia With a Broader Lens: Quantifying the Value of Wikipedia's Relationships with Other Large-Scale Online Communities
Nicholas Vincent, Isaac Johnson, Brent Hecht
ACM CHI (2018)