Total Error Sheets for Datasets (TES-D) -- A Critical Guide to Documenting Online Platform Datasets
Abstract: This paper proposes a template for documenting datasets that have been collected from online platforms for research purposes. The template should help to critically reflect on data quality and increase transparency in research fields that make use of online platform data. The paper describes our motivation, outlines the procedure for developing a specific documentation template that we refer to as TES-D (Total Error Sheets for Datasets) and has the current version of the template, guiding questions and a manual attached as supplementary material. The TES-D approach builds upon prior work in designing error frameworks for data from online platforms, namely the Total Error Framework for digital traces of human behavior on online platforms (TED-On, https://doi.org/10.1093/poq/nfab018).
- Data statements for natural language processing: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics, 6:587--604.
- Datasheets for datasets. Communications of the ACM, 64(12):86--92.
- Survey Methodology. John Wiley & Sons.
- Total survey error: Past, present, and future. Public Opinion Quarterly, 74(5):849--879.
- Social data: Biases, methodological pitfalls, and ethical boundaries. Frontiers in Big Data, 2:13.
- “call me sexist, but...”: Revisiting sexism detection using psychological scales and adversarial samples. In Proceedings of the International AAAI Conference on Web and Social Media, volume 15, pages 573--584.
- A total error framework for digital traces of human behavior on online platforms. Public Opinion Quarterly, 85(S1):399--422.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.