No results found.



Data Cleaning / Editing



Discipline-specific repositories

Generalist repositories

Institutional repositories

Further comparisons of repository features compiled by:

Project Management Platforms


“Cleaning Data with Open Refine,” The Programming Historian

“Cleaning Data with OpenRefine for Ecologists” and “OpenRefine for Social Science Data”, Data Carpentry: Building Communities Teaching Universal Data Literacy

Checklist for Digital Humanities Projects, La Red de Humanidades Digitales (RedHD), English and Spanish versions available

Programming Historian: Preserving Your Research Data: “This lesson will suggest ways in which historians can document and structure their research data so as to ensure it remains useful in the future.”

Library Carpentry: Tidy Data for Librarians

Library Carpentry: OpenRefine

Library Carpentry: Top 10 FAIR Data & Software Things, a list of field-specific FAIR principles/techniques

Black Living Data Booklet section on "3 Steps to Download and Decode Data" PDF

Data Literacies: DH Institutes on tidy data, CSV, stages of data analysis, etc.

NEH’s Office of Digital Humanities Guide to Data Management Plans

Methods & Best Practices

Example Datasets

Further Readings

Data & Method

Tanya E. Clement, “Where Is Methodology in Digital Humanities?”, Debates in the Digital Humanities 2016

Ryan Cordell, “Teaching Humanistic Data Analysis” (2019)

Luke Stark and Anna Lauren Hoffmann, “Data Is the New What? Popular Metaphors & Professional Ethics in Emerging Data Culture,Cultural Analytics (2019)

Daniel Rosenberg, “Data Before the Fact,” in “Raw Data” Is an Oxymoron, ed. Lisa Gitelman (MIT Press, 2013)

Johanna Drucker, “HTML and Structured Data” (2013)

Michael Hancher, “Re: Search and Close Reading,” Debates in the Digital Humanities 2016

Ricardo L. Punzalan, Diana E. Marsh, Kyla Cools, “Beyond Clicks, Likes, and Downloads: Identifying Meaningful Impacts for Digitized Ethnographic Archives,” Archivaria 84 (Fall 2017)

Klein, Lauren F. “The Image of Absence: Archival Silence, Data Visualization, and James Hemings.” American Literature 1 December 2013; 85 (4): 661–688.

Data Cleaning

Katie Rawson & Trevor Muñoz, “Against Cleaning” (2016)

Mia Ridge, “Mia Ridge explores the shape of Cooper-Hewitt collections”, Cooper-Hewitt Labs (2012)

Lauren F. Klein, “The Image of Absence: Archival Silence, Data Visualization, and James Hemings,” American Literature 85, no. 4 (2013)

Garfinkel, Simson L. “De-Identification of Personal Information.” National Institute of Standards and Technology NISTIR 8053, October 2015.

Lincoln, Matthew D. “Tidy Data for the Humanities.” Matthew Lincoln, PhD (blog), 26 May 2020, (Accessed January 31, 2022.)

Sperberg-McQueen, C.M. and David Dubin. “Data Representation.” DH Curation Guide, (no date) (Accessed January 31, 2022.)

Wickham, Hadley. “Tidy Data.” Journal of Statistical Software 2014; 49 (10): 1-23.

Babau, Alison. “Classics, ‘Digital Classics’ and Issues for Data Curation.” DH Curation Guide, (no date) (Accessed February 14, 2022)

Gebru, Timnit, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. “Datasheets for Datasets.” Communications of the ACM 64(12): pp. 86-92, 2021.

Levine, Melissa. “Policy, Practice, and Law.” DH Curation Guide, (no date) (Accessed February 14, 2022)

Mitchell, Margaret, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. “Model Cards for Model Reporting.” FAT* ’19: Proceedings of the Conference on Fairness, Accountability, and Transparency: pp. 220-229, 2019.

Van den Eynden, Veerle, Louise Corti, Matthew Woollard, Libby Bishop, and Laurence Horton. “Managing and Sharing Data: Best Practices for Researchers.” 3rd Ed. Essex: UK Data Archive, 2011. (Accessed February 28, 2022)