Skip to Main Content

Creating and Developing a Digital Humanities Project - From Inception to Implementation and Dissemination: DATA MANAGEMENT

An Essential Step by Step Approach: From Planning to Completing and Disseminating Your Digital Humanities Project.

WHAT IS DATA MANAGEMENT IN THE DIGITAL HUMANITIES?

Digital Humanities (DH) data are not the objects in themselves, but a digital representation of those objects, or of structured information related to them, which is created and analyzed by computational methods. Most digital projects rely on the synthesis, analysis, and visualization of data.  Broadly defined, data are any information collected or created in order to answer a research question, and may include the primary objects of study, such as texts, paintings, documentary sources, surveys, and secondary literature. Large projects can even be crowd-sources from the general public.

In DH research, data often also constitute valuable outputs in the form of digital resources, such as, for example, encoded texts, databases, images of artifacts, and digital collections.

In DH projects, "data management" refers to the systematic process of organizing, storing, archiving, preserving, and sharing research data collected through digital technologies. This includes planning for the handling of data throughout its lifecycle from collection to analysis and dissemination, ensuring its accessibility and long-term usability for future research. This process is often called the research data lifecycle.

KEY ASPECTS OF DATA MANAGEMENT IN DH PROJECTS

Key aspects of data management in DH projects:

  • Data Collection:

Identifying relevant data sources, whether digitized texts, images, audio recordings, or other digital artifacts, and implementing appropriate methods for data capture. 

  • Data Cleaning and Standardization:

Cleaning up messy data by correcting errors, standardizing formats, and applying consistent metadata to ensure data quality and interoperability. 

  • Data Storage and Archiving:

Selecting appropriate platforms or repositories to securely store digital data, considering file formats, version control, and long-term preservation strategies. 

  • Metadata Creation:

Developing detailed descriptive information about the data (e.g., origin, date, content, context) to facilitate discoverability and understanding. 

  • Data Analysis Tools:

Utilizing software applications like OpenRefine, text analysis tools, and data visualization platforms to analyze and interpret the data. 

  • Data Sharing and Dissemination:

Making research data accessible to the wider community through open repositories, data portals, or published digital projects. 

bjbb

Examples of Digital Humanities Projects that heavily rely on data management:

  • Corpus-based analysis of literary texts:

Building and analyzing large digital collections of texts to study language patterns and themes. 

  • Historical mapping projects:

Geo-referencing and visualizing historical data on maps to explore spatial relationships. 

  • Digital archives of cultural heritage materials:

Digitizing and providing online access to collections like photographs, manuscripts, and audio recordings. 

  • Social media analysis for cultural studies:

Collecting and analyzing large volumes of social media data to understand public discourse and trends. 

READINGS

ISSUES TO TACKLE IN DH DATA MANAGEMENT

Challenges in Digital Humanities Data Management:

  • Data Heterogeneity: Dealing with diverse data formats and structures from different sources. 
  • Data Volume: Managing large datasets that can be computationally demanding to process. 
  • Access and Copyright Issues: Obtaining permissions to use copyrighted materials and ensuring ethical data collection practices. 

ESSENTIAL CONSIDERATIONS

Important Considerations:

  • Data Management Plans (DMPs):

Creating a detailed plan outlining data collection, storage, access, and preservation strategies, often required by funding agencies. 

  • Collaboration with Librarians and Data Specialists:

Leveraging expertise from library staff and data scientists to navigate complex data management issues. 

  • Open Standards and Interoperability:

Utilizing open data formats and metadata standards to enable data sharing and reusability across different projects. 

TOOLS

Tools that will allow you to work easily with collections of data:

Bulk Rename Utility:  Simplifies the process for Windows Users by making it easy to establish and follow naming conventions.

     Resource: How to Use Bulk Rename Utility’  (video - 5.59 mins.)

DMPToolHelps create the Data Management Plan, which will be required in a DH project that involves generating data.

From the PageTranscription tool that allows users to crowdsource or collaborate with restricted individuals or volunteers to transcribe, index, and describe historic documents

     Resource: From the Page "How-To"

OpenRefine: OpenRefine is a powerful free, open source tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.

     Resource: OpenRefine User Manual

TropyAllows users to organize, annotate, tag, search visually, and export collections for research.

     Resource: Introduction to Tropy  (video) 4.12 mins.