Text & Data Mining: Overview

This guide provides information about freely available text mining resources and tools and whether or not the Libraries subscription databases support text/data mining.

What is Text/Data Mining?

Text mining is a research technique used in a variety of disciplines that deploys computational analysis to extract trends and patterns from large text-based data sets (Source: University of Chicago's Text & Data Mining Guide). The difference between text mining and data mining is that "in text mining the patterns are extracted from natural language text rather than from structured databases of facts" (Source: "What Is Text Mining?" by Marti Hearst). Text mining examines and analyzes full-text digitized content, while data mining might only need to look at metadata describing that content. 

Ted Talk: What We Learned from 5 Million Books

Contact

Caroline Muglia's picture
Caroline Muglia
Contact:
Co-Associate Dean for Collections & Technical Services; Head of Resource Sharing and Collection Assessment Librarian
213-821-0756