Content Mining: Cost incurred resources for mining

This guide provides information about available text mining resources and tools and whether or not the Libraries subscription databases support content mining.

USC Libraries' databases where cost is incurred to perform mining

Most of the libraries' databases do not allow content mining research due to license agreements with publishers. We will continue to work with database vendors to include mining in future license agreements. If you do not see a resource listed here, please contact us and let us know!

Vendor Fee Details
Cambridge University Press  Cost negotiated per request

Contact USC Libraries to initiate the process.

 

Gale Primary Resources Some free; downloading large datasets incurs cost

Contact USC Libraries to initiate the process.

Gale Artemis: Primary Sources, which searches across 23 of our Gale primary source databases covering 1500-2012, has a Term Frequency search option and Term Clusters viewer.

To download large datasets USC Libraries will have to request data on your behalf from our Gale sales representative. It can take up to 3 weeks to process requests.

IEEE Cost negotiated per request

Contact USC Libraries to initiate the process.

Through a negotiation of the vendor license, the library facilitates on a case by case situation. 

Newsbank Cost incurred Contact USC Libraries to initiate the process. Restrictions in place; cost for TDM research between $6-8,000 and can take up to 6-8 weeks to process.
Oxford University Press (OUP) Cost incurred

Contact USC Libraries to initiate the process.

Researchers may use resource for non-commercial text mining. However, OUP offers consultation service with technical project manager to assist in planning project including "avoidance of any technical safeguard triggers OUP has in place to protect stability and security of website."

ProQuest Cost negotiated per request and available TDM Studio platform

Contact USC Libraries to initiate the process.

ProQuest does allow free text mining for the newspapers to which USC Libraries have purchased perpetual access licenses. USC Libraries will have to request this data on your behalf. 2019 platform: TDM Studio offers (for pay) select researchers to use ProQuest resources including newspapers for research.

ScienceDirect (Elsevier) Free (with subscription)

Contact USC Libraries to initiate the process.

TDM all subscribed content as long as it is for non-commercial purposes. Users access via Elsevier's Science Direct APIs

SpringerLink Free (with subscription)

Users can download subscribed and open access content for TDM purposes directly from the SpringerLink platform.

Content can be downloaded via a web browser or with an HTTP GET request using a scripting tool such as curl, wget and Python’s urllib, among others.

No API key or other authentication is required. TDM researchers are requested to be considerate and limit their downloading speed to a reasonable rate.