Library Guides: Research Data Repositories: Finding and Storing Data: Literature, Linguistics, & Languages

Literature, Linguistics, & Languages

AILLA: Archive of the indigenous languages of Latin America
AILLA is a digital archive of recordings and texts in and about the indigenous languages of Latin America. Access to archive resources is free of charge. Most of the resources in the AILLA database are available to the public, but some have special access restrictions.
British Library Datasets
British Library Datasets contains numerous history datasets (i.e., Black History Month, Magna Carta, Women’s rights, etc.), literature datasets (datasets focused on famous authors), music datasets (i.e., History of Music), and other datasets (i.e., National parks, Theology, etc.).
Early Novels Database
The Early Novels Database (END) project generates high-quality metadata about novels published between 1660 and 1850 in order to make early works of fiction more available to both traditional and computational modes of humanistic study.
OLAC: Open Language Archives Community
OLAC is "an international partnership of institutions and individuals who are creating a worldwide virtual library of language resources by: (i) developing consensus on best current practice for the digital archiving of language resources, and (ii) developing a network of interoperating repositories and services for housing and accessing such resources."
Oxford Text Archive
A repository of full-text literary and linguistic resources. Includes thousands of texts in more than 25 languages.
The Rosetta Project
The Rosetta Project is a global collaboration of language specialists and native speakers working to build a publicly accessible digital library of material on the nearly 7,000 known human languages. The collection currently contains nearly 100,000 pages of material documenting over 2,500 languages, as well as a growing multimedia collection of modern and historical language recordings.
TalkBank
TalkBank is an open access repository for spoken language data.
TROLLing - The Tromsø Repository of Language and Linguistics
TROLLing "is designed as an archive of linguistic data and statistical code. The archive is open access, which means that all information is available to everyone. All postings are accompanied by searchable metadata that identify the researchers, the languages and linguistic phenomena involved, the statistical methods applied, and scholarly publications based on the data (where relevant). Linguists worldwide are invited to post datasets and statistical models used in linguistic research."

University Libraries

Research Data Repositories: Finding and Storing Data

Literature, Linguistics, & Languages

OHIO UNIVERSITY LIBRARIES