AILLA is a digital archive of recordings and texts in and about the indigenous languages of Latin America. Access to archive resources is free of charge. Most of the resources in the AILLA database are available to the public, but some have special access restrictions.
British Library Datasets contains numerous history datasets (i.e., Black History Month, Magna Carta, Women’s rights, etc.), literature datasets (datasets focused on famous authors), music datasets (i.e., History of Music), and other datasets (i.e., National parks, Theology, etc.).
The Early Novels Database (END) project generates high-quality metadata about novels published between 1660 and 1850 in order to make early works of fiction more available to both traditional and computational modes of humanistic study.
OLAC is "an international partnership of institutions and individuals who are creating a worldwide virtual library of language resources by: (i) developing consensus on best current practice for the digital archiving of language resources, and (ii) developing a network of interoperating repositories and services for housing and accessing such resources."
The Rosetta Project is a global collaboration of language specialists and native speakers working to build a publicly accessible digital library of material on the nearly 7,000 known human languages. The collection currently contains nearly 100,000 pages of material documenting over 2,500 languages, as well as a growing multimedia collection of modern and historical language recordings.
TROLLing "is designed as an archive of linguistic data and statistical code. The archive is open access, which means that all information is available to everyone. All postings are accompanied by searchable metadata that identify the researchers, the languages and linguistic phenomena involved, the statistical methods applied, and scholarly publications based on the data (where relevant). Linguists worldwide are invited to post datasets and statistical models used in linguistic research."