Corpus del Español The Corpus del Español has two sections. The first contains over 100 millions words of historical data from the 1200s-1900s from a number of sources, including spoken, fiction, newspaper, and academic texts. The second section is a new addition from 2016 which has nearly two billion words of data from the internet with webpages from 21 different countries. The texts were collected between 2013 and 2014.
CORPES The ¨Corpus del Español del Siglo XXI¨ contains thousands of written and oral texts from 2001 to 2012, constituting over 225 million forms. These texts are from a range of countries, including Spain, the Americas, the Philippines and Equatorial Guinea.
CREA The ¨Corpus de Referencia del Español Actual¨ contains oral and written texts from every Spanish speaking country covering the years 1975-2004. You can also search for lemmas, forms and grammatical categories.
CORDE CORDE or ¨El Corpus Diacrónico del Español¨ contains texts from the beginning of the Spanish language to 1974, constituting over 250 million forms. The corpus is divided by text types, including narrative, lyrical, dramatic, scientific, historical, judiciary, and religious texts, among others.
SPLLOC The Spanish Learner Language Oral Corpora contains data that has been collected from classroom learners of Spanish who have English as their first language. The participants range from beginners to advanced levels, and perform a range of oral tasks. The database contains the sound files and transcripts.
Biblioteca Miguel de Cervantes The Biblioteca Miguel de Cervantes is a digital library project, run by the University of Alicante. It holds a collection of open-access digitized Spanish-language texts with more than 22,000 individual works.
ESPAL Is a corpus from which researchers can extract word properties such as word frequency, orthographic structure and neighborhoods, phonological structure and neighborhoods or find words according to these properties. The data come from two sources: 1) a large collection of written data from the Web and 2) subtitles from movies.
NIM NIM is a stimuli search engine for psycholinguistic research. You can look for words in Spanish, English and Catalan based on a number of selection criteria, such as part-of-speech, frequency, number of letters, among other features.
CREA frequency list This website contains the list of the 1000, 5000 and 10000 most frequent forms in the CREA corpus. You can also download the list of all word frequencies.
PEBL The Psychology Experiment Building Language is an experiment platform that allows you to design your own experiments or use any of a range of already existing experiments.
OpenSesame OpenSesame is an experiment platform that allows you to create experiments for psychology, neuroscience and experimental economics.