Corpus del Español The Corpus del Español has five sections containing between 40,000 and 45 billion words. These subcorpora contain historical texts, texts from a variety of Spanish dialects, news articles on the web, google books as well as the top 40,000 words in Spanish. CORPES The ¨Corpus del Español del Siglo XXI¨ contains thousands of written and oral texts from 2001 to 2012, constituting over 225 million forms. These texts are from a range of countries, including Spain, the Americas, the Philippines and Equatorial Guinea.
CREA The ¨Corpus de Referencia del Español Actual¨ contains oral and written texts from every Spanish speaking country covering the years 1975-2004. You can also search for lemmas, forms and grammatical categories.
CORDE CORDE or ¨El Corpus Diacrónico del Español¨ contains texts from the beginning of the Spanish language to 1974, constituting over 250 million forms. The corpus is divided by text types, including narrative, lyrical, dramatic, scientific, historical, judiciary, and religious texts, among others.
SPLLOC The Spanish Learner Language Oral Corpora contains data that has been collected from classroom learners of Spanish who have English as their first language. The participants range from beginners to advanced levels, and perform a range of oral tasks. The database contains the sound files and transcripts.
Biblioteca Miguel de Cervantes The Biblioteca Miguel de Cervantes is a digital library project, run by the University of Alicante. It holds a collection of open-access digitized Spanish-language texts with more than 22,000 individual works.
Word Properties
ESPAL Is a corpus from which researchers can extract word properties such as word frequency, orthographic structure and neighborhoods, phonological structure and neighborhoods or find words according to these properties. The data come from two sources: 1) a large collection of written data from the Web and 2) subtitles from movies.
NIM NIM is a stimuli search engine for psycholinguistic research. You can look for words in Spanish, English and Catalan based on a number of selection criteria, such as part-of-speech, frequency, number of letters, among other features.
CREA frequency list This website contains the list of the 1000, 5000 and 10000 most frequent forms in the CREA corpus. You can also download the list of all word frequencies.
Experiment presentation
PEBL The Psychology Experiment Building Language is an experiment platform that allows you to design your own experiments or use any of a range of already existing experiments.
OpenSesame OpenSesame is an experiment platform that allows you to create experiments for psychology, neuroscience and experimental economics.