The Salamanca Corpus

digital archive of english dialect texts


The Project




From 8th March 2015 all new updates will only be seen at

The present contents of The Salamanca Corpus will be gradually added to the new website. Until migration is completed, these contents will still be seen at this site and, eventually, at the new site.

The linguistic history of English dialects still suffers from a considerable lack of diachronic data representative of the period that extends from the early modern period up to modern times (c.1500-c.1950). Whilst the increasing availability of textual corpora has enabled successful diachronic research into the history of standard English, variation in regional varieties of English remains virtually unexplored. No diachronic compilations have hitherto been available to fill the lacunae still present in the field. For this reason, a group of researchers from the University of Salamanca (initially led by Dr. Gudelia Rodríguez Sánchez ✝) has been working over the past few years on a long-term project whose primary aim is to remedy the scarcity of data so that linguists may be able to sketch the regional setting from a diachronic perspective. Consisting of documents representative of literary dialects and dialect literature, the Salamanca Corpus has been conceived as an  electronic repository of diachronic dialect material which might bridge some of the gaps still existing in the field. It aims to cover a time span of no fewer than four centuries (c.1500-c.1950), thereby presenting documents in which dialect traits from pre-1974 English counties are documented. Some of the texts supplement the monumental primary sources of the English Dialect Dictionary (1898-1905), thus adding to our understanding of old regional speech. The compilation follows the pluralistic stance that has recently been adopted by diachronic linguistics, seeking to provide a democratic account of non canonical literatures too. 

The Salamanca Corpus has been possible thanks to the generous financial support of the Spanish Ministry of Education and Science. Two research grants have so far funded our investigation:


Title: “Variación lingüística en el Inglés Moderno Temprano: Dialectos y sociolectos marginados en el proceso de estandardización” (PB98-0258).

Period: 30/12/1999-30/12/2002

Main researcher: Dr. Gudelia Rodríguez Sánchez.


Title: “Idiolectos y sociolectos ingleses marginados en el proceso de estandardización desde fines del siglo XVI hasta mediados del siglo XX” (BFF 2003-09376).

Period: 10/12/2003-09/12/2006

Main Researcher: Dr. María F. García-Bermejo Giner.

We are also grateful to the University of Salamanca both for granting us space in their server for this web-page and for permanently hosting this electronic Corpus at the University Digital Archive: GREDOS. See link on the left.




Home                        Facebook


Research on the Salamanca Corpus

Research on related Topics

The Salamanca Corpus

          Dialect Literature

          Literary Dialects







Digital Library  GREDOS

(University of Salamanca)

Copyright and citation

Copyright © 2011-DING,

The Salamanca Corpus

Universidad de Salamanca

Last updated

8th March 2015

407 texts & counting

[Title List]

Word Count

11,721,610 Words