A corpus of Scottish texts from 1945 to the present day ..... currently contains over 1.100 written and spoken texts, totalling over 4 million words of running text. 80% of this total is made up of written texts and 20% is made up of spoken texts, which are provided in the form of an orthographic transcription, synchronised with the source audio or video
« The American National Corpus (ANC) project is creating a massive electronic collection of American English, including texts of all genres and transcripts of spoken data produced from 1990 onward. The ANC will provide the most comprehensive picture of American English ever created, and will serve as a resource for education, linguistic and lexicographic research, and technology development. »
« The corpus contains more than 360 million words of text, including 20 million words each year from 1990-2007, and it is equally divided among spoken, fiction, popular magazines, newspapers, and academic texts. The corpus will also be updated at least twice each year from this point on, and will therefore serve as a unique record of linguistic changes in American English. The interface allows you to search for exact words or phrases, wildcards, lemmas, part of speech, or any combinations of these. You can search for surrounding words (collocates) within a ten-word window (e.g. all nouns somewhere near chain, all adjectives near woman, or all verbs near key). »
LAEME will allow you to search and retrieve linguistic data from its corpus of lexico-grammatically tagged texts ; search and retrieve data from the Index of Sources; view maps showing the geographical distribution of linguistic features across space; view and create chronological tables, graphs and charts showing the distribution of linguistic features through time