Researchers release largest ever public collection of British conversations
25 Sep 2017 12:59 PM
Language experts at Lancaster University and Cambridge University Press yesterday published the largest ever public collection of transcribed British conversations, totalling 11.5 million words of spontaneous British English collected between 2012 and 2016.
The study has revealed that use of the word 'like' at the beginning of sentences has risen substantially in the last few decades, from 160 per million sentences in the 1990s to 625 per million in the 2010s.
Use of the split infinitive, as in the infamous Star Trek line 'To boldly go', has almost tripled over the last three decades. This grammatical construction sees the word 'to' and the verb broken up by an intervening word, usually an adverb.
Linguists working on the project found the split infinitive had risen from a mere 44 words per million in the early 1990s to a staggering 117 words per million in the 2010s, with common examples including 'to just go', 'to actually get' and 'to really want'.
The split infinitive and the word 'like' at the beginning of sentences are just two examples of language that is becoming a normal part of speech. Researchers on the Spoken British National Corpus 2014 project have previously identified that words such as 'marvellous' and 'marmalade' were out of fashion and that newcomers such as 'awesome' and 'massively' were bang on trend.
The recordings used for the project were carried out between 2012 and 2016. They were gathered by members of the British public, who used their smartphones to record everyday conversations with their families and friends. These included: a newlywed couple reminiscing about their recent honeymoon, students drinking in their halls, a father and daughter chatting in the car and grandparents visiting family for the day.
In a landmark moment for social science, the anonymised transcripts of these recordings were released yesterday, free of charge, to the public. This is the largest collection or 'corpus' of British English conversations ever made freely available.
The creators, including Lancaster University's Professor Tony McEnery and Cambridge University Press's Dr Claire Dembry, intend for the transcripts to be used by linguists and language educators around the world.
These conversations will help linguists to understand what influences language change over short periods of time, as well as how best to teach learners of English.
Professor McEnery, who set up the research project, said: "The launch of the Spoken British National Corpus 2014 is an important moment for the study of spoken English. Never before has it been possible to compare millions of words of spoken English across decades in this way. This will help linguists to understand better the changing nature of English speech and help a new generation of learners of English in the modern world."
Principal Research Manager at Cambridge University Press, Dr Dembry, highlighted the importance of keeping up with language change. She said: "Learners of English deserve to be taught in a way which is informed by the most up to date research into how the language is used in the real world.
"The rise of the split infinitive is just one example of language phenomena which some commentators might not like, but which are becoming a normal part of everyday speech. Language teaching should reflect these changes, which can only be observed in a corpus such as this."
The corpus will also make it possible to compare how different social groups talk, including men vs. women, young vs old and north vs south, as well as to study how the British public discusses topics including politics, religion, immigration and the economy.
The Spoken British National Corpus 2014 was gathered by the ESRC-funded Centre for Corpus Approaches to Social Science (CASS) at Lancaster University and Cambridge University Press.
Notes for editors
- The findings about changes in frequency over time are generated by comparing the Spoken BNC2014 to an existing corpus, collected in the 1990s – the spoken component of the original British National Corpus.
- The first research articles to be based on the Spoken BNC2014 are due to be published in the International Journal of Corpus Linguistics later this year. This will be followed by a book in the Routledge Advances in Corpus Linguisticsseries and the recordings of the conversations.
- Lancaster University is ranked among the top 10 universities by all leading league tables in the UK and has a rising global reputation. It is the highest ranked University in the North West of England in the Guardian, Times/Sunday Times and Complete University Guide. It is also top for employability and student satisfaction in its region. 83% of Lancaster’s research is judged to be internationally excellent and world leading. The University has a strong focus on working with business and has helped create more than 5,000 new jobs.
- Cambridge University Press is part of the University of Cambridge. It furthers the University's mission by disseminating knowledge in the pursuit of education, learning and research at the highest international levels of excellence. Its extensive peer-reviewed publishing lists comprise 50,000 titles covering academic research and professional development, as well as school-level education and English language teaching. Playing a leading role in today's international marketplace, Cambridge University Press has more than 50 offices around the globe, and it distributes its products to nearly every country in the world.
- The Economic and Social Research Council (ESRC) is the UK’s largest funder of research on the social and economic questions facing us today. It supports the development and training of the UK’s future social scientists and also funds major studies that provide the infrastructure for research. ESRC-funded research informs policymakers and practitioners and helps make businesses, voluntary bodies and other organisations more effective. The ESRC also works collaboratively with six other UK research councils and Innovate UK to fund cross-disciplinary research and innovation addressing major societal challenges. The ESRC is an independent organisation, established by Royal Charter in 1965, and funded mainly by the Government.