Oxford Dictionaries announce that the billionth word has just been added to the massive Oxford English Corpus (OEC), the groundbreaking research project into the language of the 21st century.
This is equivalent to half a million pages of the average broadsheet (or over a million pages of a tabloid newspaper) and nearly nine million Shakespeare sonnets – it would take the average TV newsreader 30 years without stopping to read out.
In order to really understand how our language is developing during the 21st century, researchers at Oxford Dictionaries are monitoring the whole of the English language as it is being used by everyone, everywhere, every day. Since 1 January 2000 Oxford Dictionary researchers have fed over one billion words of what people around the world are writing and saying into the corpus, a unique electronic database that makes it possible to see exactly how and why English is changing.
This ongoing research programme not only provides Oxford Dictionaries with the most up-to-date and accurate information about how words, spellings, and meanings are changing, but also reveals many fascinating insights into our language and culture.
The corpus can be searched to discover which words most often appear together as 'collocates'. When, for example, the corpus is examined for verbs which correlate with 'man' or 'boy' but not 'woman' or 'girl', we find that men assault, hijack, crouch, kidnap, rob, grin, shoot, dig, stagger, leap, invent, or brandish. But they don't consent, faint, sob, cohabit, undress, clutch, scorn, or gossip, because, according to the evidence of the corpus, that's what women do.
And while we may like to think of the Internet as a worthy tool providing access to knowledge for all, the corpus shows overwhelmingly that we actually use the word 'online' in conjunction with gaming, gambling, dating, and shopping much more often than any other activity.
If you thought there wasn't much difference between terms of abuse such as 'scum', 'git', or 'bastard', the corpus reveals otherwise. Discounting the 'F' word, which scores highly in most abusive contexts, the corpus shows that 'gits' are mostly irritating: smarmy, smug, and pompous, while 'scum' are bad through and through: cheating, traitorous, filthy, racist, vile, nazi, and murdering. If you're a 'bastard' you may be poor rather than bad – or sick, fat, or even lucky – but regrettably also heartless, arrogant, and greedy.
Many of our most familiar two-word phrases are being fused into one, according to corpus evidence – and the US is mostly to blame! Forever, somebody, and everyone have long been accepted, but now in American English someday, anymore, and underway are also becoming standard, and Britain is starting to follow. Britain is mostly responsible, however, for popularizing thankyou as opposed to thank you. Other increasingly popular word fusions tracked by the corpus include instore and alot.
New words and expressions revealed by the corpus include a whole new array of variations on the theme of the 'inner child', including (in order): inner geek, inner nerd, inner diva, inner dweeb, inner slut, inner cynic, inner hippie, and inner brat. And the corpus also shows whose jargon we find the most annoying by revealing the most popular uses of the suffix -speak; in order, they are: management-speak, corporate-speak, geek-speak, business-speak, therapy-speak, art-speak, lawyer-speak, media-speak, government-speak, consultant-speak, techno-speak, adspeak, and, er, PR-speak.
In the world of fashion, it's no longer enough to be simply chic when you could have radical chic, geek chic, heroin chic, shabby chic, retro chic, urban chic, lesbian chic, casual chic, cheap chic, cutie chic, or porno chic.
See www.askoxford.com/oec for more examples...
For more information on the Oxford English Corpus, please contact Kate Farquhar-Thomson on 01865 353423 or email kate.farquhar-thomson@oup.com