A fascinating article about the value of Twitter as a corpus of contemporary language from the Language Log.
To say Twitter is colloquial is putting it lightly. “Brother,” for example, occurs in Twitter data during the week of May 10-17, 2010 with an average frequency of once every 7,338 words, not too distant from its frequency in its closest cousin, the Corpus of Contemporary American English (once every 9,405 words). The difference for “bro,” however, is much more dramatic: in the Twitter data during that same period, it occurs once every 5,833 words (more frequently, in fact, than “brother”), while in the COCA it occurs once every 757,575 words – two orders of magnitude less frequently.
[ link ]
Advertisement
Posted in: Language, Social Media

Posted on 25 June 2010
0