10 May 2006

'Many' vs. 'Virtually Every'

Submitted by Karl Hagen
I recently ran into a claim that seems to have risen to the level of folk linguistics. In this version, the claim runs either "in the vast majority of languages" or "in virtually every language" the word for mother begins with m. Google {"word for mother" begin m} and you'll see what I mean.

Stated this way, the observation seems unlikely to be true. Just to take two examples from languages with which I have a passing acquaintance, the Tamil word is amma, and the Korean word is oma. Granted, they both contain the sound [m], but as the claim is phrased, vowel initial words are out.

This looks like a badly mangled version of the observation made originally by George Murdoch (an observation explained by Roman Jakobson in his paper "Why 'mama' and 'papa'?") that 52% of 470 languages that he surveyed had ma, me or mo in them. Phrased this way, the Tamil and Korean examples are part of this group, but notice that this constitutes only a slim majority, not "virtually every language". And if we restricted ourselves to only m- initial words, I'd be willing to bet the numbers drop to well less than a majority. Even with the looser definition, 48% of languages don't have a m + V structure. Many others are formed with n (another nasal), and 7% even have papa type words meaning 'mother'. (By the way, in Tamil, papa means 'baby', and maamaa means 'uncle'.)

The reality is harder to phrase in a punchy fashion, but is still just as interesting. There's a nice explanation of Jakobson's explanation (and why the similarities don't prove that all languages come from a common ancestor, Proto-World) here [pdf]

It would also be nice if the rosetta project got the Swadesh lists that they have collected into some sort of searchable tool, so that we could get all words for mother in a single shot and find out the real numbers. At the moment, I can't even figure out how to access any of the lists they claim to have, even one at a time.