Friday, July 5, 2013

NLP - List Of World's Weirdest Languages

The World Atlas of Language Structures evaluates 2,676 different languages in terms of a bunch of different language features. These features include word order, types of sounds, ways of doing negation, and a lot of other things—192 different language features in total.

So rather than take an English-centric view of the world, WALS allows us take a worldwide view. That is, we evaluate each language in terms of how unusual it is for each feature. For example, English word order is subject-verb-object—there are 1,377 languages that are coded for word order in WALS and 35.5% of them have SVO word order. Meanwhile only 8.7% of languages start with a verb—like Welsh, Hawaiian and Majang—so cross-linguistically, starting with a verb is unusual. For what it’s worth, 41.0% of the world’s languages are actually SOV order. (Aside: I’ve done some work with Hawaiian and Majang and that’s how I learned that verbs are a big commitment for me. I’m just not ready for verbs when I open my mouth.)

The data in WALS is fairly sparse, so we restrict ourselves to the 165 features that have at least 100 languages in them (at this stage we also knock out languages that have fewer than 10 of these—dropping us down to 1,693 languages).

- Read the whole thing & check out the list here (very interesting)


No comments: