In Search of the Ur-Wikipedia: Universality, Similarity, and Translation in the Wikipedia Inter-language Link Network

Wikipedia has become one of the primary encyclopaedic in- formation repositories on the World Wide Web. It started in 2001 with a single edition in the English language and has since expanded to more than 20 million articles in 283 lan- guages. Criss-crossing between the Wikipedias is an inter- language link network, connecting the articles of one edition of Wikipedia to another. We describe characteristics of ar- ticles covered by nearly all Wikipedias and those covered by only a single language edition, we use the network to under- stand how we can judge the similarity between Wikipedias based on concept coverage, and we investigate the ow of translation between a selection of the larger Wikipedias. Our ndings indicate that the relationships between Wiki- pedia editions follow Tobler’s rst law of geography: sim- ilarity decreases with increasing distance. The number of articles in a Wikipedia edition is found to be the strongest predictor of similarity, while language similarity also appears to have an in uence. The English Wikipedia edition is by far the primary source of translations. We discuss the im- pact of these results for Wikipedia as well as user-generated content communities in general.


Sorin Adam Matei

Sorin Adam Matei - Professor of Communication at Purdue University - studies the relationship between information technology and social groups. He published papers and articles in Journal of Communication, Communication Research, Information Society, and Foreign Policy. He is the author or co-editor of several books. The most recent is Structural differentation in social media. He also co-edited Ethical Reasoning in Big Data,Transparency in social media and Roles, Trust, and Reputation in Social Media Knowledge Markets: Theory and Methods (Computational Social Sciences) , all three the product of the NSF funded KredibleNet project. Dr. Matei's teaching portfolio includes online interaction, and online community analytics and development classes. His teaching makes use of a number of software platforms he has codeveloped, such as Visible Effort . Dr. Matei is also known for his media work. He is a former BBC World Service journalist whose contributions have been published in Esquire and several leading Romanian newspapers. In Romania, he is known for his books Boierii Mintii (The Mind Boyars), Idolii forului (Idols of the forum), and Idei de schimb (Spare ideas).

