Automatic Acquisition of Taxonomies in Different Languages from Multiple Wikipedia Versions

Automatic Acquisition of Taxonomies in Different Languages from Multiple Wikipedia Versions
Key:	DRS11-1
Author:	Renato Domínguez García, Christoph Rensing, Ralf Steinmetz
Date:	September 2011
Kind:	In proceedings
Publisher:	ACM International Conference Proceedings Series ACM Inc.
Book title:	Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies
Abstract:	In the last years, the vision of the Semantic Web has led to many approaches that aim to automatically derive knowledge bases from Wikipedia. These approaches rely mostly on the English Wikipedia as it is the largest Wikipedia version and have lead to valuable knowledge bases. However, each Wikipedia version contains socio-cultural knowledge, i.e. knowledge with specific relevance for a culture or language. One difficulty of the application of existing approaches to multiple Wikipedia versions is the use of additional corpora. In this paper, we describe the adaptation of existing heuristics that make the extraction of large sets of hyponymy relations from multiple Wikipedia versions with little information about each language possible. Further, we evaluate our approach with Wikipedia versions in four different languages and compare results with GermaNet for German and WordNet for English.
View Full paper (PDF) \| Download Full paper (PDF)

The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, not withstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.