Аннотация

This paper addresses one of the largest and most complexdata curation workflows in existence: Wikipedia and Wikidata with ahigh number of users and curators adding factual information from exter-nal sources via a non-systematic Wiki workflow to Wikipedia’s infoboxesand Wikidata items. We present high-level analyses of the current state,the challenges and limitations in this workflow and supplement it witha quantitative and semantic analysis of the resulting data spaces by de-ploying DBpedia’s integration and extraction capabilities. Based on ananalysis of millions of references from Wikipedia infoboxes in differentlanguages, we can find the most important sources which can be usedto enrich other knowledge bases with information of better quality. Aninitial tool is presented, the GlobalFactSync browser, as a prototype todiscuss further measures to develop a more systematic approach for datacuration in the WikiVerse.

Линки и ресурсы

тэги