bookmarks  4

  •  

    An ontology is a computer-processable collection of knowledge about the world. This thesis explains how an ontology can be constructed and expanded auto- matically. The proposed approach consists of three contributions: 1. A core ontology, YAGO. YAGO is an ontology that has been constructed automatically. It com- bines high accuracy with large coverage and serves as a core that can be expanded. 2. A tool for information extraction, LEILA. LEILA is a system that can extract knowledge from natural language texts. LEILA will be used to ¯nd new facts for YAGO. 3. An integration mechanism, SOFIE. SOFIE is a system that can reason on the plausibility of new knowl- edge. SOFIE will assess the facts found by LEILA and integrate them into YAGO. Each of these components comes with a fully implemented system. Together, they form an integrative architecture, which does not only gather new facts, but also reconcile them with the existing facts. The result is an ever-growing, yet highly accurate ontological knowledge base. A survey of applications of the ontology completes the thesis.
    16 years ago by @sia
    (0)
     
     
  •  

    We present a taxonomy automatically generated from the system of categories in Wikipedia. Categories in the resource are identified as either classes or instances and included in a large subsumption, i.e. isa, hierarchy. The taxonomy is made available in RDFS format to the research community, e.g. for direct use within AI applications or to bootstrap the process of manual ontology creation.
    16 years ago by @sia
    (0)
     
     
  •  

    this paper presents the process of acquiring a large, domain independent, taxonomy from the German Wikipedia. We build upon a previously implemented platform that extracts a semantic network and taxonomy from the English version of theWikipedia. We describe two accomplishments of our work: the semantic network for the German language in which isa links are identied and annotated, and an expansion of the platform for easy adaptation for a new language. We identify the platform's strengths and shortcomings, which stem from the scarcity of free processing resources for languages other than English. We show that the taxonomy induction process is highly reliable – evaluated against the German version of WordNet, GermaNet, the resource obtained shows an accuracy of 83.34%.
    16 years ago by @sia
    (0)
     
     
  •  

    This paper presents an automatic method for diferentiating between instances and classes in a large scale taxonomy induced from the Wikipedia category network. The method exploits characteristics of the category names and the structure of the network. The approach we present is the ¯rst attempt to make this distinction automatically in a large scale resource. In contrast, this distinction has been made in WordNet and Cyc based on manual annotations. The result of the process is evaluated against ResearchCyc. On the subnetwork shared by our taxonomy and ResearchCyc we report 84.52% accuracy.
    16 years ago by @sia
    (0)
     
     
  • ⟨⟨
  • 1
  • ⟩⟩