copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Domain ontology learning from the web

D. Sánchez. The Knowledge Engineering Review, 24 (04): 413--413 (2009)

Abstract

Ontology Learning is defined as the set of methods used for building from scratch, enriching or adapting an existing ontology in a semi-automatic fashion using heterogeneous information sources. This data-driven procedure uses text, electronic dictionaries, linguistic ontologies and structured and semi-structured information to acquire knowledge. Recently, with the enormous growth of the Information Society, the Web has become a valuable source of information for almost every possible domain of knowledge. This has motivated researchers to start considering the Web as a valid repository for Information Retrieval and Knowledge Acquisition. However, the Web suffers from problems that are not typically observed in classical information repositories: human oriented presentation, noise, untrusted sources, high dynamicity and overwhelming size. Even though, it also presents characteristics that can be interesting for knowledge acquisition: due to its huge size and heterogeneity it has been assumed that the Web approximates the real distribution of the information in humankind. The present work introduces a novel approach for ontology learning, introducing new methods for knowledge acquisition from the Web. The adaptation of several well known learning techniques to the web corpus and the exploitation of particular characteristics of the Web environment composing an automatic, unsupervised and domain independent approach distinguishes the present proposal from previous works. With respect to the ontology building process, the following methods have been developed: i) extraction and selection of domain related terms, organising them in a taxonomical way; ii) discovery and label of non-taxonomical relationships between concepts; iii) additional methods for improving the final structure, including the detection of named entities, class features, multiple inheritance and also a certain degree of semantic disambiguation. The full learning methodology has been implemented in a distributed agent-based fashion, providing a scalable solution. It has been evaluated for several well distinguished domains of knowledge, obtaining good quality results. Finally, several direct applications have been developed, including automatic structuring of digital libraries and web resources, and ontology-based Web Information Retrieval.

Links and resources

BibTeX key: sanchez2009domain
entry type: article
year: 2009
journal: The Knowledge Engineering Review
number: 04
pages: 413--413
publisher: Cambridge Univ Press
volume: 24
timestamp: 2010-11-10 11:09:12
username: dbenz
intrahash: 1d6ef9dbccf2f21c8395427fefa8a8a9
issn: 0269-8889
file: sanchez2009domain.pdf:sanchez2009domain.pdf:PDF
interhash: 958a2cf02d6bdad93c5a61fd952385e6
journalpub: 1
groups: public
url: http://scholar.google.de/scholar.bib?q=info:1b5eMmkxoXoJ:scholar.google.com/&output=citation&hl=de&as_sdt=2000&ct=citation&cd=45

Cite this publication

%0 Journal Article %1 sanchez2009domain %A Sánchez, D. %D 2009 %I Cambridge Univ Press %J The Knowledge Engineering Review %K ol_web2.0 background %N 04 %P 413--413 %T Domain ontology learning from the web %U http://scholar.google.de/scholar.bib?q=info:1b5eMmkxoXoJ:scholar.google.com/&output=citation&hl=de&as_sdt=2000&ct=citation&cd=45 %V 24 %X Ontology Learning is defined as the set of methods used for building from scratch, enriching or adapting an existing ontology in a semi-automatic fashion using heterogeneous information sources. This data-driven procedure uses text, electronic dictionaries, linguistic ontologies and structured and semi-structured information to acquire knowledge. Recently, with the enormous growth of the Information Society, the Web has become a valuable source of information for almost every possible domain of knowledge. This has motivated researchers to start considering the Web as a valid repository for Information Retrieval and Knowledge Acquisition. However, the Web suffers from problems that are not typically observed in classical information repositories: human oriented presentation, noise, untrusted sources, high dynamicity and overwhelming size. Even though, it also presents characteristics that can be interesting for knowledge acquisition: due to its huge size and heterogeneity it has been assumed that the Web approximates the real distribution of the information in humankind. The present work introduces a novel approach for ontology learning, introducing new methods for knowledge acquisition from the Web. The adaptation of several well known learning techniques to the web corpus and the exploitation of particular characteristics of the Web environment composing an automatic, unsupervised and domain independent approach distinguishes the present proposal from previous works. With respect to the ontology building process, the following methods have been developed: i) extraction and selection of domain related terms, organising them in a taxonomical way; ii) discovery and label of non-taxonomical relationships between concepts; iii) additional methods for improving the final structure, including the detection of named entities, class features, multiple inheritance and also a certain degree of semantic disambiguation. The full learning methodology has been implemented in a distributed agent-based fashion, providing a scalable solution. It has been evaluated for several well distinguished domains of knowledge, obtaining good quality results. Finally, several direct applications have been developed, including automatic structuring of digital libraries and web resources, and ontology-based Web Information Retrieval.

@article{sanchez2009domain, abstract = {Ontology Learning is defined as the set of methods used for building from scratch, enriching or adapting an existing ontology in a semi-automatic fashion using heterogeneous information sources. This data-driven procedure uses text, electronic dictionaries, linguistic ontologies and structured and semi-structured information to acquire knowledge. Recently, with the enormous growth of the Information Society, the Web has become a valuable source of information for almost every possible domain of knowledge. This has motivated researchers to start considering the Web as a valid repository for Information Retrieval and Knowledge Acquisition. However, the Web suffers from problems that are not typically observed in classical information repositories: human oriented presentation, noise, untrusted sources, high dynamicity and overwhelming size. Even though, it also presents characteristics that can be interesting for knowledge acquisition: due to its huge size and heterogeneity it has been assumed that the Web approximates the real distribution of the information in humankind. The present work introduces a novel approach for ontology learning, introducing new methods for knowledge acquisition from the Web. The adaptation of several well known learning techniques to the web corpus and the exploitation of particular characteristics of the Web environment composing an automatic, unsupervised and domain independent approach distinguishes the present proposal from previous works. With respect to the ontology building process, the following methods have been developed: i) extraction and selection of domain related terms, organising them in a taxonomical way; ii) discovery and label of non-taxonomical relationships between concepts; iii) additional methods for improving the final structure, including the detection of named entities, class features, multiple inheritance and also a certain degree of semantic disambiguation. The full learning methodology has been implemented in a distributed agent-based fashion, providing a scalable solution. It has been evaluated for several well distinguished domains of knowledge, obtaining good quality results. Finally, several direct applications have been developed, including automatic structuring of digital libraries and web resources, and ontology-based Web Information Retrieval.}, added-at = {2011-02-17T17:43:15.000+0100}, author = {S{\'a}nchez, D.}, biburl = {https://www.bibsonomy.org/bibtex/21d6ef9dbccf2f21c8395427fefa8a8a9/dbenz}, file = {sanchez2009domain.pdf:sanchez2009domain.pdf:PDF}, groups = {public}, interhash = {958a2cf02d6bdad93c5a61fd952385e6}, intrahash = {1d6ef9dbccf2f21c8395427fefa8a8a9}, issn = {0269-8889}, journal = {The Knowledge Engineering Review}, journalpub = {1}, keywords = {ol_web2.0 background}, number = 04, pages = {413--413}, publisher = {Cambridge Univ Press}, timestamp = {2013-07-31T15:39:42.000+0200}, title = {Domain ontology learning from the web}, url = {http://scholar.google.de/scholar.bib?q=info:1b5eMmkxoXoJ:scholar.google.com/&output=citation&hl=de&as_sdt=2000&ct=citation&cd=45}, username = {dbenz}, volume = 24, year = 2009 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Domain ontology learning from the web

Abstract

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Domain ontology learning from the web

Abstract

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Domain ontology learning from the web

Comments and Reviews
(0)