@dominikb1888

MultiCrawler: A Pipelined Architecture for Crawling and Indexing Semantic Web Data

, , and . The Semantic Web - ISWC 2006, (2006)

Abstract

The goal of the work presented in this paper is to obtain large amounts of semistructured data from the web. Harvesting semistructured data is a prerequisite to enabling large-scale query answering over web sources. We contrast our approach to conventionalweb crawlers, and describe and evaluate a five-step pipelined architecture to crawl and index data from both the traditionaland the Semantic Web.

Description

SpringerLink - Book Chapter

Links and resources

Tags

community

  • @munozjuan
  • @dominikb1888
  • @terraces
  • @dblp
@dominikb1888's tags highlighted