Article,

Learning object identification rules for information integration

, , and .
Information Systems, 26 (8): 607--633 (December 2001)
DOI: 10.1016/S0306-4379(01)00042-4

Abstract

When integrating information from multiple websites, the same data objects can exist in inconsistent text formats across sites, making it difficult to identify matching objects using exact text match. We have developed an object identification system called Active Atlas, which compares the objects’ shared attributes in order to identify matching objects. Certain attributes are more important for deciding if a mapping should exist between two objects. Previous methods of object identification have required manual construction of object identification rules or mapping rules for determining the mappings between objects. This manual process is time consuming and error-prone. In our approach. Active Atlas learns to tailor mapping rules, through limited user input, to a specific application domain. The experimental results demonstrate that we achieve higher accuracy and require less user involvement than previous methods across various application domains.

Tags

Users

  • @jaeschke
  • @pirot
  • @hotho
  • @sam_chapman
  • @robertisele

Comments and Reviews