Article,

A community approach to data integration: Authorship and building meaningful links across diverse archaeological data sets

.
Geosphere, 1 (2): 97-109 (2005)
DOI: 10.1130/GES00013.1

Abstract

The ability to link and compare diverse archaeological data sets will catalyze innovative research of great scope and analytic rigor. However, information heterogeneity and limited budgets and information technology skills challenge data dissemination initiatives. This paper argues for new methods of community-based data integration pioneered by the University of Chicago's Extensible Markup Language (XML) System for Textual and Archaeological Research project (XSTAR). With XSTAR, data integration takes place in two steps: (1) syntactic-schematic integration: Legacy data sets are migrated for representation in the data structures described by the Archaeological Markup Language (ArchaeoML), and (2) Semantic integration: Mappings must be established between related terms and classes in each source database. Because the nuances of meaning are often very subtle, human experts must classify related items in each data set. Initial syntactic-schematic mapping of data into XSTAR is simple and fast but occurs at a relatively abstract level of meaning. Nevertheless, this initial step can accommodate diverse archaeological (and other) data sets, and will facilitate community-led development of more semantically specific data integration. XSTAR hopes to enable multiple semantic data integration schemas to develop and keep pace with changing research agendas. though rooted in archaeology, this paper discusses challenges faced by many disciplines in encouraging more powerful diachronic and regional syntheses. ArchaeoML's highly generalized data model has applicability outside archaeology, especially with subdisciplines of the earth sciences yet to develop formal ontologies. In addition, because this is a community driven approach, incentives for community participation must be explored. Intellectual property and professional rewards are key factors in determining the success of online dissemination systems across many disciplines.

Tags

Users

  • @cstrasser

Comments and Reviews