Abstract
We present a new approach to integrate annotation data from public
sources for the expression analysis of genes and proteins. Expression
data is materialized in a data warehouse supporting high performance
for data-intensive analysis tasks. On the other hand, annotation
data is integrated virtually according to analysis needs. Our virtual
integration utilizes the commercial product SRS (Sequence Retrieval
System) of LION bioscience. To couple the data warehouse and SRS,
we implemented a query mediator exploiting correspondences between
molecular-biological objects explicitly captured from public data
sources. This hybrid integration approach has been implemented for
a large gene expression warehouse and supports functional analysis
using annotation data from GeneOntology, Locus-link and Ensembl.
The paper motivates the chosen approach, details the integration
concept and implementation, and provides results of preliminary performance
tests.
Users
Please
log in to take part in the discussion (add own reviews or comments).