Inproceedings,

Terminology Evolution Module for Web Archives in the LiWA Context

N. Tahmasebi, G. Zenz, T. Iofciu, and T. Risse.
Proc. of 10th International Web Archiving Workshop in conjunction with iPRES in Vienna, Austria, 2010, (2010)

Full text

Abstract

More and more national libraries and institutes are archiving the web as a part of the cultural heritage. As with all long term archives, these archives contain text and language that evolves over time. This is particularly true for web archives as content published online is highly dynamic and changing at a fast rate. The language evolution causes gaps between the terminology used for querying and the one stored in long term archives. To ensure access and interpretability of these archives, language evolution must be found and handled in an automatic manner. In this paper we present the LiWA Terminology evolution module, TeVo which takes us one step closer to fully automatic detection of terminology evolution. TeVo consists of a pipeline for finding evolution from web archives based on the UIMA framework. The LiWA TeVo module consists of two main processing chains, the first for Warc file extraction and text processing and the second for finding terminology evolution. We also present the terminology evolution browser, the TeVo browser, which aids in exploring evolution of terms present in archives.

BibTeX key: L3S_2aaf8c67a65197717cd4f40e01d9c29987f5277d
entry type: inproceedings
booktitle: Proc. of 10th International Web Archiving Workshop in conjunction with iPRES in Vienna, Austria, 2010
year: 2010
Document: http://www.l3s.de/~risse/pub/iwaw2010.pdf

BibSonomy

Terminology Evolution Module for Web Archives in the LiWA Context

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on