@diego_ma

University of Sheffield: Description of the LaSIE-II System and Used for MUC-7

, , , , , und . Proc. MUC-7, SAIC, (1998)On-line proceedings, http:\\www.muc.saic.com/.

Zusammenfassung

The University of Sheffield NLP group took part in MUC-7 using the LaSIE-II system, an evolution of the LaSIE (Large Scale Information Extraction) system first created for participation in MUC-6 and part of a larger research effort into information extraction underway in our group. LaSIE-II was used to carry out all five of the MUC-7 tasks and was, in fact, the only system to take part in all of the MUC-7 tasks. While LaSIE-II is significantly different from the earlier version (differences are detailed below) there are no radical changes in the basic philosophy of the approach. This could be described as seeking a pragmatic middle way in the shallow vs deep analysis debate which has characterised the last several MUCs. That is, while aware that information extraction tasks may not require full text understanding, and hence that systems should be optimised to make use of shallow techniques where appropriate, we have not wanted to preclude the application of arbitrarily sophisticated linguistic analysis techniques where these may prove useful. The result is an eclectic mixture of techniques including finite state recognition of domain-specific lexical patterns, partial parsing using a restricted context-free grammar, simplified semantic representation of each sentence in the text and a formal representation of the whole discourse from which all of the IE task results and the coreference task results are derived. From our perspective, LaSIE-II should not be viewed as the expression of a theory about how to do IE, but as a laboratory in which ongoing experiments with different component NL processing techniques, and most importantly, their interaction are being carried out. Seen this way, one of the most important developments in LaSIE-II is its modularised architecture and integration into the GATE platform (see below) which has enabled us to gain much deeper insights into strengths and weaknesses of components of the system and the ways in which these interact.

Links und Ressourcen

Tags