@marenzi

Ain’t that sweet. Reflections on scene level indexing and annotation in the House Corpus Project.

, , and . 1, page 151-181. ESE - Salento University Publishing, (2019)
DOI: 10.1285/i9788883051531p151

Abstract

This paper outlines the strategies, rationale and potential uses motivating the construction of the House Corpus, a one-million-word corpus that can be accessed by authorised users through the MWSWeb site (Taibi et al. 2015a) at http://openmws. itd. cnr. it. Part 1 illustrates the tools and techniques used to index the corpus data–transcriptions of all 177 episodes in the House MD series (original US version). In particular, it describes the commercially available Elasticsearch ( https://www. elastic. co), used as an indexing, annotational and search tool. Part 2 explains that this is a multimedia corpus allowing viewings of different types of scene. The 6000-plus scenes in the corpus have been annotated in terms of their typological features: Location type (eg patient’s hospital room; medical lab etc.); Event type (handover; differential diagnosis; precipitating medical event; patient examination etc.) and Character Group type...

Links and resources

Tags