Article,

Robust Multimodal Audio-Visual Processing for Advanced Context Awareness in Smart Spaces

, , , and .
Personal and Ubiquitous Computing, 13 (1): 3-14 (2009)
DOI: 10.1007/s00779-007-0169-9

Abstract

Identifying people and tracking their locations is a key prerequisite to achieving context awareness in smart spaces. Moreover, in realistic context-aware applications, these tasks have to be carried out in a non-obtrusive fashion. In this paper we present a set of robust person-identification and tracking algorithms, based on audio and visual processing. A main characteristic of these algorithms is that they operate on far-field and un-constrained audio-visual streams, which ensure that they are non-intrusive. We also illustrate that the combination of their outputs can lead to composite multimodal tracking components, which are suitable for supporting a broad range of context-aware services. In combining audio-visual processing results, we exploit a context-modeling approach based on a graph of situations. Accordingly, we discuss the implementation of realistic prototype applications that make use of the full range of audio, visual and multimodal algorithms.

Tags

Users

  • @flint63

Comments and Reviews