Abstract
This book is aimed to provide an overview of general deep learning methodology
and its applications to a variety of signal and information processing tasks. The
application areas are chosen with the following three criteria: 1) expertise or
knowledge of the authors; 2) the application areas that have already been
transformed by the successful use of deep learning technology, such as speech
recognition and computer vision; and 3) the application areas that have the
potential to be impacted significantly by deep learning and that have gained
concentrated research efforts, including natural language and text processing,
information retrieval, and multimodal information processing empowered by
multi-task deep learning.
In Chapter 1, we provide the background of deep
learning, as intrinsically connected to the use of multiple layers of nonlinear
transformations to derive features from the sensory signals such as speech and
visual images. In the most recent literature, deep learning is embodied also as
representation learning, which involves a hierarchy of features or concepts where
higher-level representations of them are defined from lower-level ones and where
the same lower-level representations help to define higher-level ones. In Chapter
2, a brief historical account of deep learning is presented. In particular,
selected chronological development of speech recognition is used to illustrate
the recent impact of deep learning that has become a dominant technology in
speech recognition industry within only a few years since the start of a
collaboration between academic and industrial researchers in applying deep
learning to speech recognition. In Chapter 3, a three-way classification scheme
for a large body of work in deep learning is developed. We classify a growing
number of deep learning techniques into unsupervised, supervised, and hybrid
categories, and present qualitative descriptions and a literature survey for each
category. From Chapter 4 to Chapter 6, we discuss in detail three popular deep
networks and related learning methods, one in each category. Chapter 4 is devoted
to deep autoencoders as a prominent example of the unsupervised deep learning
techniques. Chapter 5 gives a major example in the hybrid deep network category,
which is the discriminative feed-forward neural network for supervised learning
with many layers initialized using layer-by-layer generative, unsupervised
pre-training. In Chapter 6, deep stacking networks and several of the variants
are discussed in detail, which exemplify the discriminative or supervised deep
learning techniques in the three-way categorization scheme.
In Chapters
7-11, we select a set of typical and successful applications of deep learning in
diverse areas of signal and information processing and of applied artificial
intelligence. In Chapter 7, we review the applications of deep learning to speech
and audio processing, with emphasis on speech recognition organized according to
several prominent themes. In Chapters 8, we present recent results of applying
deep learning to language modeling and natural language processing. Chapter 9 is
devoted to selected applications of deep learning to information retrieval
including Web search. In Chapter 10, we cover selected applications of deep
learning to image object recognition in computer vision. Selected applications of
deep learning to multi-modal processing and multi-task learning are reviewed in
Chapter 11. Finally, an epilogue is given in Chapter 12 to summarize what we
presented in earlier chapters and to discuss future challenges and
directions.
Links and resources
Tags
community