n this thesis, a number of possible solutions to source separation are suggested. Although they differ significantly in shape and intent, they share a heavy reliance on prior domain knowledge. Most of the developed algorithms are intended for speech applications, and hence, structural features of speech have been incorpo- rated.
Single-channel separation of speech is a particularly challenging signal process- ing task, where the purpose is to extract a number of speech signals from a single observed mixture. I present a few methods to obtain separation, which rely on the sparsity and structure of speech in a time-frequency representation. My own contributions are based on learning dictionaries for each speaker separately and subsequently applying a concatenation of these dictionaries to separate a mixture. Sparse decompositions required for the decomposition are computed using non- negative matrix factorization as well as basis pursuit.
In my work on the multi-channel problem, I have focused on convolutive mix- tures, which is the appropriate model in acoustic setups. We have been successful in incorporating a harmonic speech model into a greater probabilistic formula- tion. Furthermore, we have presented several learning schemes for the parameters of such models, more specifically, the expectation-maximization (EM) algorithm and stochastic and Newton-type gradient optimization.
%0 Thesis
%1 olsson2007algorithms
%A Olsson, Rasmus Kongsgaard
%D 2007
%K audio separation source
%T Algorithms for Source Separation - with Cocktail Party Applications
%X n this thesis, a number of possible solutions to source separation are suggested. Although they differ significantly in shape and intent, they share a heavy reliance on prior domain knowledge. Most of the developed algorithms are intended for speech applications, and hence, structural features of speech have been incorpo- rated.
Single-channel separation of speech is a particularly challenging signal process- ing task, where the purpose is to extract a number of speech signals from a single observed mixture. I present a few methods to obtain separation, which rely on the sparsity and structure of speech in a time-frequency representation. My own contributions are based on learning dictionaries for each speaker separately and subsequently applying a concatenation of these dictionaries to separate a mixture. Sparse decompositions required for the decomposition are computed using non- negative matrix factorization as well as basis pursuit.
In my work on the multi-channel problem, I have focused on convolutive mix- tures, which is the appropriate model in acoustic setups. We have been successful in incorporating a harmonic speech model into a greater probabilistic formula- tion. Furthermore, we have presented several learning schemes for the parameters of such models, more specifically, the expectation-maximization (EM) algorithm and stochastic and Newton-type gradient optimization.
@mastersthesis{olsson2007algorithms,
abstract = {n this thesis, a number of possible solutions to source separation are suggested. Although they differ significantly in shape and intent, they share a heavy reliance on prior domain knowledge. Most of the developed algorithms are intended for speech applications, and hence, structural features of speech have been incorpo- rated.
Single-channel separation of speech is a particularly challenging signal process- ing task, where the purpose is to extract a number of speech signals from a single observed mixture. I present a few methods to obtain separation, which rely on the sparsity and structure of speech in a time-frequency representation. My own contributions are based on learning dictionaries for each speaker separately and subsequently applying a concatenation of these dictionaries to separate a mixture. Sparse decompositions required for the decomposition are computed using non- negative matrix factorization as well as basis pursuit.
In my work on the multi-channel problem, I have focused on convolutive mix- tures, which is the appropriate model in acoustic setups. We have been successful in incorporating a harmonic speech model into a greater probabilistic formula- tion. Furthermore, we have presented several learning schemes for the parameters of such models, more specifically, the expectation-maximization (EM) algorithm and stochastic and Newton-type gradient optimization.},
added-at = {2011-09-30T14:06:50.000+0200},
author = {Olsson, Rasmus Kongsgaard},
biburl = {https://www.bibsonomy.org/bibtex/264c40230ce9a0dfed59ffd5c2bcddb58/nosebrain},
interhash = {551fc83a10baf6a81e555f01ca5297b3},
intrahash = {64c40230ce9a0dfed59ffd5c2bcddb58},
keywords = {audio separation source},
timestamp = {2012-11-04T17:54:17.000+0100},
title = {Algorithms for Source Separation - with Cocktail Party Applications},
year = 2007
}