Artikel,

Attention based convolutional recurrent neural network for environmental sound classification

Z. Zhang, S. Xu, S. Zhang, T. Qiao, und S. Cao.
Neurocomputing, (2021)
DOI: https://doi.org/10.1016/j.neucom.2020.08.069

Zusammenfassung

Environmental sound classification (ESC) is a challenging problem due to the complexity of sounds. The classification performance is heavily dependent on the effectiveness of representative features extracted from the environmental sounds. However, ESC often suffers from the semantically irrelevant frames and silent frames. In order to deal with this, we employ a frame-level attention model to focus on the semantically relevant frames and salient frames. Specifically, we first propose a convolutional recurrent neural network to learn spectro-temporal features and temporal correlations. Then, we extend our convolutional RNN model with a frame-level attention mechanism to learn discriminative feature representations for ESC. We investigated the classification performance when using different attention scaling function and applying different layers. Experiments were conducted on ESC-50 and ESC-10 datasets. Experimental results demonstrated the effectiveness of the proposed method and our method achieved the state-of-the-art or competitive classification accuracy with lower computational complexity. We also visualized our attention results and observed that the proposed attention mechanism was able to lead the network tofocus on the semantically relevant parts of environmental sounds.

BibTeX-Schlüssel: attention-based-crnn
Eintragstyp: article
Jahr: 2021
Zeitschrift: Neurocomputing
Seiten: 896-903
Band: 453
issn: 0925-2312
DOI: https://doi.org/10.1016/j.neucom.2020.08.069
URL: https://www.sciencedirect.com/science/article/pii/S0925231220313618

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Bitte melden Sie sich an um selbst Rezensionen oder Kommentare zu erstellen.

Zitieren Sie diese Publikation

@article{attention-based-crnn, abstract = {Environmental sound classification (ESC) is a challenging problem due to the complexity of sounds. The classification performance is heavily dependent on the effectiveness of representative features extracted from the environmental sounds. However, ESC often suffers from the semantically irrelevant frames and silent frames. In order to deal with this, we employ a frame-level attention model to focus on the semantically relevant frames and salient frames. Specifically, we first propose a convolutional recurrent neural network to learn spectro-temporal features and temporal correlations. Then, we extend our convolutional RNN model with a frame-level attention mechanism to learn discriminative feature representations for ESC. We investigated the classification performance when using different attention scaling function and applying different layers. Experiments were conducted on ESC-50 and ESC-10 datasets. Experimental results demonstrated the effectiveness of the proposed method and our method achieved the state-of-the-art or competitive classification accuracy with lower computational complexity. We also visualized our attention results and observed that the proposed attention mechanism was able to lead the network tofocus on the semantically relevant parts of environmental sounds.}, added-at = {2022-07-12T01:34:06.000+0200}, author = {Zhang, Zhichao and Xu, Shugong and Zhang, Shunqing and Qiao, Tianhao and Cao, Shan}, biburl = {https://www.bibsonomy.org/bibtex/2da93181010963739d62d39ce11154b05/fachter}, doi = {https://doi.org/10.1016/j.neucom.2020.08.069}, interhash = {4b700231319dee6ac55e38bc13a1178d}, intrahash = {da93181010963739d62d39ce11154b05}, issn = {0925-2312}, journal = {Neurocomputing}, keywords = {attention_mechanism audio_classification convolutional_recurrent_neural_network thema:cnn_and_attention_methods_for_audio_classification}, pages = {896-903}, timestamp = {2022-07-12T10:20:36.000+0200}, title = {Attention based convolutional recurrent neural network for environmental sound classification}, url = {https://www.sciencedirect.com/science/article/pii/S0925231220313618}, volume = 453, year = 2021 }

BibSonomy

Attention based convolutional recurrent neural network for environmental sound classification

Zusammenfassung

Tags

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Zitieren Sie diese Publikation

Mehr Zitationsstile

Suchen auf