NAGIOS: RODERIC FUNCIONANDO

Signal processing techniques for robust sound event recognition

Repositori DSpace/Manakin

IMPORTANT: Aquest repositori està en una versió antiga des del 3/12/2023. La nova instal.lació está en https://roderic.uv.es/

Signal processing techniques for robust sound event recognition

dc.contributor.advisor	Ferri Rabasa, Francesc Josep
dc.contributor.advisor	Cobos Serrano, Máximo
dc.contributor.author	Martín Morató, Irene
dc.contributor.other	Departament d'Informàtica	es_ES
dc.date.accessioned	2019-11-25T12:26:17Z
dc.date.available	2019-11-26T05:45:05Z
dc.date.issued	2019	es_ES
dc.date.submitted	25-11-2019	es_ES
dc.identifier.uri	https://hdl.handle.net/10550/72345
dc.description.abstract	The computational analysis of acoustic scenes is today a topic of major interest, with a growing community focused on designing machines capable of identifying and understanding the sounds produced in our environment, similar to how humans perform this task. Although these domains have not reached the industrial popularity of other related audio domains, such as speech recognition or music analysis, applications designed to identify the occurrence of sounds in a given scenario are rapidly increasing. These applications are usually limited to a set of sound classes, which must be defined beforehand. In order to train sound classification models, representative sets of sound events are recorded and used as training data. However, the acoustic conditions present during the collection of training examples may not coincide with the conditions during application testing. Background noise, overlapping sound events or weakly segmented data, among others, may substantially affect audio data, lowering the actual performance of the learned models. To avoid such situations, machine learning systems have to be designed with the ability to generalize to data collected under conditions different from the ones seen during training. Traditionally, the techniques used to carry out tasks related to the computational understanding of sound events have been inspired by similar domains such as music or speech, so the features selected to represent acoustic events come from those specific domains. Most of the contributions of this thesis are based on how such features are suitably applied for sound event recognition, proposing specific methods to adapt the features extracted both within classical recognition approaches and modern end-to-end convolutional neural networks. The objective of this thesis is therefore to develop novel signal processing techniques aimed at increasing the robustness of the features representing acoustic events to adverse conditions affecting the mismatch between the training and test conditions in model learning. To achieve such objective, we start first by analyzing the importance of classical feature sets such as Mel-frequency cepstral coefficients (MFCCs) or the energies extracted from log-mel filterbanks, analyzing as well the impact of noise, reverberveration or segmentation errors in diverse scenarios. We show that the performance of both classical and deep learning-based approaches is severely affected by these factors and we propose novel signal processing techniques designed to improve their robustness by means of the non-linear transformation of feature vectors along the temporal axis. Such transformation is based on the so called event trace, which can be interpreted as an indicator of the temporal activity of the event within the feature space. Finally, we propose the use of the energy envelope as a target for event detection, which implies the change from a classification-based approach to a regression-oriented one.	es_ES
dc.format.extent	141 p.	es_ES
dc.language.iso	en	es_ES
dc.subject	audio classification	es_ES
dc.subject	support vector machines	es_ES
dc.subject	deep learning	es_ES
dc.subject	feature selection	es_ES
dc.subject	sound event recognition	es_ES
dc.title	Signal processing techniques for robust sound event recognition	es_ES
dc.type	doctoral thesis	es_ES
dc.subject.unesco	UNESCO::CIENCIAS TECNOLÓGICAS	es_ES
dc.embargo.terms	0 days	es_ES