Improving semi-supervised learning for audio classification with FixMatch

Grollmisch, Sascha; Cano, Estefanía

doi:10.3390/electronics10151807

Artikel / Aufsatz Mi., 28. Juli. 2021 CC BY 4.0

Veröffentlicht

Improving semi-supervised learning for audio classification with FixMatch

Including unlabeled data in the training process of neural networks using Semi-Supervised Learning (SSL) has shown impressive results in the image domain, where state-of-the-art results were obtained with only a fraction of the labeled data. The commonality between recent SSL methods is that they strongly rely on the augmentation of unannotated data. This is vastly unexplored for audio data. In this work, SSL using the state-of-the-art FixMatch approach is evaluated on three audio classification tasks, including music, industrial sounds, and acoustic scenes. The performance of FixMatch is compared to Convolutional Neural Networks (CNN) trained from scratch, Transfer Learning, and SSL using the Mean Teacher approach. Additionally, a simple yet effective approach for selecting suitable augmentation methods for FixMatch is introduced. FixMatch with the proposed modifications always outperformed Mean Teacher and the CNNs trained from scratch. For the industrial sounds and music datasets, the CNN baseline performance using the full dataset was reached with less than 5% of the initial training data, demonstrating the potential of recent SSL methods for audio data. Transfer Learning outperformed FixMatch only for the most challenging dataset from acoustic scene classification, showing that there is still room for improvement.

Vorschau

Einordnung

Erschienen in:: Electronics : open access journal
Bd. 10, H. 15 (28.07.2021)Art.-Nr.:1807
Band:: 10
Heft:: 15
Datum der Erstellung:: 09.05.2022
Datum der Veröffentlichung:: 28.07.2021
DOI:: 10.3390/electronics10151807
PPN:: 1772135992
Sprache:: Englisch
Ressourcentyp:: Text
Umfang:: 20 Seiten
Schlagwörter:: semi-supervised learning; deep learning; industrial sound analysis; music information retrieval; acoustic scene classification
DDC-Sachgruppe der DNB:: 621.3 Elektrotechnik, Elektronik
Einrichtung:: Technische Universität Ilmenau, Fakultät für Elektrotechnik und Informationstechnik

auf die Merkliste

Zitieren

Zitierform:

10.3390/electronics10151807
Zitier-Link kopieren

Rechte

Nutzung und Vervielfältigung:

Export

BibTeX, Endnote, MODS, MARCXML, RIS, ISI, PICA, DC, CSV