Audio and speech enhancement: emerging trends in Deep Learning

Theme and Scope

In the last decade, deep learning has revolutionized the research fields of audio and speech signal processing, and acoustic scene analysis. In these research fields, methods relying on deep learning have achieved remarkable performance in various applications and tasks, surpassing legacy methods that rely on the independent usage of signal processing operations and machine learning algorithms. The huge success of deep learning methods relies on their ability to learn representations from audio and speech signals that are useful for various downstream tasks.

The typical methodology adopted in these tasks consists, in fact, in extracting and manipulating useful information from the audio and speech streams to pilot the execution of automatized services. In addition, the importance of obtaining reliable performance by using data recorded in real acoustic ambient, where several unpredictable and corruptive causes (like background noise, reverberation, multiple interferences, and so on) always worsen the algorithm behavior, is a challenge of fundamental importance.

It is indeed of great interest for the scientific community to understand the effectiveness of novel computational algorithms for audio and speech processing operating in these environmental conditions, in the light of all aforementioned aspects, able to enhance the quality of the recorded signals in order to successfully fulfill specific tasks, like machine listening, automatic diarization, auditory scene analysis, music information retrieval, and many others. Moreover, recent advancements in exploitation of Deep Learning models, in order to directly handle the acoustic raw data, make use of cross-domain approaches, to exploit the information contained in diverse kinds of environmental audio signals.

The aim of this session is therefore to provide the most recent advances in Deep Learning for audio and speech enhancement with a wide range of processing tasks and applications in real acoustic environments.

Topics

Potential topics include, but are not limited, to:

Machine Learning for Speech and Audio Processing
Cross-domain Audio Analysis
Deep Learning for Audio Applications in Real Acoustic Environments
Audio-based Security Systems and Surveillance
Speech and Audio Forensic Applications
Transfer Learning for Changing Environments
End-to-end Audio Processing and Learning
Meta-Learning of Machine Listening Models
Separation and Localization of Real Recorded Audio Sources
Computational Acoustic Scene Understanding
Computational Methods for Wireless Acoustic Sensor Networks
Computational Audio Denoising and Dereverberation
Deep-ad-hoc Acoustic Beamforming
Context-aware Audio Interfaces
Adversarial Techniques for Audio-Metric Learning

Important Dates

Paper submission: January 31, 2022
Decision notifications: April 26, 2022
Camera-ready papers: May 23, 2022

Submission

Manuscripts intended for the special session should be submitted via the paper submission website of WCCI 2022 as regular submissions. All papers submitted to special sessions will be subject to the same peer-review review procedure as the regular papers. Accepted papers will be part of the regular conference proceedings.

Paper submission guidelines: https://wcci2022.org/submission/

Organizers

Michele Scarpiniti, Sapienza University of Rome, Italy (michele.scarpiniti@uniroma1.it)
Jen-Tzung Chien, National Yang Ming Chiao Tung University, Taiwan (jtchien@nycu.edu.tw)
Konstantinos Drossos, Tampere University, Finland (info@kdrossos.net)
Danilo Comminiello, Sapienza University of Rome, Italy (danilo.comminiello@uniroma1.it)

Download the PDF Flayer.

Report abuse