Audio and speech enhancement: emerging trends in Deep Learning

Theme and Scope

In the last decade, deep learning has revolutionized the research fields of audio and speech signal processing, and acoustic scene analysis. In these research fields, methods relying on deep learning have achieved remarkable performance in various applications and tasks, surpassing legacy methods that rely on the independent usage of signal processing operations and machine learning algorithms. The huge success of deep learning methods relies on their ability to learn representations from audio and speech signals that are useful for various downstream tasks.

The typical methodology adopted in these tasks consists, in fact, in extracting and manipulating useful information from the audio and speech streams to pilot the execution of automatized services. In addition, the importance of obtaining reliable performance by using data recorded in real acoustic ambient, where several unpredictable and corruptive causes (like background noise, reverberation, multiple interferences, and so on) always worsen the algorithm behavior, is a challenge of fundamental importance.

It is indeed of great interest for the scientific community to understand the effectiveness of novel computational algorithms for audio and speech processing operating in these environmental conditions, in the light of all aforementioned aspects, able to enhance the quality of the recorded signals in order to successfully fulfill specific tasks, like machine listening, automatic diarization, auditory scene analysis, music information retrieval, and many others. Moreover, recent advancements in exploitation of Deep Learning models, in order to directly handle the acoustic raw data, make use of cross-domain approaches, to exploit the information contained in diverse kinds of environmental audio signals.

The aim of this session is therefore to provide the most recent advances in Deep Learning for audio and speech enhancement with a wide range of processing tasks and applications in real acoustic environments.


Potential topics include, but are not limited, to:

  • Machine Learning for Speech and Audio Processing

  • Cross-domain Audio Analysis

  • Deep Learning for Audio Applications in Real Acoustic Environments

  • Audio-based Security Systems and Surveillance

  • Speech and Audio Forensic Applications

  • Transfer Learning for Changing Environments

  • End-to-end Audio Processing and Learning

  • Meta-Learning of Machine Listening Models

  • Separation and Localization of Real Recorded Audio Sources

  • Computational Acoustic Scene Understanding

  • Computational Methods for Wireless Acoustic Sensor Networks

  • Computational Audio Denoising and Dereverberation

  • Deep-ad-hoc Acoustic Beamforming

  • Context-aware Audio Interfaces

  • Adversarial Techniques for Audio-Metric Learning

Important Dates

  • Paper submission: January 31, 2022

  • Decision notifications: April 26, 2022

  • Camera-ready papers: May 23, 2022


Manuscripts intended for the special session should be submitted via the paper submission website of WCCI 2022 as regular submissions. All papers submitted to special sessions will be subject to the same peer-review review procedure as the regular papers. Accepted papers will be part of the regular conference proceedings.

Paper submission guidelines:


Download the PDF Flayer.