Welcome to the Symposium on Signal Processing Technology and Research, where we bring together leading researchers. This event will cover a range of topics, with a focus on audio, video, language and bio-medical signals. Below, you will find an overview of our distinguished speakers and their talks. Join us for an insightful day of discussions and networking!
Time | Speaker | Topic |
---|---|---|
10:00-10:05 | Nilesh Madhu | Introduction |
10:05-10:45 | Prof. Jesper Rindom Jensen (Aalborg University, Denmark) |
Robust Optimal Filtering for Acoustic Array Applications
View Abstract Abstract:Acoustic array applications, such as sound zone control, demand robust filtering strategies capable of adapting to dynamic and uncertain acoustic conditions. This talk discusses recent methods designed to enhance the robustness and adaptability of acoustic array systems. A dictionary-based selection approach is introduced, which involves precomputing an extensive set of control filters tailored for diverse acoustic scenarios. By dynamically selecting the most appropriate filter using acoustic data only, the method effectively manages variability and uncertainties in acoustic environments. Additionally, a parametric modeling technique for adaptively modifying acoustic impulse responses is presented. This approach allows acoustic filters to continuously adapt to evolving environmental conditions, such as temperature variations, ensuring stable and reliable performance. Together, these strategies can lead to increased robustness in acoustic filtering methods, offering practical solutions for real-world acoustic array applications. |
10:45-11:25 | Prof. Kris Demuynck (Ghent University - Imec, Belgium) |
TBA
View Abstract Abstract:TBA |
11:25-11:45 | Break | |
11:45-12:25 | Dr. Kasper Claes (Imec, Belgium) | Voice analysis for Parkinson's
View Abstract Abstract:TBA |
12:25-12:55 | Jihyun Lee (Yonsei University, South Korea) | Multi-modal speech synthesis: bridging modalities with shared representations.
View Abstract Abstract:With the rapid advancement of deep learning, synthesizing speech from diverse modalities—such as text, images, video, articulatory data, and even brain signals—has become significantly easier. A major approach to this challenge is to learn shared latent representations between these modalities and speech. In this talk, we present our recent work on generating speech from face videos, articulatory data, and brain signals. We explore architectures and training techniques for high-quality multi-modal speech synthesis while addressing challenges such as preserving speaker identity or the shortage of paired training data. |
12:55-13:00 | Nilesh Madhu | Closing Words |
13:00 | Lunch |