Abstract
Speaker diarization is a task to identify 'who spoke when'. Moreover, nowadays, speakers' audio clips usually are accompanied by visual information. Thus, in the latest works, speaker diarization systems performance has been improved substantially by taking advantage of the visual information synchronized with audio clips in Audio-Visual (AV) content. This paper presents a deep learning architecture to implement an AV speaker diarization system emphasizing Voice Activity Detection (VAD). Traditional AV speaker diarization systems use hand-crafted features, like Mel-frequency cepstral coefficients, to perform VAD. On the other hand, the VAD module in our proposed system employs Convolutional Neural Networks (CNN) to learn and extract features from the audio waveforms directly. Experimental results on the AMI Meeting Corpus indicated that the proposed multimodal speaker diarization system reaches a state-of-the-art VAD False Alarm rate due to the CNN-based VAD, which in turn boosts the whole system's performance.
Original language | English (US) |
---|---|
Title of host publication | MWSCAS 2022 - 65th IEEE International Midwest Symposium on Circuits and Systems, Proceedings |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9781665402798 |
DOIs | |
State | Published - 2022 |
Event | 65th IEEE International Midwest Symposium on Circuits and Systems, MWSCAS 2022 - Fukuoka, Japan Duration: Aug 7 2022 → Aug 10 2022 |
Publication series
Name | Midwest Symposium on Circuits and Systems |
---|---|
Volume | 2022-August |
ISSN (Print) | 1548-3746 |
Conference
Conference | 65th IEEE International Midwest Symposium on Circuits and Systems, MWSCAS 2022 |
---|---|
Country/Territory | Japan |
City | Fukuoka |
Period | 08/7/22 → 08/10/22 |
Bibliographical note
Publisher Copyright:© 2022 IEEE.
Keywords
- audio-visual
- convolutional neural networks
- deep learning
- false alarm rate
- speaker diarization
- voice activity detection
ASJC Scopus subject areas
- Electronic, Optical and Magnetic Materials
- Electrical and Electronic Engineering