Abstract
In this work, we explore Parameter-Efficient-Learning (PEL) techniques to repurpose a General-Purpose-Speech (GSM) model for Arabic dialect identification (ADI). Specifically, we investigate different setups to incorporate trainable features into a multi-layer encoder-decoder GSM formulation under frozen pre-trained settings. Our architecture includes residual adapter and model reprogramming (input-prompting). We design a token-level label mapping to condition the GSM for Arabic Dialect Identification (ADI). We achieve new state-of-the-art accuracy on the ADI-17 dataset by vanilla fine-tuning. We further reduce the training budgets with the PEL method, which performs within 1.86% accuracy to fine-tuning using only 2.5% of (extra) network trainable parameters. Our study demonstrates how to identify Arabic dialects using a small dataset and limited computation with open source code at https://github.com/Srijith-rkr/KAUST-Whisper-Adapter.
Original language | English (US) |
---|---|
Pages | 1958-1962 |
Number of pages | 5 |
DOIs | |
State | Published - 2023 |
Event | 24th International Speech Communication Association, Interspeech 2023 - Dublin, Ireland Duration: Aug 20 2023 → Aug 24 2023 |
Conference
Conference | 24th International Speech Communication Association, Interspeech 2023 |
---|---|
Country/Territory | Ireland |
City | Dublin |
Period | 08/20/23 → 08/24/23 |
Bibliographical note
Publisher Copyright:© 2023 International Speech Communication Association. All rights reserved.
Keywords
- Arabic Dialect
- Dialect Identification
- Parameter-Efficient Learning
ASJC Scopus subject areas
- Language and Linguistics
- Human-Computer Interaction
- Signal Processing
- Software
- Modeling and Simulation