Most current language identification (LID) systems make little or no use of prosodic information, despite the importance of prosody in LID by humans. The greatest obstacle has been that of finding an appropriate feature set which captures linguistically relevant prosodic information. The only system to attempt LID entirely on the basis of prosodic variables uses a set of over 200 features which are selected and combined in a task-specific manner . We apply a novel recurrent neural network model to the task of pairwise discrimination among languages. Network inputs are limited to delta-Fo and the first difference of the band limited amplitude envelope. Initial results are based on all pairwise combinations of English, German, Japanese, Mandarin and Spanish, with 90 speakers per language.
|Original language||English (US)|
|Title of host publication||6th European Conference on Speech Communication and Technology, EUROSPEECH 1999|
|Publisher||The International Society for Computers and Their Applications (ISCA)|
|Number of pages||4|
|State||Published - Jan 1 1999|