Speaker age estimation for elderly speech recognition in European Portuguese

Dias, J.; Pellegrini, T; Hedayati, V.; Trancoso, I.; Hämäläinen, A.

Utilize este identificador para referenciar este registo: http://hdl.handle.net/10071/25451

Registo completo

Campo DC	Valor	Idioma
dc.contributor.author	Dias, J.	-
dc.contributor.author	Pellegrini, T	-
dc.contributor.author	Hedayati, V.	-
dc.contributor.author	Trancoso, I.	-
dc.contributor.author	Hämäläinen, A.	-
dc.contributor.editor	Chng E.S. , Li H., Meng H., Ma B. and Xie L	-
dc.date.accessioned	2022-05-19T09:49:31Z	-
dc.date.available	2022-05-19T09:49:31Z	-
dc.date.issued	2014-01-01	-
dc.identifier.isbn	9781634394352	-
dc.identifier.issn	2308-457X	-
dc.identifier.uri	http://hdl.handle.net/10071/25451	-
dc.description.abstract	Phone-like acoustic models (AMs) used in large-vocabulary automatic speech recognition (ASR) systems are usually trained with speech collected from young adult speakers. Using such models, ASR performance may decrease by about 10% absolute when transcribing elderly speech. Ageing is known to alter speech production in ways that require ASR systems to be adapted, in particular at the level of acoustic modeling. In this study, we investigated automatic age estimation in order to select age-specific adapted AMs. A large corpus of read speech from European Portuguese speakers aged 60 or over was used. Age estimation (AE) based on i-vectors and support vector regression achieved mean error rates of about 4.2 and 4.5 years for males and females, respectively. Compared with a baseline ASR system with AMs trained using young adult speech and a WER of 13.9%, the selection of five-year-range adapted AMs, based on the estimated age of the speakers, led to a decrease in WER of about 9.3% relative (1.3% absolute). Comparable gains in ASR performance were observed when considering two larger age ranges (60-75 and 76-90) instead of six five-year ranges, suggesting that it would be sufficient to use the two large ranges only.	eng
dc.language.iso	eng	-
dc.publisher	International Speech and Communication Association	-
dc.relation	UID/MULTI/0446/2013	-
dc.rights	openAccess	-
dc.subject	Automatic speech recognition	eng
dc.subject	Elderly speech	eng
dc.subject	Automatic age estimation	eng
dc.subject	I-vector extraction	eng
dc.title	Speaker age estimation for elderly speech recognition in European Portuguese	eng
dc.type	conferenceObject	-
dc.event.title	Celebrating the Diversity of Spoken Languages	-
dc.event.type	Conferência	pt
dc.event.location	Singapura	eng
dc.event.date	2014	-
dc.peerreviewed	yes	-
dc.journal	15th Annual Conference of the International Speech Communication Association (INTERSPEECH 2014)	-
degois.publication.location	Singapura	eng
degois.publication.title	Speaker age estimation for elderly speech recognition in European Portuguese	eng
dc.date.updated	2022-05-19T10:48:25Z	-
dc.description.version	info:eu-repo/semantics/acceptedVersion	-
dc.subject.fos	Domínio/Área Científica::Ciências Naturais::Ciências Físicas	por
iscte.identifier.ciencia	https://ciencia.iscte-iul.pt/id/ci-pub-22986	-
iscte.alternateIdentifiers.scopus	2-s2.0-84910028544	-
Aparece nas coleções:	ISTAR-CRI - Comunicações a conferências internacionais

Ficheiros deste registo:

Ficheiro	Descrição	Tamanho	Formato
conferenceobject_22986f.pdf	Versão Aceite	84,99 kB	Adobe PDF	Ver/Abrir

Mostrar registo em formato simples Visualizar estatísticas