Utilize este identificador para referenciar este registo: http://hdl.handle.net/10071/20833
Registo completo
Campo DCValorIdioma
dc.contributor.authorRicardo Rei-
dc.contributor.authorNuno Miguel Guerreiro-
dc.contributor.authorBatista, F.-
dc.contributor.editorLesot, Marie-Jeanne and Vieira, Susana and Reformat, Marek Z. and Carvalho, João Paulo and Wilbik, Anna and Bouchon-Meunier, Bernadette and Yager, Ronald R.-
dc.date.accessioned2020-11-17T15:15:43Z-
dc.date.available2020-11-17T15:15:43Z-
dc.date.issued2020-
dc.identifier.isbn978-3-030-50146-4-
dc.identifier.urihttp://hdl.handle.net/10071/20833-
dc.description.abstractThis paper describes an approach for automatic capitalization of text without case information, such as spoken transcripts of video subtitles, produced by automatic speech recognition systems. Our approach is based on pre-trained contextualized word embeddings, requires only a small portion of data for training when compared with traditional approaches, and is able to achieve state-of-the-art results. The paper reports experiments both on general written data from the European Parliament, and on video subtitles, revealing that the proposed approach is suitable for performing capitalization, not only in each one of the domains, but also in a cross-domain scenario. We have also created a versatile multilingual model, and the conducted experiments show that good results can be achieved both for monolingual and multilingual data. Finally, we applied domain adaptation by finetuning models, initially trained on general written data, on video subtitles, revealing gains over other approaches not only in performance but also in terms of computational cost.eng
dc.language.isoeng-
dc.publisherSpringer International Publishing-
dc.relationUIDB/50021/2020-
dc.relation038510-
dc.rightsopenAccess-
dc.titleAutomatic truecasing of video subtitles using BERT: a multilingual adaptable approacheng
dc.typeconferenceObject-
dc.event.titleIPMU 2020: Information Processing and Management of Uncertainty in Knowledge-Based Systems-
dc.event.typeConferênciapt
dc.event.date2020-
dc.pagination708 - 721-
dc.peerreviewedyes-
dc.journalInformation Processing and Management of Uncertainty in Knowledge-Based Systems-
degois.publication.firstPage708-
degois.publication.lastPage721-
degois.publication.titleAutomatic truecasing of video subtitles using BERT: a multilingual adaptable approacheng
dc.date.updated2020-11-17T15:14:24Z-
dc.description.versioninfo:eu-repo/semantics/publishedVersion-
dc.identifier.doi10.1007/978-3-030-50146-4_52-
dc.subject.fosDomínio/Área Científica::Engenharia e Tecnologia::Outras Engenharias e Tecnologiaspor
dc.subject.fosDomínio/Área Científica::Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informáticapor
dc.subject.fosDomínio/Área Científica::Humanidades::Línguas e Literaturaspor
iscte.subject.odsIndústria, inovação e infraestruturaspor
iscte.identifier.cienciahttps://ciencia.iscte-iul.pt/id/ci-pub-72401-
iscte.alternateIdentifiers.scopus2-s2.0-85086257662-
Aparece nas coleções:IT-CRI - Comunicações a conferências internacionais

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
Rei2020_Chapter_AutomaticTruecasingOfVideoSubt.pdfVersão Editora385,68 kBAdobe PDFVer/Abrir


FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpaceOrkut
Formato BibTex mendeley Endnote Logotipo do DeGóis Logotipo do Orcid 

Todos os registos no repositório estão protegidos por leis de copyright, com todos os direitos reservados.