Utilize este identificador para referenciar este registo:
http://hdl.handle.net/10071/26157
Registo completo
Campo DC | Valor | Idioma |
---|---|---|
dc.contributor.author | Bico, M. I. | - |
dc.contributor.author | Baptista, J. | - |
dc.contributor.author | Batista, F. | - |
dc.contributor.author | Cardeira, E. | - |
dc.contributor.editor | Silvello, G., Corcho, O., Manghi, P., Di Nunzio, G. M., Golub, K., Ferro, N., and Poggi, A. | - |
dc.date.accessioned | 2022-09-22T13:40:28Z | - |
dc.date.available | 2022-09-22T13:40:28Z | - |
dc.date.issued | 2022 | - |
dc.identifier.citation | Bico, M. I., Baptista, J., Batista, F., & Cardeira, E. (2022). Early experiments on automatic annotation of Portuguese medieval texts. In G. Silvello, O. Corcho, P. Manghi, G. M. Di Nunzio, K. Golub, N. Ferro, & A. Poggi (Eds.), Lecture notes in computer science: Vol. 13541. Linking theory and practice of digital libraries (pp. 442-449). Springer. https://doi.org/10.1007/978-3-031-16802-4_44 | - |
dc.identifier.isbn | 978-3-031-16802-4 | - |
dc.identifier.issn | 0302-9743 | - |
dc.identifier.uri | http://hdl.handle.net/10071/26157 | - |
dc.description.abstract | This paper presents the challenges and solutions adopted to the lemmatization and part-of-speech (PoS) tagging of a corpus of Old Portuguese texts (up to 1525), to pave the way to the implementation of an automatic annotation of these Medieval texts. A highly granular tagset, previously devised for Modern Portuguese, was adapted to this end. A large text (∼155 thousand words) was manually annotated for PoS and lemmata and used to train an initial PoS-tagger model. When applied to two other texts, the resulting model attained 91.2% precision with a textual variant of the same text, and 67.4% with a new, unseen text. A second model was then trained with the data provided by the previous three texts and applied to two other unseen texts. The new model achieved a precision of 77.3% and 82.4%, respectively. | eng |
dc.language.iso | eng | - |
dc.publisher | Springer International Publishing | - |
dc.relation | info:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDB%2F50021%2F2020/PT | - |
dc.relation | UI/BD/152806/2022 | - |
dc.relation | info:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDP%2F00214%2F2020/PT | - |
dc.relation.ispartof | Linking theory and practice of digital libraries. Lecture Notes in Computer Science | - |
dc.rights | openAccess | - |
dc.subject | Automatic annotation | eng |
dc.subject | Lemmatization | eng |
dc.subject | Part-of-speech tagging | eng |
dc.subject | Old portuguese | eng |
dc.title | Early experiments on automatic annotation of Portuguese medieval texts | eng |
dc.type | conferenceObject | - |
dc.event.title | 26th International Conference on Theory and Practice of Digital Libraries, TPDL 2022 | - |
dc.event.type | Conferência | pt |
dc.event.location | Padua | eng |
dc.event.date | 2022 | - |
dc.pagination | 442 - 449 | - |
dc.peerreviewed | yes | - |
dc.volume | 13541 | - |
dc.date.updated | 2022-09-22T14:34:49Z | - |
dc.description.version | info:eu-repo/semantics/acceptedVersion | - |
dc.identifier.doi | 10.1007/978-3-031-16802-4_44 | - |
iscte.identifier.ciencia | https://ciencia.iscte-iul.pt/id/ci-pub-90833 | - |
Aparece nas coleções: | IT-CRI - Comunicações a conferências internacionais |
Ficheiros deste registo:
Ficheiro | Tamanho | Formato | |
---|---|---|---|
conferenceobject_90833.pdf | 428,17 kB | Adobe PDF | Ver/Abrir |
Todos os registos no repositório estão protegidos por leis de copyright, com todos os direitos reservados.