Please use this identifier to cite or link to this item:
Full metadata record
DC FieldValueLanguage
dc.contributor.authorBico, M. I.-
dc.contributor.authorBaptista, J.-
dc.contributor.authorBatista, F.-
dc.contributor.authorCardeira, E.-
dc.contributor.editorSilvello, G., Corcho, O., Manghi, P., Di Nunzio, G. M., Golub, K., Ferro, N., and Poggi, A.-
dc.identifier.citationBico, M. I., Baptista, J., Batista, F., & Cardeira, E. (2022). Early experiments on automatic annotation of Portuguese medieval texts. In G. Silvello, O. Corcho, P. Manghi, G. M. Di Nunzio, K. Golub, N. Ferro, & A. Poggi (Eds.), Lecture notes in computer science: Vol. 13541. Linking theory and practice of digital libraries (pp. 442-449). Springer.
dc.description.abstractThis paper presents the challenges and solutions adopted to the lemmatization and part-of-speech (PoS) tagging of a corpus of Old Portuguese texts (up to 1525), to pave the way to the implementation of an automatic annotation of these Medieval texts. A highly granular tagset, previously devised for Modern Portuguese, was adapted to this end. A large text (∼155 thousand words) was manually annotated for PoS and lemmata and used to train an initial PoS-tagger model. When applied to two other texts, the resulting model attained 91.2% precision with a textual variant of the same text, and 67.4% with a new, unseen text. A second model was then trained with the data provided by the previous three texts and applied to two other unseen texts. The new model achieved a precision of 77.3% and 82.4%, respectively.eng
dc.publisherSpringer International Publishing-
dc.relationinfo:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDB%2F50021%2F2020/PT-
dc.relationinfo:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDP%2F00214%2F2020/PT-
dc.relation.ispartofLinking theory and practice of digital libraries. Lecture Notes in Computer Science-
dc.subjectAutomatic annotationeng
dc.subjectPart-of-speech taggingeng
dc.subjectOld portugueseeng
dc.titleEarly experiments on automatic annotation of Portuguese medieval textseng
dc.event.title26th International Conference on Theory and Practice of Digital Libraries, TPDL 2022-
dc.pagination442 - 449-
Appears in Collections:IT-CRI - Comunicações a conferências internacionais

Files in This Item:
File SizeFormat 
conferenceobject_90833.pdf428,17 kBAdobe PDFView/Open

FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpaceOrkut
Formato BibTex mendeley Endnote Logotipo do DeGóis Logotipo do Orcid 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.