Please use this identifier to cite or link to this item: http://hdl.handle.net/10071/26157
Full metadata record
DC FieldValueLanguage
dc.contributor.authorBico, M. I.-
dc.contributor.authorBaptista, J.-
dc.contributor.authorBatista, F.-
dc.contributor.authorCardeira, E.-
dc.contributor.editorSilvello, G., Corcho, O., Manghi, P., Di Nunzio, G. M., Golub, K., Ferro, N., and Poggi, A.-
dc.date.accessioned2022-09-22T13:40:28Z-
dc.date.available2022-09-22T13:40:28Z-
dc.date.issued2022-
dc.identifier.citationBico, M. I., Baptista, J., Batista, F., & Cardeira, E. (2022). Early experiments on automatic annotation of Portuguese medieval texts. In G. Silvello, O. Corcho, P. Manghi, G. M. Di Nunzio, K. Golub, N. Ferro, & A. Poggi (Eds.), Lecture notes in computer science: Vol. 13541. Linking theory and practice of digital libraries (pp. 442-449). Springer. https://doi.org/10.1007/978-3-031-16802-4_44-
dc.identifier.isbn978-3-031-16802-4-
dc.identifier.issn0302-9743-
dc.identifier.urihttp://hdl.handle.net/10071/26157-
dc.description.abstractThis paper presents the challenges and solutions adopted to the lemmatization and part-of-speech (PoS) tagging of a corpus of Old Portuguese texts (up to 1525), to pave the way to the implementation of an automatic annotation of these Medieval texts. A highly granular tagset, previously devised for Modern Portuguese, was adapted to this end. A large text (∼155 thousand words) was manually annotated for PoS and lemmata and used to train an initial PoS-tagger model. When applied to two other texts, the resulting model attained 91.2% precision with a textual variant of the same text, and 67.4% with a new, unseen text. A second model was then trained with the data provided by the previous three texts and applied to two other unseen texts. The new model achieved a precision of 77.3% and 82.4%, respectively.eng
dc.language.isoeng-
dc.publisherSpringer International Publishing-
dc.relationinfo:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDB%2F50021%2F2020/PT-
dc.relationUI/BD/152806/2022-
dc.relationinfo:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDP%2F00214%2F2020/PT-
dc.relation.ispartofLinking theory and practice of digital libraries. Lecture Notes in Computer Science-
dc.rightsopenAccess-
dc.subjectAutomatic annotationeng
dc.subjectLemmatizationeng
dc.subjectPart-of-speech taggingeng
dc.subjectOld portugueseeng
dc.titleEarly experiments on automatic annotation of Portuguese medieval textseng
dc.typeconferenceObject-
dc.event.title26th International Conference on Theory and Practice of Digital Libraries, TPDL 2022-
dc.event.typeConferênciapt
dc.event.locationPaduaeng
dc.event.date2022-
dc.pagination442 - 449-
dc.peerreviewedyes-
dc.volume13541-
dc.date.updated2022-09-22T14:34:49Z-
dc.description.versioninfo:eu-repo/semantics/acceptedVersion-
dc.identifier.doi10.1007/978-3-031-16802-4_44-
iscte.identifier.cienciahttps://ciencia.iscte-iul.pt/id/ci-pub-90833-
Appears in Collections:IT-CRI - Comunicações a conferências internacionais

Files in This Item:
File SizeFormat 
conferenceobject_90833.pdf428,17 kBAdobe PDFView/Open


FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpaceOrkut
Formato BibTex mendeley Endnote Logotipo do DeGóis Logotipo do Orcid 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.