Please use this identifier to cite or link to this item: http://hdl.handle.net/10071/25398
Author(s): Barreiro, A.
Batista, F.
Date: 2016
Title: Machine translation of non-contiguous multiword units
Pages: 22 - 30
Event title: Proceedings of a meeting held 17 June 2016, San Diego, California, USA. Held at the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2016)
ISBN: 978-1-5108-2521-5
Abstract: Non-adjacent linguistic phenomena such as non-contiguous multiwords and other phrasal units containing insertions, i.e., words that are not part of the unit, are difficult to process and remain a problem for NLP applications. Non-contiguous multiword units are common across languages and constitute some of the most important challenges to high quality machine translation. This paper presents an empirical analysis of non-contiguous multiwords, and highlights our use of the Logos Model and the Semtab function to deploy semantic knowledge to align non-contiguous multiword units with the goal to translate these units with high fidelity. The phrase level manual alignments illustrated in the paper were produced with the CLUE-Aligner, a Cross-Language Unit Elicitation alignment tool.
Peerreviewed: yes
Access type: Open Access
Appears in Collections:IT-CRI - Comunicações a conferências internacionais

Files in This Item:
File Description SizeFormat 
conference_object_28921.pdfVersão Aceite572,2 kBAdobe PDFView/Open


FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpaceOrkut
Formato BibTex mendeley Endnote Logotipo do DeGóis Logotipo do Orcid 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.