Please use this identifier to cite or link to this item:
http://hdl.handle.net/10071/25398
Author(s): | Barreiro, A. Batista, F. |
Date: | 2016 |
Title: | Machine translation of non-contiguous multiword units |
Pages: | 22 - 30 |
Event title: | Proceedings of a meeting held 17 June 2016, San Diego, California, USA. Held at the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2016) |
ISBN: | 978-1-5108-2521-5 |
Abstract: | Non-adjacent linguistic phenomena such as non-contiguous multiwords and other phrasal units containing insertions, i.e., words that are not part of the unit, are difficult to process and remain a problem for NLP applications. Non-contiguous multiword units are common across languages and constitute some of the most important challenges to high quality machine translation. This paper presents an empirical analysis of non-contiguous multiwords, and highlights our use of the Logos Model and the Semtab function to deploy semantic knowledge to align non-contiguous multiword units with the goal to translate these units with high fidelity. The phrase level manual alignments illustrated in the paper were produced with the CLUE-Aligner, a Cross-Language Unit Elicitation alignment tool. |
Peerreviewed: | yes |
Access type: | Open Access |
Appears in Collections: | IT-CRI - Comunicações a conferências internacionais |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
conference_object_28921.pdf | Versão Aceite | 572,2 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.