Please use this identifier to cite or link to this item: http://hdl.handle.net/10071/16690
Author(s): Silva, S.
Ribeiro, R.
Pereira, R.
Editor: Pedro Rangel Henriques; José Paulo Leal; António Menezes Leitão; Xavier Gómez Guinovart
Date: 2018
Title: Less is more in incident categorization
Volume: 62
ISSN: 2190-6807
ISBN: 978-3-95977-072-9
DOI (Digital Object Identifier): 10.4230/OASIcs.SLATE.2018.17
Keywords: Machine learning
Automated incident categorization
SVM
Incident management
Natural language
Abstract: The IT incident management process requires a correct categorization to attribute incident tickets to the right resolution group and obtain as quickly as possible an operational system, impacting the minimum as possible the business and costumers. In this work, we introduce automatic text classification, demonstrating the application of several natural language processing techniques and analyzing the impact of each one on a real incident tickets dataset. The techniques that we explore in the pre-processing of the text that describes an incident are the following: tokenization, stemming, eliminating stop-words, named-entity recognition, and TFxIDF-based document representation. Finally, to build the model and observe the results after applying the previous techniques, we use two machine learning algorithms: Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). Two important findings result from this study: a shorter description of an incident is better than a full description of an incident; and, pre-processing has little impact on incident categorization, mainly due the specific vocabulary used in this type of text.
Peerreviewed: yes
Access type: Open Access
Appears in Collections:ISTAR-CRI - Comunicações a conferências internacionais

Files in This Item:
File Description SizeFormat 
OASIcs-SLATE-2018-17.pdfVersão Editora350,36 kBAdobe PDFView/Open


FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpaceOrkut
Formato BibTex mendeley Endnote Logotipo do DeGóis Logotipo do Orcid 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.