Skip navigation
User training | Reference and search service

Library catalog

Content aggregators
Please use this identifier to cite or link to this item:

Title: Mining categorical sequences from data using a hybrid clustering method
Authors: De Angelis, L.
Dias, J. G.
Keywords: Data mining
Sequential data
Hidden Markov models
Categorical data
Issue Date: 2014
Publisher: Elsevier
Abstract: The identification of different dynamics in sequential data has become an every day need in scientific fields such as marketing, bioinformatics, finance, or social sciences. Contrary to cross-sectional or static data, this type of observations (also known as stream data, temporal data, longitudinal data or repeated measures) are more challenging as one has to incorporate data dependency in the clustering process. In this research we focus on clustering categorical sequences. The method proposed here combines model-based and heuristic clustering. In the first step, the categorical sequences are transformed by an extension of the hidden Markov model into a probabilistic space, where a symmetric Kullback-Leibler distance can operate. Then, in the second step, using hierarchical clustering on the matrix of distances, the sequences can be clustered. This paper illustrates the enormous potential of this type of hybrid approach using a synthetic data set as well as the well-known Microsoft dataset with website users search patterns and a survey on job career dynamics.
Description: WOS:000331419800014 (Nº de Acesso Web of Science)
ISSN: 0377-2217
Publisher version: The definitive version is available at:
Appears in Collections:DMQGE-RI - Artigos em revistas internacionais com arbitragem científica

Files in This Item:
File Description SizeFormat 
publisher_version_Dias2014EJOR.pdf780.03 kBAdobe PDFView/Open    Request a copy

FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpace
Formato BibTex MendeleyEndnote Currículo DeGóis 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.