Utilize este identificador para referenciar este registo: http://hdl.handle.net/10071/9550
Registo completo
Campo DCValorIdioma
dc.contributor.authorSilvestre, C.-
dc.contributor.authorCardoso, M. G. M. S.-
dc.contributor.authorFigueiredo, M.-
dc.date.accessioned2015-08-04T14:38:18Z-
dc.date.available2015-08-04T14:38:18Z-
dc.date.issued2015-
dc.identifier.issn0266-4720-
dc.identifier.urihttp://hdl.handle.net/10071/9550-
dc.description.abstractResearch on the problem of feature selection for clustering continues to develop. This is a challenging task, mainly due to the absence of class labels to guide the search for relevant features. Categorical feature selection for clustering has rarely been addressed in the literature, with most of the proposed approaches having focused on numerical data. In this work, we propose an approach to simultaneously cluster categorical data and select a subset of relevant features. Our approach is based on a modification of a finite mixture model (of multinomial distributions), where a set of latent variables indicate the relevance of each feature. To estimate the model parameters, we implement a variant of the expectation-maximization algorithm that simultaneously selects the subset of relevant features, using a minimum message length criterion. The proposed approach compares favourably with two baseline methods: a filter based on an entropy measure and a wrapper based on mutual information. The results obtained on synthetic data illustrate the ability of the proposed expectation-maximization method to recover ground truth. An application to real data, referred to official statistics, shows its usefulness.eng
dc.language.isoeng-
dc.publisherWiley-
dc.relationinfo:eu-repo/grantAgreement/FCT/5876/147442/PT-
dc.rightsembargoedAccesspor
dc.subjectCluster analysiseng
dc.subjectFinite mixtures modelseng
dc.subjectEM algorithmeng
dc.subjectFeature selectioneng
dc.subjectCategorical featureseng
dc.titleFeature selection for clustering categorical data with an embedded modeling approacheng
dc.typearticle-
dc.pagination444 - 453-
dc.publicationstatusPublicadopor
dc.peerreviewedyes-
dc.journalExpert Systems-
dc.distributionInternacionalpor
dc.volume32-
dc.number3-
degois.publication.firstPage444-
degois.publication.lastPage453-
degois.publication.issue3-
degois.publication.titleFeature selection for clustering categorical data with an embedded modeling approacheng
dc.date.updated2019-05-07T13:05:46Z-
dc.description.versioninfo:eu-repo/semantics/publishedVersion-
dc.identifier.doi10.1111/exsy.12082-
dc.subject.fosDomínio/Área Científica::Ciências Naturais::Ciências da Computação e da Informaçãopor
iscte.identifier.cienciahttps://ciencia.iscte-iul.pt/id/ci-pub-18663-
iscte.alternateIdentifiers.wosWOS:000355958900009-
iscte.alternateIdentifiers.scopus2-s2.0-84930797327-
Aparece nas coleções:BRU-RI - Artigos em revistas científicas internacionais com arbitragem científica

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
Feature selection for clustering.pdf
  Restricted Access
Versão Editora567,27 kBAdobe PDFVer/Abrir Request a copy


FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpaceOrkut
Formato BibTex mendeley Endnote Logotipo do DeGóis Logotipo do Orcid 

Todos os registos no repositório estão protegidos por leis de copyright, com todos os direitos reservados.