Skip navigation
Logo
User training | Reference and search service

Library catalog

Retrievo
EDS
b-on
More
resources
Content aggregators
Please use this identifier to cite or link to this item:

acessibilidade

http://hdl.handle.net/10071/13072
acessibilidade
Title: Performance of combined models in discrete binary classification
Authors: Marques, A.
Ferreira, A. S.
Cardoso, M. G. M. S.
Keywords: Classification performance
Combined models for classification
Discrete discriminant analysis
Separability
Issue Date: 2017
Publisher: Hogrefe and Huber Publisher
Abstract: Diverse Discrete Discriminant Analysis (DDA) models perform differently in different samples. This fact has encouraged research in combined models which seems particularly promising when the a priori classes are not well separated or when small or moderate sized samples are considered, which often occurs in practice. In this study, we evaluate the performance of a convex combination of two DDA models: the First-Order Independence Model (FOIM) and the Dependence Trees Model (DTM). We use simulated data sets with two classes and consider diverse data complexity factors which may influence performance of the combined model -the separation of classes, balance, and number of missing states, as well as sample size and also the number of parameters to be estimated in DDA. We resort to cross-validation to evaluate the precision of classification. The results obtained illustrate the advantage of the proposed combination when compared with FOIM and DTM: it yields the best results, especially when very small samples are considered. The experimental study also provides a ranking of the data complexity factors, according to their relative impact on classification performance, by means of a regression model. It leads to the conclusion that the separation of classes is the most influential factor in classification performance. The ratio between the number of degrees of freedom and sample size, along with the proportion of missing states in the minority class, also has significant impact on classification performance. An additional gain of this study, also deriving from the estimated regression model, is the ability to successfully predict the precision of classification in a real data set based on the data complexity factors.
Peer reviewed: yes
URI: http://hdl.handle.net/10071/13072
DOI: 10.1027/1614-2241/a000117
ISSN: 1614-1881
Ciência-IUL: https://ciencia.iscte-iul.pt/id/ci-pub-33952
Accession number: WOS:000397417500003
Appears in Collections:BRU-RI - Artigos em revistas científicas internacionais com arbitragem científica

Files in This Item:
acessibilidade
File Description SizeFormat 
2017Methodology Marques Ferreira Cardoso.pdfVersão Editora598.96 kBAdobe PDFView/Open    Request a copy


FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpace
Formato BibTex MendeleyEndnote Currículo DeGóis 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.