Performance of combined models in discrete binary classification

Marques, A.; Ferreira, A. S.; Cardoso, M. G. M. S.

doi:10.1027/1614-2241/a000117

Utilize este identificador para referenciar este registo: http://hdl.handle.net/10071/13072

Registo completo

Campo DC	Valor	Idioma
dc.contributor.author	Marques, A.	-
dc.contributor.author	Ferreira, A. S.	-
dc.contributor.author	Cardoso, M. G. M. S.	-
dc.date.accessioned	2017-04-20T15:21:27Z	-
dc.date.available	2017-04-20T15:21:27Z	-
dc.date.issued	2017	-
dc.identifier.issn	1614-1881	-
dc.identifier.uri	http://hdl.handle.net/10071/13072	-
dc.description.abstract	Diverse Discrete Discriminant Analysis (DDA) models perform differently in different samples. This fact has encouraged research in combined models which seems particularly promising when the a priori classes are not well separated or when small or moderate sized samples are considered, which often occurs in practice. In this study, we evaluate the performance of a convex combination of two DDA models: the First-Order Independence Model (FOIM) and the Dependence Trees Model (DTM). We use simulated data sets with two classes and consider diverse data complexity factors which may influence performance of the combined model -the separation of classes, balance, and number of missing states, as well as sample size and also the number of parameters to be estimated in DDA. We resort to cross-validation to evaluate the precision of classification. The results obtained illustrate the advantage of the proposed combination when compared with FOIM and DTM: it yields the best results, especially when very small samples are considered. The experimental study also provides a ranking of the data complexity factors, according to their relative impact on classification performance, by means of a regression model. It leads to the conclusion that the separation of classes is the most influential factor in classification performance. The ratio between the number of degrees of freedom and sample size, along with the proportion of missing states in the minority class, also has significant impact on classification performance. An additional gain of this study, also deriving from the estimated regression model, is the ability to successfully predict the precision of classification in a real data set based on the data complexity factors.	eng
dc.language.iso	eng	-
dc.publisher	Hogrefe and Huber Publisher	-
dc.relation	info:eu-repo/grantAgreement/FCT/5876/147442/PT	-
dc.rights	embargoedAccess	por
dc.subject	Classification performance	eng
dc.subject	Combined models for classification	eng
dc.subject	Discrete discriminant analysis	eng
dc.subject	Separability	eng
dc.title	Performance of combined models in discrete binary classification	eng
dc.type	article	-
dc.pagination	23 - 37	-
dc.publicationstatus	Publicado	por
dc.peerreviewed	yes	-
dc.journal	Methodology	-
dc.distribution	Internacional	por
dc.volume	13	-
dc.number	1	-
degois.publication.firstPage	23	-
degois.publication.lastPage	37	-
degois.publication.issue	1	-
degois.publication.title	Performance of combined models in discrete binary classification	eng
dc.date.updated	2019-03-21T17:47:10Z	-
dc.description.version	info:eu-repo/semantics/publishedVersion	-
dc.identifier.doi	10.1027/1614-2241/a000117	-
dc.subject.fos	Domínio/Área Científica::Ciências Sociais::Psicologia	por
dc.subject.fos	Domínio/Área Científica::Ciências Sociais::Sociologia	por
iscte.identifier.ciencia	https://ciencia.iscte-iul.pt/id/ci-pub-33952	-
iscte.alternateIdentifiers.wos	WOS:000397417500003	-
iscte.alternateIdentifiers.scopus	2-s2.0-85017329682	-
Aparece nas coleções:	BRU-RI - Artigos em revistas científicas internacionais com arbitragem científica