Structure of Dark Triad Dirty Dozen Across Eight World Regions

The Dark Triad (i.e., narcissism, psychopathy, Machiavellianism) has garnered intense attention over the past 15 years. We examined the structure of these traits’ measure—the Dark Triad Dirty Dozen (DTDD)—in a sample of 11,488 participants from three W.E.I.R.D. (i.e., North America, Oceania, Western Europe) and five non-W.E.I.R.D. (i.e., Asia, Middle East, non-Western Europe, South America, sub-Saharan Africa) world regions. The results confirmed the measurement invariance of the DTDD across participants’ sex in all world regions, with men scoring higher than women on all traits (except for psychopathy in Asia, where the difference was not significant). We found evidence for metric (and partial scalar) measurement invariance within and between W.E.I.R.D. and non-W.E.I.R.D. world regions. The results generally support the structure of the DTDD.

Interest in the Dark Triad traits has been growing for over 15 years (Furnham et al., 2013). The Dark Triad (Paulhus & Williams, 2002) comprises the three correlated traits of narcissism (i.e., entitlement and self-aggrandizement), 1 psychopathy (i.e., callous social attitudes and impulsivity), and Machiavellianism (i.e., manipulation and cynicism). These traits, especially psychopathy, are more prevalent in men than in women (Muris et al., 2017). Although a common theme in the Dark Triad is callousness and manipulation (Jones & Figueredo, 2013), distinct traits relate differently to various outcomes and behaviors, such as intelligence and cheating (Jones & Paulhus, 2017;Kowalski et al., 2018).
Narcissism is the most independent trait within the Dark Triad, as seen in its relatively weaker correlations with the other two traits and in its somewhat different personality profile and downstream outcomes . In contrast, the correlation between Machiavellianism and psychopathy occasionally exceeds .80 (Berry & Feldman, 1985;Klimstra et al., 2014;Pineda et al., 2018). Regardless, the veracity and utility of treating the traits as three correlated factors model has come into question (Rogoza & Cieciuch, 2018). To address this potential multicollinearity problem, researchers studying samples that originated in different countries have adopted a bifactorial modeling approach, which is hypothesized to disentangle common (i.e., general factor) and specific (i.e., orthogonal group factor[s]) sources of variance (Czarna et al., 2016;Jonason & Luévano, 2013;Maneiro et al., 2019). For example, in the context of Dark Triad, the general factor represents the common dark core, whereas group factors represents the traits of narcissism, psychopathy, and Machiavellianism (Moshagen et al., 2018).
Although bifactor modelling is a promising statistical method of evaluating structure, it has several limitations. Such a model may not accurately represent psychological functioning as a general factor. That is, a general factor from the bifactor model does not imply a general causal structure (i.e., the Dark Triad is not caused by a single antecedent; Bonifay et al., 2017). Furthermore, a general factor extracts some of the group factors' variance, leaving them in the form of residualized estimates, which might pose substantial interpretational difficulties (Sleep et al., 2017).
For example, what remains in narcissism, after the dark core variance is extracted? This is especially difficult in multigroup contexts, given that a general factor might capture different variance from one group to another, making group comparison meaningless.
Researchers, however, often use a bifactor modeling approach, as it usually results in a better fit to the data than traditional approaches (i.e., correlated factors models). This is so, because the general factor captures item "noise" or implausible response patterns (Reise et al., 2016). A situation where a bifactor model yields better fit, even with predetermined nonbifactor population-level structure (e.g., three correlated factors), is described as probifactor bias (Greene et al., 2019). In light of these arguments, applying a bifactor modelling approach to study the structure of the Dark Triad traits, although probably yielding better model fit, is not necessarily a good solution to solving the problems with the structure of the Dark Triad.

Measurement of the Dark Triad Traits
As originally identified (Paulhus & Williams, 2002), the Dark Triad traits have been studied using three independent measures per construct (Vize et al., 2018). The traditional measures of individual differences in these constructs are the Narcissistic Personality Inventory (Raskin & Hall, 1979), the Self-Report of Psychopathy (Paulhus et al., 2016), and the MACH-IV (Christie & Geis, 1970) scales. Given that the application of these measures produces a pool of 124 items, two independent teams of researchers developed briefer scales to reduce participant fatigue and facilitate research in this area. These scales are the 27-item Short Dark Triad (SD3; Jones & Paulhus, 2014) and the 12-item Dark Triad Dirty Dozen (DTDD; Jonason & Webster, 2010).
Despite the aforementioned controversies with using the DTDD, especially in comparison with the parent scales, we decided to use the DTDD in the current study for three reasons. First, given the length of our complete set of measures (see OSF project site for methodology codebook), we considered it sensible to reduce participant fatigue where possible. Second, the structure of the DTDD appears to be more stable across different languages and cultural contexts, which is crucial in the testing of invariance. Finally, the DTDD remains popular for researchers because of its brevity, providing a reasonable tradeoff between efficiency and accuracy (Jonason & Luévano, 2013). Nevertheless, the validity of the DTDD may be compromised in comparison to the SD3, and thus our results should be interpreted with caution.

The Structure of the Dark Triad Dirty Dozen Across Cultures
Although most people are not from W.E.I.R.D. (Western, Educated, Industrialized, Rich, Democratic) backgrounds, most behavioral sciences studies rely on W.E.I.R.D. samples (Henrich et al., 2010a(Henrich et al., , 2010b, and so does research on the DTDD, which was originally developed as a measure of three correlated factors and validated in a North American sample (Jonason & Webster, 2010). Follow-up work on W.E.I.R.D. samples found support for the three correlated factors measurement model (Klimstra et al., 2014;Küfner et al., 2014;Maneiro et al., 2019;Pineda et al., 2018;Savard et al., 2017). Some of this work (Maneiro et al., 2019;Savard et al., 2017) compared the three correlated factors model and a bifactor model. Although the three correlated factors model fit the data well, the bifactor model fit them even better. These finding led to the conclusion that the bifactor model represents the structure of DTDD best. However, in light of problems with the bifactor model noted above (e.g., probifactor bias; Greene et al., 2019), such a conclusion is questionable.
Moreover, only a few, generally underpowered, studies have examined the structural properties of the DTDD in non-W.E.I.R.D. samples. However, the results regarding the measurement model were similar to those of W.E.I.R.D. samples. That is, in Asia, the Middle East, non-Western Europe, and South America, the three correlated factors model fit the data well (Dinić et al., 2018;Gouveia et al., 2016;Özsoy et al., 2017;Tamura et al., 2015). Moreover, the probifactor bias was also observed in some studies examining DTDD, providing a better fit to data of the bifactor model than a three correlated factors model; in other studies, the bifactor model was considered as the best model without comparison with the three correlated factors model (Czarna et al., 2016;Gouveia et al., 2016;Tamura et al., 2015).
In an attempt to validate the DTDD structure across cultures, one needs not only to compare results from different studies but also, and, perhaps, more importantly, to assess measurement invariance (Meredith, 1993). There are three models of measurement invariance, representing progressively more stringent assumptions: (a) configural invariance (i.e., whether the same latent constructs are loaded by the same items across compared groups), (b) metric invariance (i.e., where factor loadings are equal across compared groups), and (c) scalar invariance (i.e., where, in addition to factor loadings, item intercepts are equal across compared groups). Establishing configural invariance confirms whether the compared structure is essentially the same, reaching metric invariance allows for comparing covariances and unstandardized regression coefficients, and establishing scalar invariance permits meaningful comparisons of latent means Davidov et al., 2014;Milfont & Fischer, 2010). We conducted a test of measurement invariance of the DTDD in 13 samples originating from three W.E.I.R.D. world regions (i.e., North America, Oceania, Western Europe) and 36 samples from non-W.E.I.R.D. world regions (i.e., Asia, Middle East, non-Western Europe, South America, sub-Saharan Africa).

Overview
We aimed to test the structure and measurement invariance of the DTDD across cultures in eight world regions (i.e., Asia, Middle East, non-Western Europe, North America, Oceania, South America, sub-Saharan Africa, and Western Europe). We hypothesized that the three correlated factors model would represent adequate fit to the data (Hypothesis 1). We hypothesized this structure to be invariant across men and women, with the latter scoring higher on all Dark Triad traits (particularly psychopathy; Hypothesis 2). We also hypothesized for this structure to be invariant across W.E.I.R.D. and non-W.E.I.R.D. world regions (Hypothesis 3).
To test Hypothesis 1, we evaluated the independent cluster model of confirmatory factor analysis (ICM-CFA), and, to test Hypotheses 2 and 3, we evaluated the multigroup confirmatory analysis (MGCFA). In the testing of the ICM-CFA, we relied on standard recommendations. That is, the comparative fit index (CFI) should be ≥.90, and the root mean square error of approximation (RMSEA) should be ≤.08 (Byrne, 1994). To find out if the tested model is invariant, we compared the differences in approximate fit statistics between subsequent models (e.g., between configural and metric or between metric and scalar), whose values should not exceed .015 in RMSEA and .01 in CFI (Chen, 2007). We carried out all the structural analyses using robust maximum likelihood estimation in Mplus v. 7.2 (Muthén & Muthén, 2012). We made all the used scripts and data available at the OSF project site: https://osf.io/8nsc3.

Participants and Procedure
We report how we determined our sample size, all data exclusions, all manipulations, and all measures. We collected the data (N = 11,723) between April 2016 and October 2017 as part of the "Cross-Cultural Self-Enhancement Project," which brought together over 70 academics from 56 countries. In each country, researchers set out to recruit at least 150 participants, based on a priori power analyses using the average effect in personalitysocial psychology over the past 100 years (i.e., r ≈ .20; Richard et al., 2003), but ideally to recruit 250 participants so as to reduce estimation error in personality research (Schönbrodt & Perugini, 2013). In a minority of samples from the larger project (i.e., Hong Kong, Spain, Uganda, Uruguay), we failed to gather the minimal number of participants and consequently we excluded these samples from analyses. Participants from two countries (i.e., Philippines and Vietnam) did not complete the DTDD, and so we excluded their data from analyses. Finally, we excluded the Iranian sample due to serious violations of data quality that we were unable to resolve. Although some sites fell short of the ideal of 250 participants, we considered the inclusion of the full range of data important, because of the novelty of this project and the difficulty of obtaining (good) data from some of the regions to which we had access.
In all, we analyzed data from 49 countries ( Table 1). The sample consisted of moderately affluent (M = 4.47, SD = 1.10; scale range: 1 = much lower than average, 7 = much higher than average) university students (M = 21.53 years, SD = 3.17 years), with 66% women, and 39% taking the study in a paper-and-pencil form and 18% in English (as native-tongue or official language of instruction). We followed informed consent and debriefing procedures in each country. The full list of the used measures is available at the OSF project site. The project was reviewed and approved by the ethical committee of the home institution of the second author (UG1/2016), and reciprocal approval was secured at the remaining locations.

Measure
We assessed the Dark Triad traits using the Dirty Dozen measure (Jonason & Webster, 2010). We translated the measure (when relevant) by following the procedure recommended by International Test Commission guidelines for translating and adapting tests in cross-cultural research (Brislin, 1986;Hambleton, 2005). In particular, we translated the 12 items into each language with the help of two native speakers, and back translated the items with the help of a third one. We discussed the back-translated version with the author of the scale (Peter Jonason), and, in case of comments or suggestions, a translator adjusted the scale until a final version was reached. We asked participants how much they agreed (1 = not at all, 7 = very much) with statements such as "I tend to want others to admire me" (i.e., narcissism), "I tend to lack remorse" (i.e., psychopathy), and "I have used deceit or lied to get my way" (i.e., Machiavellianism).

The Dark Triad Dirty Dozen Structure (Hypothesis 1)
We present in Table 2 the model fit indices estimated through the ICM-CFA and intercorrelations between the Dark Triad traits in each world region separately. Results generally supported the hypothesized structure. 2 Nevertheless, to reach acceptable fit indices in all W.E.I.R.D. regions and in Asia, we entered correlations one at a time between residuals until the model fitted the data well. In Oceania and Western Europe, we added a correlation between two Machiavellianism items (i.e., 2 and 3). In Asia, we added a correlation between two psychopathy items (i.e., 9 and 10). In North America, we added correlations for the two pairs of items reported above (i.e., 2 and 3, 9 and 10). Hypothesis 1 was mostly confirmed around the world.

Measurement Invariance Across the Sexes (Hypothesis 2)
We present in Table 3 the results of the MGCFA across men and women in each of the analyzed regions. We maintained the correlations between residuals identified in the assessment of the basic model. In all the analyzed world regions, we found support for full scalar invariance in men and women. We present the comparisons of latent means in Table 4. Men scored significantly higher than women on all three traits in all world regions. The only exception was for the psychopathy difference in Asia, which was not significant. Hypothesis 2 was generally confirmed.

Measurement Invariance Across W.E.I.R.D. and Non-W.E.I.R.D. World Regions (Hypothesis 3)
We present the results of the MGCFA across W.E.I.R.D. and non-W.E.I.R.D. samples in Table 5. 3 Overall, we found metric but not scalar invariance. To identify which parameters were noninvariant in the scalar model, we scrutinized modification indices and freed one intercept at a time. In W.E.I.R.D. regions, we freed the following intercepts: one in North America (i.e., psychopathy: Item 12), two in Oceania (i.e., narcissism: Item 5, psychopathy: Item 12), and four in Western Europe (i.e., Machiavellianism: Item 1, narcissism: Item 4, psychopathy: Items 10 and 12). In non-W.E.I.R.D. regions, we freed the following intercepts: two in Asia (i.e., Machiavellianism: Item 3, narcissism: Item 7), three in Middle East (i.e., narcissism: Items 5 and 8, psychopathy: Item 12), three in non-Western Europe (i.e., narcissism: Item 8, psychopathy: Items 9 and 12), and three in South America (i.e., narcissism: Items 7 and 8, psychopathy: Item 9). The results supported our hypothesis to a limited extent, especially in the context of the equivalence of narcissism and psychopathy.

Discussion
The dark side of personality has attracted interest from researchers and laypersons alike (Zeigler-Hill & Marcus, 2016). Yet the existing studies have relied on Western samples, and evidence from non-W.E.I.R.D. countries has been Note. Standardized correlations between latent factors are presented in brackets. df = degrees of freedom; CFI = comparative fit index; RMSEA = root mean square error of approximation; M = Machiavellianism; P = Psychopathy; N = Narcissism. All correlations were significant (p < .001).
equivocal and mostly underpowered (Gouveia et al., 2016;Özsoy et al., 2017;Tamura et al., 2015). To advance our understanding of the structural properties of the DTDD, we examined the DTDD across the eight world regions of Asia, Middle East, non-Western Europe, North America, Oceania, South America, sub-Saharan Africa, and Western Europe. Our results provided support for the three correlated factors model of the Dark Triad traits in all the analyzed samples. Although the bifactor model yielded better fit in some countries, in others it produced problems with model convergence. This illustrates that, alongside with the better model fit provided by the probifactor bias (Greene et al., 2019), the bifactor modeling approach can be problematic (Bonifay et al., 2017). Therefore, we encourage researchers to be more circumspect with the application of this statistical procedure, as it might yield only superficial improvements in approximate fit indices without necessarily aiding in the theoretical understanding of the construct in question. Note. df = degrees of freedom; CFI = comparative fit index; RMSEA = root mean square error of approximation.
Additionally, the results were consistent with existing meta-analyses examining sex differences of Dark Triad traits (Muris et al., 2017). Men scored higher than women on all Dark Triad traits. However, in Asia, primarily Japan and Korea, we observed no statistically significant differences in psychopathy for men and women, which is consistent with previous findings ). An explanation lies in the nature of psychopathy, as the most socially aversive trait (Eisenbarth et al., 2018;Paulhus & Williams, 2002). Japan and Korea are face-saving cultures (Kim & Nam, 1998;Sedikides et al., 2015). As such, there may be strong normative pressure to refrain from manifesting (and admitting to having) such traits, which could harm other people; the potency of this normative pressure might stifle sex differences.
The three correlated factor structure of the DTDD was invariant at the metric level in W.E.I.R.D. and non-W.E.I.R.D. world regions. As such, researchers could compare covariances and unstandardized beta weights of the latent DTDD factors. Relevant studies found limited evidence on the DTDD factorial structure in non-W.E.I.R.D. countries (Dinić et al., 2018;Gouveia et al., 2016;Özsoy et al., 2017;Tamura et al., 2015), but these studies neglected several world regions and were generally underpowered. After the removal of some model constraints, mostly associated with narcissism and psychopathy, we reached partial scalar invariance. These results are not surprising, given that the DTDD has been criticized for its limited measurement of these two traits (Kajonius et al., 2016;Maples et al., 2014;Miller et al., 2017). Reaching metric invariance allows testing for validity of the DTDD across world regions, although better (i.e., more valid) measures may exist (Jones & Paulhus, 2014;Miller et al., 2017)-problems with their internal structure notwithstanding.
Despite the multinational sample and the large number of participants, our study has several limitations. To begin, there are likely sampling biases present given our reliance on convenience samples of university students. Also, we did not consider validity tests in this article, except for testing invariance across sexes and region, which would further help us differentiate the optimal model. Finally, in some countries we did not use the national translations but the English versions, which potentially might (in India) or might not (in Nigeria) influence the obtained results depending on participants' linguistic skills. Nevertheless, we have provided evidence for the factor structure of the DTDD. This structure was invariant across the sexes and partially invariant across world regions. Although we advocate caution in the interpretation of the results and the judicious use of this scale, we hope the findings promote cross-cultural research on the Dark Triad traits.

Author's Note
Taciano L. Milfont is now affiliated with The University of Waikato.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.  Note. We also assessed multigroup confirmatory analysis for W.E.I.R.D. and non-W.E.I.R.D. samples independently, also finding only metric invariance. We also assessed the measurement invariance in non-W.E.I.R.D. regions excluding non-Western European countries, however, the results did not change. df = degrees of freedom; CFI = comparative fit index; RMSEA = root mean square error of approximation.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: We thank Jeremy Frimer for providing data for Canada. Notes 1. Narcissism in the Dark Triad typically refers to the grandiose form of this trait Sedikides & Campbell, 2017). 2. We also tested the ICM-CFA for each country separately.
Furthermore, we tested the ICM-CFA for three additional models: unidimensional, bidimensional with psychopathy and Machiavellianism merged as one factor, and bifactor model. The bifactor model fitted better the data in some countries, but it yielded lack of convergence in other countries. It is likely that this model reflects probifactor model bias. Results of these additional analyses are available at the OSF project site. 3. Because of the limitations of the DTDD described in the Introduction, we decided not to interpret latent mean differences across world regions. We uploaded these results on the OSF project page.