THE EFFECT OF EMOTIONS ON BRAND RECALL BY GENDER USING VOICE EMOTION RESPONSE WITH OPTIMAL DATA ANALYSIS

Purpose – To analyses the effect of emotions obtained by oral reproduction of advertising slogans established via Voice Emotion Response software on brand recall by gender; and to show the relevance for marketing communication of combining “Human-Computer-Interaction (HCI)” with“affective computing (AC)” as part of their mission. Design/methodology/approach – A qualitative data analysis did the review of the scientific literature retrieved from Web-of-Science Core Collection (WoSCC), using CiteSpace’scientometric technique; the quantitative data analysis did the analyse of brand recall over a sample of Taiwan’ participants by “optimal data analysis”. Findings – Advertising effectiveness has a positive association with emotions; brand recall varies with gender; and “HCI” connected with “AC” is an emerging area of research. Research limitations/implications – The selection of articles obtained depend the terms used in WoSCC, and this study used of only five emotions. Still the richness of the data gives some compensation. Practical implications – Marketers involved with brands need a body of knowledge on which to base their marketing communication intelligence gathering and strategic planning. Originality/value Provides exploratory research findings related to the use of automatic tools capable of mining emotions by gender in real time, that could enhance the feedback of customers toward their brands.


2
Emotions have an important role in directing responses to stimuli and have been extensively investigated in the field of psychology, receiving unprecedented recognition in the field of marketing (Consoli, 2010). Bagozzi, Gopinoth, & Nyer, (1999) described emotions as producing a mental state of readiness arising from cognitive appraisals of events or thoughts; Fridja (2007) say that emotions motivate behavior but as they are short-lived in the field of consciousness, they require immediate attention; Bagozzi and Dholakia (2006) state that emotions are important to accomplish collective goals; Romani, Grappi and Dalli (2012) state that negative emotions drive consumers away from brands; Chodlhry et al (2015) state that not all negative emotions lead to concrete construal; Dubé and Morgan (1998) affirms that higher states of positive emotions magnified increasing trends in satisfaction; Orth and Holancova (2004) show that emotions evoked by advertisements vary according to gender; Pham, Geuens and Pelsmacker (2013), show the ad-evoked feelings exert a positive influence on brand attitudes; Wierenga (2011) says that sophisticated behavioral laboratories and brain imaging methods are the next research frontier in managerial decision making in marketing.
Recall is an important physiological factor of the human learning process, where information is stored in the mind as collaborated nodes, creating a semantic network. Recalled data with stronger emotional background will ultimately lead to greater attention to stimuli and increase communication effects (Vakratsas & Ambler, 1999). Precise measurement of emotions are important given the significance of emotion in the advertising process (e.g., Martensen et al., 2007). An appealing advertising campaign arouses consumer's positive emotion toward communicating message that is often themed with advertising slogan. According to Fiske and Taylor (1984), priming exists when current ideas come to consumer's mind with greater ease than ideas that are not currently activated. Slogan, as a themed emotion of advertising campaign, can be applied to prime various attributes of brand perceptions (Bouch 1993).
Psychophysiological measures are more connected to brand recall (Hazlett & Hazlett, 1999;Wang, Chien & Moutinho, 2015). According to Wang, Chien and Moutinho (2015), brand recall in Mandarin Chinese is better captured by voice emotion response than by self-reported measures. Their study in Taiwan in 2015, involved a sample of 142 participants, from 18 to 55 years old. Emotions were measured in five nominal 3 categories (happy, bored, neutral, sad, angry), in order to classify eight advertising slogans of longestablished brands, familiar to consumers in the Greater Chinese market: KFC (Just want KFC), Burger King (Roast is delicious), Pepsi (Enjoy delicacies, drink Pepsi!), Coca Cola (Enjoy cold, My Coca Cola); 7-Eleven (Oh! Thanks heaven! 7-Eleven), Family Mart (Family Mart is your home!), Suzuki (Firepower is to win), and SYM (Burn my hot blood chopper dream). To develop this previous study, the authors intend to select the voice emotion response and analyse its effect on brand recall by gender, applying Optimal Data Analysis (ODA) which is robust with small samples (Yarnold & Soltysik, 2004). A mixed method of both qualitative and quantitative design was used in the study, which, according to Fodness (1994) results in a comprehensive measurement in understanding tourist motivations.
In order to statistically measure the scientific contribution of the study, a scientometric review of the bibliographical references, which are growing rapidly in this digital information era, is undertaken. Such a review has theoretical and practical implications since the detection of potentially valuable ideas is essential to safeguard the integrity of scientific knowledge (White & McCain,1998;Morris & Van, 2008;Tabah, 1999).
The perceived contribution of this paper is its illumination of when, how, and why the effect of emotions on brand recall by gender, using voice emotion response with optimal data analysis, can be useful in the scientific domain. Hence, it is necessary to obtain an up-to-date understanding of the scientific field's intellectual structure, and to identify exactly how the current issue connects with previously disparate patches of knowledge, creating a network of cognitively demanding ideas. The insights into the structure and dynamics of this issue are gained in computational terms, using publication records from Web-of-Science and exporting them to CiteSpace, a scientometric branch of informatics enabling the analysis of bibliographical records, articles actively cited from the domain issue, and emerging trends and changes in scientific literature over time. It is a tool that applies multiple temporal, structural, and semantic metrics and allows the visualisation of patterns from both citing and cited items (White, & Griffith, 1982;Chen, Dubin, & Kim, 2014). The temporal interval is sliced into equal mutually exclusive length segments of one year, and an individual co-citation network is derived from each slice. The merged networks and the 4 major changes between adjacent periods can be highligted in a panoramic visualisation. The following sections present literature review, methodology, results and conclusion of this study.

Literature review
The literature review about the subject that entitles this research was done using structural, temporal and semantic metrics, apllying CiteSpace, which allows knowing the distribution and contribution for knowledge of those references by areas of investigation. This purpose is explained in five steps: first, what is a scientometric review; second, the qualitity of the obtained analysis; third the identification of the most cited, central, burst and novelty references; fourth the identification of emerging trends; and fith the identification of Optimal Data Analysis, as an isolated area in the network of all references.

Scientometric review of the literature with CiteSpace
The bibliographical records published since 1900 appropriate to the subject title "the effect of emotions on brand recall by gender using voice emotion response with optimal data analysis" were collected from the Web-of-Science of Thomson Reuters, using the broadest possible terms to ensure that subsequent analysis via CiteSpace covered all major components of a knowledge domain . The terms chosen with the logical operator 'AND' were: emotions, advertising recall, and affective computing; and with the logical operator "OR" were: gender recall, and optimal data analysis. Two data sets were obtained: one corresponding directly to the core or topic search, producing 226 papers, and the other indirectly including 3,826 citations appearing at least once in any paper of the original core set, because they can be thematically relevant to the subject matter Chen, Dubin, & Kim, 2014). References are a general term for any written scientific work, i.e. articles, books, and conferences. The resultant expanded dataset containing 4,052 references was then merged and exported to CiteSpace for scientometric review.

The scope and quality of the network
CiteSpace filters the analysis, narrowing the period of time to 1991-2016. The corresponding network has 294 unique nodes, each one representing a cited reference, and 686 links connecting them, showing various topics, it always being possible to find some irrelevant ones. Each node cites a number of related references, 5 where the connectivity between cited and citing papers captures the underlying intellectual structure of each cluster.
All the references published in a given year create a network called a time slice. The entire time period is divided into equal length mutually exclusive segments of one year. The network configuration for each time slice is based on one of seven criteria: modify g-index, top N, top N%, threshold, by citations, usage 180, usage 2013. The modify g-index with a positive scaling factor k, gives the average number of an author's most important publications. In the current paper, the network configuration of citations and co-citations uses the g-index with a threshold (k = 9).
Top N represents the number of most cited references or occurences chosen. Top N% represents the percentage of most cited references or occurences. Threshold interpolation combines nodes and links stipulating minimum values for citation counts (c), co-citation counts (cc) and co-citation coefficients (ccv). The sequence of time-sliced networks is merged into one containing all nodes appearing between 1991-2016, giving an overview of how the scientific field has been evolving over time. Each link from individual networks is merged based on the earliest time stamp (default option), and subsequent links connecting the same pair of nodes are dropped in order to detect the earliest moment when a connection was first made in the literature (Chen, 2008).
The network can change over time with the addition or elimination of references and links. The objective is to simplify a dense network through effective pruning, allowing its visualisation to be clarified (Chen, 2008). CiteSpace supports two pruning algorithms: the minimum spanning tree, and Pathfinder. Pathfinder is the default option and was used in this study because keeps the relevant links at a minimum and preserves the chronological growth patterns (Chen, Kuljis, & Paul, 2001;Chen, 1998 ). 6 The new improved network includes individual components of bibliographic records (nodes), and their relationships and changes over time, established via the method of Document Co-Citation (DCC), which partitioned the network into 50 non-overlapping clusters, measured by cosine coefficients (Small, 1980). The application of structural metrics (betweenness centrality, modularity, silhouette) and temporal metrics (burtness, sigma), allowed the network to be filtered and reduced to 7 (regarded as optimal) relevant clusters, using the spectral clustering method (Luxburg, Bousquet & Belkin, 2009;Luxburg, 2006) in which strongly connected nodes were assigned to the same cluster, and non-connected nodes were assigned to different clusters. These 7 major clusters correpond to 59.5% (=175) of the 249 references, with silhouettes in the interval [0.850; 0.967], containing at least 10 references. Clusters of smaller size, despite having a high silhouette probably indicate that the same author provides all the references, thus being of no interest for analysis (Schneider, 2006).
The scientometric analysis via CiteSpace includes modularity,and silhouette as structural metrics that measure the quality of the network. Modularity (Q), rang from zero to one, measuring the extent to which a network can be divided into a number of independent groups, named clusters, such that nodes within the same group are more tightly connected than nodes between different ones. A low value of Q suggests a network that cannot be reduced to clusters with clear boundaries, whereas a high value, like the one in the current paper (0.827), suggests a well constructed network. According to Chen (2006), networks with modularity scores of 1 or very close to 1 may turn out to be trivial special cases in which individual components are simply isolated from one another. The silhouette ranges from -1 to 1, and is useful in estimating the uncertainty involved in identifying or interpreting the nature of a cluster (Rouseeuw, 1987). A value of 1 represents a perfect separation from other clusters, while a negative value suggests its diversity or heterogenity. The mean silhouette for the whole period defined by the merged network (0.544) is higher than 0.5, thereby fulfilling the conditions required by Chen (1994) for further analysis.
The topics involved in the field of research can be delineated in terms of keywords assigned to each reference in the dataset, as shown in Figure 1. Adjacent ones are often attributed to the same reference, with characters' size proportional to their frequency. The keywords of optimal data analyis, affective computing, 7 human-computer-interaction, and emotion recognition are marked with arrows due their connection with the subject title. The frequency of the more relevant keywords is shown in Table 1. Emotions and affective computer were used in Web-of-Science for the current domain. author, and have their characteristic dimensions proportional to their size (Yu & Somple, 1965). One isolated node, identified by an arrow at the top of Figure 2, refers to Yarnold P R (2005), and is related to Optimal Data Analysis (ODA), which appears in the subject title. According to Yarnold (2016), ODA accommodates all metrics, requires no distributional assumptions, allows for analytic weighting of individual observations, explicitly maximises predictive accuracy, and supports multiple methods of assessing validity. The citation history shown in Table 2 shows that ODA was cited in seven papers during the period 2008-2013, none of them connected with the domain of the effect of emotions on brand recall by gender using voice emotion response. identifying the references with high betweeness centrality (structural metrics), and those with citation bursts and novelty (temporal metrics).

Areas of research
To identify the areas of research included in the network, we aplly semantic metrics where clusters are labeled from extracted titles according to three algoritms [weight terms frequency (TF*IDF); likelihoodratio (LLR); mutual information (MI)], providing a set of cues that facilitate their interpretation and serving as symbols for scientific ideas and methods (Schneider, 2009). The algortim TF*IDF tend to represent the most salient aspects of a cluster (Salton, Yang, & Wong, 1975); and MI and LLR reflect its unique aspect (Dunning, 1993). According to Chen (2008), LLR usually giving the best result in terms of uniqueness and coverage of each cluster.  The most relevant fifteen clusters, labelled via the TFI*DF algorithm, are represented in circle packing graph, the size and distance of each circle being proportional to their intercognitivity and relevance (Chen, 1999). Cluster #0 (spontaneous expression) with 39 references, is identified by the largest circle, while cluster #6 (mouse), with 17 references, has the seventh smallest size, marked with an arrow. Other small clusters with a small number of references, are gravitating around the first seven, as shown in Figure 1.

The most relevant references of the literature by area of research
The most relevant references, analysed by citations, centrality, burst, and novelty (sigma), are identifyed by the first author in Table 4. Centrality, assume values in the intervall [0;1], and measures the extend to which a reference is in the midle of a path that connects different clusters (Brandes, 2001). High centrality identify pottentially revolucionary scientific publications as well as gatekeepers in social network (Freeman, 1977).
Bursts aim to investigate where the citations of a reference increases abruptly, using the algorithm introduced by Kleinberg (2002). It detects when surge the statistically significant fluctuations on the citation count of a particular reference during a short time interval within the overal time period, (Chen, Kurjis, & Paul, 2001), regardless of how many times their host article are cited. Scientific novelty is measured through sigma, which is a combination of burtness and centrality, identifying publications that represent creative ideas, with a role more proeminent that the rate of it's recognition by peers.
13 Table 4: Most relevant references by cluster, citations, burst, centrality and sigma.
The timeline view ( Figure 5) of most of the documents cited shows they were published after 1995, while the clusters with more recent publications after 2010, marked with an arrow, came from clusters #1 and #6.
The majority of burst references (52.2%), come from cluster #3, followed by the newest cluster #6 which contains 35%.  To understand the links between clusters, in table 5 it can see the most relevant papers that cite or mention members for the top seven clusters. It can be seen that Balasubramanian, R (1996), "Gravitational waves from coalescing binaries: detection strategies and Monte Carlo estimation of parameters", is connected with 41% (coverage) of the field of research included in the oldest cluster #5. The paper Zeng, Z H (2007)     The most relevant references, analysed by citations, burst, centrality, and sigma, are shown in Table 4. Each line of the table provides information about the first author, year of publication, source, the cluster to which it belongs, number of citations received, burst, centrality, and novelty (sigma).
Zeng, Z H (2009) is a landmark from cluster #4, labelled as continuous emotion (TFI*DF), co-presentative collaboration (LLR) or arousal classification (MI). It has strong centrality and burst, being a pioneer reference. This paper develops algorithms that can process spontaneously-occurring human affective behaviour. It examines available approaches to solving the problem of machine understanding of human affective behaviour, outlining some of the scientific and engineering challenges to advancing human affectsensing technology. Pantic, M (2003) is the highest landmark of the major cluster #0, labelled spontaneous expression (TFI*DF), audio (LLR) or speech (MI). This paper discusses how to integrate into computers a number of components of human behavior in the context-constrained analysis of multimodal behavioral signals. 16 Cowie, R (2001), is a landmark of cluster #3, labelled as empathic technologies (TFI*DF), data fusion (LLR) or speech (MI). It is a pioneer reference, with high burst, responsible for connections between different fields of knowledge (centrality), and a novelty reference (sigma). This paper discusses the recognition of seven different human negative and neutral emotions, (bored, disengaged, frustrated, helpless, over-strained, angry, impatient) by technical systems, focusing on problems of data gathering and modelling, in an attempt to create a "Companion Technology" for Human Computer Interaction that allows the computer to react to human emotional signals. Picard, R W (1997), is a landmark from cluster #2, labelled as user emotion (TFI*DF) or computer (LLR, MI). It has strong centrality and burst, and is a structurally essential and inspirational pioneering reference. It advances a compelling argument in favour of the need for affective computers, suggesting that a truly intelligent system, artificial or otherwise, cannot be implemented without emotional mechanisms, drawing upon data and examples taken from a wide spectrum of disciplines, from neurobiology to folk psychology, showing the potential positive applications of affective computing. Kapoor A (2007), is a landmark of cluster #1, labelled as human-computer interaction (TFI*DF), detection system (LLR) or emotion representation (MI). It has the highest centrality, being a pivotal reference. This paper presents the first automated method that assesses, using multiple channels of affect-related information, whether a learner is frustrated. The new assessment method is based on Gaussian process classification and Bayesian inference and its performance suggests that non-verbal channels carrying affective cues can help to provide important information for a system to allow it to formulate a more intelligent response.
Those important landmarks and pivot nodes support the relevance of the subject title, concerned with applying a human computer interface to analyse how particular emotions (sad, bored, angry, neutral and happy), affect brand recall. Table 4 shows the most relevant references ranked by citations.
In order to detect emergent terms or understand the significance of a reference within a short period of time, regardless of how many times it was cited, it is important to know the top burst references, which could show some tendency over time to support the subject title. Goleman, D (1995), from cluster #2, has a citations burst between 2001; 2003, describing a model of emotional intelligence based on competences that enable a person to demonstrate intelligent use of their emotions in managing themselves and working with others to be effective at work. This reference seeks to understand the characteristics that predict better performance and more fulfilled lives.
Scherer, K R (2005), from cluster #1, with a burst of citations between 2011 and 2013, attempts to sensitise researchers to the importance of the definition of emotions, in order to guide research and make it comparable across disciplines, which is central for the development of instruments and measurement operations, as well as for the communication of results, and their discussion between scientists. This paper distinguishes emotions from other affective states or traits, and discusses how to measure them in a comprehensive and meaningful way. Kim, J (2008), from cluster #3, with a burst of citations between 2012 and 2016, investigates the potential of physiological signals as reliable channels for emotion recognition. This paper designs a musical induction method to acquire a naturalistic data set for evoking certain emotions based on the voluntary participation of subjects. The emotion recognition problem is decomposed into several refining processes using additional modalities, valence recognition, and the resolution of subtle uncertainties between adjacent emotion classes.
Calvo, R A (2010), from cluster #3, with a recent burst of citations between 2012 and 2016], describes the progress in the field of Affective Computing (AC), with a focus on affect detection. In order to achieve a truly effective real world system of affective computing, the need is stressed for an integrated examination of emotion theories from multiple areas. This paper provides meta-analyses on existing reviews of affect detection systems that focus on modalities like physiology, face, and voice, and also reviews emerging research on more novel channels such as text, body language, and complex multimodal systems.

18
Gao, Y (2012), from cluster #6, with a recent burst of citations between 2014 and 2016, analyses whether the touch behaviours when people are playing games on touch-screen mobile phones reflect players' emotional states. The use of touch as an affective communicative channel would be an interesting modality when facial expression and body-movement recognition, or bio-signal detection, may not be feasible.
The investigation of references with strong citations bursts reveals that they can be grouped essentially into two branches: one focusing on the theoretical concept of emotions/affection and the other on emotion recognition through the interface with computers. The exploration of the expanded dataset suggests that designing and executing novel approaches to address the recognition of emotions through computers are significant and widely accepted concerns in the domain knowledge.

Emerging trends
The networks are intellectual structures of associated co-cited references representing the knowledge of a scientific field , evolving over time during which newly-published articles may introduce profound structural variation or can have little or no impact on the structure (Chen, Song & Yuan, 2008).
Changes in modularity are represented by bars in Figure 6, each network being based on a two-year slicing period.
The number of new publications per year is represented by an increased line. The significant decrease in the modularity above 0.5 in three time periods: 1995-1996, 2001-2002, and 2005-2006, is expected to be explained by the appearance of citation burst references, playing an important role in changing the overall intellectual structure. Table 4 shows that the emergent trends are explained mainly by five references: Goleman, D (1995); Reeves, B (1996); Cowie, R (2001); Picard, R W (2001);and Sherer, K R (2005).
Note that 2015-2016 has modularity less than 0.5 but has no burst references, implying that those references do not contribute to emerging trends.

Optimal Data Analysis
Optimal Data Analysis (ODA) is a method developed by Yarnold and Soltysik (2005) which offers maximum predictive accuracy to data, even when the assumptions of the alternative statistical models are not applied. This method is used to identify patterns in the data that distinguish the effect of voice emotion response on brand recall by gender.
The accuracy of ODA is obtained by calculating the following measures: Sensitivity is the proportion of actual females who are correctly predicted by the model; Specificity is the proportion of actual males who are correctly predicted by the model; and Effect Size Sensitivity (ESS) is an index of preditive accuracy relative to chance, where values less than 25% indicate a relative weak effect; 25% -50% indicate a moderate effect, 50% -75% indicate a relatively strong effect, and 75% or greater indicate a strong effect over chance.
To assess generalisability, ODA first estimates using the entire sample (training set), calculating accuracy measures as described previously. Next, the model is cross-validated, and the accuracy measures are recalculated. If the accuracy measures remain consistent with those of the original model using the entire sample, as in the present study, then it can be said that the model is generalisable. The current study applies the approach of 'leave-one-out' (LOO) cross-validation, which is simply an n-fold cross-validation, where n= 141 observations in the dataset. Each observation in turn is left out, and the model is estimated for all 20 remaining observations. The predicted value is then calculated for the hold-out observation, and the accuracy is determined as female or male in predicting the outcome for that observation. The results of all predictions are used to calculate the final accuracy estimates. Model accuracy measures are calculated using the average values across all hold-out models. All variables included in the ODA model were constrained to achieve identical classification accuray in training (total sample), and LOO validity analysis. To ensure adequate statistical power, inhibit over-fitting, and increase the likelihood of cross-validation when the model is applied to classify a smaller independent sample, model endpoints were constrined to have N>= 10% of the total sample (Yearnold & Slotysik, 2016)

3.Results
The 141 participants divided into 80 females and 61 males, were presented with eight slogans to be classified into five categories of emotion, registered by Voice Emotion Response, in order to determine the effect of the emotion on brand recall (Wang, Chien & Moutinho, 2015). This was done by gender, and by applying ODA. With the exceptions of Family Mart and 7-Eleven, the other slogans have two patterns regarding recall by gender: males feel happier in cars (Suzuki and SYM) showing greater recall than females. In the remaining four slogans, Coca-cola, Pepsi-cola, KFC and Burger King, the opposite occurs, with females being happier, and showing better recall than males. The results are in line with those found by other researchers (Teixeira, Wedel, & Pieters, 2012;Martensen, Gronholdt, Bendtsen, & Jensen, 2007;Faseur & Geuens, 2006;Janssens & De Pelsmacker, 2005;Vakratsas & Ambler, 1999) who state that a significant relationship exists between advertising effectiveness and positive emotions. The brand recall is higher when associated with positive emotions, as shown in Table 7 The magnitude of recall by similar brands are: SYM (56.74%) higher than Suzuki (42.55%); Coca-cola (87.23%) higher than Pepsi-cola (79.43%); KFC (85.11%) higher than Burger King (65.25%); Family Mart and 7-Eleven are equal (both 95.74%). This can be explained by the fact that in the Taiwanese market, SYM motorcycles are more popular than Suzuki motorcycles, and Coca-cola is still the leading soft drinks brand. KFC came to Taiwan in 1985, and Burger King in 1990, so KFC is much more well-known and loved by Taiwanese consumers. Finally, Family Mart and 7-Eleven are the top two popular brands of convenience store.

Conclusion
The scientometric review supports the relevance of research on the subject title. The analysis of the field and the citation-based expansion has outlined the evolutionary trajectory of the collective knowledge over 1991-2016, and highlighted the areas of active pursuit. Emerging trends are identified from computational properties from CiteSpace, which is designed to facilitate sense-making tasks in relation to scientific frontiers based on relevant domain literature.
The paper tracks the advancement of the collective knowledge of a dynamic scientific community through the analysis of expert references in the literature domain, using computational techniques to discern patterns and trends at various levels of abstraction, as cited and co-cited references. The research on the subject title makes the following contributions to science: connecting two isolated areas (optimal data analysis and marketing communication); integrating them into six of seven major clusters in this knowledge domain; and differentiating brand recall by gender. Human Computer Interaction (cluster #1) and Future Direction (cluster #6) are the most recent issues that continue to be referenced in 2016, with which the subject title has great probability to be connected.
It was found that men had better recall scores than women when related to cars, whereas women scored higher in recall when dealing with soft drinks and fast-food. There has been always a clear association between the memorisation process of advertisements and the triggering of an emotional state towards brands. It has been confirmed that a positive relationship between consumer recall and brand stimuli does exist.
Owing to the preliminary nature of this study, our subject sample was limited in size and scope, and the findings of this investigation may not generalize to other samples. The Voice Emotion Response can only recognize five basic emotions, and this critically constrains this study. Researchers in the Tatung University are trying to develop further techniques to recognize more emotions that better suit marketing marketing and advertising research. Further research will benefit greatly if the technology improves.
Additionally, methods diversity improves the robustness of marketing research (Davis et al., 2013).