A divide‐and‐conquer strategy using feature relevance and expert knowledge for enhancing a data mining approach to bank telemarketing

The discovery of knowledge through data mining provides a valuable asset for addressing decision making problems. Although a list of features may characterize a problem, it is often the case that a subset of those features may influence more a certain group of events constituting a sub‐problem within the original problem. We propose a divide‐and‐conquer strategy for data mining using both the data‐based sensitivity analysis for extracting feature relevance and expert evaluation for splitting the problem of characterizing telemarketing contacts to sell bank deposits. As a result, the call direction (inbound/outbound) was considered the most suitable candidate feature. The inbound telemarketing sub‐problem re‐evaluation led to a large increase in targeting performance, confirming the benefits of such approach and considering the importance of telemarketing for business, in particular in bank marketing.

Concerning the modelling phase, in Moro et al. (2014), four DM learning techniques were explored: logistic regression, decision trees, support vector machines (SVMs), and a neural network ensemble. The best result was achieved by the neural network; thus, this is the only technique used for the experiments reported in this paper. Nevertheless, it should be noted that SVMs stand out as one of the most accurate modelling techniques (Yuan, Li, Guan, & Xu, 2010), with successful applications to a wide range of problems, including aircraft fuel consumption (Wang, Liu, Feng, & Zhu, 2015), and customers' feedback in hospitality (Moro, Rita, & Coelho, 2017). SVMs are able to "simultaneously minimize the empirical classification error and maximize the geometric margin classification space" (Yuan et al., 2010, p. 152).
Artificial neural networks represent an attempt to mimic the complex relations of neurons within the human brain to achieve logical reasoning in problem solving (Wang et al., 2010). A typical network is composed of layers, with each containing neurons (or nodes). Hence, the values of the input features travel through the network undergoing through computational transformations until the final node is reached, where a value resides as a result of all those complex relations hidden within the network. The capability of apprehending complex knowledge is dependent on both the number of hidden layers and the number of nodes (Li & Xu, 2000). For the present study, the neural network ensemble adopts the popular multilayer perceptron as its base model, with a configuration consisting in one hidden layer of H hidden nodes and one output node. The ensemble consists of N r different trained multilayer perceptrons, where the combined output is made in terms of the average of the individual neural network predictions. It should be noted that domain knowledge has been proved to be a valuable asset when incorporated in DM tasks (Osei-Bryson & Rayward-Smith, 2009). Also, DM has been successively applied to several cases on banking, such as performance prediction (Wanke, Kalam Azad, Barros, & Hadi-Vencheh, 2015), and telemarketing (e.g., Barzanti & Giove, 2012, for fund raising).
The probabilistic outcome is computed by the data mining model for evaluating classification problems. A successful result c is considered when p(c | x k ) > D, where D is the probability threshold that defines achieved success and p(c | x k ) is the data-driven model response for the input client contact x k . The advantage of using class probabilities is that a receiver operating characteristic curve can easily be conducted by varying D from 0.0 to 1.0, displaying one minus the specificity (x-axis) versus the sensitivity (y-axis; Fawcett, 2006). In effect, the area under the curve (AUC) provides a popular classification metric to evaluate prediction results (Jiménez, Jódar, Martín, Sánchez, & Sciavicco, 2017), and thus, such metric was adopted in this work. Another adopted metric is based on the lift analysis, which provides an interesting procedure to assess model prediction capabilities by dividing the dataset in deciles, ordered from the most to the least likely successful results (Witten et al., 2016). The cumulative lift curve plots population samples (x-axis) versus the cumulative percentage of real responses captured (y-axis). Similarly to the AUC metric, the ideal method should present an area under the LIFT (ALIFT) cumulative curve close to 1.0. Sensitivity analysis has been established as an effective technique for assessing to which degree a given model is sensitive to changes on the initial assumptions that have been made for building such model (Hora & Campos, 2015). The data-based sensitivity (DSA) approach allows understanding neural network models by changing simultaneously several input features from the samples used in the training set and thus can be used to detect the global effect of a feature even if such influence is only visible when it interacts with other input features (Cortez & Embrechts, 2013).

| Divide-and-conquer strategy
A wide number of divide-and-conquer strategies have been applied in the machine learning and DM domains. The division of a certain problem in smaller fractions is a natural way to reduce complexity to a manageable size. Particularly, several algorithms widely spread for implementing DM techniques tend to adopt tree structures to which divide-and-conquer strategies are recursively applied to shortcut through the search space of the problem (Yun, Lee, & Lee, 2016). Also, a wide number of techniques categorized under rule based and ensemble classifiers, including random forests, are based on the divide-and-conquer rule induction approach (Stahl & Bramer, 2014). Recent developments continue to use divide-andconquer strategies for improving DM modelling tasks, such as improving the performance of support vector machines (Hsieh, Si, & Dhillon, 2014), and handling large datasets by using an hierarchical classification for dividing the problem in smaller fractions that can be dealt with neural networks (Fritsch & Finke, 2012). Chen, Xu, He, and Wang (2017) proposed a divide-and-conquer strategy for improving sentiment analysis based on successively categorizing sentences using neural networks. The approach the aforementioned authors propose outperforms other well-known sentiment analysis methods on several benchmarking datasets. Other related domain to which divide and conquer has been applied is process mining, with the results showing that for large volumes of event logs, decomposing a problem into a maximum number of sub-problems is not always the best choice; rather, non-maximal decompositions tend to provide better results (Verbeek, van der Aalst, & Munoz-Gama, 2016). Especially, relevance is the criterion chosen for such decomposition, which may become a challenging task for an algorithm to handle solely without human intervention.
Although a large number of articles are devoted to DM algorithm design, a smaller number emphasizes the need to divide the problem to reduce the feature selection search space. Such need has been mathematically described by Li and Xu (2001) within their study on feature space theory. It encompasses the transformation between extensions and intensions in terms of knowledge representation (Li, Xu, Wang, & Mo, 2003). Furthermore, it is considered a research subject of relevant interest in DM, as the exponential growth associated with real world problems with several constraints and large search spaces pose serious challenges even with large computational power (Chandrashekar & Sahin, 2014). However, most of the proposals are focused on automated solutions or using expert knowledge embedded in the feature selection procedure, rather than decomposing the problem. As an example, Lin, Liang, Yeh, and Huang (2014) included expert knowledge in their feature selection proposal of a wrapper-based method. Also, expert knowledge has helped to improve the reliability of web attack detection by reducing the number of redundant and irrelevant features (Torrano-Gimenez, Nguyen, Alvarez, & Franke, 2015). The present study highlights a different mixed approach, which brings a new research contribution by using computed feature relevance combined with an expert knowledge. Furthermore, as Cortez & Santos, (2015) stated, "datadriven knowledge, used either solely or complemented by expert driven knowledge, turned into a key element of modern Expert Systems".

| Bank telemarketing case study
Direct marketing is the method of targeting specific customers allowing companies to promote products or services on an individual basis. The usage of a customer database can enhance the process, turning it into database marketing (Tapp, 2008). The telephone is still one of the major communication channels, emphasized by the proliferation of mobile devices. Marketing promotions and campaigns conducted using mainly the telephone are defined as telemarketing (Kotler & Keller, 2016). Companies with the need to support a large number of customers typically concentrate communication in contact centres, where automated or human agents can answer customers. Such centres are also used to conduct telemarketing campaigns that include performing outbound calls to a list of customers, or by taking advantage of inbound calls from customers to approach them in order to promote campaign products. The difference between inbound and outbound telemarketing has been reported by a few researches for problems such as simulation modelling and staffing (e.g., Lin, Chen, Hong, & Lin, 2010;Mehrotra & Fama, 2003).
One of the major problems in telemarketing is to specify the list of customers that present a higher likelihood of buying the product being offered (Talla Nobibon, Leus, & Spieksma, 2011). Decision support systems using predictive models can provide better informed decision-making, increasing telemarketing managers' awareness of the impact resulting from predicted outcomes. DM modelling techniques allow unveiling patterns of information, translating it into knowledge which can be incorporated in such predictive systems (Witten et al., 2016). Despite the potential gain of using DM for modelling bank telemarketing success, few studies have adopted such approach (Moro, Cortez, & Rita, 2015a). As explained by Moro et al. (2014), in 2012, we initially explored several data-driven models for targeting the subscription of long-term deposits, but we only achieved accurate predictive models when using features that were only known on call execution, such as call duration. In 2014, such drawback was solved by adding features that could be known before call execution, namely, social and economic context related (e.g., Euribor rate) (Moro et al., 2014); and lifetime value features (e.g., frequency of past client successes; Moro et al., 2015b). Also in 2014, Javaheri, Sepehri, and Teimourpour (2014) followed a distinct approach, modelling the effect of mass media campaigns (e.g., television) on the buying of new bank products. As explained in Section 1, this paper provides a purely novel approach to enhance the modelling bank telemarketing success, where a divideand-conquer strategy, supported by expert domain knowledge, is used to split the dataset instances according to the direction of the call, with a particular focus on inbound call-specific features.
This research adopts the case of telemarketing contacts executed between May 2008 and June 2013 from a Portuguese bank for selling longterm deposit accounts, consisting in a total of 52,944 records. In Moro et al. (2014), a feature selection and engineering procedure took place over the same dataset, from an initial base of 150 features. As a result, the final model used just 22 features. Subsequent work of Moro et al. (2015b) improved the model using five new customer lifetime value related features, increasing the total to 27.
For this work experiments, the five most relevant features for the best model (achieved by Moro et al., 2015b) are used for a panel of three human experts assess the relevance based on their business knowledge, complementing previous feature relevance. Such panel is composed by • a telemarketing campaign manager, having worked 3 years as a technical contact centre support and 10 years as a technician in marketing (#1); • a bank technician, with 6 years of experience in the commercial area, plus 3 years in the risk department (#2); and • a former bank information systems manager (2001-2016), who spent 12 years coordinating information systems projects for the bank's contact centre (#3). This panel is the same who helped with the definition of the banking dictionary adopted by Moro et al. (2015a). By enquiring three experts with different backgrounds, bias in the achieved results is reduced when compared to relying on a single expert.

| Data mining
For computational experiments when modelling the sub-problem, the rminer package of the R tool was adopted, which provides a coherent and parameterizable small set of functions specifically designed for data mining computation (Cortez, 2010). To fit the data records, the choice was a neural network base learner, because this model provided the best results in previous studies with the same dataset, outperforming a support vector machine, a logistic regression, and a decision tree. rminer implements a multilayer perceptron ensemble with N r learners, where each individual learner consists of a multilayer perceptron with several computing nodes organized in layers (Haykin, 2009). The input layer is fed with the input vector and propagates the activations in a feed forward fashion, via the weighted connections, through the entire network. The ensemble final response is set as the average of the distinct neural networks. The use of such ensemble turns the model less dependent on the random initialization of the neural network weights, as suggested by Hastie et al. (2009).
Regarding the configuration of the ensemble model, the neural network ensemble is composed of N r = 7 distinct networks, each trained with 100 epochs of the Broyden-Fletcher-Goldfarb-Shanno algorithm. For setting the number of hidden nodes (H), a grid search was performed where the number of hidden nodes was searched within the set H ∈ {0; 2; 6; 8; 10}. The rminer package of the R tool applies this grid search by performing an internal holdout scheme over the training set (with 2/3 of the data), in order to select the best H value that corresponds to the lowest AUC value measured on a subset of the training set and then trains the best model with all training set data.

| Divide-and-conquer strategy
For defining a division of the problem, first the five most relevant features from previous research were considered (Table 1). Each of the three experts was asked to assess the relevance for each of the previously five most relevant features using their expertise, by selecting a quantitative metric for relevance from one (no relevance) to 10 (vital relevance). Next, an unstructured interview took place for discussing their assessment with the goal of finding the most suitable candidate features for splitting the problem (and the dataset) in the light of business management commitments. For justifying this approach, it can be argued that specific domain knowledge via human expertise remains the best method for characterizing a specific problem, as argued by Witten et al. (2016). Figure 1 provides a simplified overview of the proposed procedure.
In the suggested strategy, each of the three members of the experts' panel participated in two phases: First, by assessing from a shortened list of the most relevant features for the DM model, which is the more suitable candidate partition for the problem in analysis; then, for providing assistance in the characterization of the new sub-problem, which should be different than the main problem, justifying a new feature analysis. For the experiments hereby presented, the five most relevant features from the DM model were re-evaluated by each expert. However, this limitation of features was only imposed to save valuable time from human experts who assisted the experiments. Because the three experts hold distinct credentials and banking experience, all the three are accounted the same weight regarding their decisions. Therefore, for obtaining the final rank of features' relevance, the average of the three experts' decisions is computed. This approach is an attempt to benefit from the best of two alternative strategies: • Usage of automated machine learning DM techniques, used to discover possible unknown yet useful knowledge from data; and mixed with • cognitive human thinking, which is known for making straightforward shortcuts for quick decision making.
Such approach addresses the complexity associated with the characterization of a problem using its features, which is highly applicationdependent.

| Modelling procedure and evaluation
After deciding which sub-problem should be addressed, a baseline model using the 27 features previously discovered was developed, in order to allow a straightforward comparison (Moro et al., 2015b). First, a feature selection and feature engineering procedure was conducted, where the  In order to compare the results between every phase, both AUC and ALIFT metrics were computed in an effort to improve AUC but also aligned with an increase in ALIFT. For assessing model performance, considering the new sub-problem dataset, two different methods were adopted: • First, for every phase of the feature selection and reduction procedure, modelling was executed using a random holdout validation, with a selection of 2/3 of contacts for training and 1/3 for testing predictions. This procedure was performed for 20 times and computed average AUC and ALIFT metrics; and • Second, a more realistic time ordered holdout was carried out with the most relevant features by selecting the 90% oldest contacts for building the newly proposed inbound model and then tested it on the more recent 10% of contacts. Such a different split from feature procedure intended to address the issue of having few contacts; thus, we took a larger fraction for the training set for fitting more data. For comparison purposes, metrics were computed for modelling using the initial features and the newly found characterizing features. The inbound model is the optimization achieved through the proposed divide and conquer and feature selection based on the new specific sub-problem toward a tuned result.

| Divide-and-conquer strategy
The five most relevant features in the study of Moro et al. (2015b) were assessed by the panel of experts. It should be noticed that the experts were perfectly aware of the previous results, which could influence their judgement. However, Table 1 shows that the relevant experience in the real environment is not the same as the computed from the DM model (using DSA). Call direction, the least relevant feature from the top five DM model DSA analysis, was considered the most relevant for both the telemarketing and the information systems managers. The risk manager with commercial background, however, pointed out the difference between best rate offered and the national average as the most relevant feature, with the call direction in second. It should be stressed that this expert, although with a valuable banking background, was less aware of the specificities inherent of telemarketing management when compared to the remaining two experts. Nevertheless, the final column in Table 1 shows that call direction was considered the most relevant feature in average. Such result pinpoints the need for a discussion to understand why human experience pointed to different direction from a computational result. First of all, the telemarketing manager indicated the huge difference in the human agents handling the calls, whereas outbound agents are typically young university students (18-24 years) appealing for a part-time job to pay some of their day-to-day expenses, inbound agents are older, more experienced agents. Furthermore, outbound agents usually do not stay for more than a year. In fact, their training reflects this situation, with highly focused and intensive sessions up to 2 weeks before the beginning of call handling and smaller session for each new campaign, to understand the product or service being offered. Moreover, inbound agents only receive briefing sessions and electronic documentation for a new distinct product. Also, quality control is much more emphasized for outbound agents. It is interesting to note short-term outbound agents as opposed to mid-or long-term inbound agents. In fact, outbound activities are pre-planned, more focused, and can be better structured whereas inbound activities are likely to be less structured because the initiative does not come from the bank but rather from the consumer. Hence, inbound interaction may demand more experience and breadth from the employee. Outbound looks more on hard sell, whereas inbound is more on soft sell.
The other big difference is customer entanglement versus intrusiveness. The manager states that customers are keener to listen to the agent when they begin the interaction. The fact that it was the customer who first established the call is enough to make a difference, because if she/he is calling, she/he has some available slot of time for conversation, whereas an outbound call may come in an inconvenient time. Then, there is the procedure for this particular contact centre, where customer requests are handled first, and only then the agent tries to offer the product/service. Therefore, after the client is satisfied, it is more likely to be predisposed to listen to the agent who happens to have just met her/his request.
Last but not least, there is a sense of insecurity in making banking transactions through phone. This is much more emphasized than a home banking web site, where the client is not interacting with a human being on the opposite of a phone contact, aggravated by the fact the client is not physically seeing the agent. Therefore, customers who are used to interact with the bank via the phone communication channel (thus perform phone calls) are much more prone to subscribing products/services using this channel. In fact, intrusiveness associated with outbound communication and specifically phone calls have been widely studied and regulated by legislation through opt-out registration (Woodside, 2005). Thus, it is no surprise the conclusions of the telemarketing expert. However, the focus of this characteristic in comparison with the remaining features differs from the feature relevance determined through model analysis. Thus, a discussion about the second most relevant feature for the manager also should take place.
Although it can be argued that the gain the customer achieves through the offered rate in comparison with the competition is a valuable driver toward deposit subscription, there is not such a remarkable difference such as for the type of phone call. The expert manager states that dialogue script configured per campaign usually is different for each deposit, being adapted to its characteristics. This difference could suggest that mining a different model per product would improve targeting subscription performance. In fact, other reason may favour the split using the offered rate: The expert mentioned that eventually a DM model would optimally use different features associated with the deposit specification (e.g., a deposit with an offered rate higher than the competition may demand a high minimum amount of subscription, thus being more related with the client income). Nevertheless, the fact that interest rate is a numeric value (when compared with the binary direction call) also poses the difficulty of where the split should occur, or even how many splits.
In the end, the expert clearly indicated the inbound versus outbound as the most relevant split, although emphasizing that other divisions could occur considering the offered rate. Taking into account the size of the analysed dataset, the inbound telemarketing sub-problem consisted in only 1,915 instances, as opposed to the total 52,944 contacts. This number suggests previous research was optimized for outbound telemarketing.
Given such potential to substantially improve inbound telemarketing predictive results, in this study, we adopt the call direction split, with a particular focus on inbound calls. In the next section, the sub-problem of inbound telemarketing is characterized by the telemarketing manager and re-evaluated. As a last remark, it may be argued that the knowledge emerging from this discussion solely based on human experience can hardly be inferred by the state of the art in machine learning. Simply, there are too many context factors to account for in each problem. Results explored in the next section attest the benefits of the proposed approach.

| Modelling
In this section, modelling of the new problem takes place, first for tuning through feature selection, then by comparing the results from baseline to the newly defined model. Previous section suggested splitting the problem between inbound and outbound telemarketing, following advice from expert, also supported by the literature. As stated previously, the panel of experts analysed the new sub-problem of inbound telemarketing and its characteristics and selected a new list of features for adding value to the baseline 27 features (Table 2). It should be noted that two of the initially proposed features were discarded (call.dir and last.prod.result), because these attributes were constant for the small subset of just 1,915 inbound contacts (but not for the whole 52,944 records). Thus, further reference in this text mentions just 25 features to those used in the baseline study. Table 3 shows the aggregated results by the average for the 20 executions for each feature selection phase.
Recalling that a closer value of AUC and ALIFT to 1.0 represents a more accurate model, phase 2a, with the originally proposed features, achieved an AUC of 0.8818, whereas modelling using just the new features (Phase 2b) got an AUC of 0.7940. Nevertheless, the new features were an addition to the original characteristics; thus, an overall better result by merging all the features was expected, which was achieved through a model with an AUC of 0.9069 (Phase 3), confirming the value of the panel's choices. To validate statistical significance, a Mann-Whitney nonparametric test was executed to check significance at the 95% confidence level (Molodianovitch, Faraggi, & Reiser, 2006).
For the feature reduction procedure (Table 4), a first test was conducted for a reduction from the 34 features from Phase 3 to the most relevant 20 features, where these features were selected using DSA method applied to the model with the 34 features. Such a reduction resulted in an improvement in the AUC from 0.9069 to 0.9137. The results achieved are evidence that some of the features hold noise, affecting negatively the performance of the model. In fact, more features does not necessarily mean better models, as it is a challenging task for an algorithm to decide which features are relevant and which are not (Domingos, 2012).
Next, two more steps were performed by removing the five lesser relevant features in each iteration. The result is an achievement of an AUC of 0.9139 with 15 features and of 0.9247 with just 10 features. These metrics confirm that the discarded features were less relevant, leading to a model with better predictive performances. This behaviour often occurs when using feature selection methods (Guyon & Elissee, 2003). Table 5 shows the results for the realistic simulation where the 90% older contacts were used for modelling and the newer 10% for testing the predictive accuracy, using the best set of 25 features proposed by Moro et al. (2015b; original model, optimized for all types of contacts), and the 10 new features discovered and presented in Table 4 (inbound optimized model, as set in Phase 4c). The difference in the results confirms the effectiveness of the suggested procedure. The inbound optimized model achieves the best AUC and ALIFT values, presenting an AUC of 0.89 and ALIFT of 0.87 that corresponds respectively to a 14 and 13 percentage point difference when compared to the original model (with 25 features).
The curves plotted in Figure 2 for both the original model based on the 27 features previously discovered and the newly found features for the optimized inbound telemarketing model show the difference of both models, confirming the values shown on Table 5. Nevertheless, it should be stated that this is a rough comparison, given the large difference between the features used. Also the confusion matrix displayed (right side of Figure 2) exhibits the high level of accuracy, with a true positive rate of TPR = 82% and true negative rate of TNR = 82%, when considering a success if the model foresees more than 20% of probability of occurring a subscription. We should note that the D = 20% threshold used to build the example confusion matrix was chosen taking into account that this particular bank prefers to avoid losing business opportunities, which is translated into successful subscriptions. Hence, we opted for a low threshold and more sensitive example, above which the client is contacted. Figure 3 shows the lift cumulative curves for the simulation test, when using the 10% newest contacts. The plot visually confirms the large difference between both models in terms of prediction results. of the most likely buyers) to achieve the same level of performance.   Table 3 3 (34 features) 0.9069* 0.8669* Both features included in ii)a) and in Table 3 Note. Best values in bold.
*Statistically significant under a pairwise comparison with 2a and 2b.    The difference between the best offered rate and the national average is the most relevant feature for the implemented model. This is a confirmation of the results achieved by Moro et al. (2015b), which also stated this feature as the most relevant. Interestingly, the second and third most relevant features are specifically inbound and correspond to newly proposed features. The former represents the past-time period since any previous call for the same campaign, whereas the later stands for the previous outbound attempts within the same campaign. The relevance of the second feature is just slightly below the top feature and above 20%, emphasizing the influence for modelling inbound calls. The third feature has an influence of around 13%. The inclusion of these two new features in the top three is a confirmation that inbound telemarketing should deserve a specific attention, different from outbound telemarketing, proving the value of the proposed strategy.
Considering the second and third most relevant and newly proposed features, we proceeded by using the sensitivity analysis DSA method to measure the global effect of the feature in the output response. Figure 5 presents the respective variable effect characteristic curve, which plots the input feature range of values (x-axis) versus the average sensitive model responses (y-axis).
From the plot of Figure 5, it can be observed a direct influence of an increase in the number of days gone from another inbound contact for the same campaign with the likelihood of success. It should be noted that some of the inbound campaigns are resident, that is, permanent because the corresponding product started to be sold through phone calls. This fact justifies the larger numbers in the curve, such as 175 months (approximately 14 years). Furthermore, each deposit has an expiration period, after which the value invested and interests earned are deposited in the main current account, meaning that the customer may be contacted again after this date. Several reasons may justify the relationship found. One of the more plausible is the fact that the customer may be bothered if few time passed since the last time he was asked previously to subscribe the same deposit. In fact, whereas the intrusiveness is more associated with outbound phone calls, inbound communication may also be affected by the customer perceived intrusiveness (Jung, 2009).
The third most relevant feature is the number of previous outbound calls, with a higher value influencing positively the likelihood of success. This is not a surprising relationship. In fact, it often occurs that the client is contacted directly through an outbound call and asked to be contacted later, but sometimes it is the client himself that takes the initiative of contacting the bank to finally subscribe the deposit, resulting in a successful contact. However, the studied dataset does not provide information about which contacts are made by the clients specifically for subscribing the deposit.

| CONCLUSIONS
In this paper, a divide-and-conquer strategy approach was presented, using a procedure based on feature relevance computed by a DM model followed by an expert analysis in order to find the best feature for splitting the problem in smaller more manageable sub-problems. This methodology was applied to a bank telemarketing problem consisting in campaigns conducted through phone calls for selling long-term deposits. The expert evaluated previous five most relevant features and assessed, based on his experience, which was the best candidate for splitting the problem. As a result, the call direction was considered the most suitable candidate.
Considering most of the contacts in the dataset are from outbound calls (51,029), the hypothesis that arose was that inbound calls have been neglected in its characterization features. To test it, the 1,915 inbound contacts were re-evaluated as a sub-problem from the original problem. First, the human expert helped in selecting additional features for increasing dataset value. Then, a feature reduction procedure took place for improving the model by eliminating the least relevant features, leading to just 10.
A data-driven model was built and evaluated. As a baseline comparison, a model fed with 25 features that were previously identified as relevant was defined, but where there was no distinction between outbound and inbound contacts, with the vast majority of the calls being executed through outbound. The obtained results confirm the inbound optimized model as the best solution, outperforming the baseline model by a large difference. Moreover, the inbound optimized model achieves the ideal lift performance (i.e., reaching all potential buyers) when selecting only half of the most likely buyers, although the baseline model needs to select a much larger sample (80% of the clients) to reach the same performance.
These results clearly confirm the approach proposed for dividing a problem in a smaller sub-problem that is characterized by specific features, different than the ones that best represent the more global problem. Thus, using a "divide-and-conquer" approach, in which the inbound contacts are modelled with a distinct predictive model, leads to an added value for the studied telemarketing bank domain. Moreover, a sensitivity analysis was executed over the best predictive model, ranking two inbound-specific features (i.e., number of days passed from another inbound contact for the same campaign and number of previous outbound calls) as the second and third most relevant features. Such knowledge is useful for bank campaign managers. Furthermore, within our knowledge, this is the first study that addresses inbound telemarketing problem within the banking domain (which it is normally "closed" and does not provide data openly for research). This clearly values this study and the obtained findings.
As future work, we intend to explore using other methods for feature selection to compare with the efficiency achieved through the DSA. One possibility is using mutual information theory, as it is a proven method for detecting noisy or irrelevant features (Barraza, Moro, Ferreyra, & de la Peña, 2016).

CONFLICTS OF INTEREST
none.