Survey of Game Theory and Future Trends for Applications in Emerging Wireless Data Communication Networks

Game Theory (GT) has been used with outstanding outcomes to formulate, and either design or optimize, the operation of a huge number of representative communications and networking scenarios, each one with distinct entities aiming at different and usually contradictory goals. This paper comprehensively surveys and updates, in a tutorial style, the literature contributions that have applied a diverse set of theoretical games to wireless networks, emphasizing scenarios of upcoming Mobile Edge Computing (MEC). MEC is a promising model that aims to offer cloud services at the network periphery, reducing service latency and backhaul load, and enhancing Quality of Experience. Our literature discussion of GT is structured according to a taxonomy that has three top-level groups: classical, evolutionary, and incomplete information. Then, the above groups are divided considering pertinent game aspects that have a great impact on players final decisions, namely: rational vs. evolving strategies, cooperation level, available information, and how the game is played (single turn, repeated, sequentially between leader and followers). This paper also revises applications of games to develop adaptive algorithms and protocols for the efficient and intelligent deployment of some standardized uses cases at the edge of heterogeneous networks. Finally, we highlight future research directions where GT can enhance the performance of wireless data communication networks in emerging MEC scenarios.


Introduction
Game Theory (GT) is a branch of applied mathematics which is used to study how rational players, aiming to have a satisfactory amount of common and scarce resources, can interact among themselves to obtain a fair and stable distribution of useful resources within a specific system.In this way, GT analyses the interaction among independent and self-interested players [1], [2].More recently, there has been a considerable amount of research in wireless networks [3]- [5].At the time of writing, there is an enormous interest in studying GT applications for emerging wireless data communication environments, such as cognitive radio [6], sensor networks [7], and mobile social networks [8] [9].
Data communication networks -especially wireless systems -are evolving, following a common and global trend: services and associated data, initially uniquely available at Cloud Data Centers, are also becoming directly accessible from the network edge (e.g.Base Stations, cloudlets) [10], [11].Consequently, there is a smooth change in the network model: from the centralized form based on cloud computing to a complementary one that is much more distributed and based on Mobile Edge Computing (MEC) or Fog as it is sometimes termed (see Fig. 1).Some advantages of adopting a system design that incorporates MEC devices are to reduce the latency availability of both data and services to the endusers.The latency is reduced because both data and services are much nearer the consumers and the Round Trip Time (RTT) is diminished.In addition, the backhaul link overload can be decreased because most user data requests are locally satisfied by data cached at the network periphery.Simultaneously, due to the appearance of the Internet of Things (IoT), which facilitates basically every device with a processor to communicate with other devices, it makes possible the gradual integration of IoT into the network infrastructure.In this way, new terminal devices will soon show up within edge network domains.These new devices would be of distinct types, as shown in Fig. 1: smartphone sensors, Unmanned Aerial Vehicles (UAVs), or even domestic appliances.
In our opinion, the emerging changes in the network infrastructure that we have just outlined can be seen as an excellent opportunity to create a cognitive operation mode for the current Internet.We envision some "tools" that can support this more intelligent operation mode, as follows: pervasive 'sensoring' (sensors + local sensor protocols), always-connected access technologies (5G, WAVE), network edge processing and storage (MEC), all the network devices within a domain that can be completely coordinated among them in attaining specific goals (e.g.aggregating sensor data to create useful information) by "out-of-box" software in innovative and very flexible ways (SDN), and MEC agents inferring knowledge from information (Deep learning; e.g.Generative Adversarial Network) following optimum strategies (GT).We have only selected, due to paper size limitation, the following "tools" to be discussed in the current survey: 5G, WAVE, MEC, and GT.In this way, GT should be revisited to check its adequacy for use inside each network domain by adjusting the configuration parameters of the network algorithms or protocols deployed within that domain.With these parameter adjustments, the network domain would hopefully react better to unexpected situations (e.g.load variation [12], latency, jitter, cyber-attacks) in usage scenarios supported by 5G, WAVE, and MEC.
Cloud Computing @ Network core Fog Computing @ Network edge Mobile Terminals, UAVs, Home Appliances, etc.
Reduces both latency and scale; Mitigates backhaul congestion; but it creates novel challenges: energy efficiency, storage and processing additional demands at the edge How GT could empower new styles of networking?

Fig. 1. MEC/Fog Emerging Environment
The main question addressed by our paper is how GT can give useful hints (i.e.learned skills) for managing wireless data networks to fulfil the emerging challenges imposed by MEC, namely: energy efficiency, virtualization, storage and processing at the network edge.In order to answer the last question, we need initially to highlight the most important network design aspects that have a strong impact on the supervision and orchestrated control of upcoming networks available resources.These design aspects are: mobile cloud computing [13], integration of Cloud Computing and the IoT [14], the forthcoming 5G access technology [15], [16], and SDN [17].As an example, 5G novel scenarios involve extending the network operator infrastructure with Femtocells, cellular relays or middle-boxes.In addition, with the imminent appearance of IoT, mobile networks need to be extended at their edge with some capillary (multi-hop) networks, connecting a huge diversity of devices, namely: sensors, actuators, high-level peer-to-peer data caches, terrestrial vehicles, under-water sensors, under-ground sensors, and aerial vehicles.In this context, it is investigated a new type of Wireless Sensor Network (WSN) that is often termed a Shared Sensor Network (SSN) [18].Further, the mobile operators are studying solutions where on the edge of their networks there are virtualized machines offering their customers some data storage and application housing with a lower RTT than the alternative based on remote cloud.The new solution is called Mobile Edge Computing (MEC) [19] (or Fog Computing).As an example, [13] studies a three tier architecture for mobile computation offloading.
The Fig. 2 visualizes some heterogeneous small cells (e.g.Pico cell, Femtocell) mobile networks deployed within a macrocell.Each small cell is like a cluster entity with a Cluster Header and diverse cluster nodes (e.g.sensors).In this way some advantages are obtained: i) improve the network coverage by exploiting spatial/temporal reuse of the spectrum, ii) enhance the capacity of cellular network by offloading traffic from the BS, iii) managing data/service location at the network edge (i.e.FC) to diminish the network latency/jitter and increase the data rate, iv) load balancing among the diverse backhaul links, v) flow routing in the optimum conditions (e.g.low energy consumption, high rate, low delay, low interference, high robustness against fails).Other potential advantages can include [20]: i) combining wireless and backhaul resources in the network selection, ii) orchestrating flow admission and flow rate policing to guarantee end-toend QoS/QoE.Other edge access networks visualized in Fig. 2 are discussed in subsequent sections.

Fig. 2. Design of a Heterogeneous Wireless Access Network involving Mobile Edge Computing Scenarios
Although there is some recent work related to MEC [21], it is mainly focused on economic approaches, including contract theory [22] that is very popular for emerging wireless network or micro grid scenarios.By contrast, our work aims to offer an updated review of literature that applies GT in wireless data communication networks with a main focus in allocating (satisfactorily but in a fair way) mobile network resources in use cases of incomplete information where the aspects of learning, cooperation, and social connections are very relevant.In particular, we study realistic scenarios wellaligned with MEC such as heterogeneous access, small cells, D2D communications, vehicular networks, micro-grid, IoT, energy consumption, energy harvesting, and mobile social networks.To the best of our knowledge, the current work is the first one to offer a comprehensive discussion about GT applied to MEC, including a background material for nonspecialist GT readers as well as future perspectives on using GT in MEC.Our contributions are as follows: • Briefly introduces the key concepts of a classical game as well as an evolutionary game, comparing both approaches in terms of their main characteristics; • Discusses diverse game solutions and their stability, including realistic limitations such as self-interest behaviour, incomplete information, and game parameters with random values; • Studies with some formalism how an evolutionary game can be analysed; • Classifies the surveyed contributions using a proposed taxonomy (Fig. 7), which also defines the structure of section 3, where we discuss the most part of the revised work; • Points out open research issues in GT to address novel challenges imposed by expected MEC scenarios.The rest of the paper is organized as follows.Section 2 of the paper briefly presents the more fundamental concepts and models of GT, including other facets of the game formulation, possible solutions, and analytical considerations which are required to follow our discussion in later sections of the paper.We also provide a background on MEC.In section 3, we review the literature related to the application of GT to wireless communications and networking, using the structure shown in Fig. 7 of section 2.9.In section 4, we discuss relevant research challenges for applying GT to emerging MEC scenarios.Finally, section 5 concludes our publication.The readers already specialized in GT/MEC could go to section 3.

Basics of Game Theory and Mobile Edge Computing
This section provides useful information to support our discussion about GT/MEC surveyed work in the next sections.Alternatively, the readers already familiarized with GT/MEC could go to section 3, where we start the literature review.

Selecting the Right Technique to Solve a Network Problem
Before finding the optimum solution of a network problem, e.g.optimization goal, the researcher initially needs to select the right technique to find that solution.For this, she (he) needs to verify if that problem involves players with common or different objectives.In the case when the players share a common set of objectives, the researcher could select the most convenient technique from the ones listed in the left branch of Fig. 3.Alternatively, if the problem involves players with different objectives and even with completely opposite goals, the researcher should choose the most suitable theoretical game for that problem, as further shown in the right branch of Fig. 3.

Fig. 3. Taxonomy of Optimization Techniques
GT is not only about optimization; it can be also useful to design a system, algorithm, or protocol, satisfying a set of functional requisites.These goals could be the deployment and management of intelligent, opportunistic, and heterogeneous network topologies.To illustrate the difference between optimization and design goals, we point out the following work: i) the optimization of transmit strategies in hierarchical mobile networks [23]; ii) design of an algorithm to dynamically offload the traffic of busy cells to adjacent underutilized cells [24].
The authors of [25] stated that game theoretic methods are now central to much investigation.They suggest areas where further advances are important, and argue that models of learning are a promising route for improving and widening GT's predictive power, hopefully in a polynomial time, while preserving the successes of GT where it already works well.They also emphasize facets for GT becomes more useful: i) the game results should not only be valid but precise; ii) the GT models should have limited complexity, and easily reproduced and tested by the research community.

Classical and Evolutionary Game Theory with Cooperation Incentives
Repeating what was already stated, GT is a mathematical tool that studies the interaction among the players participating in a specific use case.There are two distinct high-level perspectives within GT (see Fig. 4): classical and evolutionary.The classical GT essentially requires that all the players make rational choices among a pre-defined set of static strategies.Consequently, it is fundamental in GT that each player must consider the strategic analysis that the players' opponents are making in determining that her/his own static strategic choice is appropriate.By contrast, a more recent branch of GT -Evolutionary GT (EGT) -states that the players are not completely rational, the players have limited information about available choices and consequences, and the strategies are not static (the strategies evolve).The players have a preferred strategy that is continuously ranked with other alternative strategies.If a player finds a better strategy, then she/he prefers to change to that strategy to get a better reward (so-called fitness).The decision to change the preferred strategy can be also influenced by other neighbouring players belonging to the same population (by observation and learning).In this way, the strategy with the highest selection score inside a group of individuals forming a community will become the predominant strategy for that generation of individuals, and it will be transferred to the next generation of individuals (evolutionary aspect).During this transference, some individual characteristics can be changed through selection, crossover or mutation operations.A major potential advantage of evolutionary GT compared to the classical one is that the system's equilibrium could be achieved more quickly.The system's equilibrium (i.e.game solution) in both classic and evolutionary models means a stable system's configuration, satisfying diverse system requirements and from which no player aims to deviate from.As a player deviates from the system's equilibrium then that player would be penalized by some form of degradation on her/his payoff (classical game) or fitness (evolutionary game).Table II compares the main elements of a classical game with the corresponding facets of an evolutionary game.
The GT classical approach can be divided into two dominant branches, namely non-cooperative and cooperative (e.g.coalitional).The essential distinction between these branches is that in non-cooperative GT the basic modeling unit is the individual player, while in cooperative GT the basic modeling unit is a group of several players sharing a common goal but in competition with other groups.The game benefit is also divided in different ways.In a non-cooperative game, each participant gains an individual benefit by default in a selfish way.By contrast, in a cooperative game the benefit obtained by a coalition group is shared among the diverse elements forming that group.
In non-cooperative games some incentives to nodes normally self-interested to collaborate among them to reach a common goal can be inoculated in those games by either pricing [26]- [29], [30], reputation [31]- [33], or others [22][33] [34][35][36][37].This collaboration is useful as the network resources are scarce for satisfying high loads.In this way, the available resources should be fairly distributed among the diverse accepted flows.A pricing scheme to give a strong incentive for collaboration is discussed in [30].They studied the problem of local cooperative application execution for mobile cloud computing in opposition to move the task execution to a remote cloud.The main expectation behind the local task execution is reduce the latency associated to the complete execution of that task.To encourage mobile devices for sharing their unused resources, they design an incentive scheme, which benefits both the tasks' owners and mobile devices, encouraging local task execution.To model this, they proposed a NC Stackelberg game that permits task owner (leader) to decide the price that can be offered to mobile devices for application execution and, the amount of execution units each device (follower) aims to provide, which is constrained by its battery autonomy.In [29] they analysed a spectrum oligopoly market with primaries and secondaries where secondaries select a channel depending on the price quoted by the primary and the transmission rate the channel offers.
A valid alternative to pricing is reputation.Reputation is defined as the amount of trust inspired by a particular node of a network in a specific community or domain [3].It is a hybrid mechanism because it combines incentives and punishments.Members with a good reputation, as they positively contribute to the community, can use the network resources without any significant limitation, while nodes with a bad reputation, as they usually refuse to collaborate, are gradually constrained of using the network resources.A significant disadvantage of a reputation-based system is the considerable waste of node battery and communication bandwidth to support the continuous reporting of node behavior.
Reference [34] discusses further mechanisms besides pricing and reputation to enable positive interaction among users: auctions, lotteries, bargaining games, contract theory, and market-driven solutions.The authors of [35] study how the incentives are provided to the participants' social friends to stimulate cooperation, rather than directly incentivizing participants themselves.In addition, repeated games [33] can enforce coordination among their rational players via either a short-term (e.g.tit-for-tat) or a long-term (e.g.cartel maintenance framework) punishment for the selfish players [38].Other new and more efficient reciprocal altruistic strategies can be studied using the Axelrod-Python library.A potential drawback of cooperation is the extra coordination messages among players that overload resource-constrained networks.To mitigate this, [39] reduces overhead while keeping desired control.An EGT-based mechanism enforces evolution on cooperation in mobile social networks [36].The stimulus for cooperation in multihop wireless networks is studied in [37].

Definition of a Classical Game
A classical game can be represented in a normal form way.This representation assumes a tuple with three elements, (N, A, u), where: • N is a finite set of n players, indexed by i; these players make the relevant decisions (i.e.action choice) when the game is being performed; • A=A1 x … x An, where Ai is a finite set of actions available to player i.Each vector a=(a1, …, am) is designated an action profile for that player; • u = (u1, …, un), where ui is a real-valued utility (or payoff) function for player i.The payoff is a kind of reward a specific player will receive at the end of the game constrained by the decisions of other players.
A very popular game is the Prisoner´s Dilemma among two players.Fig. 5 shows its normal (or strategic) form representation.Here two players have each own two possible strategies: action 1 (to cooperate) and action 2 (to defect).Any payoff values respecting the following conditions, c > a > d > b, define an instance of the current game.We following give an example how the game is played: if player 1 chooses to cooperate and player 2 to defect then player 1 receives a benefit "b" and player 2 receives a benefit "c".In addition, each player before choosing its strategy does not have any information about which decision the other is going to make.In these conditions, regardless of the other player's strategies, each player always selects defect and the equilibrium status of this game is {(d,d)}.Although, reciprocal cooperation will give to both players a better payoff, we can conclude that self-interest leads to an inefficient outcome.This also shows that the lack of information about the opponent's decision implies a bad decision in each player.In other words, the above obtained solution {(d,d)} is not a social optimum solution.To correct this, obtaining a social optimum and fair solution for this game, we should add pricing to the current game model.This new game parameter encourages players to cooperate among them and increases the system revenue.In this way, the game equilibrium will change from {(d,d)} to {(a,a)}, being the latter system status the expected social optimum solution of this game.This outcome obtained from a very simple model is still valid for the diverse equivalent scenarios presented and discussed through the current paper.Using the normal form representation, the game typically involves a single stage.There is an alternative representation, designated as the extensive form, where a dynamic game has several stages, and each player chooses a strategy at each stage.Details on the extensive form are out of scope of this paper.The Game Theory Explorer (GTE) framework allows the creation of games in either extensive or strategic form and computes their equilibria [41].In addition, a web resource with a large amount of namely simulation applets in various facets of GT is available in [42] .

Analysing a Classical Game: from Optimality to Equilibrium
The question that now arises is how to reason about a normal form game. Depending on how the game outcomes are analysed, there are two possible solutions: Pareto optimality and Nash Equilibrium (NE).As the game analysis is made by an external observer, who is interested in finding the strategy profile that maximizes the sum of players' utilities.This maximizing strategy profile is the game Pareto optimum.As an example, one can imagine a system allocating a specific resource among a set of players.Pareto efficiency, or Pareto optimality, is a state of allocation of resources in which it is impossible to make any one player better off without making at least one player worse off.It measures the system efficiency.Alternatively to the Pareto study, considering the game evaluation is made individually by each player, the NE was proposed by John Nash after the von Neumann's Minmax Theorem restricted to only two-player zero-sum games.
Both theorems rely on the existence of game equilibrium obtained from the mixed strategies of selfish players.A mixed strategy of a player is a probability distribution over his set of static strategies.The number of strategies should be normally finite.As the game equilibrium is evaluated (e.g.NE) no player can improve her/his expected payoff by changing its current strategy.The set of non-deviating strategies (one strategy per player) forms a strategy vector (i.e.strategy profile or NE) with n positions, where n is the number of game players.Depending on the scope (more general or specific) of the equilibrium, the type of game, the amount of available information, or correlation on players' choices of strategies, there are a few equilibrium types that could be considered as results of a game: these are the correlated equilibrium, Bayesian Nash equilibrium, and subgame perfect Nash equilibrium.Definitions of these are available in [43].There is also an interesting solution, designated as Mechanism Design, to increase the NE efficiency.The mechanism design alters the original game, e.g. by applying an affine transformation on the utility functions and tuning the associated parameters, to obtain an NE that is more efficient than the one considered in the original game [44].This efficiency gain is ensured by incentivizing strategic agents to maximize their sum of utilities but it does not account for a fair allocation of network resources among them.A very recent work [45] proposes social utilities to achieve both fairness of allocation and a reduction in tax variation among strategic agents, which goes beyond the standard maximization of sum utility.
Comparing both NE and Pareto system designs, one can also conclude that, on the one hand, NE is typically obtained using the same algorithm but instantiated among diverse players.This design involves normally a high cost to obtain the required result due to players behave selfishly, leading to inefficient NE solutions.On the other hand, Pareto evaluation requires a more centralized design and assumes a certain degree of cooperation among the players.Due to this cooperation, the cost to obtain the aimed result is minimized.So, there is a need to measure how inefficient is the NE solution in comparison to the Pareto scenario.This inefficiency measurement associated with NE is evaluated as the ratio between the costs of an NE over the optimum cost, i.e. the Price of Anarchy.Additionally, in scenarios with very high complexity and dynamics, where different types of agents with various characteristics and requirements aim to interact with each other, it is very hard to deploy conventional GT models to fulfil cooperation among agents.There is an alternative technique designated by Matching Theory that is out of the scope of our publication.For further information, the interested reader can consult [46] [47].A comprehensive GT review for complex interactions among agents is in [48].

Discussion on Nash Equilibrium
The NE can be evaluated in both game types: non-cooperative and cooperative [49].The authors of [30] have studied the probability to find the NE in each game.Initially, in non-cooperative games with incomplete information about the strategies of the opponents, the probability to find the NE decreases as that available information shrinks.This occurs because the lack of available information means less data about the strategy space of the game opponents.Consequently, the moves of the other players are considered in a random way and it is more difficult to obtain probabilistically the game's NE.In contrast, when a cooperative game is played, the probability of finding the more convenient set of NE increases.In fact, considering a cooperative model, each player has more complete information about the strategy space of others.In this way, a specific player can select the more suitable strategy almost without any conflict, which results in achieving the more convenient NE points with higher probability than in the initial case [49].
Another problem associated with a game's NE is about its uniqueness.In fact, some games with static strategies have multiple NEs.A method to solve this problem and to guarantee a single equilibrium is by using a mixed-strategy game, i.e. each strategy has associated a specific probability that defines the odds of that strategy to be selected by a game player.For a comprehensive contribution discussing the uniqueness of NE on network games, see [50].
There is a particularization of NE designated as Correlated Equilibrium (CE) that describes a condition of competitive optimality that can be found easier than NE.It requires observation of the history of decisions (or their associated outcomes) to naturally correlate players' future decisions.This coordination among players in CE can potentially lead to their higher social welfare than making decision independently as in NE.Reaching a CE implies the capability of players to coordinate their decisions in a distributed way as there was an external observer that they all trust, sending to players some recommendations.These recommendations are associated to a global probability distribution, which is created by the external entity, over the set of strategy vectors.Other more practical reason for a game having a CE is due to the players use the same random events (e.g.due to interference in wireless communications) for choosing their strategies.
When a game is deployed at the network via a distributed algorithm there is always the need to tune it to find the right balance between network optimization and service fairness (e.g.among customers).A hybrid solution based on a Stackelberg game (e.g.network operator is the leader) combined with a CG (e.g. each coalition is formed by local customers with similar service expectations is a follower) can be very useful to adopt in these scenarios.
Computing an NE is generally hard in a repeated game (RG).This difficulty lies in computing the NE because the strategy space of a RG is much larger than that of a one-shot game.In addition, in a RG due to the many possible strategies the opponents of each player might be using, it is hard to learn which strategies will do well.In [51], the authors proposed a new refinement of NE in RGs, namely the level-k equilibrium, in order to develop tractable and general algorithms to compute the NE of RGs.Based on this concept, they show the computational effort to find the NE of a RG can be diminished as they aggregate players within groups of k-elements.Inside each group, there is a complete coordination among its elements that enables a group evolution (learning) to select a stable payoff profile from which none element of that group has any incentive to deviate from.This proposal is only discussed for symmetric RGs and it neither uses mutation nor crossover.
To discover the solution of a NC game, two classes should be considered [52].The first is the class of Nash Equilibrium Problems (NEPs) where the interactions among players take place at the level of objective functions only.The second is the class of Generalized NEPs (GNEPs) where the choices available to each player also depend on actions taken by her/his rivals.The NEP is, by far, better studied and "easier."The GNEP has a wider range of applicability, but sparser results are available for studying it.In this context, the usage of Variational Inequality (VI) can be very useful.VI is a discipline in the field of mathematical programming which provides a unifying framework to study optimization and equilibrium problems [53].Further information on VI is in [52] and about game solution is in [44], [54].
Fig. 6 summarizes how a classical game can be represented (strategic vs. coalition) to find a problem´s solution (NE, Pareto, Core, Shapley) [44].Then, the set of available solutions can be studied from diverse perspectives (existence, uniqueness, characterization, efficiency).As a final step, an algorithmic solution is designed to study its behaviour to solve the initial problem (e.g.Best-Response Dynamics).The definition of Best-Response Dynamics is available in [44] among other very relevant discussed game solution concepts.

Definition of an Evolutionary Game
We discuss evolutionary-game theory (EGT) to model dynamic games [4].The term "dynamic" is applied to the strategy of each player.EGT has been developed as a mathematical framework to study the interaction among rational biological agents in a population.In evolutionary games, the agent revolves the chosen strategy based on its payoff.In this way, both static and dynamic behaviour of the game can be analysed [3].In this way, on one hand, Evolutionary Stable Strategies (ESSs) are used to study a static evolutionary game.On the other hand, replicator dynamics is used to study a dynamic evolutionary game (see more about this below).EGT usually considers a set of players that interact within a game and then die, giving birth to a new player generation that fully inherits its ancestor's knowledge.The new player strategy is evaluated against to the one of its ancestors and its current environmental context.Also, through mutation, a slightly distinct strategy may be selected, probably offering better payoffs.Next, the player competes with the other players within the evolutionary game using a strategy that increases its payoff.In this way, strategies with high payoffs will survive inside the system as more players will tend to choose them, while weak strategies will eventually disappear.In the following we present a mini tutorial about how EGT can be applied to wireless networks [4].
Formally, we should consider within an evolutionary game an infinite population of individuals that react to changes of their environmental surroundings using a finite set of m pure strategies S = {s1, s2, …, sm}.There is also a population profile, i.e.X = {x1, x2, …, xm}, which denotes the popularity of each strategy   ∈  among the individuals.This means that xi is the probability that a strategy si is played by the individuals.For this reason, X is also designated as the set of mixed strategies.
Consider an individual in a population with a profile X.Its expected payoff when choosing to play strategy si is given by f (si, X).In a two-player game, if an individual chooses strategy si and its opponent responds with strategy sj, the payoff of the former player is given by f (si, sj).In a more generic way, the expected payoff of strategy si is evaluated by . (  ,   ), whereas the average payoff is given by   = ∑   .   =1 .

Analysing an Evolutionary Game: from Population Profile (game internal aspect) towards Evolutionary Stable Strategy (result obtained by an external entity)
To analyse an evolutionary game is fundamental for studying the behavior of a special function designated by replicator dynamics.The replicator dynamics is a differential equation that describes the dynamics of an evolutionary game without mutation [4].According to this differential equation, the rate of growth of a specific strategy is proportional to the difference between the expected payoff of that strategy and the overall average payoff of the population, as follows, ̇=   .(  −   ).Using this equation, if a strategy has a much better payoff than the average, the number of individuals from the population that tend to choose it increases.On the contrary, a strategy with a lower payoff than the average is preferred less and eventually is eliminated from X.
Considering now the mutation issue, suppose that a small group of mutants  ∈ [0,1] with a profile  ′ ≠  invades the population.The profile of the newly formed population is given by   = . ′ + (1 − ) ..Hence, the average payoff of non-mutants will be    − = (,   ) = ∑   .(,   )  =1 and the average payoff of mutants will be given by     = ( ′ ,   ) = ∑  ′  .(,   )  =1 . A strategy x is called ESS if for any  ′ ≠ ,   ∈ [0,1] exists such that for all  ∈ [0,   ], the following equation holds:    − >     .In this way, when an ESS is reached, the population is immune from being invaded by other groups with different population profiles.In other words, an ESS is a mixed strategy that is "resistant to invasion" by new strategies.One can also conclude that the ESS is obtained after a successive set of payoffs analysis was made by an external observer of the game being studied.
A very interesting practical scenario to use the ESS result is the one applied to the scenario of a network domain that the operator aims to keep protected from a DDoS attack.In the case that domain is following a strategy very near to ESS, the network should continue to operate without problems after a DDoS originated in some compromised internal nodes.

Behavioral Facets of Game Theory
The design of wireless networking is challenging due to the highly dynamic environmental condition that makes model parameter optimization a complex task.Due to the dynamic, and often unknown, network status, modern wireless networking proposals increasingly rely on artificial intelligence algorithms.Genetic algorithms (GAs) are well known for their remarkable generality and flexibility and have been applied in a wide diversity of scenarios in wireless networks.A comprehensive survey of the applications of GAs in wireless networks is available in [55].
A GA models how a population of entities with similar characteristics (i.e.species), after several generations, through a process of genetic inheritance constrained by mutation, crossover and natural selection, can evolve to a future generation of that species, which is formed by a majority of entities fully-specialized in solving a specific problem.Applying now the GA methodology to the area of wireless networks, one can conclude that discovering the best solution for a wireless network problem could have a huge impact on the system´s performance; this is related to the necessary time to find out the optimum solution of that problem, as well as the amount of energy depleted on each mobile device's battery to discover the optimum solution in the typical distributed architecture of a wireless networking infrastructure.As a more concrete example, a GA to optimize either routing or localization function in a WSN could not be an efficient proposal to real-time applications due to both high delay and resource consumption (e.g.battery) induced by GA's operation.In this context, other alternative learning techniques, such as Reinforcement Learning (RL) performs better than GA, thus increasing WSN lifespan.RL is an unsupervised learning technique where each entity does not inherit any knowledge from the previous generation and learns only by itself through direct experience and interaction with the environment.
In spite of the GA issues that we have just discussed, a GA strong advantage is about its easy deployment in digital signal processors (DSPs) or field programmable gate arrays (FPGAs) [55].Aligned with this, a recent contribution uses GA to enhance the WSN lifespan by employing a new rotated crossover combined with diverse mutation operators [56].

Our Proposed Taxonomy and Brief Introduction on each Game Type
This section discusses the basic concepts of the most representative games found in the literature, organized in twotopmost distinct perspectives: classical and evolutionary (Fig. 7).Firstly, we can expand the classic GT perspective as follows.In non-cooperative GT, the fundamental modelling unit is the individual player, including her/his knowledge about the game status, expectations, and possible static strategies.In this game type, the main goal is to evaluate whether there exists a reasonable solution for that game.This solution implies a set of strategies that the players would rationally select for optimizing their own utility or payoff.At this point, it should be clear that a non-cooperative (NC) game is defined as a model in which any desired cooperation must be self-enforced.In addition, the players make their decisions in a simultaneous way.So, each player has normally no information about the decisions of others.An exception for this characteristic occurs in a special game designated by Stackelberg game (SG) (or Leader vs. Followers).The initial idea behind a SG is that one player, denoted as leader, has the right to take the first action.This action is selected in order to the leader optimizes its reward after observing the other players (i.e.followers) strategies.Then, the leader announces its preferred strategy to the followers.The followers observe the leader's action and adapt their strategies so as to minimize their own cost.After this step, the followers announce their strategies again to the leader.In this way, we can define a SG as a sequential model that analyses the interaction between a leader and a set of followers in order at least a specific model goal can be achieved (Fig. 8).The final aim of this game type is to discover the Stackelberg Equilibrium (SE) given by the following vector, (Strategy_leader*, Strategy_follower*).
A repeated game (RG) is a NC game that is played through a finite number of turns.A historical log is also kept, with all the players' decisions through the several iterations of that game.We can also classify repeated GT as an extensive form of GT, meaning a player has to take into account the impact of her/his current action on the future actions of others [57].This special behaviour can enforce some cooperation level among the players, including in the players with no initial intention to cooperate (i.e.selfish players) with others.A RG is also sometimes designated as a dynamic game.The authors of [33] propose a taxonomy of applications of RGs in wireless networks that we summarize in Table III   CGs prove to be very appealing to design fair, robust, practical, and efficient cooperation strategies in communication networks.The main aim behind a CG is that several players choose to form an alliance because they share a common objective and they can do better as a group than by acting alone.We can envision several interesting applications of CGs, namely: heterogeneous wireless network access, small cells, D2D communications, Vehicular Networks, dynamic spectrum access, Cloud Radio Access Networks, delay-tolerant networks, and wireless sensor networks.A key design issue in CGs is the trade-off between the stability of the coalitions and the network efficiency [3].Notably, there are important differences in how the payoff is allocated to each player depending on whether the game is either a NC game with a specific incentive to cooperation or a CG.In fact, on one hand, as a NC game is used, each player receives his own payoff.On the other hand, in a CG the payoff of a coalition (i.e. group of players forming a single cluster) is alternatively divided among all the elements of that group as the game has a Transferrable Utility (TU) [58], [59].Alternatively, the payoff is not divided among the coalition members because the game has a Non-Transferrable Utility (NTU).This means that the individual payoff of each player cannot be given or transferred arbitrarily to other players due to practical impairments [60][61] [9].Regardless of this, each agent of a specific coalition gets a benefit that depends on the actions chosen by the remainder agents of that coalition.
There are three types of CGs.The first is designated by canonical games.The main goal of this game is to get the grand coalition of all users.The major potential problem is how to stabilize that grand coalition.The second type is designated by coalition formation games.These games assume that forming a coalition brings advantage to its members but the gains are limited by the cost for forming that coalition.So, the key question is how to form an adequate coalitional structure (topology) and how to study its properties?In addition, the algorithm used by a coalition formation game has typically two functions, i.e. to merge and split coalitions.The convergence time of these functions is very critical for the game scalability.In the third and last type, we have coalitional graph games.In these games the players' interactions are ruled by a communication graph structure.The key question associated with this game type is how to stabilize the graph structure.In this game, the interconnection between the players strongly affects the characteristics as well as the outcome of the game [62].
In an evolutionary game, the game is played repeatedly among some elected agents with evolving strategies.The two major mechanisms associated with the evolutionary process along the diverse population generations are mutation and selection.The mutation (static) mechanism is used to guarantee the diversity of the population.The selection (dynamic) mechanism is used to promote the genetic code of agents with higher fitness over other agents with low fitness.In this way, the latter agents tend to disappear as the evolutionary process continues.Applying evolutionary algorithms to theoretical games allow players with limited-rationality to learn from the environment and take individual decisions for attaining each game's equilibrium with minimum control exchange.We have found in the literature some important contributions that illustrate how evolutionary game-theoretic (EGT) models may be successfully applied to analyse a huge variety of networking functional aspects.The authors of [4] review the literature concerning the applications of EGT to distinct network types such as wireless sensor networks, delay tolerant networks, peer-to-peer networks and wireless networks in general, including heterogeneous 4G networks and cloud environments.In addition, [3] discusses selected applications of EGT in wireless communications and networking, including congestion control, contention-based (i.e.Aloha) protocol adaptation, power control in CDMA, routing, cooperative sensing in cognitive radio, TCP throughput adaptation, and service-provider network selection.A comprehensive discussion about finding stable states on evolutionary games is available in [51].In [63] they discuss the stability of a RL-based distributed mechanism for strategy and payoff learning in 4G networks based on evolutionary game dynamics.Further, a broader perspective in how evolutionary games can be very useful in future wireless networks are available in [64] [65].
We have used the open-source Axelrod library1 to model a population with natural evolution and mutation of 0.05.Each round every player interacts with every other player via a prisoner's dilemma game, choosing one of the following cooperation strategies: Cooperator, Defector, Stochastic WSLS, and ZD-GTFT-2.After the scores are summed for each player, we choose one to reproduce proportionally to its score (fitness proportionate selection) and a player is chosen to death, keeping the population size always in 30 players (i.e.Moran model).The simulation is stopped after 1000 rounds because the game never converges to a status with a single type of players.Fig. 9 shows averaged values from 200 turns per round.The final winning strategy is Zero Determinant Generous Tit For Tat.A reason for this is that slow to anger and fast to forgive strategies show the best performance because they recover more efficiently from eventual previous round errors on the players' intention to collaborate.The results also suggest that players who chose Always Defect earn substantially less than those who chose conditionally cooperative strategies such as the variants of Tit-for-Tat.That is why the former player almost does not exist at the end.

Fig. 9. Average Number of Individuals for each Type along 1000 Rounds of an Evolutive Game with Mutation
A Bayesian game (BG) is characterized by game information is not common knowledge, allowing concepts such as private information and secret information to appear [4] or, the game players do not have complete information on the environment they face due to some practical physical impairments that counteract the global dissemination among the nodes of, e.g.channel gain information [66].Following John C. Harsanyi's framework, a BG is modelled by introducing Nature as a player in that game.This framework changes the game type: from Incomplete to Imperfect Information, where Imperfect Information means the history of the game is not available to all players.These games are also called Bayesian because of the probabilistic analysis inherent in these games.Players have initial beliefs about others' payoff functions.A belief is a probability distribution over the possible types for a player.Then, the initial beliefs might change based on the actions the players of the game have played.Comparing a BG with a non-BG (both in normal form), the latter only requires the specification of strategy spaces and payoff functions; and, the former requires the additional specification of beliefs for every player.
The players in BGs select their strategies according to Bayes' Rule [67].In [68] is available a comprehensive contribution that introduces Bayesian mechanism design.In the case a game with incomplete information is repeated, the folk theorem [69] can be very useful to find a stable solution for that RG.This theorem proves that individual rational payoffs of a one-shot game (repeated infinitely) can be approximated by sequential equilibrium payoffs of a long but finite RG of incomplete information, where players' payoffs are almost certainly as in the one-shot game.

MEC Foundational Aspects
The emerging MEC paradigm aims to support at the network periphery a ubiquitous and efficient access applied to a few number of operational resources, such as: data and/or service storage (e.g.distributed cache, proxies), computational capabilities, and data dissemination.The following aspects can be addressed by MEC: enhance network coverage and capacity, performing load balancing at backhaul links, orchestrating the usage of available network/computing/storage resources according the QoS/QoE of end-users, increasing network lifetime by saving the energy consumption mainly at battery-operated devices, and optimizing data routing based on customers social connections.
The models for distinct entities of MEC systems are comprehensively discussed in [19], including for computational tasks, wireless data communication channels and networks, as well as the computation latency, delay on data availability and energy consumption of mobile devices or MEC servers operating at the network edge.
Further advantages of using MEC are highlighted in [21]: • It reduces the data traffic, cost, and latency and improves QoS/QoE since cloud resources and services are located close to users.• It alleviates the major bottleneck and the risk of a potential point of failure since has a decentralized design.
• It enhances security since data is encrypted as the data is moved towards the network edge.
• It provides high levels of scalability, reliability, and automation.
The reader can obtain further information about MEC from different perspectives, such as: standardization [70], industry [71], or academia [72].As already stated before, our current main contribution is to envisage and discuss how GT can be used to empower the gradual deployment of emerging MEC scenarios without penalizing the satisfaction of players' expectations.In addition, a remarkable aspect involved in creating a game model is the design of utility functions to evaluate the players' payoffs associated to the available set of strategies.These utility functions should be very well aligned with the final outcomes of each game as well as the specific scenario requisites to be satisfied.A taxonomy and research challenges of utility functions for strategic radio resource management games is available in [73].Additionally, a recent work [74] proposes an abstract algorithm, designated by BLMA, to pair strategic agents in a generic two-sided market without the conventional allocation of utilities among users.This solution is considered in the scenario of cognitive radios and ensures privacy among users.

Game Theory for Modeling and Analysis of Wireless Data Communication Networks
This section comprehensively discusses work that investigated key topics such as resource sharing in wireless data communication networks, assuming scenarios aligned with the MEC vision, which is further debated in section 4. Some analysed scenarios are namely, hierarchical cellular, D2D/M2M, vehicular, or smart grid communications.The network resources, as well as model parameters, system constraints, trade-offs among players' interests and optimization objectives, are identified, modelled, and investigated.It studies how GT can achieve all the above mentioned objectives.

Generic Perspective
We have found in the literature a significant number of surveys about GT being used as an analytical tool for enhancing notable networking features.These surveys cover the following wireless network aspects: single-hop [75] , vehicular [93], and D2D communications [94].There is also a survey related with telecommunications [95].From these, we should mention the following work: a comprehensive discussion about GT usage is driven in a network-layered perspective [75]; multiple access games are analysed in [43]; GT models for random access with CSMA are covered in [76]; games about resource management and admission control are addressed by [77]; game theoretical contributions for network selection and resource allocation are available in [31][79] [80]; opportunistic communications in hierarchical cognitive networks are the scope of [90]; cooperation stimulation mechanisms for wireless multihop networks are investigated in [37], including their strengths and weaknesses; and evolutionary CGs for wireless networking and communications are addressed in [64] [96].
Discussed in [97][61] [98][99][100][101] [102] are several game theoretical contributions that are highly relevant for spectrum sharing, including a theoretical framework (taxonomy) to systematically understand and tackle the issue of economic viability of cooperation based on dynamic spectrum management [102].The authors of [103] propose a constrained coalition formation game, where each UE is a player whose cost is identified as the content upload time.The solution of the game determines the stable feasible partition for the UEs in the cell.Then, the proposed cooperative content uploading scheme guarantees lower upload delays than in the traditional cellular operation mode.In addition, the authors of [26] revise GT for existing cooperation stimulation mechanisms.They also discuss important issues in this field such as false judgment and node collusion.In addition, they argue that the root of these problems originates from the inability to evaluate accurately the behaviour of a node.This requires further investigation on.
For absolute beginners in GT, a recent work [104] offers code to reproduce some discussed results, whereas we suggest [38][3][4] [5] for those readers self-interested in a thorough discussion on GT applied to wireless communications.The authors of [104] discuss a two-player strategic-form game designated by near-far effect (NFE) game, in which the two transmitters interfere with each other in the attempt to reach their own receiver.This simple scheme is the foundational model of many scenarios discussed along our publication, namely: i) communications among neighbouring cells; ii) hierarchical network formed by small cells within macrocells; iii) cognitive radio system joining primary and secondary users; iv) device-to-device communications.The utility function of each terminal achieves a degree of satisfaction that depends both on the success of its transmission and on the energy spent to transmit at a specific power.In addition, due to the selfish behaviour (in the sense of self-optimization of the utility) of the players, this game has the problem of NE being socially inefficient since the system distributes most part of the available resources to the devices which can achieve higher throughputs.To solve this problem, they continue their discussion presenting three popular methods that can improve the NE efficiency of the discussed game, as follows: i) modifying the utility functions by pricing the strategies (i.e.introducing a pricing factor as an externality in the utility function); ii) repeating the game; and iii) letting the players cooperate using a bargain algorithm (note: as this solution is scaled out for games with more than two players coalitional GT is needed).From all these methods to enhance the NE efficiency, bargaining ensures the highest level of fairness among the users.In addition, the distributed method that finds a NE nearer from the socialoptimum point is the repeated game.This last result is very interesting because it shows that a distributed solution can manage the system and obtain similar results to a centralized one.Nevertheless, the former can have two practical drawbacks: unacceptable latency due to high number of repetitions and not find the social-optimum solution due to incomplete system information.These challenges are very appealing for further investigation.Fig. 10-13 illustrates how pricing guarantees NFE social efficiency for various values of gamma (other parameters are static), which is associated to access technology, including receiver processing [104].The optimum price of each scenario is also shown.In the next section, we narrow our survey into the distinct types of games for optimizing the efficient operation of wireless networking scenarios, more specifically the ones well aligned with the current vision of MEC.

Non-cooperative Games
From the available literature, we have selected non-cooperative (NC) games which results can be potentially more appealing for the upcoming MEC scenarios, as follows: network resource sharing [77]  We following analyse the literature in diverse scenarios with selfish players.The discussed work is related with spectrum sharing [108], power control in either one-hop [112] or multi-hop [118] uplink cellular communications, media access protocol [40][28], business model [142], multirate opportunistic routing [128], energy efficiency [110], scheduling of beacons (discovery signals) for competing drones [130], and allocating resources according to the expected load [141].
The communications on the network edge requires mitigating the interference not only within a cell but also among cells.This interference mitigation is very important to optimize the usage of network resources.In [108] they study spectrum sharing for D2D communications in scenarios such as public safety and vehicular communications.The pool of shared spectrum is obtained from distinct mobile operators involved in a NC game that analyses possible negotiation strategies among them.The mobile operators submit proposals to each other in parallel until a consensus is found.The authors reach the following conclusions: i) the existence of a unique equilibrium point will occur when every mobile operator has a concave utility function on the box-constrained region and all eigenvalues of derivatives of iterative response process are less than unity.In the case a mobile operator does not fulfil the two above conditions then it cannot be accepted within the network since its participation could cause the existence of multiple equilibria which is undesirable; ii) the iterative algorithm based on the mobile operator's best response might not converge to the equilibrium point due to myopically overreacting to the response of the other mobile operators; iii) alternatively, when the Jacobi-play iterative algorithm for updating the mobile operators' strategies is used with a proper smoothing parameter, then all mobile operators experience performance gains compared with the scheme without spectrum sharing.The Jacobi-play strategy update assumes that all players adjust access probabilities in a specific direction, depending on the measured congestion level of the network, and towards the best-response strategy; iv) In this game asymmetric mobile operators could contribute with an unequal amount of resources to the spectrum pool; v) The authors also assume that the iteration converges much faster than any change detected in the channel.
At the network edge is also relevant to control the transmission power of cognitive radio nodes.In [112] they propose a new chaos based cost function to design power control algorithm and analyse the dynamic spectrum sharing issue among secondary users for their uplink communications through cellular cognitive radio networks.They specify utility/cost functions considering the interference from and the interference tolerance of the primary users.They show that their power control game rapidly converges to a single NE.This result is obtained in parallel with a reduction on the power consumption of cognitive radio terminals at the expense of small drift (1-3%) from the target SINR in comparison with other existing game algorithms.The authors of [118] use NC game theory paired with RL methods, to investigate the problem of power allocation in the uplink communication at both source and relay nodes on a multiple-access multiple relayed scenario.This model achieves a good compromise between energy efficiency and overall data rate.
We have assisted to a significant increase on the number of wireless networks operating at the network edge.This requires a more intelligent and flexible MAC that adjusts MAC system parameters (e.g.sensor wakeup period) to reach a fair equilibrium point, minimizing both the system latency and energy consumption [40].This also means a denser spectrum usage that requires innovation to reuse that spectrum and minimize the interference among the wireless transmitters in case of competitive transmissions [28].
A very good example of a game where their players are not directly associated to network entities is available in [40].In fact, the players of this game are related to performance network metrics, such as energy and delay.These performance metrics have conflicting operational trends among them.On one hand, the delay is minimized at the cost of increasing energy consumption.On the other hand, the energy consumption is minimized at the cost of increasing delay.Consequently, these two metrics have been formalized as the players of a NC game to solve a conflicting multi-objective optimization problem in energy-constrained, delay-sensitive wireless sensor networks.Increasing the sensor wakeup period reduces the energy consumption but increases the end-to-end (e2e) delay.They tried to solve their game over a duty-cycled MAC protocol with the length of the wake-up period as the single parameter to be optimized.They have found the Nash bargaining solution (NBS) [44] to assure energy consumption and e2e delay balancing.As a final game result, given the two performance requirements (i.e., the maximum latency tolerated by the application and the initial energy budget of nodes), the proposed game framework allows to set MAC tuneable system parameters (e.g.sensor wakeup period) to reach a fair equilibrium that minimizes both latency and energy consumption.
The authors of [28] studied the scenario of Network Utility Maximization (NUM) applied to a mobile cell with interference among the terminals within that cell, when some cooperation among the users in the form of (pricing) message passing is allowed.The NUM is a distributed algorithm for the resource allocation in a constrained network with the global objective of maximizing the total utility of users connected to that network.Using NUM, they aim to maximize the sum-throughput of the users with respect to the transmission thresholds.In this way, they propose a distributed pricing-based algorithm that converges to a stationary solution of the NUM while requiring only limited signalling among the users.This algorithm is more suitable in "collaborative" contexts, where users are willing to exchange some (limited) information (e.g.accepting some overhead and privacy loss) in favour of better performance.From their experiments, they have also concluded that in networking scenarios with "high" interference, the cooperation among the users via pricing enhances the performance of network.The authors envisage further related work in multi-hop wireless networks.
The work available in [142] proposes and analyses a business model for a likely scenario in the Internet of Things (IoT), which is made up of wireless sensor networks, service providers and users.The service providers compete against each other in the intermediation between the virtualized WSNs and the users that benefit from enhanced services built on the sensed data.The service providers pay to the WSNs for the data and charge the users for the service.The model is analysed by using oligopoly theory [29] and GT, the conditions for the existence and uniqueness of the NE are established, and the equilibrium and the social optimum are evaluated.This work can be enhanced as a more realistic price scheme is adopted instead the simple linear price scheme in the WSN's side.
Article [128] explores the combination of two possible features of wireless networks: multi-rate and opportunistic routing.The multi-rate feature is related with the multiple transmission bit rates specified by IEEE 802.11 protocols.The opportunistic routing allows any node that listens to a prior packet transmission to participate actively in packet forwarding.Aggregating these two features and as everyone follows the routing and incentive protocol, the system performance gets optimized and each node gets its payoff maximized.Specifically, the incentive protocol works as follows.They added to the network some probe messages, which measure the link loss probabilities, with a cryptographic component to prevent the probe message from being forged, and design a payment scheme to guarantee that the nodes cannot benefit from manipulating the link loss probability measuring process or deviating from the routing decision.So, the spirit behind this incentive protocol is by accepting some network overhead due to the extra probing traffic among the nodes, and maximizing the global network performance.Further work is required to protect this proposal from collusion.
In [110] the authors have proposed a framework to develop centralized and decentralized power control algorithms for energy efficiency optimization in multi-cell massive MIMO networks with the presence of relay nodes involved in multicarrier communications.From their results, they have concluded that centralized algorithms are quite robust to the enforcement of demanding rate constraints.Instead, the distributed algorithms are more sensitive to rate constraints, especially for increasing maximum feasible powers, due to the lack of centralized interference management.Also, the centralized algorithms perform better than their distributed counterparts, both with and without rate constraints, at the expense of a higher computational complexity and feedback requirements.A challenging problem still needs to be solved.It is necessary a suitable optimization framework to effectively determine the global solution of energy-efficient optimization in interference-limited networks.
In [130] they study the scheduling of beacons from an energy efficiency perspective based on a NC GT for two competing drones that are covering (e.g. using 3G, 4G, WiFi) two small cells with distinct device' s density, as visualized in Fig. 14.The backhaul communications in each drone is provided by a satellite C-band link.Each drone has two possible strategies; it is either in "idle mode" or sending beacons for mobile users on ground.The UAV's payoff is the difference between the positive outcome related with the successful first contact with mobile users on the ground and the energetic cost to achieve that.They aim to study network configurations for maximizing the likelihood of getting in contact with the mobile users on the ground with the minimum energy consumption, which is vital for drones to fly as long as possible, supporting wireless connectivity during a long time.

High density area
Low density area UAVi UAVj

Fig. 14. Drone Small cells
From the literature, a vast number of games assume that the number of players is fixed and known by everyone.However, in some scenarios the initial number of players competing for the same network resources is completely unknown.Therefore, to model this situation, the authors of [141] propose a NC game where the strategic behavior of customers in making advanced reservation is driven not only by their prices, but also by their beliefs about other customers' decisions.In addition, this game has several NE, where in some of these the provider could obtain a null profit.To avoid this, the provider may opt to be risk-averse and set a price yielding sub-optimal but guaranteed revenue.

Summary
A system designer before choosing a NC game to support the more correct operation of wireless networks should be aware that this choice offers advantages but it could also imply some disadvantages.On one hand, the main advantages are, namely: wireless nodes compete for a limited set of network resources; obtain a more robust distributed control algorithm in comparison with the centralized game option; and diminish the signalling network traffic due to local processing in each node.On the other hand, the main issues that the same system designer should be aware of, namely are as follows: absence of learning as the game is played simultaneously by all the users during a single interval of time; selfinterest user behaviour can be naturally induced by the lack of information about other players; and difficulty in attaining a global system optimization due to the distributed system design.

Stackelberg Games
NC games have a relevant drawback in terms of the absence of players learning that can impair the discovery of a game equilibrium, diminishing the game efficiency.Stackelberg games (SGs) try to enhance the efficiency of game results by playing the game in a sequential way.The surveyed papers in the current publication that rely on SGs involve diverse aspects, namely the following: offloading mobile computation [30], D2D communications [143][144] [38], study social network effect on the network congestion [145], opportunistic access in Delay Tolerant Networks [146], maximization of the channel capacity in hierarchical cellular networks [53][66] [132], micro-grid management [137][147], P2P streaming system [32], and allocate resource involving media cloud [148] served by mobile social networks [9].Following, we refrain of discussing here papers already discussed in other parts of our work.With high scalability, high video streaming quality, and low bandwidth requirement, peer-to-peer (P2P) systems have become a popular way to exchange files and deliver multimedia content over the internet.However, current P2P systems are suffering from "free-riding" due to the peers' selfish nature.In [32] they propose a credit-based incentive mechanism to encourage peers to cooperate with each other in a heterogeneous network consisting of wired and wireless peers.The proposed mechanism can provide differentiated service to peers with different credits through biased resource allocation.A SG is formulated to obtain the optimal pricing and purchasing strategies, which can jointly maximize the revenue of the uploader and the utilities of the downloaders.Their results show that the proposed resource allocation scheme is effective in providing service differentiation for peers and stimulating them to contribute to the P2P streaming.
Device-to-device (D2D) communication technology is a promising add-on component for cellular networks that offers the next advantages: increases spectral and energy efficiencies, could decrease transmission delay, offloads traffic from the macro BS, and diminishes the load of backhaul link.In this context, the authors of [143] propose a dynamic SG in which the BS and the potential D2D UEs act as the leader and the followers, respectively.Specifically, the adaptive mode selection of potential D2D UEs is formulated as a follower evolutionary game, and an evolutionary stable strategy is presented as its solution.The dynamic control of spectrum partitioning by the BS is formulated as a leader optimal control problem.They also extend the model formulation by considering information delays in control and state.The authors discuss their proposal evaluation results, which, namely suggest that although the mode selection is performed in a distributed and user-controlled way, the dynamic spectrum partitioning can be viewed as an effective incentive mechanism to drive the user distribution close to the optimal one.In [144] the authors propose a SG to study a distributed resource allocation scheme.In this game-theoretic model, the base station (BS), which is formalized as the leader, coordinates the interference from the D2D transmission to the cellular users (CUs) by pricing the interference.Subsequently, the D2D pairs, as followers, compete for the spectrum in a non-cooperative fashion.Sufficient conditions for the existence of the NE and its uniqueness are presented, and an iterative algorithm is proposed to solve the problem.Their numerical results show that the distributed scheme is effective for the resource allocation and could protect the CUs with limited signaling overhead.The authors of [38] have studied the problem of time-domain scheduling in D2D communications.They have developed a SG in which a cellular user (leader) and a D2D user (follower) are grouped into a leader-follower pair.In this way, they have proposed an algorithm for aggregating power control, time-domain scheduling, and spectrum resource allocation of D2D communications.In their study, they have considered the system metrics of throughput, interference, and fairness.Their results show the proposed algorithm can offer a good throughput performance for both cellular and D2D users.The D2D users can be also fairly served.They identified two parameters (i.e.scaling, fairness) which have important effects on the performance of the proposed algorithm.
The rapid growth of online social services has strengthened wireless users' social relationships, which in turn has resulted in more data traffic for sustaining the "social connections".Nevertheless, the increasing demand for wirelessly social services may challenge the limited capacity of wireless infrastructure.To build a thorough understanding, the work in [145] models the interplay between mobile users and a wireless provider as a SG, by jointly considering the social network effect in the social domain and the congestion effect in the physical wireless medium.Real data are used to evaluate the performance of proposed algorithms and draw useful engineering hints for wireless providers.
In resource sharing networks, opportunistic resources with dynamic quality are available to users to be exploited.As many user tasks are delay-tolerant, this potentially allows the network users to wait for and access the opportunistic resource at the time of its best quality.Aligned with this idea, [146] discusses a leader-follower game.In particular, this solution consists of two subgames.The first subgame models the interactions between the resource seller (leader) and all the network users (followers) as a single-leader multi-follower game, and the second subgame deals with the competition among network users, taking into consideration the influence of the resource seller's and the network users' actions to the dynamics of resource quality.The SE of the proposed solution is then derived, which can guide the behaviors of the leader and the followers for enhanced network performance.
Consumer electricity consumption can be controlled through electricity prices, which is called demand response.Under demand response, retailers determine their electricity prices, and customers respond accordingly with their electricity consumption levels.In a more detailed analysis, the demands of customers who own electric vehicles (EVs) are elastic with respect to price.The interaction between retailers and customers, both visualized in Fig. 15, can be formulated as a game because both attempt to maximize their own payoffs.In this way, the authors of [147] model an athome EV charging scenario as a SG and show that this game reaches an equilibrium point at which the EV charging requirements are satisfied, and retailer profits are maximized when customers use their proposed utility function.They give some engineering insights in how the equilibrium of their game can vary following the weighting factor for the utility function of each customer, resulting in various strategic choices.Due to the rapid increases in both the population of mobile social users and the demand for quality of experience (QoE), providing mobile social users with satisfied multimedia services has become an important issue.Media cloud has been shown to be an efficient solution to resolve the above issue, by allowing mobile social users to connect to it through a group of distributed brokers.However, as the resource in media cloud is limited, how to allocate resource among media cloud, brokers, and mobile social users becomes a new challenge.In this context, the authors of [142] propose a fourstage SG for media cloud to allocate resource to mobile social users through brokers.In this way, the media cloud can dynamically determine the price of the resource and allocate its resource to brokers.A mobile social user can select his broker to connect to the media cloud by adjusting the strategy to achieve the maximum revenue, based on the social features in the community.They propose an iterative algorithm to implement the proposed scheme and obtain a stable SE.

Summary
The diverse emerging network scenarios, as we just discussed, provide strong evidence that a hierarchical network infrastructure seems very suitable to be modelled within a Stackelberg framework, with the top-level architecture entity being the leader and the bottom-level entities the followers of the associated SG.The next sub-section reviews the existing repeated (learning) games for cognitive cellular networks where notably energy efficiency is a major concern.

Repeated Games
A repeated game (RG) is an effective tool to avoid potential conflicts among wireless nodes.The occurrence of these conflicts leads to selfish behaviors, resulting in poor network performances and detrimental individual payoffs.Contrary to static one-shot games that model interactions among players during a single time interval, in RGs, interactions of players repeat along multiple time intervals.Thus, the players become aware of other players' past behaviors and their future benefits, and they can adapt their strategies accordingly, resulting in better long-term system performance.In addition, the use of RGs with the aid of a convenient incentive mechanism can encourage wireless nodes to adopt more cooperative strategies, thereby improving network performance and increasing players' benefits.Aligned with these goals, the authors of [33] survey a considerable number of RGs in different wireless networks.A very important advantage of using a RG is to enforce learning at cognitive nodes in a NC game with incomplete and imperfect information [149].With this learning, the game evolves to a state where the interference is tamed and QoS requirements are fulfilled.In addition, [7] revises RGs but within a more specific scope related to security.From the literature in the scope of RGs, we surveyed and divided it in two parts, namely non-cooperative and cooperative, as discussed below.
Discussion about recent work using non-cooperative repeated games From the literature in the scope of NC RGs, we have selected some recent contributions with the following main goals: radio spectrum sharing [150][29]; multimedia distribution [151][152]; D2D communication [153][154]; and UAVs airborne communications [130].
The NC RGs we have selected are discussed more comprehensively as follows.The radio spectrum sharing among primary and secondary users is studied in two distinct perspectives: cognitive radio access channel among nodes with imperfect information [150], spectrum oligopoly market where each primary user individually defines a channel-price based on its quality [29].In [150] they study how autonomous cognitive nodes (CNs) can use in an opportunistic way the unused capacity of a channel normally assigned to Primary Users (PUs).The main goal is to find an efficient and fair opportunistic channel access policy among CNs without disrupting the connection quality offered to PUs.They also consider CNs have two limitations when the channel status is monitored: i) imperfect observation due to both sensing and channel errors; ii) computational constraints.In this context, they propose a repeated game implemented by two-state machine (automaton) to fulfil the following requisites: i) maximizing the total network payoff; ii) ensuring fairness among CNs; ii) reducing the likelihood of collisions among CNs; and iii) requiring a small number of sensing attempts to find a channel free of any PU activity.They discuss and compare their results against the ones obtained from a complete centralized model.The authors of [29] investigate a spectrum oligopoly market where each primary user seeks to sell its channel to a secondary user.On one hand, each primary has a strategy to choose a price depending on the transmission rate of its channel.On the other hand, a secondary player aims to select a channel depending on the price and the transmission rate of the channel.It is assumed that the transmission rate evolves randomly (e.g.due to fading).Using their NC game, they have obtained results that illustrate a primary user prices its channel to render the channel that provides high transmission rate.They also show the primaries suffer a decrease on their profit due to a non-cooperative behaviour among them.Additionally, they studied a repeated version of their game.From this last analysis, they have concluded that a primary user can attain a payoff arbitrarily close to the payoff it would obtain when primaries cooperate.
We have also found some work related with multimedia dissemination through either a content distribution network [151] or a mobile multi-flow multicast approach [152].The article [151] considers two aspects regarding the problem of distributed caching with limited capacity in a content distribution network.Firstly, a nonparametric learning algorithm is proposed to estimate the request probability of videos from past user choices.Secondly, using these estimated request probabilities, the adaptive caching issue is elaborated as a NC RG in which servers autonomously select which videos to cache.The utility function makes a trade-off between the placement cost for caching videos locally and the latency cost associated with delivering the video to the users from a neighbouring server.The game is nonstationary as the preferences of users in each region evolve over time.The authors then propose an adaptive popularity-based video caching algorithm that has two timescales: i) slow timescale corresponds to learning user preferences; ii) whereas fast timescale is a regretmatching algorithm that provides individual servers with caching prescriptions.It is shown that, if all servers follow simple regret minimization for caching, then the network achieves a correlated equilibrium.This means servers can coordinate their caching strategies in a distributed fashion similarly to a centralized coordinating device that all they trust to follow.The authors of [152] address the resource allocation problem in the multi-flow multicast scenario in a cellular networking environment.To configure the multicast and achieve optimal resource allocation, the BS requires the feedback of the channel-quality information (CQI) from the users.There is an incentive mechanism to users truthfully report the channel quality.This mechanism is based on both pricing and water-filling algorithm.A water-filling algorithm tries to maximize the network capacity, allocating network resources to traffic types in a differentiated way, according to their priority.In this way, the priority of a traffic type could depend on its own demand.As an example, a traffic type with a low demand has a high priority allocation for the network resources that traffic type will be using.A very recent mobile scenario is the mobile D2D communication [153][154].In [153] the authors have investigated a cellular network consisting of a BS, multiple cellular users and a D2D pair of users that communicate using licensed spectrum resources.They have introduced a new spectrum trading scheme based on supply and demand curves for service providers (SPs) and D2D user.The game model has been formulated and it has been shown that the game has a unique NE point under a specific condition.Furthermore, a gradient-based learning method has been proposed for decision making of the SPs and the D2D user in incomplete information case, when they only have the historical information about the price.The gradient-based characteristic of the learning method ensures that in spite of small learning rates being selected, the learning game converges to the NE point as well.Their results also show that increasing the channel quality results in more profit for SPs.This is a suitable incentive for SPs to offer bandwidth in high quality channels to improve their economic performance and satisfy their users.It is also shown that the higher learning rates may lead to faster game convergence.Nevertheless, increasing the learning rate from a specified threshold will result in the instability of the game.Article [154] proposes a game-theoretic resource allocation scheme to address inter-cell interference with multiple cell settings in a mobile networking environment, including the coexistence of D2D and cellular transmissions that can mutually interfere when D2D reuses the available cellular spectrum.They characterize Base Stations (BSs) as players competing for resources allocation quota of D2D demand, and define as a novelty the utility of each player as the payoff gained from both cellular and D2D.
Another very recent scenario involves the usage of drones.Aligned with this aspect, the authors of [130] aim to fix the beaconing time period of two UAVs acting as airborne access points to optimize both energy efficiency and rate.They study the scheduling of beacons transmitted by two UAVs acting as drone small cells for two distinct scenarios, temporary events and disaster-relief activities.To study the first scenario, they proposed a NC game and characterized the equilibrium beaconing period durations for the competing drones.Next, for studying the second scenario, they described a fully distributed mechanism that allows each drone to self-discover its equilibrium beaconing strategy without any knowledge of its opponent's scheduling decision.The equilibrium point of each game allows drones to efficiently optimize their energy consumption while maximizing the likelihood of getting in contact with the mobile users on the ground.Some aspects of this work that can be enhanced are as follows: i) the UAV backhaul is a satellite link with high latency and limited rate; ii) they not discuss the energy consumption in each UAV associated to the satellite interface.
Discussion about recent work using cooperative repeated games From the literature in the scope of cooperative RGs, we have selected some recent contributions with the following goals: trade-off between load balancing and energy efficiency [155]; mesh wireless networks [156] [157]; and Delay tolerant networks [60].
The cooperative RGs we have selected are discussed more comprehensively as follows.In [155] the authors have suggested a multi-objective optimization problem, which solves the trade-off between load balancing and energy efficiency.They use a repeated Nash bargaining game [44] between two players, representing each a system objective (e.g.load balancing, energy) to optimize.The model also considers a threat mechanism among the players to not only guarantee an agreement but also generate a fair solution.
We have found some work applying cooperative RGs to share available spectrum [156] as well as to schedule network access among wireless nodes [157].The authors of [156] consider a repeated cooperative game that shares the available bandwidth among diverse relay nodes using LTE-Advanced.Their scheme encourages cooperation under the threat of punishment.It aims to improve the utilization of the spectrum and the achieved total throughput in the system.Their results also demonstrate that Pareto efficiency is achieved.They assume only constant bit rate traffic and static values for transmission power.In [157] is proposed an incompletely cooperative GT in wireless mesh networks to model the behavior of a hybrid CSMA/CA protocol.Their results suggest that their solution can increase the throughput of a mesh network, as well as decrease delay, jitter, and packet loss rate.The estimation made about the total number of mesh nodes is accurate only under system saturated conditions (i.e., nodes always have frames to transmit either real or virtual).
In [60] a Bayesian CG with incomplete information is considered for delay tolerant networks (DTNs).The current proposal is a nontransferable utility CG since the individual payoff of each mobile node (i.e., utility as a function of packet delivery delay minus cost of helping other nodes to deliver packets) cannot be given or transferred arbitrarily to other mobile nodes.The main objective of the current CG is to find a stable coalitional structure in a distributed way that despite the lacking of information (real scenario) the actual payoff of each mobile node is close to that when all the information is completely known (theoretical scenario).The formation of coalitions was studied in the context of DTNs.Each node decides to form a specific coalition as it expects the other nodes of that coalition will cooperate with him, forwarding his packets; otherwise, the former coalition is terminated.In this way, the coalitions in the network vary dynamically.They also formulate a discrete-time Markov-chain-based model [38] to analyse and find the utility and average cost (and hence the expected payoff) of each mobile node under uncertainty about other mobile nodes' types.The BG was also extended to a dynamic game.In this game, each mobile device refreshes its beliefs about other mobile devices' decisions as the cooperative game is played in a repeated way.These beliefs in other devices' behaviour (e.g.selfish, available to cooperate) are pertinent to a device assume its choice of staying or leaving its current coalition.For this game, the actual payoff of each mobile device is very similar to that when all the information is well known.In addition, the payoffs of the mobile devices when these are organized in several coalitions are, in the worst situation of limited information, equal to the payoffs of the case with no coalitions.

Summary
In this sub-section, we have reviewed the literature of RGs and divided it in two groups: NC and cooperative.A very important aspect that remains to be solved is how to mitigate the negative impact on the network performance imposed by misbehaving individual players, which select a defector strategy.Other remaining issue is finding a stable and optimum network configuration in the presence of incomplete game information.
In the following sub-section, we review the current CGs for diverse aspects of wireless access networks, including resource management in a scalable way after enforcing some level of collaboration among the players.

Coalitional Games
CGs have been applied in wireless networks to design fair, reliable, and efficient cooperation algorithms.One should be aware that to incentivize cooperation among the players of a game, this does not imply the usage of a CG (see our discussion in section 2.2 about Reputation and other incentives for cooperation).In what follows, we discuss recent contributions of CGs to diverse aspects of wireless access networks, such as: small cell networks [97] [176], and cloud computing [177][178] [179].All these scenarios are very relevant for the areas of MEC/FC.

Small cell networks
We start our discussion with small cell networks.A possible solution to satisfy the problem imposed by high demands of multimedia data over an already deployed mobile cell is to use a deployment choice commonly proposed in 3G and beyond systems, essentially in areas with a huge concentration of users.This choice is based on the usage of femtocells sharing the same spectrum of the macrocells, which raises new challenges, such as the interference mitigation [97][160], spectrum management [159], uplink user association [158], traffic offloading [97], and optimization of the network throughput considering not only network status but also the social-ties among users, essentially as D2D communications are available [161].Considering the hierarchical system formed by macro and small cells, the interference mitigation becomes more complex to manage because it has two components [97]: cross-tier and co-tier interference.The cross-tier interference is from the macrocell BS to the small cell BSs.The co-tier interference is among small cell BSs.The authors of [97] propose a CG for the mitigation of both co-tier and cross-tier interferences as well as the resource allocation in dense heterogeneous networks that allows in case of congestion the traffic offload from macrocell base station to small cell base stations.Alternatively, [160] is focused specifically in the co-tier interference reduce.
In [159] the performance of cognitive femtocell networks was enhanced through the hybrid overlay/underlay paradigm.This hybrid paradigm combines the benefits of both the overlay and underlay techniques.More exactly, the hybrid overlay/underlay paradigm utilizes both the non-utilized and under-utilized spectra.In addition, as a novelty from prior work, the authors aim to explore the potential of spatial and frequency reuse of the network to improve the performance of cognitive femtocell networks.They formulate the sub-channel allocation problem as a coalition formation game among femtocell users under the hybrid access scheme with negative externalities, to effectively characterize and tame the interference.They propose an algorithm that offers a core solution [44] with a stable and efficient allocation.
The authors in [158] suggested a solution based on an analogy of uplink user association as an admission game between mobile users and a hierarchical cellular infrastructure formed by both small-cell and macrocell base stations.They combined matching theory [47] and CG [180] to solve some potential conflicts in this scenario, e.g. by giving strong incentives to the small cells to extend the macrocell coverage while maintaining the users' QoS.The user mobility and multihoming aspects are out of the scope of [158].

Device-to-Device Communications
Only recently, GT and CGs have been also successfully applied to address the new challenges imposed by D2D communications in mobile scenarios (i.e.these challenges also involve 5G, IoT and mobile edge computing use cases).In particular, in [9] game-theoretic models are applied to D2D radio resource allocation and numerous open issues are identified.They study a coalition formation game with Non-Transferable Utility (NTU), and the mobiles are partitioned into many coalitions, each of which applies a cooperative scheme to maximize their profit.In [163] the authors propose a CG in a scenario of wireless content distribution as an effort to minimize the overall energy consumed whenever a number of mobile terminals seek to download a common content of interest.Article [164] investigates the scenario of uplink radio resource allocation when multiple D2D pairs and cellular users share in an optimized way the available resources, considering a cross-tier interference among primary and secondary users.In [165] a simple CG is studied for energy-efficient D2D communications in public safety networks.In [103], a constrained coalition formation model considers the content uploading time and also includes the coverage constraints for the D2D links so that only feasible coalitions are formed.The authors of [61] introduce a new Bayesian non-transferable overlapping coalition formation (BOCF) game to study spectrum sharing by D2D communications in cellular networks.This work was extended in [174] to the case in which a mobile device can belong simultaneously to multiple coalitions.Article [8] develops a coalition game-theoretic framework to devise social-tie-based cooperation strategies for D2D communications, which can achieve significant performance gain over the case without D2D cooperation.The authors of [166] design and evaluate a dynamic distributed resource sharing scheme that jointly considers the mode selection, resource allocation, and power control in a unified framework, with the goal of maximizing the available rate under a series of practical constraints in a mobile D2D cellular network that has multiple potential D2D pairs and cellular users.In addition, the authors of [167] are concerned with diminishing the energy consumption on the cellular network with D2D communications.In [168] they proposed a trust-based and social-aware coalition formation game for multi-hop data uploading in 5G systems.A survey of interference management for D2D communication and its challenges in 5G networks is available in [181].Article [9] comprehensively revises game-theoretic resource allocation methods for D2D.

Vehicular networks
The main aspect of a vehicular network is the great difficulty (due to vehicles mobility and limitations on the network coverage) in always maintaining high-quality coverage by means of a convenient access technology, e.g.WAVE (IEEE 802.11p) as visualized in Fig. 16.Consequently, a way to mitigate the previous limitation is to enable multi-hop communications in this emerging scenario.To support multi-hop communications in an efficient way, the cooperation among the vehicles needs to be improved.In this way, coalitional formation games can be very useful to attain those goals.Normally the players of these games are the vehicles.
The authors of [169] proposed a cooperative Bayesian Coalition Game (BCG) as-a-service for content distribution amongst vehicles.This work is complemented in [49] by proposing a NC Bayesian CG.The last two contributions [169][49] could have convergence time issues due to a exponentially increasing complexity in the algorithm to form the coalitions of vehicles when the number of vehicles attached to the network is relatively high.Alternatively, in [170] the algorithm to form the coalitions only requires a single iteration, making it a scalable solution in terms of the number of vehicles, each of which searching for the best access network.The limitation imposed by this proposal is that at a given time, a vehicle is only allowed to use one single network interface.The authors of [171] investigated a reliable message delivery in Vehicular Ad Hoc Networks (VANETs).They model the cooperative service-based message sharing problem in the VANET as a coalition formation game among nodes.Some nodes within a coalition operate as relays.In this way, their solution, for each case, chooses the more adequate relay to enhance rate and reduce delay.

Wireless sensor networks
We have found in the literature several CGs to enhance the operation of wireless sensor networking environments in the following aspects: network lifetime [84], security [172][173], and MAC access [157].This work is detailed as follows.The authors of [84] review the recent literature about using GT, including cooperative and cooperation enforcement games, in wireless sensor networks to achieve a trade-off between maximizing the network lifetime and providing the required service.
In [172] a CG with a reinforcement-based learning algorithm is proposed for WSNs.It is a three-player strategy game consisting of sink nodes, a BS, and an attacker.The proposed model implements a cooperative security game to mitigate DDoS attacks.Article [173] suggests a security strategy to predict the attacks and their effect using cooperating camera sensors.The model is based on a threshold for the probability of error of the captured scene.This approach provides a solution for false alarm, attack prediction, and selfish behavior.This work can be extended to tame the following issues: harsh environmental conditions, wrong calibrations or component defects.Nevertheless, RL in large networks could converge slowly to the optimum operation due to the very complex problem for estimating all possible actions and states of players.In these cases, assuming players with bounded-rationality could return tractability to the problem analysis.
The authors of [157] proposed a game-theoretic MAC based on an incompletely CG.This game can easily be implemented in mesh nodes.They propose a MAC access algorithm called V-CSMA/CA that operates like CSMA/CA but only manages virtual frames.In this way, as a node decides to transmit a virtual frame, no real frame is transmitted.V-CSMA/CA estimates the collision probability by assuming real transmission of virtual frame.Collision of a virtual frame is detected when other nodes transmit their real frames in the same time slot.In the case of collision, V-CSMA/CA goes into back-off mode.Since no real frame is transmitted in V-CSMA/CA, it has no effect on contention of other nodes and no consumption of bandwidth and extra energy occur.If the node has no real packet to transmit, it estimates the game state through V-CSMA/CA.Therefore, nodes have always packets, either real or virtual, to transmit.This means the nodes are always in the saturation mode and they can use the above-mentioned relations to estimate the number of competing nodes and to adjust their strategies of a dynamic BG state to compete for accessing the channel with optimal strategy when transmitting their real packets.With this method, nodes are always ready to transmit their real packets and use the channel efficiently.Their results suggest that their solution can increase the throughput of a mesh network, as well as decrease delay, jitter, and packet loss rate.The estimation (made about the total number of mesh nodes is accurate only under system saturated conditions (i.e., nodes always have frames to transmit either real or virtual).

Cooperative cellular communications
Cooperative packet transmission can enhance the throughput in wireless networking infrastructures by taking advantage of the mobility of the users' devices, essentially at networks with intermittent connectivity, high delay and error rates such the typical case of delay-tolerant networks (DTNs).Here, the DTNs (with D2D communications) are directly related to a decentralized mobile social network [9], where the data (or context information such as location) transfer can be performed locally among users or devices by means of their own mobility patterns that empowers the formation of coalitions among the local devices, as visualized in Fig. 17.As an example, the node 1.1 after moving to a location nearby node 1.2 forms a coalition.Within that coalition, the two nodes can establish D2D cooperative communications.

Fig. 17. D2D Communications
We have also found in the literature some important contributions that revise game-theoretic models, including CGs, applied to several distinct wireless scenarios, as follows: wireless and communication networks with evolutionary games [43] and without this natural selection perspective [180], cognitive radios [89], radio resource allocation in D2D communication [9], energy efficiency [84] or clustering protocols [85] in wireless sensor networks, and to enhance the collaboration among players via either pricing [157], [184] or reputation [156].In the following sub-section, we review the available evolutionary games in very diverse networking scenarios.

Evolutionary Games
Applying evolutionary algorithms to theoretical games allow players with limited-rationality to learn from the environment and take individual decisions for attaining each game's equilibrium with minimum control message exchange.We have found in the literature games that use evolutionary algorithms in very diverse networking scenarios.
From these, we have selected a few with the following main goals: developing strategies for wired [185] or wireless [186][64] cases to efficiently use available resources, distributed management of networking environments with small cells for optimizing the usage of radio capabilities [96][187], distributed learning to mitigate interference in wireless topologies with Femtocells [94][188], and enhancing cooperation in VANETs [189].The evolutionary games we have selected are now discussed as follows.
Developing more efficient strategies for wired/wireless networks Evolutionary algorithms have been proposed for either wired [185] or wireless [186] [190][64] use cases to use the available networking resources in a more efficient way.In [185] they use evolutionary theoretical games to study the necessary network conditions to concurrent transport protocols might either coevolve or not for dealing with the congestion problem.In the latter case, a dominant protocol would be operating at the network.In addition, the authors propose some directions in upgrading the protocols to enable the network operation at an optimum stable state.An evolution on this could be how to deploy in a non-disruptive way the protocol upgrades in a real networking system.
In [186] they discuss an evolutionary Game for vehicle-to-vehicle (V2V) caching.In the V2V caching, only users who actively participate in this caching can obtain cached contents from others, and users who participate in V2V caching must keep the contents in storage after watching/using/consuming them.In this work, an evolutionary stable strategy is a mixed strategy for the whole population (i.e. the winner strategy), such that the turbulence of a small proportion of other strategies will gradually disappear in the long-term trend.Examples of these minority strategies are related with a reduce number of users leaving or joining the system.
The authors of [190] consider NC mobiles, each faced with the problem of which subset of WLANs access points (APs) to connect (and multi-homing) to, and how to split its traffic among them.Considering the many users' regime, they obtain a potential game model and study its equilibrium.The authors analysed the user performance of UDP/TCP throughput with varying frame lengths over WLAN.They also obtain pricing for which the total throughput is maximized at equilibrium and study the convergence to equilibrium under various evolutionary dynamics.They also study the case where the Internet Service Provider (ISP) could charge prices greater than that of the cost price mechanism and show that even in this case multihoming is desirable.They assume only a single ISP offering multihoming connectivity.
Article [64] offers a comprehensive literature revision in evolutionary CG theory applied to wireless networking and communications use cases such as opportunistic and cognitive radio networks.They assumed a homogeneous network.
Distributed management of networking environments with small cells Evolutionary games have been proposed for optimizing the management of radio resources in networking environments with small cells [96][187].
In [96] they study the spectral coexistence between a macrocell and Femtocells using tools from evolutionary GT and RL.These tools are investigated in diverse scenarios.In the first scenario, Femtocell Base Stations (FBSs) exchange information through a central controller, and adapt their strategies based on their instantaneous payoffs and average payoffs of the Femtocell population.In the second scenario, when information exchange among Femtocells is no longer possible, each Femtocell gradually learns by interacting with its local environment through trial-and-error means, and adapts its strategies.In addition, a variant of the evolutionary game approach (referred to as replication by imitation) is also investigated where Femtocells probabilistically review their strategies and imitate other Femtocells in the network.They finally conclude that the spectral efficiency and convergence to a system stable state are shown to be driven by the type of information available at Femtocells.The performance of the evolutionary-game-based algorithms under information delay is highly subjective to the system and network parameters such as channel gains and number of devices in the network.Nonetheless, evolutionary games may not be capable of studying uncertainty and the stochastic nature of parameters such as queue dynamics and non-guaranteed energy supply.Moreover, evolutionary games assume homogeneity of players.Therefore, analysing the interconnection of different types of mobile devices may be difficult.
The authors of [187] propose a new game theoretic framework, where fast interference suppression is decoupled from the relatively slow frequency allocation process to tolerate the delayed control.The key idea is to cast Femtocell clustering as an outer-loop evolutionary game coupled with bankruptcy channel allocation, which drives the cells to spontaneously switch to less interfered clusters.Within each cluster, they design an inner-loop NC power control game, such that the requirement of prompt control is eliminated.The two loops interact recursively with an analytically confirmed stability.Simulations show that their framework can improve the throughput by 13.2% in a network of 200 cells, compared to the prior art.They assume the signalling exchange among service gateways (S-GWs) is delay-free.
Distributed learning to tame interference in wireless topologies with Femtocells Evolutionary solutions have been proposed for enhancing distributed learning to mitigate interference in wireless topologies with Femtocells [94] [188].
In [94] they introduce two mechanisms for interference mitigation, inspired by evolutionary GT and machine learning to support the coexistence of a macrocell network under-laid with self-organized Femtocell networks.In the first approach, stand-alone Femtocells choose their strategies, observe the behavior of other players, and make the best decision based on their instantaneous payoff, as well as the average payoff of all other Femtocells.They formulate the interactions among self-interest Femtocells (algorithm) using evolutionary games and demonstrate how the system converges to equilibrium.In contrast, in the Reinforcement-Learning (RL) approach, information exchange among Femtocells is no longer possible and hence each Femtocell adapts its strategy and gradually learns by interacting with its environment (i.e., neighboring interferers) through trials-and-errors.Their investigations reveal that through learning, Femtocells can self-organize by relying only on local information, while mitigating the interference towards the macrocell network.Besides, a trade-off exists where faster convergence is obtained in the evolutionary case as compared to the RL approach, at the expense of more side information.Finally, it is shown that Femtocells face an interesting tradeoff exploration versus exploitation in their learning processes.This proposal requires that instantaneous information should be exchanged among Femtocells, which is hard to achieve in practice.
Article [188] studies the strategic coexistence between macro and Femtocell tiers using tools from evolutionary GT and RL.In the first case, Femtocell base stations (FBSs) exchange information through a central controller, and adapt their strategies based on their instantaneous payoffs and average payoffs of the Femtocell population.A fictitious play formulation is also examined where FBSs maximize their payoffs given the empirical frequency of other Femtocells' actions.In the second case, when information exchange among Femtocells is no longer possible, each Femtocell gradually learns by interacting with its local environment through trials-and-errors, and adapts its strategies.Variant of the evolutionary game approach (referred to as replication by imitation) is also investigated where Femtocells probabilistically review their strategies and imitate other Femtocells in the network.Finally, the overall performance of the network in terms of spectral efficiency and convergence is shown to be adamantly driven by the type of information available at Femtocells.The effect of information exchange delay on the convergence performance of the distributed resource allocation algorithm was not studied.

Enhancing cooperation in VANETs
The authors of [189] suggest an evolutionary game to support cooperation in VANETs.They analyse how networking properties can impact the diffusion of cooperation.Simulation results show that higher network connectivity induces higher clustering in the network.This influences the probability of nodes receiving common packets from the neighbourhood.The average path length proportional to clustering impacts the benefit sharing in the neighbourhood.Results show that cooperation diffusion in these networks cannot be forced but evolves with different networking conditions.They assume a minimum number of twenty seeders to ensure a correct operation of the network.

Summary
In this sub-section, we have reviewed the existing literature of evolutionary approaches at the network edge.The covered topics were the following: efficient usage of network resources, small cells, distributed learning to tame the interference issue, and enhancing cooperation in VANETs.Some puzzling issue in this game type is how the players should evolve in their actions to allow the network reach in a polynomial time a stable and optimum operation.
The most part of the games discussed in the current publication are models were the players have access to a complete set of information about the game they are involved in.However, in a significant number of practical scenarios, it is only possible to the players have access to a limited amount of game information.Consequently, there is a strong demand to infer in a correct way the missing information.To support this, a new type of game is required, which is discussed in the next sub-section for diverse scenarios, including hierarchical small cells with D2D communications.

Games with Incomplete Information (Bayesian)
This section discusses BGs for wireless networking environments.These are games of incomplete information.From the literature was found very recent games that use Bayesian algorithms in very diverse networking scenarios.From these, we have selected a few covering the following aspects: hierarchical small cells [53][132] [150][66] [191], D2D communications [153][192][193], vehicular scenarios [137][194], and efficient resource allocation for IoT / wireless mesh networks [157][195].The BGs we have selected are discussed more comprehensively as follows.
Hierarchical small cells To enhance the performance of a mobile cell, the main idea is to reduce the on-air distance between the mobile terminal and the radio-equipment of mobile operator.As this distance is reduced, the quality perceived by the users about the network connection is enhanced as well as there are significant gains on the autonomy of terminals' battery.Nevertheless, as small cell base stations (i.e.Femtocells) are densely deployed in the same area already covered by macrocell base stations, the case of interference raises its priority to be successfully tamed, as well as the spectrum reuse and finally the practical inexistence of complete information about the state of channels is urgent to be addressed by the algorithms that control the network edge.We discuss now some GT proposals that have been proposed to enable the performance of cognitive small cells in scenarios with limited information about the network context.In [53][132] Bayesian Stackelberg games have been discussed to maximize the capacity of a hierarchical cellular network, considering both imperfect channel state information and interference power constraints.Article [66] studies a power allocation BG for hierarchical small cells.They formulate a Bayesian Stackelberg game to maximize the transmission capacity of the Femtocell networks while guaranteeing that the interference experienced at the macro base station does not exceed an interference constraint.
The authors of [150] investigate the issue of how autonomous cognitive nodes (CNs) can arrive at an efficient and fair opportunistic channel access policy in scenarios where channels may be non-homogeneous in terms of primary user (PU) occupancy.In their model, a CN that adapts to its environment is limited in two ways.First, a CN makes imperfect observations (such as due to sensing and channel errors) of their context.Second, a CN has imperfect memory due to constraints in computational tasks.For efficient opportunistic channel access, they present a simple adaptive win-shift lose-randomize (WSLR) strategy that can be executed by a two-state machine (automaton, which is any system in which rules are applied to entities and their neighbors in a regular grid, e.g.rules of the Game of Life).Using the framework of repeated games (with imperfect observations and limited memory), they show that the proposed strategy enables the CNs (without any explicit coordination) to reach an outcome that: 1) maximizes the total network payoff and ensures fairness among the CNs; 2) reduces the likelihood of collisions among CNs; and 3) requires a small number of sensing steps (attempts) to find a channel free of PU activity.They compared the performance of the proposed autonomous strategy with a centralized strategy and tested it with real spectrum data.The authors state that the statistics of PU's duty cycle are assumed to be known to the autonomous CNs.In practice, the autonomous CNs may obtain the statistics of PU's duty cycle using geo-location databases.This could overhead the CNs and the network with extra control traffic.
In [191] is modelled the dynamic spectrum access (DSA) problem of the secondary users (SUs) as a BG, referred to as the DSA game.They formalize the primary user (PU) network as a forest where the roots represent the operators and the leaves represent the operators' sub-bands.This forest analogy is useful to study the market interaction between the SUs and the PU network.In this a set of SUs can be first matched to a set of operators and the SUs matched to the same operator can then be associated to the corresponding sub-bands.In this scenario, the SUs cannot exchange information among them or know the PUs' preferences.The authors of [191] propose a distributed algorithm that results in a stable forest matching structure, which coincides with the optimal Bayesian NE of the DSA game.They prove that the Bayesian hierarchical mechanism associated with the proposed algorithm incentivizes truth-telling by SUs.They also assume that each sub-band (either occupied or unoccupied by a PU) can be accessed by at most one SU.

D2D communications
We discuss now some BGs proposals that have been proposed to analyse D2D communications, which is an essential tool to alleviate the problems of congestion and spectrum scarcity in mobile networks.
A mobile operator has the opportunity to increase the profit from his mobile infrastructure, including the allocated spectrum, if some mobile traffic is offloaded from the channels established between users' terminals and macro base stations to other channels that allow a direct communication among user's terminals, avoiding any intervention of macro base stations [192].In this way, a mobile cell can support sporadic high loads without any cost/overload increase.To avoid causing interference to the neighbouring cell, they assume each D2D link can only share spectrum with the subscribers in its local cell.The authors of [193] study a cooperative spectrum sharing scenario by allowing secondary users (SUs) to dynamically and opportunistically share the licensed bands with primary users (PUs).For rewarding this, a SU relays PU's traffic to improve the PU's effective data rate.In this way, they propose a dynamic NC bargaining game with incomplete information.This lack of information occurs because the PU does not have complete information of the SU's energy cost.Their results indicate that the proposed scheme can lead to a win-win situation, where both the PU and the SU obtain data rate improvements via a bargaining-based mechanism.They assume that the channel gains remain fixed across time slots.In this simplistic way, they only consider the average channel condition.Consequently, the network could offer transitory periods with unwanted operation.Article [153] proposes a spectrum auction where a D2D transmitter bids a demand price-bandwidth curve and each SP offers a price-bandwidth supply curve.In addition, a repeated game qualifies players to learn despite they have access to only a limited set of information.All the auction activities are conducted by a central entity, called the broker which is located at the base station.This solution has some weaknesses: bottleneck, low reliability, because only a single node decides how spectrum is allocated.

Vehicular scenarios
The main characteristic of a vehicular environment is the great difficulty in maintaining all the time a high-quality coverage by means of the current access technologies available via a heterogeneous (typically cellular and WiFi) network infrastructure.Consequently, a way to mitigate this limitation is to deploy multi-hop communications in this emerging scenario.Other very interesting and related aspects with this emerging use case are how to dynamically manage the network locations where data (computation power) is stored (available) (at cloud, cloudlet, access points, base stations, each vehicle) to optimize the system performance, and how the electrical grid should be controlled to energize the batteries of electrical vehicles in the most efficient way.We discuss now some BGs proposals that have been proposed in the context of vehicular environments.
The authors of [137] study the energy charging in a power system composed of an aggregator and multiple electrical vehicles (EVs) in the presence of demand uncertainty.They propose a Stackelberg game, where the aggregator is the leader and EVs are the followers.They propose two different approaches under demand uncertainty: a NC optimization and a cooperative design.In the robust NC case, they present the energy charging problem as a competitive game among self-interested EVs, where each EV chooses its demand strategy for maximizing its benefit.In the robust cooperative model, they formulate an optimal distributed energy scheduling algorithm that maximizes the sum benefit of the connected EVs.They theoretically prove the existence and uniqueness of robust Stackelberg equilibrium for the two approaches and develop distributed algorithms to converge to the global optimal solution independently on the demand uncertainty.Moreover, they extend the two models for handling slow variations on the system power.They considered all possible energy variations using the robust game optimization technique to obtain (NC and cooperative) games results.But according to the original authors of robust games [196], these can only produce stable results only for a bounded payoff uncertainty set.So, the question that raises is how accurate could be the game solution of [137] (obtained from a limited set of possible decisions the opponents could choose) for assuming the most correct adversarial realizations of uncertainty in a realistic application.
In [194] they investigate a Bayesian CG as-a-service for intelligent context-switching of virtual machines (VMs) in a vehicular cloud for reducing the energy consumption, so that the clients of VMS can execute their services without a performance degradation.In the proposed scheme, they have used the concepts of LA and GT in which LA are assumed as the players such that each player has an individual payoff based upon the energy consumption and load on the VM.Players interact with the stochastic environment for acting such as the selection of appropriate VMs and based upon the feedback received from the environment, they update their action probability vector.The performance of the proposed scheme is evaluated by using various performance evaluation metrics such as context-switching delay, overhead generated, execution time, and energy consumption.The results obtained show that the proposed scheme performs well with respect to these metrics.Specifically, using the proposed scheme there is a reduction of 10 percent in energy consumption, 12 percent in network delay, 5 percent in overhead generation, and 10 percent in execution time.The data privacy is an open issue that needs to be addressed in vehicular clouds.

Efficient resource allocation for Internet of Things
With the advent of the Internet of Things (IoT), the algorithms and protocols that are used in the network edge need to be revisited to rapidly learning how to adapt to the emerging context that is very dynamic and heterogeneous.An example of an algorithm that requires urgent intervention is the one that controls the access to the communications media for the efficient use of network capabilities.We discuss now some BGs proposals that have been proposed in the context of IoT.
The authors of [157] present a novel concept of incompletely cooperative GT and use it to improve the performance of MAC protocols in WMNs.In this game, first, each node estimates the current game state (e.g., the number of competing nodes).Second, the node adjusts its equilibrium strategy by tuning its local contention parameters (e.g., the minimum contention window) to the estimated game state.Finally, the game is repeated several times to get the optimal performance.To use the game effectively in WMNs, the authors present a hybrid CSMA/CA protocol by integrating a proposed virtual CSMA/CA and the standard CSMA/CA protocol.When a node has no packet to send, it contends for the channel in virtual CSMA/CA mode.In this way, the node can estimate the game state and obtain the optimal strategy.When a node has packets to send, it contends for the channel in standard CSMA/CA mode with the optimal strategy obtained in virtual CSMA/CA mode, switching smoothly from virtual to standard CSMA/CA mode.At the same time, the node keeps adjusting its strategy to the variable game state.In addition, the authors propose a simplified game-theoretic MAC protocol (G-CSMA/CA) by designing an auto backoff mechanism based on the incompletely cooperative game.G-CSMA/CA can easily be implemented in mesh nodes.Finally, simulation results show that the incompletely cooperative game can increase system throughput, decrease delay, jitter, and packet loss rate, and support the game effectively.
In [195] they propose a Bayesian CG for an IoT environment.The players of this game have variable learning rates to optimize their decisions in relation to the most suitable coalition they need to stay in.This leads for discovering game NE quicker than preceding proposals that used static learning rates.Their proposal can be enhanced for supporting the collection of data from various RFID tags deployed in the IoT environment.

Summary
In this sub-section, we have reviewed the existing literature of Bayesian approaches at the network edge.The covered topics were the following: hierarchical small cells, D2D communications, vehicular communications, and efficient resource allocation for wireless mesh networks with sensor devices.Some challenging issues for limited information games are learning and solution scalability in scenarios at the network edge with limited capabilities of connectivity, processing, and data storage.In the next section, we steer our survey into future solid perspectives at the network edge.More particularly, we discuss in how GT can bring positive outcomes to the new paradigm designated by FC or MEC.

Future Trends in Game Theory for Mobile Edge Computing
In the last few years, the huge popularity of mobile devices and the exponential increase in mobile Internet traffic have been pushing many innovations in wireless communications and networking.More precisely, the deployment of more dense wireless cells and the advent of 5G [16] [12] access technology promise to offer mobile users a gigabit wireless access to data stored at the cloud infrastructure.However, as an illustrative example, this remote access to data has an inherent limitation due to propagation delay values in the range of 50-200ms.These delay values are very high for latency-critical mobile applications, such as real-time and interactive ones.Recently, a new paradigm has emerged that aims to diminish the latency to values around 1ms or less.This new research area is called "Fog Computing" (FC) [10] (as proposed by CISCO) or "Mobile Edge Computing" (MEC) [19] (proposed by ETSI).In the remainder text of our paper the designation MEC is used.This proposal aims to integrate the concept of Cloud Computing, namely both storage and computation (virtualized) elastic aspects, into the edge of the network infrastructure and offering a low latency to applications where time-efficiency is very critical.Article [19] presents a research outlook consisting of a set of promising directions for MEC research, including its deployment, cache-enabled, computation offloading, mobility management, green computing, as well as privacy-aware services.
The remaining part of this section discusses the same concept from two distinct perspectives: standardization (two initial parts) and academia (remainder parts).The first part presents diverse categories extracted from the current MEC standardization.Then, we proceed to a second part that outlines distinct use cases associated to the various categories.In the third part, we discuss the most prominent contributions found in the literature which have goals very well aligned with those of MEC.Finally, the section is ended by two parts that highlight trends in how GT assisted with other technologies can develop disruptive services at the network periphery, such as Software Defined-MEC (SD-MEC).

MEC Usage Categories from Standardization
ETSI has defined several main categories for use cases [70].The potential benefit of the existence of these categories is that the requirements on the architecture are generally quite similar for use cases within the same category, and quite different between the categories.However, [70] argues that all these categories need to be supported to empower MEC as a robust methodology to be deployed across a heterogeneous set of use cases to support a global seamless operation with more capabilities, speed and energy efficiency than current networking solutions.The heterogeneous set of use cases are as follows: connected vehicles including unmanned aeronautical ones, satellite networks, railway backhaul communications, future smart offices, remote real-time health services and maintenance of industrial factories controlled by robots, ultra-dense networks for smart cities, smart metering in the grid obtained via sensors, home automation, and media & entertainment services.
There are a few very recent contributions discussing the use cases where MEC can have a significant positive impact [70] [71].As an example, diverse MEC categories are proposed in [70]: consumer-oriented services, operator and thirdparty services, network performance and QoE improvements.First, the consumer-oriented are innovative services that generally benefit directly the end-user, i.e. the owner of the mobile equipment device.Secondly, the operator and thirdparty are innovative services that take advantage of computing and storage facilities close to the edge of the operator's network.Third, the services of network performance and QoE improvements are usually not directly benefiting the enduser, but can be operated in conjunction with third-party service companies.These services are generally aimed at improving performance of the network, either via application-specific or generic improvements.As the services are increasingly developed taking care of users' real-life needs, the user experience of available services should be consistently improved, but in a way, completely transparent to the end-users.

MEC Use Cases from Standardization
The purpose now is to describe the more relevant use cases of each MEC usage category to derive useful requirements and system design constraints.These constraints will be useful in the next sub-section to identify the MEC challenges that theoretical games can address.After that, we continue our discussion by segmenting each MEC usage category.

Consumer-oriented services
The MEC use cases grouped in the consumer-oriented services are listed in Table V.We also highlight the system design constraints associated with each use case.

Use case System design constraints Gaming and low latency cloud applications
Latency, jitter, computing capacity, storage, host mobility support, transfer of user session Augmented reality, assisted reality, virtual reality, cognitive assistance Latency, jitter, dynamic virtualization, dynamic contents, user's location, environment sensorial data, response time, transfer of user session, cross-layer programming By placing game server applications closer to the radio equipment, at the edge of the network, a new kind of low latency-based games will become available to end-users.For that, the network device where the game server is running should have enough resources in terms of storage and computing power.While the game is running, one or more users might move around, and be connected to a different radio node (handover).As this is occurring, the connectivity between the UE and the application needs to be maintained.As the user moves away from the original location, the latency between the UE and the application is likely to increase and pass over the maximum delay value.To avoid this, the user session might have to be relocated to another game server located at a shorter distance from the current user location than the initial game server.
Augmented reality permits end-users to have additional information from their environment by collecting sensor data, device location, and/or camera information.This information is sent to a server, located at the network edge.This server then derives the semantics of the scene, augments it with additional knowledge provided by databases, and feeds it back to the user device within a very short time.
Assisted reality is like augmented reality, but its purpose is to actively inform the user of any matter of interest to her/him.This might be used, for example, to support people with disabilities or the elderly to facilitate the interaction with their environment.
Virtual reality is similar to augmented reality, but its purpose is to render the entire field of view with a virtual environment either generated or based on recorded/transmitted environments.This might for example be used to support gaming implementations or remote viewing while using the most natural input device available.
Cognitive assistance takes the concept of augmented reality one step further, by providing personalized feedback to the user on any activity the user might be performing.As an example, the server located at the network edge can send to the user some useful advice or information helping him performing his activity in a more effective way.In this way, the analysis of the scene and the advices to the user need to be fulfilled within a very short time.
For all these cases, i.e. augmented reality, assisted reality, virtual reality and cognitive assistance applications, the session between the user and the application needs to be personalized, and continuity of the session needs to be always maintained as the user moves.In addition, users are not necessarily going to be permanently using the mobile network environment for running their applications.In some cases (e.g. in their home or at work), they might access their applications located in a cloud environment through the local WiFi.However, when a user moves away from her/his indoor environment to an outdoor environment, the session needs to be exchanged from the cloud to the MEC server colocated with the mobile cell covering the outdoor position of that user, and without the user notice any session disruption.
Operator and third party services The MEC use cases grouped in the operator and third party services are listed in Table VI, including constraints.MEC can be used to extend the connected vehicle cloud into the highly distributed mobile base station environment, and allows data and applications to be placed close to the vehicles.This can reduce the RTT of data.MEC applications can run on MEC virtualized servers that are deployed at the cell radio base station to provide the roadside functionality.The MEC applications can receive local messages directly from the applications in the vehicles and the roadside sensors, analyse them and then propagate (with extremely low delay and in a reliable way) hazard warnings and other latencysensitive messages to other cars in the same geographical area.This facilitates a nearby car to receive data in a matter of milliseconds, allowing the driver to react immediately.
MEC opens services to consumers and enterprise customers as well as to adjacent industries so that they can deliver their mission-critical services over the mobile network.The goal is to develop favourable market conditions that create sustainable business for all players in the value chain, and to support global market growth.To this end, a standardized, open environment needs to be created to allow the efficient and seamless integration of such applications across multivendor MEC Computing platforms.These real-time applications should be offered with low latency, high throughput, in a reliable and elastic ways, and with a proficient awareness of location and other users' environmental information.The fulfilment of all these requirements ensures that the majority of enterprise customers can be better served.

Network performance and QoE improvements
The MEC use cases grouped in the service category "Network performance and QoE improvements" are listed in Table VII.There are also highlighted the system design constraints associated with each use case, which should be mitigated preferably by the available network resources.At this point, GT can optimize the usage of these resources (see 4.3-4.4).The video management application transcodes and stores captured video streams from cameras received on the mobile cell uplink.The video analytics application running at the MEC server processes the video data to detect and notify specific configurable events e.g.object movement, lost child, abandoned luggage, etc.The application sends low bandwidth video metadata to the central operations and management server for database searches to fulfil the needs of certain applications.Applications may range from public security to smart cities, respectively from human authorized access (e.g. with face recognition) to car park monitoring.
The dense-urban network deployment can be useful to support services in the following areas: eHealth, Media & Entertainment, factory, and Enterprise [71].First, the area of eHealth will require remote monitoring of health or wellness data, smarter medication, and grid access.Second, the area of Media & Entertainment will require on-site live event experience and collaborative gaming.Third, the factory will require an always-connected supply chain, increased level of automation, energy management, remote monitoring, and proactive maintenance.Fourth, in the Enterprise area we expect seamless intra-/inter-enterprise communication, allowing the monitoring of assets distributed in larger areas, the efficient orchestration of cross value chain activities and the optimization of logistic flows.The final goal is to facilitate the creation of new value added services.As a partial conclusion, the use case of dense-urban network requires low latency and the mitigation of wireless network congestion.To satisfy these requirements, MEC can enable D2D communication or traffic offloading.Another useful deployment is the use of Relay Nodes as mobile edge hosts.The MEC can manage these Relay Nodes similarly to other mobile edge hosts, allowing the system to have further options to fulfil the application requirements (notably latency, compute resources, storage resources, and of course throughput).

MEC Main Requirements from the Literature
In our opinion, the emerging of MEC seems very well aligned with a set of existing networking use cases in the available literature.These scenarios are: cognitive small cells, wireless sensor networks, unmanned vehicles, vehicular networks, micro-grid, and tactile Internet.For each of these scenarios, we discuss below some main requirements to be fulfilled.To fulfil these requirements, we think GT can be a very useful tool to analyse and optimize the network infrastructure involved in each one of those use cases.We also pointed out from the current literature further contributions that can be used as solid references to future exciting developments on the novel paradigm of MEC.

Cognitive small cells
The deployment of low-cost and high-capacity cognitive small cells over existing cellular networks has been discussed as a promising solution to small cells offload traffic of a macrocell.To achieve this, some problems induced by this hierarchical design needs to be successfully addressed, such as: share of spectrum among macro and Femtocell base stations, as well as among macrocell and Femtocell users [197][198][199][200][201]; energy consumption [202][203][204] [205]; control of power transmission / mitigate interference [206][207] [208][209]; security [210][211]; enhance cooperation via pricing [212].In [6] is available a comprehensive literature revision on applications of Model-Free strategy learning in cognitive wireless networks.In [213] they discuss relevant trends and challenges in the deployment of millimetre wave, massive MIMO, and small cells to prepare the evolution of current mobile networks towards fifth generation (5G) wireless networks.During this evolution, the capacity of mobile networks should increase to support the exponential growth on data traffic.Several deployment strategies at mobile networks can increase their capacity [214]: i) use of larger bandwidth by exploiting higher spectrum frequencies [135] (i.e. more spectrum available); ii) increasing spectrum efficiency by exploiting multi-antenna transmission/reception [133] (i.e.MIMO, cooperative communications); and iii) spatial reuse of spectrum by deploying D2D communications [192], small cells [199], and heterogeneous access networks [132].Some open issues for D2D communications are the prevention of denial-of-service (DoS) attacks, particularly in D2D LANs [9].In addition, user identity is also important for security in D2D networks, so there is a need for a novel method to generate and check user identity.Additionally, Machine-to-Machine (M2M) communications is a promising technology for next generation communication systems [215].This permits direct interaction among a large number of intelligent devices (e.g.sensors, actuators) without any human intervention.
Wireless sensor networks Wireless sensor networks (WSNs) have profound significance towards environmental surveillance and remote monitoring by placing sensors in places of difficult access to humans.However, the limited network lifetime is a major deployment barrier for the traditional battery-operated WSN.Consequently, energy-efficient algorithms and protocols should be developed, e.g.energy harvesting [216][217].In addition, the data need to be disseminated with low-latency because most of the cases are used to signalize a faulty situation in the monitored environment that urges to be solved [218][219].
In [220] they propose a localization scheme named Opportunistic Localization by Topology Control (OLTC), specifically for sparse Underwater Sensor Networks (UWSNs).In [7] they use GT to tame security threats in WSNs.In [61] [62] are studied economic and pricing models associated to WSNs.In [18][221] are available research future trends in WSNs.

Unmanned vehicles
Some examples of unmanned vehicles are aerial [130][222] [65] and maritime [223][224] ones.UAVs were initially developed for military monitoring and surveillance tasks but found several interesting applications in the civilian domain.
A promising application/technology is to use drone small cells (DSCs) to expand wireless communication coverage on demand [130].This requires for wireless communication links with sufficient QoS metrics (e.g.throughput, delay, jitter), satisfying security requirements (e.g.avoid fake DSCs), and the support of energy consumption optimization [138].An interesting idea is to use UAVS to deploy a crowd surveillance system [222].Other tantalizing scenario is using unmanned vehicles to maritime tasks [223][224], such as surveillance and patrolling, aquaculture inspection, or wildlife monitoring.There is a clear need for a completely distributed solution to allow each vehicle to learn (e.g. in an evolutive process) how to engage in a more effective way with other vehicles to all these can fulfil a specific mission.To support the system scalability, the vehicles can be organized in clusters, allowing a cooperative learning inside each cluster.

Vehicular networks
The interest of intelligent transportation systems and vehicular ad hoc networks has increased in recent years.As a fundamental building block for the development of applications for vehicular networks, the resource management and sharing problem for bandwidth and computing resources should be revisited to support mobile applications in cloudenabled vehicular networks [225].The challenges to solve are namely: find in an intelligent and efficient ways the more suitable communication link (i.e. with adequate data rate, low jitter and errors), support multi-hop communications, ensure data privacy, give incentives to vehicles cooperate and either relay traffic from others or cache data that can be used by others, support distinct patterns of mobility, use the parked cars as Serving nodes (i.e.like BSs or APs) for the cars circulating on the road, and increase the comfort of the passengers of self-driving vehicles by giving them all the Internet contents they need (e.g.streaming movie, live concert).Finally, new techniques and solutions are needed to handle data appropriately in vehicular networks [226].

Micro-grid
Recently, electric vehicles (EVs) and plug-in hybrid EVs have been considered as natural components of future electricity power systems, due to their efficient integration, cost savings, and significant environmental advantages.This will lead to new significant challenges in the design of power systems.These challenges include proposing energy charging scheduling plans for connected EVs, guaranteeing energy demands of EVs during peak hours, and managing information exchange between EVs and the grid (or aggregators).Other interesting area to investigate is the security of these systems.
In fact, intrusion detection/prevention and mitigation of DDoS attacks need to be addressed in micro-grids.Some recent literature about micro-grid are available in [227][228] [229][230][231] [147].Considering the widespread penetration of plug-in hybrid electric vehicles (PHEVs), the overall demand on micro-grids may increase manifold in the near future.In this way, [227] proposes a charging-discharging scheme to manage the micro-grid's load using electrical vehicles.In addition, [229] revises comprehensively the diverse on-air technologies to charge battery-operated devices.Driven by the need of sustainable "green communications" the management of the available energy within a network domain is a challenging and very active research topic [228][230] [231].From these contributions, [228] investigates a decentralized algorithm that allocates energy for wireless networks with renewable energy powered Base Stations.The authors of [230] study dynamic energy trading through a stochastic game.Their solution assists devices to trade the harvested energy with one another based on the service requirement.Reference [231] proposes a double-action-based relay selection scheme to stimulate the relay nodes to forward packets for others, empowering the network connectivity.For the in-home charging of electrical vehicles, consumer electricity consumption can be controlled through electricity prices, namely using a method designated by demand response.Under demand response, retailers determine their electricity prices, and customers respond accordingly with their electricity consumption levels.In particular, the demands of customers who own electric vehicles are elastic with respect to price.In this context, the interaction between retailers and customers can be seen as a game because both attempt to maximize their own payoffs.Considering this fact, the authors of [147] propose a Stackelberg game based on the demand response to control the electricity consumption.Their results evidence that this game reaches an equilibrium point (NE) at which the electrical vehicle charging requirements are satisfied and retailer profits are maximized, when customers use their own utility function.In addition, they conclude that the NE of this game can vary according to the weighting factor for the utility function of each customer, resulting in various strategic choices.Their numerical results confirm that the NE of the proposed game lies somewhere between the minimum generation-cost solution and the result of the equal-charging scheme.

Tactile Internet
The Tactile Internet (TI) involves human-to-machine (H2M) communications for remotely supervise tactile/haptic devices.These devices have local sensors and actuators and are required to be controlled via the Internet in a completely reliable and deterministic ways [232].The TI seems to converge towards a well-defined list of design goals [233]: very low latency on the order of 1 ms; ultra-high reliability; H2H/M2M coexistence; security.Other important challenges for the TI are, as follows [233][234]: resource management; task allocation and orchestration; mobility of robots; remote robot steering and control applications.In addition, the authors of [72] report on the first initiative in designing LANs to facilitate the converged delivery of latency-sensitive TI traffic and bandwidth-intensive applications.

GT Future Trends Aligned with MEC Use Cases
We present now some thoughts about possible future GT research directions related to MEC use cases discussed above.We can resume the challenges in the following areas: augmented/assisted/virtual reality, cognitive assistance, dynamic and fair (virtual) resource allocation, seamless session transfer, and congestion control.The augmented/assisted/virtual reality as well as cognitive assistance have a strong demand on location and other environmental information obtained via camera/sensors.The dynamic and fair (virtual) resource allocation has a prominent demand on elastic system capabilities such as storage and computing power.The seamless session transfer has a strong demand on an efficient support of host mobility as well as the life cycle of virtualized resources.The congestion control has a strong demand on two distinct networking aspects: technical and business.We narrow our next discussion in how to control the network congestion.The technical networking aspects that are relevant to mitigate congestion and empower the performance of MEC scenarios are as follows: traffic offloading [235][13][236], D2D communication [154][38][9], relaying [237][156] [118], and a recent topic designated by Full Duplex (FD) communications [238].This last topic has a very high potential to enhance the performance of mobile networks, including the upcoming 5G [181].In fact, a FD system allows a specific node to send and receive the transmitted signals at the same time and frequency resource, doubling the spectral efficiency of each wireless link.The authors of [238] have used both GT and matching theory to analyse diverse centralized and distributed FD communication networks.Continuing our discussion about the technical aspects that can be applied to cellular networks to attenuate the congestion issue, all these aspects we have just highlighted in the previous text can be deployed using CGs to enhance the cooperation among the devices [192][164] [143].Nevertheless, according to our current knowledge only a small number of papers have proposed cooperative strategies among users or providers considering the cost for cooperation [167].The cost can be the power required and the terminal battery energy for negotiation, the additional delay because of the information exchange, the security threats, and the fairness aspect among nodes.In a practical scenario, it is not reasonable to neglect this cost for spectrum management especially in a resource constrained environment like future networks dealing with the near-exponential increase in data traffic.Therefore, investigating the overhead, delay, security threats and fairness issues caused by the communication to form coalition groups for cooperation can be an interesting direction for future research in the area of GT.In addition, trust and security are key issues for future networks.The presence of malicious (or colluding) entities and their possible attacks against incentive mechanisms for cooperation need further research.For investigating different issues such as how to detect and counteract such attacks, and how the reliability of network can be maintained despite the presence of misbehaving nodes, using GT can be a very fruitful research area [239].Finally, we are assuming the business aspect of the network to tame the congestion.Reference [240] proposes the usage of a dynamic congestion-sensitive tariff, where prices are set in realtime following load or interference for maximizing efficiency and fairness.

Final Reflection about GT Applied to MEC
Our discussion was focused on identifying important topics where further work needs to be performed for enhancing the performance of future wireless networks in the face of some expected requisites that MEC would demand for.These topics that GT would need to address are namely heterogeneity, novel flow requirements, scarce resources for network connectivity, traffic offloading, and spectrum sharing.In our view, GT aspects of cooperation and learning will be most relevant for using a common pool of network resources in the most efficient way among the diverse players.The future games for MEC use cases should have a cross-fertilization of ideas among the following features: incentives to cooperate among the players (e.g.pricing, reputation, auctions, lotteries, bargaining games, contract theory, mechanism design, artificial intelligence, and social interactions); the players should play the game in a way to synthesize and integrate both learning and optimum operation of available network resources; one should assume realistically that, in order to make a decision, the players have access to only a limited part of the total system status; and evolutionary algorithms and virtualization will offer flexibility and elasticity to global system operation in spite of variable and diverse traffic loads.
In terms of the deployment of theoretical games at the network edge, the usage of a single game to control the operation of such a heterogeneous and dynamic data communication system seems difficult, because the system has a dynamic behaviour, constrained resources, uncertain information, as well as there are several simultaneous critical tradeoffs to balance, namely: available energy vs. relaying traffic, computation offloading vs. channel interference, and data consistency vs. latency.So, we envision an alternative way to control the system, breaking an original high complex problem in several lower complexity sub-problems.To support this, at the MEC servers, we point out the existence of several virtual machines or "lighter" namespaces (e.g.dockers) with agents where each one can run its own theoretical game with very specific goals, acting as a decision-maker.In this architecture, the terminals should also have some namespaces to play the games in which those terminals need to be involved.The architecture would require -at the edge of the network -some orchestration among the distinct games played by the same terminal.A strong advantage from this coordination is the fully-integrated distribution control of network, computational, and storage resources at the network edge.SDN [17] and NFV [241] can both abstract and orchestrate this control in quicker and more effective ways, solving the various trade-offs in the best way possible [242].This system can be designated as Software Defined Mobile Edge Computing (SD-MEC).In addition, SDN controllers can combine header fields from any stack layer, creating a crosslayer design that is very suitable for emerging wireless scenarios such as D2D communications [243], vehicular networks [244], and sensor networks [245].The system could be also hierarchical to support flexibility, reliability and scalability.
The Fig. 18 illustrates a network design of what we have just discussed.There are several communities; each one could be formed by people, devices, or systems.These entities are within communities because every entity has relationships with others in one form or another.This is the basis for another emerging concept on networking designated by mobile social networks [9] [246].A mobile social network is a user-centric system in which the devices not only process data, but also deal with context information (e.g.location) to either react (passively) or predict (actively) to respectively the network status at each time or an expected network status.In this scenario, the presence of agents at the network periphery (e.g. at either SDN controllers or applications running over the SDN controllers via Northbound APIs) can transform information into network knowledge, creating the cognitive network.This cognitive network, using theoretical games fuelled by evolutionary algorithms and artificial intelligence can infer valid and precise human behavioural patterns from the aggregated data that have been previously acquired via local sensors.Using the inferred knowledge, the agents, after running an election among essentially the neighbouring agents that sorts out a ranked list of viable automatic actions, select the top-most ranked actions, run these over the monitored systems, including network resources and services, and learn from the obtained results [247].In this way, a digital shared intelligence is created at the network periphery with hopefully precise and valid operational equilibria to successfully solve the initial paradox of offering the most highly viable service quality with minimum cost, as well as supporting fairness among the diverse competitive flows with heterogeneous requisites in the presence of high and unexpected fluctuations on the traffic load.

Conclusion
This paper initially aggregates background information that may be useful for non-specialist GT readers to comprehend the sections that follow.The main focus of the paper is to comprehensively review and refresh the literature about applying GT in wireless data communication networks, particularly for scenarios aligned with the emerging MEC model.GT open research issues to address emerging MEC challenges are finally covered.

Fig. 18 .
Fig. 18.Design of Mobile Social Network controlled and orchestrated by SD-MEC

Table I
presents a comprehensive list of acronyms used throughout our survey.

Table II Comparing Main Characteristics of Classical and Evolutionary Games Classical Game Evolutionary Game Players
[40]y could be either system entities or optimization objectives with conflicting trends -e.g.delay vs. energy[40]) .

Table III Taxonomy of Applications of Repeated Games in Wireless Networks
[33]Networking Scenario Main Goals Cellular and WLANs Multi-access; security; QoS Ad hocs Packet forwarding; energy efficiency; media streaming Cognitive Radio Spectrum sensing; spectrum usage; spectrum trading Other Scenarios Wireless networking coding; fiber wireless access; and wireless multicast.