Frequency Domain Equalization for Single and Multiuser Generalized Spatial Modulation Systems in Time Dispersive Channels

In this letter, a low-complexity iterative detector with frequency domain equalization is proposed for generalized spatial modulation (GSM) aided single carrier (SC) transmissions operating in frequency selective channels. The detector comprises three main separate tasks namely, multiple-input multiple-output (MIMO) equalization, active antenna detection per user and symbol wise demodulation. This approach makes the detector suitable for a broad range of MIMO configurations, which includes single-user and multiuser scenarios, as well as arbitrary signal constellations. Simulation results show that the receiver can cope with the intersymbol interference induced by severe time dispersive channels and operate in difficult underdetermined scenarios.


I. INTRODUCTION
G ENERALIZED spatial modulation (GSM) [1], [2] is a multiple-input multiple-output (MIMO) scheme that offers a tradeoff between the high spectral efficiency (SE) of full spatial multiplexing MIMO and the low complexity of the single radio frequency (RF) transmitter chain of spatial modulations (SMs) [3]. GSM relies on the use of multiple RF chains in order to support transmission over multiple active antenna elements (AEs). The information is mapped onto a transmit antenna combination (TAC) and on the modulated symbols, thus increasing the SE. Due to the transmission of multiple streams, GSM detection is more complex than SM. While the same symbols can be transmitted on all active AEs [1], in this letter we are concerned with the higher SE approach where a different symbol is sent on each active AE [2].
A lot of research efforts have focused on SM and GSM schemes operating in flat fading channels [1]- [6]. However, in broadband systems the channel is often severely time dispersive and leads to high intersymbol interference (ISI) levels. Although orthogonal frequency division multiplexing (OFDM) is very popular for frequency selective environments, the combination with SM sacrifices most of its benefits [7]. A better Manuscript  suited alternative is single carrier (SC) transmission which can potentially avoid SM-OFDM limitations while also providing higher frequency diversity. Motivated by this, the combination of SC with SM [7]- [9] and GSM [10]- [14] recently started to attract substantial research efforts. Regarding GSM-SC, which is the focus of this letter, a compressed-sensing (CS) based approach was studied in [10], but since it is applied in the time domain it can result in high complexity. A low complexity detection scheme was proposed in [11] which can achieve good performances in several scenarios. It was designed for zero-padded SC (ZP-SC) systems and has a complexity that grows directly with the GSM constellation size, making its application in large-scale systems difficult. In [12], several tree search algorithms for ZP-SC were proposed and evaluated. They can also achieve good performances but rely on the search over the whole GSM set making them impractical for large-scale systems. In [13] several time-domain turbo equalization detectors were proposed but were designed specifically for application to ZP-SC systems. Regarding cyclicprefixed (CP) aided SC transmissions, most of the research has been restricted to SM only [7], [9], where the special structure of CP-SC is exploited in order to implement part of the processing in the frequency domain. In [14], message passing (MP) based algorithms were proposed for multiuser (MU) MIMO with GSM. It was shown that it could be extended to CP-SC but the complexity can become very high. Compared to single-user (SU) scenarios, receivers for MU GSM systems face the major challenge of having to detect several GSM symbols simultaneously while still maintaining a reduced complexity. This problem is aggravated by the fact that when considering all the users, the large number of total transmit antennas present can often make the system large-scale underdetermined. The challenge becomes even greater when the receiver has to mitigate the high ISI resulting from the operation of SC systems in multipath channels. Against this background, the main contributions of this letter are summarized as follows: 1) We develop an iterative detector for CP-aided GSM-SC systems which separates the tasks of MIMO equalization, active AE detection per user and symbol wise demodulation. This splitting is accomplished through the alternating direction method of the multipliers (ADMM), which was previously applied in a simpler form in [6], within the context of SU GSM transmissions in flat fading channels. In this letter the detector is designed in order to cope with the more challenging MU scenarios and ISI inducing frequency selective channels.

II. SYSTEM MODEL AND PROBLEM STATEMENT
Let us consider a SC system where a base station with N rx receiver antennas serves N u users. Each user is equipped with N tx transmitter antennas with only N a active AEs at any given time. This allows a total of N comb = 2 log 2 ( N tx Na ) TACs available per user. Every active AE transmits a different M-QAM modulated symbol resulting in a total of log 2 ( Ntx Na ) + N a log 2 M bits mapped to each GSM symbol. A frequency selective channel with L resolvable paths is assumed for each pair of transmitter-receiver antennas. We consider that the system operates with N-sized blocks employing a CP with length N CP (N CP ≥ L−1), and that the channel is time invariant during a block. The GSM signal vector transmitted by user p(p = 0, . . . , N u − 1) during channel use t(t = −N CP , . . . , N − 1) can then be expressed as and A denoting the M-sized complex valued constellation set. The received signal vector in the time domain can be written as where ] T and n t ∈ C Nrx ×1 is the vector containing independent zero-mean circularly symmetric Gaussian noise samples with covariance 2σ 2 I Nrx . Matrix Ω i ∈ C Nrx ×Nu Ntx contains all the channel coefficients of the i th tap and is defined as and h i,p r ,u represents the complex-valued channel gain between transmit antenna u of user p and receive antenna r. Dropping the CP, we can concatenate the received vectors as y = [y T 0 · · · y T N −1 ] T and write y = Ωs + n, The block circulant structure of the channel matrix Ω ∈ C NNrx ×NNu Ntx allows it to be factorized as where F represents the unitary N × N discrete Fourier trans- and ω denoting a N th primitive root of unity. The received block can then be expressed in the frequency domain as subject to S = (F ⊗ I Nu Ntx )s (10) and S denotes the set of valid TACs. Due to constraints (11) and (12), finding the exact solution requires a computational complexity that grows exponentially with the problem size making it most often impractical.

III. FREQUENCY DOMAIN GSM DETECTOR
In this section, we apply a generalized version of ADMM [14] as a heuristic to provide good quality solutions with reduced complexity for the MLD problem. Firstly we encode constraints (11) and (12) into (9) and rewrite the problem using a mixture of frequency domain and time domain variables as t T ] T . The first term in (13) concerns the channel equalization, the summation in the second term refers to the active antenna detection for each time instant and individual user while the third term deals with the individual alphabet symbols (including 0) on each antenna. The three terms use three different variables, S, x and z which are related through (14) and (15). With these auxiliary variables, the objective function becomes separable over the three terms and will allow us to split the main problem. The augmented Lagrangian function (ALF) can be written as where U, W ∈ C NNu Ntx ×1 are the scaled dual variables associated to the equality constraints (14) and (15) the penalty matrices. The gradient ascent method is then applied to the dual problem [15] resulting in the following sequence of iterative steps.
Step 1 (Minimization of the ALF Over S): The frequency domain estimate at iteration q+1 can be obtained from ∇ S H L Px ,Pz (S, z, x, U, W) = 0 which, exploiting the block diagonal structure of H, results in k , P x ,k and P z ,k represent slices of S (q+1) , Y, X (q) , Z (q) , U (q) , W (q) , P x and P z matching the k th frequency. X and Z are the frequency domain representations of x and z, i.e., X = (F ⊗ I Nu Ntx )x and Z = (F ⊗ I Nu Ntx )z.
Step 3 (Minimization of the ALF Over z): In this case we get Obtain z (q+1) with projection (19). 8: . 10: If f ŝ candidate < f best then 11:ŝĪ ← 0,ŝ I ←ŝ candidate I . 12: f best = f ŝ candidate . 13: end if 14: 15: . 16: Step 4 (Dual Variable Update): The update of the dual variables is accomplished through Algorithm 1 summarizes all the required steps, withŝ denoting the final estimate and Q the maximum number of iterations. In lines 11-14, I is the support of x (q+1) ,Ī is the respective complement (i.e.,Ī = {1, . . . , NN u N tx }\I ), and s I (ŝ candidate I ) is the reduced N a N u N × 1 vector containing th nonzero elements ofŝ (ŝ candidate ) given by the support I. For initialization of the algorithm we can perform a random selection of a vector s with elements constrained within the constellation limits, followed by the projection over S and A NNu Ntx 0 (using (18) and (19)) in order to obtain x 0 and z 0 . U 0 and W 0 can be set as 0. Alternatively, we can apply the same procedure but starting with an initial vector scomputed using the MMSE-FDE (as in [7]). The penalty coefficients, ρ x i and ρ z i , are used as tuning parameters for achieving the best performance for a specific problem setting. Regarding the implementation of the algorithm, the matrix multiplications (F ⊗ I Nu Ntx ) and (F H ⊗ I Nu Ntx ), which are required in lines 6, 7 and 14, can be efficiently performed through N u N tx fast Fourier transforms (FFTs). The matrix inverses in (17) only need to be computed in the first iteration of the algorithm. Taking this into account, it can be seen that the complexity in real-valued floating point operations (flops) per subcarrier is corresponding to a complexity order of O(N 3 which is similar to the MMSE based FDE receiver from [7]. Compared with the MP-GSM receiver [14] and the CS-ZF approach from [10], which have complexity (assuming the use of an interior-point method as in [17]), the complexity of the proposed MU-FD-ADMM tends to grow much slower with the size of the problem setting.

IV. NUMERICAL RESULTS
In this section, we evaluate the performance of the proposed detector using Monte Carlo simulations. An uncoded MU SC system with N = 128, a block duration of 67µs and a CP with 16.7µs was considered. The adopted channel model was the Extended Typical Urban model (ETU) [18] (similar conclusions could be drawn for other severely time-dispersive channels). All the channel coefficients were independently drawn according to a zero-mean complex Gaussian distribution. Randomly selected modulated symbols were transmitted on the active AEs with E [|s i | 2 ] = 1. Fig. 1 plots the bit error rate (BER) as a function of the signal to noise ratio (SNR) per user and receive antenna, for a MU scenario with N u = 12, N tx = 7, N a = 2 and N rx = 42 and different configurations of the MU-ADMM-FDE. Curves for QSPK and 64-QAM modulations are shown, which correspond to SEs of 8 and 16 bits per channel use (bpcu). The penalty parameters values were ρ x i = ρ z i = 28. It can be observed that both the initialization and the number of iterations influence the behavior of the detector. For example, starting the algorithm with the MMSE based initialization tends to achieve better performance than the random initialization with the same number of iterations. We can also see that we may need at least 30-50 iterations until the performance gains become small. While this may seem a large number, it is important to remember that the complexity cost per iteration is small. Furthermore, it is possible to reduce the number of iterations effectively used by adopting a stopping criterion  based on the primal and dual residuals [15] or detecting stall conditions where variables do not show relevant change after several iterations.
In Fig. 2 we present the BER and complexity in real-valued flops of the MU-ADMM-FDE (with Q = 30 iterations) and compare them against the MMSE-FDE [7], CS-ZF [10] and MP-GSM [14]. The setting corresponds to a system with a SE of 4 bpcu, where N u = 10, N tx = 4, N a = 1, N rx = 42 and the adopted modulation is QPSK. This scenario corresponds to an underdetermined system where, as expected, the MMSE-FDE has more difficulty to cope with. The other three detectors are capable of operating more reliably, with the MU-ADMM-FDE being the most effective in coping with the ISI induced by the channel and detecting the GSM symbols, providing gains over CS-ZF and MP-GSM of 6 dB and 1.3 dB at a BER of 10 −4 . It can also be seen that the advantage is not only in terms of BER performance but also in terms of complexity as it is substantially lower than the complexity of CS-ZF and MP-GSM. Fig. 3 illustrates the impact of changing the loading factor (defined as N u /N rx ) on the SNR required to achieve a target BER of 10 −4 when N rx = 32. Three different configurations with the same SE of 6 bpcu per user are considered. The proposed receiver is employed for all setups, including the case N tx = N a = 1 (conventional MU MIMO). For low loads, the use of GSM has a clear performance advantage over the conventional MU system. For loads above 0.25, the GSM systems becomes underdetermined (N u N tx > N rx ) and, even though the proposed receiver is still able to perform well, the SNR degradation becomes sharper until the point where the BER of 10 −4 becomes unreachable. In this high-load region, the conventional MU setup becomes a better performing solution (and it can be dealt with the same receiver). These results reveal that when there is a sufficient number of receiver antennas at the base station, increasing the spectral efficiency through the use of GSM might be a better strategy than increasing the modulation order as it tends to be "easier", in terms of additional SNR required, to reliably infer the active transmit antennas than to detect the correct modulation symbols from a larger constellation alphabet. It is important to note that the advantage of GSM schemes in terms of SNR is only possible as long as the number of receiver antennas is sufficiently higher than the existing sparsity (N u N a ), otherwise reliable recovery of the sparse signal is not possible and that is why it is only achieved in the low loads regime.

V. CONCLUSION
This letter presented a novel iterative detector for SU and MU SC-GSM transmissions in frequency-selective channels which accomplishes reduced complexity implementation through frequency domain equalization. Numerical simulations show that the proposed receiver can effectively cope with the ISI induced by severe time dispersive channels and operate in difficult underdetermined scenarios. The inherent splittingbased design of the algorithm allows it to easily deal with GSM based transmissions, which can be more attractive in low load scenarios, and switch to conventional MU detection whenever the load becomes high.