Multiple Antenna Enhancements for a High Rate CDMA Packet Data System

Share Embed


Descrição do Produto

Journal of VLSI Signal Processing 30, 55–69, 2002 c 2002 Kluwer Academic Publishers. Manufactured in The Netherlands. 

Multiple Antenna Enhancements for a High Rate CDMA Packet Data System HOWARD HUANG, HARISH VISWANATHAN, ANDREW BLANKSBY AND MOHAMED A. HALEEM Lucent Technologies, 791 Holmdel-Keyport Rd., Holmdel, NJ 07733, USA Received September 25, 2000; Revised November 14, 2001

Abstract. A High Data Rate (HDR) system has been proposed for providing downlink wireless packet service by using a channel-aware scheduling algorithm to transmit to users in a time-division multiplexed manner. In this paper, we propose using multiple antennas at the transmitter and/or at the receiver to improve performance of an HDR system. We consider the design tradeoffs between scheduling and multi-antenna transmission/detection strategies and investigate the average Shannon capacity throughput as a function of the number of antennas assuming ideal channel estimates and rate feedback. The highest capacities are achieved using multiple antennas at both the transmitter and receiver. For such systems, the best performance is achieved using a multi-input multi-output capacity-achieving transmission scheme such as BLAST (Bell Labs Layered Space-Time) in which the transmitted signal is coded in space and time, and the receive antennas are used to resolve the spatial interference. In the second part of the paper, we discuss practical transmitter and receiver architectures using BLAST for approaching the theoretical gains promised by the capacity analysis. Because the terminal receivers will be portable devices with limited computational and battery power, we perform a computational complexity analysis of the receiver and make high-level assessments on its feasibility. We conclude that the overall computational requirements are within the reach of current hardware technology. Keywords: BLAST, high data rate, CDMA, multiple antennas

1.

Introduction

As the demand for wireless packet data services increases and the availability of radio spectrum becomes more scarce, communication engineers face the challenge of designing systems which are increasingly efficient in their spectrum use and which are tailored to address the characteristics of packet data services. By exploiting the spatial domain, diversity techniques using antenna arrays are known to provide improved performance compared to conventional single antenna systems. A recent innovation known as BLAST (Bell Labs Layered Space Time) [1] uses arrays at both the transmitter and receiver, providing potentially enormous gains compared to diversity systems. What follows is a brief overview on multiple antenna diversity systems, BLAST, and the design challenges of wireless

communication systems with antenna arrays. We then discuss a high data rate (HDR) system [2] designed specifically for wireless packet data services and provide motivation for a combined HDR-BLAST system. In wireless communication systems, the transmitter and/or receiver are often surrounded by objects such as buildings, trees, pedestrians, and cars which scatter and attenuate the transmitted signal. The scattered signals arrive at the receiver, and depending on their relative phases, add constructively or destructively. Subtle movements of the objects, the transmitter, or the receiver can cause wide variations in the phases resulting in a received signal whose amplitude varies in time. The channel through which the signal traverses is known as a time-varying fading channel, and it presents one of the major challenges in wireless communication system design. A well-known technique for combating fading

56

Huang et al.

channels is diversity. There are many types of diversity, however, the overall concept is that the receiver capitalizes on a signal which traverses multiple independent realizations of the fading channel. For example, using transmit diversity, multiple transmit antennas are used to transmit the same data. Compared to a single transmitter system transmitting with the same total power, this system has an advantage since it is unlikely that the signals associated with each of the antennas will fade simultaneously. A single transmit antenna and multiple receive antennas result in similar gains due to receive diversity. However, if signals are coherently combined among the antennas, there is the additional benefit of increased signal-to-noise ratio (SNR). The increase in average SNR is directly proportional to the number of receive antennas. As the number of transmit and receive antennas increases, the post-combiner SNR increases, but the gains due to diversity saturate so that the only channel impairment becomes additive Gaussian noise. In this paper, we will use Shannon capacity as a measure of the link and system performance. The Shannon capacity of a communication link, measured in bits per second per Hertz, is the theoretical limit of information that can be transmitted and reliably decoded at the receiver [3]. For a system with transmit and receive diversity, the capacity increases logarithmically as the number of receive antennas increases. For example in a diversity system with 2 transmit and 2 receive antennas at 15 dB average SNR, the average capacity is about 5.80 bps/Hz. Doubling the number of receive antennas results in a capacity increase of about one bps/Hz to 6.91 bps/Hz. By further doubling the number of transmit antennas, the capacity increases only slightly to 6.94 bps/ because the diversity gains are already near saturation. While diversity systems use the spatial dimension to increase the capacity through improved link SNR, recent information theory results show that significantly larger capacity gains are achievable if the spatial dimension is used differently. In particular, the theory tells us that the capacity can increase linearly with respect to the number of transmitters or receivers (whichever is lower). For example, a system with 2 transmit and 2 receive antennas at 20 dB can achieve an average capacity of about 8.28 bps/Hz. In doubling both the number of transmit and receive antennas, the average capacity becomes 16.23 bps/Hz. Bell-Labs Layered Space Time (BLAST) [1] is an architecture for achieving a significant fraction of the potential capacity gains of

these multiple antenna systems. In contrast to diversity systems where the same data is sent through multiple antennas, BLAST transmits different data streams through the antennas and, in its most general form, uses coding to introduce redundancy in both space and time. The signals share the same frequency band, but they can be resolved at the receiver by using multiple antennas and by relying on the distinct spatial signatures induced by the fading channel in a rich scattering environment. BLAST is a promising technology with potential applications to a number of wireless multiple access systems. In [4], the authors studied a time-division multiple access (TDMA) system employing BLAST techniques, fixed power, rate adaptation, and capacityapproaching coding. A related study [5] investigates the use of specific modulation formats. In [6], the authors apply BLAST and diversity techniques to a codedivision multiple access (CDMA) system and evaluate the system capacity in terms of users per sector supported at a given data rate and bit error rate. In the context of CDMA, BLAST uses the same spreading code to transmit independent data from different antennas. Hence each code can be ‘reused’ up to M times, where M is number of transmit antennas. These traditional TDMA and CDMA cellular systems have been designed for voice traffic and are characterized by low tolerance for latency and by equal rate service over the entire system (except perhaps for a few areas of outage where the minimum SNR requirement is not met). In future wireless systems, there will be demands for non-real-time data applications such as email and web browsing with data rates significantly higher than those associated with voice service. Recognizing that these data applications have a much higher tolerance for latency, the authors of [2] designed a time division multiplexed high data rate (HDR) system where each base station transmits to a single user at a time and where the data rate depends on the link quality. In theory, to maximize the total throughput, the base station would transmit only to those users with the best link quality at any given time. These users are typically near the base station at the cell center. However in practice, to ensure fairness to users at the cell edge, the base uses a scheduling algorithm to transmit to a user when its time-varying fading channel is in some sense better than its average channel [7]. Figure 1 shows the block diagrams of the HDR transmitter. The downlink data stream is serial-concatenated coded, the output bits are scrambled and mapped

Multiple Antenna Enhancements

Figure 1.

57

Conventional HDR transmitter.

to either a QPSK, 8-PSK, or 16 QAM constellation. These modulated symbols are channel interleaved and punctured and/or repeated as necessary. They are then demultiplexed into 16 substreams and modulated with mutually orthogonal Walsh covers. During each transmission slot, a pilot signal is time-division multiplexed with the data traffic signal to allow the terminals to make channel estimates and measurements. A pseudonoise (PN) sequence modulates the resulting sum of data substreams or the pilot signal. The signal is then filtered, modulated by the carrier, and transmitted. In this conventional HDR system, the data rate is varied using a combination of symbol repetition, variable coding rates, and variable data constellation sizes. Multiple antennas provide additional options such as transmit diversity for improving the link performance and BLAST transmission for increasing the maximum data rate via code reuse [8]. In this paper, we evaluate the system performance of an HDR system with multiple antennas in an idealized setting using Shannon capacity as a link metric and perfect scheduling at the base station. While such idealized results may be difficult to achieve in practice, it is of interest to study trends in the nature of improvements that are possible with multiple antennas.

We evaluate both diversity systems and BLAST type systems, and we use a scheduling algorithm given in [7] to ensure fairness, assuming perfect and instantaneous SNR knowledge. As expected the system gains attained using BLAST are significant, however these gains can be realized only if the mobile receiver possesses sufficient processing power for BLAST signal detection. In the second part of the paper, we discuss an architecture for such a BLAST receiver and perform a high-level complexity analysis to assess its feasibility using current hardware technology. Because of the similarities between the transmitted signals for the HDRBLAST and CDMA-BLAST systems, we note that this general receiver architecture can be used for either system. The paper is organized as follows. In Section 2 we present the various issues concerning multiple antennas in a system with efficient scheduling that motivates much of the study carried out in this paper. In Section 3, we present the link level calculations for determining the achievable Shannon capacity versus received SNR and use these results with system level simulations to determine the system throughput using the various antenna architectures. In Section 4, we propose a practical system architecture using BLAST for

58

Huang et al.

approaching the predicted capacities. Because the terminal receivers will be portable devices with limited computational and battery power, we perform a computational complexity analysis of the receiver and make high-level assessments on its feasibility. 2.

Multiple Antennas and Scheduling

Using multiple antennas with HDR provides three advantages over conventional HDR. First, using multiple receive antennas, the gains from antenna combining reduces the required power for achieving a given rate. Alternatively, one can achieve higher rates using the same power. Second, if multiple antennas are available at both the transmitter and receiver so that BLAST transmission can be used, the maximum achievable data rate is M times the rate achievable with a single transmit antenna (where M is the number of transmit antennas). Higher peak throughputs imply not only better average throughputs but also better throughput-delay characteristics. Third, with BLAST transmission, some intermediate data rates can be achieved with a combination of BLAST and small data constellations. Compared to single antenna transmission scheme with a larger constellation to achieve the same rate, the BLAST technique may have a smaller required SNR, resulting in overall improved system performance. In the HDR system, the base station serves multiple users in a time-division multiplexed manner and uses a scheduling algorithm to ensure fairness to users at the cell edge. Pilot bursts are embedded in each slot transmission to allow mobile terminals to measure the SNR of the strongest base’s signal. This value is mapped to a data rate corresponding to the maximum rate at which the mobile can reliably demodulate the signal. The data rate value is transmitted from the mobile to the base as often as once every 1.67 ms. Because of the high frequency of the data rate updates, the scheduling algorithm can take advantage of favorable channel fades for each of the users. By transmitting to users when their channel is favorable, the scheduler provides a form of multiuser diversity. With multiple antennas at the transmitter the diversity gains may not be significant in the HDR system with efficient scheduling since the multi-user diversity gains that already exists might outweigh the benefits from transmit antenna diversity. However, the multiuser diversity gains depend on the number of users and the parameters of the scheduling algorithm, for example the delay requirements of the various users.

Hence transmit diversity may still be useful to a limited extent in some situations. When transmit antennas are available at both transmitter and receiver, space-time coding schemes such as BLAST that trade-off transmit diversity for higher throughput might be very efficient and the benefit of dual antenna arrays that is observed in point-to-point links might carry over to the multiple user system with scheduling. It is also likely that other scheduling algorithms are superior when multiple antennas are available. For example transmitting to a single user in each slot may not necessarily be optimal especially when a large number of number antennas are available. We explore the design tradeoffs in using antenna diversity, BLAST, and scheduling through detailed simulations. In the system evaluation part of the paper, we consider the average throughput per cell sector assuming an ideal feedback channel and also assuming that the base station transmits at the maximum data rate given by the Shannon capacity as a function of the measured SNR. 3. 3.1.

Simulation Study of HDR System Link Level Simulations

The Shannon capacity of a communication link is the theoretical limit of information that can be transmitted and reliably decoded at the receiver. In theory, this capacity is achieved for the multiple antenna system with Rayleigh fading in additive Gaussian noise and channel knowledge at the receiver by encoding with Gaussian distributed codewords with arbitrarily long block lengths. In this section, we consider the Shannon capacity of the individual links in order to obtain upper bounds on the overall system throughput. In the next section, we consider how to approach these capacities in practice. The Shannon capacity of an unrestricted multi-input multi-output (MIMO) system and that of a MIMO system restricted to diversity coding was studied in [9]. For completeness, we review these results here. Consider a link with M transmit antennas and N receive antennas, denoted as (M,N ). If the channel is flat fading and richly scattering, the normalized complex channel coefficients between the mth transmit and nth receive antenna can be modeled as independent and identically distributed unit-variance complex Gaussian random variables h mn (m = 1, . . . , M, n = 1, . . . , N ). The link Shannon capacity for a given channel realization is

Multiple Antenna Enhancements   ρ C M,N = log2 det I + H H H bps/Hz M

59

(1)

where ρ is the average SNR at each receiver antenna, “det” denotes the matrix determinant, I denotes the identity matrix, and the (m,n)th component of the matrix H is h m,n . The SNR is divided by M because the transmit power is normalized to be independent of the number of transmit antennas (that is, the total average transmit power from the base is kept constant). Alternatively, one could use the multiple antennas to provide only diversity. In this case the goal of the spacetime encoding scheme is to achieve maximum transmit diversity without regard to the number of receive antennas. In essence, the space-time coding achieves transmit diversity of order M. For such diversity-restricted coding schemes, the link capacity of an (M,N ) system is upperbounded by the Shannon capacity of a single transmit with an equivalent SNR  Msingle  Nreceive system 2 of Mρ m=1 n=1 |h m,n | as follows:   M  N ρ  2 Cdiv(M,N ) ≤ log2 1 + |h mn | bps/Hz. M m=1 n=1 (2) Equality is achieved when there is only N = 1 receive antenna or M = 1 transmit antenna. For M = 2 transmit antennas, the bound can be achieved using space-time spreading (STS) which provides transmit diversity in a flat-fading CDMA channel without incurring bandwidth penalties [10]. Note that the upperbound in (2) and the capacity in (1) are equivalent when either M = 1 or N = 1. For a given SNR and antenna architecture, we can numerically derive cumulative distribution functions of the unrestricted and diversity-restricted MIMO capacities from (1) and (2), respectively. For practical considerations, we study M = 1, 2, 4 antennas at the base station transmitter and N = 1, 2, 4 antennas at the mobile receiver. These link level results are used in the system level simulations to obtain the system throughputs. Figure 2 shows the distributions of the capacities for 10 dB SNR and various architectures. For the MIMO systems restricted to diversity, the distribution curves become more vertical as the number of antennas increases, indicating the saturation of the diversity benefits. For the unrestricted MIMO systems, the Shannon capacity increases more significantly. For a given architecture and SNR value, one can compute the average capacity from the corresponding distribution function. Figure 3 shows the average capacities as a function of SNR.

Figure 2. Cumulative distribution function of link level Shannon capacity for diversity-restricted and unrestricted MIMO systems, SNR = 10 dB.

Figure 3.

Average link capacity.

We emphasize that the results derived in this section are for a flat fading channel. Wideband CDMA systems will most likely encounter frequency selective channels which result in loss of orthogonality between spreading codes due to multipath delays. If no measures are taken to address the multipath fading, the capacity would be reduced. A portion of this capacity could be recouped by extending the equalizer techniques described in [11] to multiple antenna systems. However, this study is beyond the scope of this paper. 3.2.

System Level Simulations

Link level results are used in the system level simulations to obtain system throughputs and to study the tradeoffs between antenna and multiuser diversity. A fixed number of users K are placed uniformly in the sector of interest. For each user, one determines the

60

Huang et al.

tional fairness criteria [13] which states that the percentage increase in throughput to any particular user is less than the sum of percentage decreases to all other users under any other scheduler. We assume that the rates of all K users are fed back to the base with no errors, and the channel is static between the time of request and transmission. In practice, the rateRr would be drawn from a discrete set; however we achieve an upperbound on system throughput by assuming a continuous rate set drawn from the link level distributions. 3.3. Figure 4.

Simulation Results

Cumulative distribution function of measured SNR.

average measured SNR, corresponding to the signal power of the strongest received base divided by the sum power of the remaining bases and thermal noise. Note that we are implicitly modeling the base station interference received by the mobile terminals as spatially white Gaussian noise. This is a reasonable assumption since each base’s signal is the sum of code-multiplexed signals, resulting in a sufficiently large number of contributing terms to the interference. Pathloss and shadow fading are used to compute the received signal powers from each of the bases. The distribution of the SNRs for a large network of 3-sector cells and frequency reuse of one is obtained from [2] and shown in Fig. 4. Each user’s signal is also assumed to experience Rayleigh fading due to scattering. When multiple antennas are considered the scattering, is assumed to be sufficiently rich so that the fading is independent across the antennas. Each user determines the corresponding supportable capacity Rr , according to (1) or (2), as a function of the instantaneous fading channel realization. The fading is assumed to remain constant over the duration of the scheduling interval or slot and independent across slots. The base then transmits to the user with the highest Rr /Ra , where Rr is the requested rate fed back by the user, and Ra is the average rate received by the mobile over a window of time. This scheduling algorithm, first described in [12], ensures that a user is served when its channel realization is better than is has been in the recent past. Because the average rate decreases as the amount of time that a user is not served increases, this user is more likely to be served even if its channel does not improve significantly. Note that the algorithm is, in some sense, fair to users regardless of their location with respect to the base station. More specifically, as recognized in [12], this algorithm satisfies the propor-

In our system simulations, the positions (average SNRs) of the K users are fixed for 10000 slots, and we assume independent fading realizations for each user from slot to slot. We study the average sector throughput derived by averaging the rates over 50 independent realizations of average SNRs for each data point. The throughput is a function of K , the number of antennas, and the feedback technique. We consider two feedback techniques. In one scheme, the rate Rr is computed from either (1) or (2) assuming all M antennas were transmitting, and this value is fed back to the base station transmitter. In the performance figures, these techniques are labeled as “rate feedback (FB)” and “rate feedback (FB), div(ersity) b(ou)nd,” respectively. In the second feedback technique, the rate is computed by assuming that only the antenna with the highest capacity is used. In other words,   N  2 R R = max log2 1 + ρ |h mn | . m

n=1

In addition to the rate feedback, the terminal must also feed back the index of the transmit antenna that achieves this maximum rate. Hence this technique is labeled as “rate/ant(enna) feedback (FB)”. The base station then transmits all the power from this antenna to achieve the determined capacity. While this scheme requires additional feedback bandwidth, we will see that there is a significant increase in throughput for the multiple transmit antenna, single receive antenna case. Figure 5 shows the average sector throughput as a function of the number of users for M = 2 or 4 transmit antennas and N = 1 receive antenna. Notice that all the curves increase with increasing number of users due to multi-user diversity. When there is only one user in the system the throughput increases in going from one antenna to two and four antennas. However, with more users in the system the trend is actually reversed for

Multiple Antenna Enhancements

Figure 5.

Benefit from multiple transmit antennas.

rate-only feedback. (Recall that with a single receive antenna, the Shannon capacity and diversity bound capacity are equivalent.) This is because with transmit diversity the variations in the SNR are reduced (the probability that the SNR is higher is smaller) and hence the gains from efficient scheduling (multiuser diversity gains) is actually reduced. This shows that with rate only feedback, the gains from multi-user diversity with sufficient number of users are actually superior to transmit diversity. The performance with rate/antenna feedback is uniformly better than rate-only feedback since the best antenna is used to transmit to any user. Essentially each user appears as M different users, where M is the transmit diversity order, and hence there is greater efficiency from scheduling compared to the single transmit antenna case for any number of users. Figure 6 shows the throughput results for M = 2 transmit antennas and N = 2 or 4 receive antennas. Comparing the results to that in Fig. 6 it is clear that the gains from receive antennas are superior to that of transmit antennas. This is as expected since in addition to receive diversity receive antennas provide antenna combining gain. Nevertheless, the gains from receive antenna also decrease with increasing number of users. Note that rate and antenna feedback now performs worse than rate-only feedback. This shows that when two or more receive antennas are available there are gains from using both transmit antennas simultaneously than to transmit out of the best antenna. The additional capacity of the (2,2) system over the (1,2) system more than compensates for the multi-user diversity gains. As expected, the capacity of the diversity bound (which can actually be achieved for N = 2 re-

Figure 6.

61

Benefit from multiple receive antennas (M = 2).

ceive antennas) is inferior to rate and antenna feedback. This is because these techniques both achieve transmit diversity, and the technique with antenna feedback uses more information to achieve a higher rate. For completeness, Fig. 7 shows the throughput results for M = 4 transmit antennas and N = 4 receive antennas. The relationships and trends are the same as in Fig. 6, however the throughputs are higher with respect to M = 2 transmit antennas for rate-only feedback and rate/antenna feedback. Note that for the diversity bound, the capacity is lower for (4,4) than (2,4) because of the reduced variation of SNR and reduced efficiency of the scheduler in the former case. Figure 8 shows the normalized sector throughput with increasing number of antennas for the cases of

Figure 7.

Benefit from multiple receive antennas (M = 4, N = 4).

62

Huang et al.

Figure 8.

Throughput trends for multiple transmit and multiple receive antennas.

only one user in the sector and 16 users in the sector. The solid line corresponds to a system with 1 receive antenna at each terminal and multiple (1, 2, or 4) antennas at the base station. The dashed line corresponds to a system with a single transmit antenna at the base and multiple receive antennas (1, 2, or 4) at the each of the terminals. Finally the dotted line corresponds to multiple antennas at both the base station and the terminals ((1,1), (2,2), or (4,4)) using rate-only feedback. For the one user case the trends are same as average or outage capacity results in [14]. Transmit diversity has the least improvement and eventually saturates with increasing number of antennas. The most dramatic improvement is for the case when both transmit and receive antennas are available at the base station. When there are 16 users the multiple antenna gains are uniformly reduced for all schemes. Nevertheless, the throughput gains with dual arrays is still significant over the case with only receive antennas, and the gains appear to grow linearly with the number of antennas as before. 4.

Implementation of an HDR System with BLAST

The system performances derived in the previous section were based on a Shannon capacity analysis and could be achieved in theory using Gaussian distributed codewords and arbitrarily long block lengths. In practice, for single antenna transmitters, turbo codes and iterative decoding techniques can approach the Shannon capacity if the interleaver depth is sufficiently long [15]. The latency tolerance for packet data allows these cod-

Figure 9.

Turbo Encoder.

ing techniques to be used. Hence in the HDR proposal, a family of turbo codes based on serially concatenated convolutional codes are used to provide powerful error correcting capability at low SNRs [16]. The encoder structure is shown in Fig. 9 and includes an interleaver between the outer and inner encoders. The outer convolutional code is rate-1/2 and has 16-states while the inner code, also rate-1/2, has 4-states. The overall concatenated code rate is rate-1/4, but by puncturing the outer and/or the inner convolutional code, concatenated codes of rates 3/8 and 1/2 are also supported. Unfortunately, the success of turbo codes has not been extended to systems with multiple transmit antennas (except in the (2,1) case as noted in [9]). However, using the BLAST technique [1] with singledimensional turbo codes, a significant fraction of the capacity is achieved by encoding the data in space and time and transmitting the streams simultaneously over multiple antennas. At the receiver, multiple antennas are required to distinguish the streams based on their spatial characteristics. Our proposed architecture with M transmit antennas extends the original HDR architecture using an M-ary demultiplexer following the channel encoder. These M parallel data streams are modulated and transmitted simultaneously through the M antennas. Details of this

Multiple Antenna Enhancements

BLAST transmitter are given in the following subsection. At the receiver, the number of receive antennas must be at least as large as the number of transmit antennas for BLAST demodulation. These antennas must be spaced sufficiently so that the correlation of the received signals across antennas is small. This spacing is on the order of half a wavelength, which for a 2 GHz carrier is 7.5 cm. Because high data rate applications will most likely target personal digital assistants and laptop computers, it is possible to have up to four antennas with sufficient spacing. In Subsection 4.2, we describe the receiver architecture and address its computational complexity. 4.1.

Transmitter Architecture

The proposed HDR transmitter with M antennas is shown in Fig. 10. Compared to the conventional single antenna transmitter in Fig. 1, the encoded data stream is now demultiplexed into M streams, and the channel interleaver is replaced with a generalized space-time interleaver for distributing the coded symbols in time and across antennas. Each of the M streams are mod-

Figure 10.

HDR transmitter, M transmit antennas.

63

ulated with the same set of 16 Walsh covers. These signals are summed, modulated with the same PN sequence, and transmitted simultaneously over the M antennas. There are a total of 16M substreams, and the M substreams which share the same code are distinguishable at the receiver only through their spatial channel characteristics. 4.2.

Receiver Architecture

We now describe and perform a complexity analysis for an HDR BLAST receiver architecture with N receive antennas, as shown in Fig. 11. The purpose of this section is to obtain a high level estimate of the processing requirements to assess the receiver’s feasibility. Let C be the number of chips per symbol, M be the number of transmit antennas, L be the number of resolvable multipath components. We assume that the timings of the L multipath delays and the MLN channel coefficients have been estimated. For a given symbol period, let rn,l be the C-dimensional complex vector representing the sampled baseband signal at the nth receive antenna corresponding to the lth multipath.

64

Huang et al.

Figure 11.

HDR receiver, N receive antennas.

4.2.1. PN Sequence Descrambling. The received signal is first descrambled using the complex conjugates of the PN sequence. Let the descrambling sequence be represented by a C-dimensional complex vector p whose components are the complex conjugates of the scrambling sequence. Descrambling is performed by taking the component-wise product of the descrambling vector with the received signal: p ⊗ rn,l . Because the components of the descrambling sequence are ±1 ± j, each component-wise multiplication consists of 2 real additions. Hence there are a total of 2C additions per vector per symbol, and a total of 2CLN additions for all LN received signals. 4.2.2. Walsh Code Despreading. The descrambled signals are despread with the C Walsh code sequences. Let wk be the kth Walsh code sequence (k = 1, . . . , C). Then the despreading corresponds to taking the inner product between the code and the descrambled signal: xk,n,l = wk , p ⊗ rn,l .

(3)

Because the Walsh sequences are binary, the inner product consists of 2C real additions. For each of the LN descrambled signals, there are C inner products (one for each code) consisting of 2C real additions for a total of 2C 2 L N real additions. We can reduce this number of computations if we consider the special structure of the Walsh codes. For codes of length C = 2n (n = 1, 2, 3, . . .), the 2n orthogonal codes are given by the columns of W2n which is given by the following recursive expression:  W2n =

W2n−1 W2n−1

W2n−1 −W2n−1

 n = 1, 2, 3, . . .

Figure 12.

Fast Walsh-Hadamard Transform, order 4.

where W1 = 1. The inner products between a C-dimensional vector and the C Walsh codes can be obtained using a Fast Walsh-Hadamard Transform discussed in [17]. An example is shown in Fig. 12 for C = 4, where x1 · · · x4 are the bits of the input vector, and y1 · · · y4 are the resulting inner products of the vector with the four Walsh codes given by the columns of W4 . Using this technique, the number of total real additions required for processing each descrambled signal is reduced from 2C 2 to 2C log2 C; hence the Walsh code despreading requires a total of 2LNC log2 C real additions per symbol. 4.2.3. Space-Time Combiner. The signal components corresponding to each of the MC substreams are distributed among LN components of the despreader outputs. For the kth code and nth receive antenna, we collect these components given by (3) to form a Ldimensional vector: xk,n = [xk,1,n , . . . , xk,L ,n ]T . Because each code is transmitted from all M antennas, there are M channel coefficients corresponding to each element of xk,n . For example, for the component xk,l,n , the M channel coefficients are given by

Multiple Antenna Enhancements h 1,l,n , . . . , h M,l,n corresponding to the channels from the M antennas over the lth multipath to the nth receive antenna. The L-dimensional vector of channel coefficients corresponding to the vector xk,n over the mth transmitter is hm,n = [h m,1,n , . . . , h m,L ,n ]T There are MLN channel coefficients in total which we assume have been estimated during a training phase. The space-time combining operation weights and combines each of the despreader outputs with the complex conjugate of its corresponding channel. For the kth code, the space-time combiner is  N output a M-dimensional vector given by yk = n=1 HnH xk,n where the mth column of the channel matrix Hn is hm,n . For each code, the space-time combiner requires MLN complex multiplications and MLN complex additions. Each complex multiplication requires 6 real operations (4 real multiplications and 2 real additions), and each complex addition requires 2. Therefore the entire operation requires 8CMLN real operations per symbol. 4.2.4. V-BLAST Detector. Each component of the vector yk is corrupted by spatial interference due to the other M – 1 components. In addition, in frequency selective channels (i.e., L > 1), there is also interference due to the substreams spread by the other codes. One could choose to mitigate this other-code interference, using for example a decorrelating detector [6]. However, this multipath interference may be ignored since it is typically less severe than the spatial interference among the code-sharing substreams. In general, there would be less multipath interference if higher order Walsh sequences were used. To eliminate the spatial interference, one could use a maximumlikelihood detector. However, the complexity of this technique grows exponentially with M. The V-BLAST detector is a computationally efficient alternative which is comparable to the maximum-likelihood detector in terms of performance [18]. A single V-BLAST detector algorithm can resolve the interference among a set of M substreams given by the vector. Therefore a bank of C V-BLAST detectors are needed in the receiver. The V-BLAST algorithm requires the M-by-M N code-channel correlation matrix Rk = n=1 HnH Fk Hn where Fk is the L-by-L code correlation matrix for the kth code. The (i, j)th element of Fk is inner product of

65

the ith delayed PN/Walsh sequence (the componentwise product of the PN scrambling sequence with the kth Walsh code: wk ⊗ p) with that of the jth delayed PN/Walsh sequence. For example, if the delay of the l = 2 multipath relative to the l = 1 multipath is a single chip, then the (1,2) element of Fk is 

wk ⊗ p 0

H 

 0 . wk ⊗ p

Each term of Fk requires 4C real operations, and there are a total of L 2 terms. However, since each diagonal element of Fk is the energy per symbol of the PN/Walsh sequence, they do not require computation. Also, since Fk is a Hermitian symmetric matrix, the total number of operations for calculating each Fk is upperbounded by 2CL 2 . Under the assumption of flat fading, the vector yk is a sufficient statistic vector for the substreams spread by the kth code [6]. Dropping the subscript k for simplicity, the vector can be written as y = Ra + n

(4)

where R is the code-channel correlation matrix for the kth code, a is the M-dimensional vector of coded data symbols corresponding to the kth code, and n is the associated complex-valued additive Gaussian noise vector. Given the correlation matrix R and the vector y, the V-BLAST algorithm [19] successively detects the data symbols of a using the following steps: 1. Determine the component of y with the highest signal-to-noise ratio (SNR). 2. Correlate the vector y with a vector which satisfies either the minimum mean-squared error or zeroforcing criterion so that the result corresponds to the component with the highest SNR and is the free from interference due to the other M-1 components. 3. Use a slicer to estimate the symbol. 4. Using the estimated symbol, remove the contribution of this component from the vector y. 5. Repeat steps 1 through 4 until all M components have been detected. Let the ordered set S = { j1 , j2 , . . . , j M } be a permutation of the integers 1, 2, . . . , M specifying the order in which the components of ak are extracted. The V-BLAST algorithm can be written in the following pseudo-code:

66

Huang et al.

for m = 1 to M Calculate R−1

(Step 1) −1

jm = arg min [R ](i,i)

(6)

i

w jm = jm th column of R−1 z jm =

(5)

(Step 2)

(7)

w Hjm y

(8)  aˆ jm = slice z jm (Step 3) (9)

 y = y − aˆ jm × jm th column of R jm (take out jm th term) (Step 4) R = [R](take out jm th column and row)

(10) (11)

end

The matrix inverse in (5) can be computed using the Gauss-Jordan technique [20]. For M = 2, 3, 4, this computation requires respectively 100, 376 and 792 real operations. In (6) the component index of y with the strongest SNR is assigned to jm and corresponds to the index of the minimum noise variance given by the diagonal elements of R−1 . The vector w jm is the zero-forcing vector. The correlation in (8) requires M − m + 1 complex multiplications and M − m complex additions, resulting in 8(M − m) + 6 real operations. For QPSK data constellation, the slicing operation in (9) requires 2 real operations. Reconstructing the contribution of the jm th component in (10) requires M −m complex multiplications, and removing the contribution from y requires M − m complex additions. The jm th term of y is removed in (10), and the matrix R is deflated in (9) by removing the jm th row and column. For the Mth iteration, only the slicing operation in (9) is required for the QPSK constellation. For M = 2 and 4, the total number of operations per symbol per code for the V-BLAST algorithm ((5) through (11)) is respectively, 126 and 1350. The operation count for the V-BLAST algorithm can be reduced significantly if the correlation matrices Rk can be reused among several codes or if it does not vary significantly from symbol to symbol so that it does not need to be recomputed each symbol period. Future studies will address these potential simplifications. Additional reductions in complexity can be achieved by using an efficient algorithm for nulling and cancellation which avoids calculating the matrix inverse of each deflated matrix R [21]. After the bank of C V-BLAST detectors, the signals are demapped, deinterleaved and passed to the turbo decoder. These processing blocks require memory but do not require arithmetic operations.

Figure 13.

Turbo decoder.

4.2.5. Turbo Decoder. The decoder structure is shown in Fig. 13. The optimal decoding algorithm is the Maximum Aposteriori Probability (MAP) algorithm proposed by Bahl et al. [22]. However, the complexity of implementing the MAP algorithm directly is prohibitive and hence suboptimal decoding algorithms, the most common being the Soft Output Viterbi Algorithm (SOVA) [23], are used in practice. It was shown in [24] that the computational cost of a SOVA decoder iteration per single bit is 3(K + 1) + 2 K maximum operations, 2 K +1 + 8 additions, and 6(K + 1) bit comparisons, where K is the constraint length of the convolutional code. For the HDR system K is 4 for the outer convolutional code and 2 for the inner code yielding 101 and 47 operations per decoder iteration and per bit respectively. Hence the total operations count for decoding a block of length B bits is given by 148MBD, where D is the number of decoder iterations. For example, using M = 4 transmit antennas, a packet length of 4096 bits, D = 4 decoder iterations, and a packet duration of 1.66 ms, the turbo decoder requires bits 148 ops × 4 × 4096 packet ×4 bit ms 1.66 packet

= 5.8 × 109

ops . s

Note that this simplified analysis has neglected the substantial number of memory operations associated with SOVA and interleaving. Table 1 gives the number of operations per second for the following three systems: a (1,2) system with only space-time combining (no additional processing for interference suppression), a (2,2) system with V-BLAST detection, and a (4,4) system with V-BLAST detection. The values are based on using C = 16 chips per symbol, L = 3 resolvable multipath components, a block length of B = 4096 bits, and a symbol rate of 76.8K symbols per second. Turbo decoding uses the majority of the processing cycles. Comparing a (1,2) and (2,2) system, the additional processing required for V-BLAST detection is an order magnitude less than that required for the additional processing for turbo

Multiple Antenna Enhancements

Table 1. Number of operations per second for an HDR receiver with multiple antennas. (1,2)

(2,2)

(4,4)

PN descrambling

1.4 × 107

1.4 × 107

2.9 × 107

Walsh despreading

5.9 × 107

5.9 × 107

1.2 × 108

Space-time combiner

5.9 × 107

1.2 × 108

2.4 × 108



1.5 × 108

1.7 × 109

Turbo decoding

1.5 × 109

2.9 × 109

5.8 × 109

Total

1.6 × 109

3.3 × 109

7.9 × 109

V-BLAST

decoding. For a (4,4) system, the V-BLAST processing accounts for about 22% of the total processing while the turbo decoding accounts for about 73%. In this complexity analysis, we have ignored the processing required for estimating the path delays and channel coefficients. However, one can assume that these operations, which are performed during a time-division multiplexed training phase, require less processing than for the data. Hence because we have assumed that the data processing for the full duty cycle, the values obtained are upper bounds on the actual processing requirements.

5.

Conclusions

The impact of multiple antennas at the transmitter and receiver for a packet data system with channel-aware scheduling was studied. We showed that the relative gains from multiple antennas are considerably reduced compared to a point-to-point system with the same number of antennas. Nevertheless, the trends in the gains are similar and we continue to see a linear increase in average throughput with increasing number of transmit and receive antennas. For single antenna receivers, we show that multiuser diversity from efficient scheduling often outweighs the benefits of transmit diversity. Allowing for additional feedback regarding the strongest received antenna, selection diversity achieves better performance. For multiple antenna receivers, the best performance is achieved using multi-input multioutput capacity achieving transmission scheme such as BLAST in which the transmitted signal is coded in space and time, and the receive antennas are used to resolve the spatial interference. The actual gains that achieved by this transmission technology remains to be studied through detailed link simulations. A receiver architecture for the BLAST transmission scheme was

67

outlined and a complexity analysis was performed. The complexity is dominated by the turbo decoder, and the overall processing requirements are within the range of current hardware technology. References 1. G.J. Foschini, “Layered Space-Time Architecture for Wireless Communication in a Fading Environment When Using MultiElement Antennas,” Bell Labs Technical Journal, vol. 1, no. 2, Autumn 1996, pp. 41–59. 2. P. Bender, P. Black, M. Grob, R. Padovani, N. Sindhushayana, and A. Viterbi, “CDMA/HDR: A Bandwidth-Efficient HighSpeed Wireless Data Service for Nomadic Users,” IEEE Communications Magazine, vol. 38, no. 7, 2000, pp. 70– 77. 3. C.E. Shannon, “A Mathematical Theory of Communication,” Bell Systems Technical Journal, vol. 27, 1948, pp. 379–423, 623–656. 4. F.R. Farrokhi, A. Lozano, G.J. Foschini, and R.A. Valenzuela, The 11th IEEE International Symposium on Personal, Indoor and Mobile Radio Communications, 2000 (PIMRC 2000), vol. 1, pp. 373–377. 5. S. Catreux, P.F. Driessen, and L.J. Greenstein, IEEE Transactions on Communications, vol. 49, no. 8, 2001, pp. 1307– 1311. 6. H. Huang, H. Viswanathan, and G.J. Foschini, “Multiple Antennas in Cellular CDMA Systems: Transmission, Detection, and Spectral Efficiency,” IEEE J. Selected Areas in Commun., to appear. 7. P. Viswanath, D. Tse, and R. Laroia, “Opportunistic Beamforming using Dumb Antennas,” IEEE Transactions on Information Theorey, submitted. 8. H. Huang and H. Viswanathan, “Multiple Antennas and Multiuser Detection in High Data Rate CDMA Systems,” in Proceedings of the IEEE Vehicular Technology Conference, Tokyo, Japan, May 2000. 9. C. Papadias, “On the Spectral Efficiency of Space-Time Spreading for Multiple Antenna CDMA Systems,” in 33rd Asilomar Conference on Signals and Systems, Asilomar Conference, Monterrey, CA, Nov. 1999. 10. C. Papadias, B. Hochwald, T. Marzetta, M. Buehrer, and R. Soni, “Space-Time Spreading for CDMA Systems,” in 6th Workshop on Smart Antennas in Wireless Mobile Communications, Stanford, CA, July 22–23, 1999. 11. I. Ghauri and D. Slock, “Linear Receivers for the DS-CDMA Downlink Exploiting Orthogonality of Spreading Sequences,” in 32nd Asilomar Conference on Signals and Systems, Monterrey, CA, Nov. 1998. 12. A. Jalali, R. Padovani, and R. Pankaj, “Data Throughput of CDMA-HDR: A High Efficiency High Data Rate Personal Communication Wireless System,” in Proceedings of the IEEE Vehicular Technology Conference, Tokyo, Japan, May 2000. 13. F. Kelly, “Charging and Rate Control for Elastic Traffic,” European Transactions on Telecommunications, vol. 8, 1997, pp. 33– 37. 14. G.J. Foschini and M.J. Gans, “On Limits of Wireless Communication in a Fading Environment when Using Multiple Antennas,”

68

15.

16. 17.

18.

19.

20. 21.

22.

23.

24.

Huang et al.

Wireless Personal Communications, vol. 6, no. 3, March 1998, pp. 311–335. C. Berrou, A. Glavieux, and P. Thitimajshima, “Near Shannon Limit Error-Correcting Coding and Decoding,” in Proceedings of the International Conference on Communication ’93, May 1993, pp. 1064–1070. G. Karmi, F. Ling, and R. Pankaj, “HDR Air Interface Specification (HAI),” Qualcomm Inc., Jan. 2000. C.-L. I, C.A. Webb, H. Huang, S. ten Brink, S. Nanda, and R.D. Gitlin, “IS-95 Enhancements for Multimedia Services,” Bell Labs Technical Journal, vol. 1, no. 2, Autumn 1996, pp. 60–87. G.J. Foschini, G.D. Golden, R.A. Valenzuela, and P.W. Wolniansky, “Simplified Processing for High Spectral Efficiency Wireless Communication Employing Multi-Element Arrays,” IEEE Jornal on Selected Areas in Communications, vol. 17, no. 11, 1999, pp. 1841–1851. P.W. Woliansky, G.J. Foschini, G.D. Golden, and R.A. Valenzuela, “V-BLAST: An Architecture for Realizing Very High Data Rates Over the Rich-Scattering Wireless Channel,” in Proc. ISSSE, Pisa, Italy, Sept. 1998. G. Strang, Linear Algebra and Its Applications, San Diego: Harcourt Brace Jovanovich, 1988. B. Hassibi, “An Efficient Square-Root Algorithm for BLAST,” in Proceedings of the International Conference on Acoustics and Signal Processing (ICAASP) 2000, Istanbul, Turkey, June 2000, pp. 3129–3134. L. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal Decoding of Linear Codes for Minimizing Symbol Error Rate,” IEEE Transactions on Information Theory, vol. IT-20, March 1974, pp. 284–287. J. Hagenauer and P. Hoeher, “A Viterbi Algorithm with Soft Decision Outputs and Its Applications,” in Proceedings of GLOBECOM ‘89, Nov. 1989, pp. 1680–1686. P. Robertson, E. Villebrun, and P. Hoeher, “A Comparison of Optimal and Sub-Optimal MAP Decoding Algorithms Operating in the Log Domain,” in Proceedings of the International Conference on Communications ‘95, 1995, pp. 1009– 1013.

Howard Huang received a B.S.E.E. degree from Rice University in 1991 and a Ph.D. In electrical engineering from Princeton University in 1995. Since graduating, he has been a member of technical staff in the Wireless Communications Research Department, Bell Labs, Holmdel NJ. His interests include multiuser detection, multiple antenna communication systems, and applications of these technologies to third generation mobile communication systems. [email protected]

Harish Viswanathan was born in Trichy, India, on August 14, 1971. He received the B. Tech. degree from the Department of Electrical Engineering, Indian Institute of Technology, Madras, India in 1992 and the M.S. and Ph.D. degrees from the School of Electrical Engineering, Cornell University, Ithaca, NY in 1995 and 1997, respectively. He is presently with Lucent Technologies Bell Labs, Murray Hill, NJ. His research interests include information theory, communication theory, wireless networks and signal processing. Dr. Viswanathan was awarded the Cornell Sage Fellowship during the academic year 1992–1993.

Andrew Blanksby received the bachelor’s degree and Ph.D. in Electrical and Electronic Engineering from the University of Adelaide in 1993 and 1999 respectively. In July 1998 he joined the DSP & VLSI Systems Research Department, Bell Laboratories, Lucent Technologies, Holmdel, NJ, as a Member of Technical Staff. Since March 2001 Andrew has been with the High Speed Communications VLSI Research Department, Agere Systems, Holmdel NJ. His professional interests include VLSI design, communication system design, and signal processing.

Mohamed A. Haleem has been with the Wireless Communications Research Department, Bell Laboratories, Lucent Technologies,

Multiple Antenna Enhancements

Holmdel, NJ since July 1996. He received a B.Sc. degree from the Department of Electrical & Electronic Engineering, University of Peradeniya, Sri Lanka in 1990 and the M.Phil. degree from the Department of Electrical & Electronic Engineering, Hong Kong University of Science & Technology in 1995. From March 1990 to August

69

1993 he was with the academic staff of the department of Electrical & Electronic Engineering, University of Peradeniya, Sri Lanka. His professional interests include dynamic resource assignment to wireless communication systems, high speed wireless systems, and Communication Systems Simulation.

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.