Multi-Processor System-on-Chip 2. Liliana Andrade

Multi-Processor System-on-Chip 2 - Liliana Andrade


Скачать книгу
processing part of wireless communication. The main tasks of baseband processing are signal detection, parameter estimation, demodulation and channel decoding. Figure 2.1 shows a state-of-the-art commercial advanced baseband System-on-Chip (SoC) from the company Octasic (http://www.octasic.com). This SoC supports a wide range of communication standards from 2G to 5G. Assisting different standards demands for flexibility; on the other hand, power and energy efficiency requires dedicated optimized accelerators that demand a careful flexibility/implementation efficiency trade-off. A specialized hardware is 2 to 3 orders of magnitude more efficient than processor-based solutions that offer largest flexibility (Horowitz 2014). Hence, the SoC features a heterogeneous architecture composed of standard CPUs such as ARM cores, highly optimized low-power Digital Signal Processor (DSP) cores and dedicated accelerators.

      Figure 2.1. State-of-the-art commercial system-on-chip baseband architecture

      The question is, what contribution have microelectronics made to improve throughput and implementation efficiency in channel decoding in the past. As a case study, we consider two Turbo code decoders. Both decoders were designed with the same design methodology and have a very similar state-of-the-art architecture that exploits spatial parallelism and processes several sub-blocks on corresponding Maximum a Posteriori (MAP) decoders in parallel:

       – the first decoder is a fully UMTS-compliant Turbo decoder implemented in a 180 nm technology. Under worst-case Process, Voltage and Temperature (PVT) conditions, a maximum frequency of 166 MHz is achieved, which results in a throughput of 71 Mbit/s at 6 decoding iterations. The total area is 30 mm2 (Thul et al. 2005);

       – the second decoder is a fully LTE-compliant Turbo decoder implemented in a 65 nm technology, achieving a maximum frequency of 450 MHz under worst-case PVT conditions. It yields a throughput of 2. 15 Gbit/s at 6 decoding iterations and consumes 7.7 mm2 area (Ilnseher et al. 2012).

      Three semiconductor technology nodes are between 180 nm and 65 nm technology. We observe a throughput increase by 30× although the improvement of frequency, which is limited by the critical path inside the MAP decoder, is only 3×. The improvement in area efficiency (throughput/area) is 118×. Hence, progress in microelectronics contributed to a huge improvement in area efficiency but much less to a frequency increase, and, thus, throughput increase. The throughput increase mainly originates from code design, i.e. conflict-free Turbo code interleavers that enable efficient implementation with a high degree of parallelism, advanced algorithmic and architectural features, such as next-iteration initialization, optimized radix-4 kernel, re-computation and advanced normalization to reduce internal bit widths. We see that microelectronics could not keep up with the increased requirements coming from communication systems. Thus, the design of communication systems is no longer just a matter of spectral efficiency or BER/FER. When it comes to implementation, channel coding requires a cross-layer approach covering information theory, algorithms, parallel hardware architectures and semiconductor technology to achieve excellent communications performance, high throughput, low latency, low power consumption and high energy and area efficiency (Scholl et al. 2016; Kestel et al. 2018a).


Скачать книгу