Recently, at ISSCC 2008 held in San Francisco, Intel published 14 technical papers, covering the fields of processors, wireless communication, storage, teraflops and so on. What impact will these achievements bring to the development of information technology? Our reporter telephoned Zhang, an academician of Intel, to give readers an in-depth interpretation.
Processor: deep evolution
The processor is Intel's magic weapon. This disclosure is the technical details of Silverthorne and Tukwila processors that have been widely concerned before. The former is a low-power IA processor for mobile internet device, and the latter is the next generation Itanium processor for high-end and RISC.
According to Zhang, the Silverthorne processor released by Intel on March 3rd adopted the latest 45nm high-k metal gate manufacturing process, and the power consumption of its series of processors was controlled below 2.5W This processor was specially developed for the first generation mobile internet device called MID by Intel, and of course it also included other similar ultra-portable devices.
Intel has designed a new micro-architecture. The architecture is fully compatible with the Core 2 Duo instruction set, based on dual code and dual emission sequential execution, and has 16 pipeline. The microarchitecture will also adopt upgraded power management technologies, such as deep energy-saving C6 state, meshless clock distribution, register set optimized for power consumption, clock gating, CMOS bus mode and separated I/O power supply. And through a number of technical improvements, dynamic and leakage power consumption can be effectively reduced.
Compared with the ULV single-core processor introduced by Intel in 2006, the TDP of Silverthorne processor is expected to be reduced to about110. At the same time, Silverthorne can provide the highest frequency of 2GHz, thus obtaining a complete Internet experience and running mainstream application software, paving the way for the rapid development of mobile internet device.
Tukwila is a 4-core Itanium processor based on 65nm manufacturing process, which integrates 2 billion transistors. Its first version is expected to be launched at the end of this year. An Teng faces key mission areas. With high integration, Tukwila has improved its performance to twice that of dual-core An Teng 9 100 series, and its RAS performance is more advanced. Tukwila's overall on-chip cache reaches 30MB, which is 10% higher than the current product. QuickPath interconnection and integrated memory controller bring 9 times interconnection bandwidth and 6 times memory bandwidth, which intuitively shows the deep evolution of Itanium processor.
Wireless: integrating and reducing power consumption
We also learned about Intel's latest achievements in low-cost digital multi-wireless access. At present, the wireless access mode is in the discrete stage, such as WLAN and WWAN design, which is not only expensive, but also huge. Zhang said that a variety of amplifiers released by Intel have achieved higher component integration on wireless chips, pushing the discrete type into the integrated wireless access stage. That is to say, on various small devices, the performance is improved and the power consumption is reduced by realizing the dual-standard single-chip integration of WLAN and WWAN.
Among several amplifiers on display, one is a MIMO multi-band transceiver for 802. 1 1a/g/n applications, which adopts 90nm CMOS process and can realize low power consumption, compact shape and low cost. There is also an E-class CMOS power amplifier, which adopts 65nm CMOS process to realize multi-wireless access and provide 28.6dBm power output. The significance of this power amplifier is that the realization of long-distance communication (such as WiMAX) needs the support of a high-power amplifier with a power of about 1 W. This device can provide WWAN with a wireless RF output of nearly1w, provide a wide coverage, and adopt new technology to achieve the precise modulation function necessary for high data rate.
In addition, Intel also demonstrated a high-frequency sampling analog-to-digital converter, which can measure each frequency band in the entire Wi-Fi frequency band, sense the interference from other wireless signals in the same frequency band, achieve the best power-to-performance ratio through self-adjustment, and provide optimized channel selection. When the signal is strong, it can reduce power consumption and support Wi-Fi/WiMAX bandwidth in an energy-saving way. These achievements are aimed at realizing the vision of using a single chip to handle multiple wireless standards in the future. By then, the performance indexes will be significantly improved, and the miniaturization of portable devices will be promoted by reducing the size.
Storage: Promoting Density Climbing
Phase change memory (PCM) is a new storage technology with great potential, and Intel has invested a lot of money in it. PCM is one of the technical directions of Numonyx, a joint venture company to be established soon. Through joint development, Intel and stmicroelectronics demonstrated a major breakthrough in PCM-the first multi-layer cell (MLC) device that can be displayed using PCM technology.
The principle of PCM is simply to store data by changing the state of chalcogenide, which can realize fast reading and writing, lower power consumption than traditional flash memory and more stable data storage. In the past, there were only two states for recording data in single-layer PCM. This time, the researchers created two other states between the amorphous state and the crystalline state of chalcogenides through a unique algorithm, so that there are four states for recording data, from 1 bit per unit to MLC, which has the significance of improving the storage density at a lower cost per byte.
Based on the manufacturing process of 45nm high-k metal gate, Intel also developed a high-performance and low-power SRAM. Small SRAM cell is beneficial to the integration of larger cache in the processor. SRAM supports 50% larger on-chip L2(6MB) cache, which is used for the rapid mass production of Intel's second-generation dual-core and quad-core processors. SRAM design, coupled with efficient power management circuit, makes the circuit better adapt to model changes and helps to improve production yield.
Trillion times: three levels go hand in hand
Multi-core trillion floating-point operations include calculation, storage and communication. From a technical point of view, in order to support emerging data-intensive applications, the I/O bandwidth of trillions of floating-point operations should be extended to more than 100Gbps, which means that each channel should exceed 10Gbps. Improving the speed of I/O channel requires an accurate clock to time data transmission and reception, which not only consumes a lot of energy, but also requires enough space to accommodate filter elements and complex circuits to reduce noise interference. An experimental chip displayed by Intel this time has realized a data link of up to 27Gbps per link. By simplifying the circuit, some filter elements are omitted, but the timing noise can be filtered. According to the calculation, the chip has achieved high energy efficiency of 1.6mW/Gbps at the rate of 20Gbps.
Breaking the memory bandwidth limit of trillions of calculations is also worthy of attention. The application analysis shows that in the future, terascale computing will run multi-threads on multi-cores, which will have extremely high requirements for memory bandwidth. At present, on-chip SRAM is fast, but it is too expensive; Although the density of DRAM is high, its speed is slow, and it cannot be integrated on the chip due to the limitation of manufacturing process. Although DRAM can be closely integrated with the processor through 3D stacking, it still lags behind the speed of on-chip storage. Therefore, Intel has designed a new integrated DRAM memory, which provides a new choice for obtaining faster on-chip memory and improving application performance. Like other dynamic memories, this kind of memory needs to be refreshed regularly, which can provide twice the storage density of on-chip SRAM and is much faster than DRAM, and its bandwidth can reach 128GB/s at the frequency of 2GHz.