Cloud AI made NVIDIA a great success. If Edge AI is a brand-new opportunity, which companies have a chance to be winners?
General and special computing routes have been discussed in the industry for a long time, but the topic is actually accompanied by diverse and fragmented application scenarios. Scene fragmentation, product fragmentation. Where's the chip? Can it only be fragmented and specialized, but not universal? Will the status of CPU and GPU, the big brothers of general computing, be marginalized? How will the relationship between general computing and special computing develop in the future?
Yu Xin, president of Shiqing Technology, told us that it is very necessary to design architecture for different applications. Domain-specific architecture (DSA) processors and chips are essentially designed to solve the balance and trade-off between generality and specificity.
"There are two main premises. One is fragmentation of end-side applications, and the other is that it often requires high power consumption and cost. Under these two premises, how to ensure sufficient competitiveness relative to a certain scenario, meet the requirements of cost and power consumption, and at the same time take into account enough market space-this is a challenge that every company has to face, and it is also a test of product definition ability, "he stressed.
Although the general-purpose computing chip can cover all the operations required by the edge computing program, it really can't adapt to the rapid growth of edge demand in terms of chip architecture scalability and performance. General-purpose computing and special-purpose computing chips have shown a trend of integration and development. Moreover, the computing characteristics determine the difference between edge chips and cloud chips, and the architecture design needs to be optimized and customized.
Hua, deputy general manager of Linxi Technology, said that the two should be complementary and preferably integrated. Special computing chips will include the core of general computing, such as ip cores such as Arm or RISC-V, and new computing architecture chips, such as brain like computing chips, will include general Arm cores in addition to neural mimicry cores and neural network cores. At the same time, general-purpose computing chips, such as the latest Arm chips, will also have some IP cores built in for traditional neural network acceleration. Heterogeneous fusion chip architecture is the inevitable development.
According to this trend, this means that the special computing unit responsible for acceleration needs to be moved into the general programming model, and the pressure to create a general processor is always there. Rob Fisher, director of business product management of Imagination computing, said that this is mainly from the perspective of ease of use of general-purpose processor programming. This mode will be limited when the scale of the task or the required performance is far beyond the scope of the general solution.
He pointed out that GPU is a good example. In practical application, the advantages of offloading graphics processing workload to GPU are obvious, which promotes the independent development of efficient graphics processors. Vector processors are more and more closely integrated with CPU architecture, allowing instruction-level acceleration of computing tasks.
Zhao Xiaowu, vice president of Xuehu Technology, said that according to the different requirements of different scenarios for functions and performance, the requirements at the edge are more complicated, and it is difficult to meet most of the requirements with a common architecture or platform, so special architecture design will be carried out for different application scenarios. General-purpose computing chips, such as CPU, can be used in parts with low performance requirements and fast algorithm changes; Special computing chips, such as ASIC, can be used for parts with high performance requirements and relatively solid algorithms; FPGA programmable chip can be used in parts that require certain performance and algorithm flexibility.
Taking the edge computing in the field of intelligent transportation as an example, he said that because it is basically an outdoor scene and the environment is complex and harsh, it is necessary to meet the needs of AI's large computing power and low delay, as well as to meet the reliability and stable operation. So at present, most edge computers can't meet the demand. Only by using the chips at the same level of outdoor base stations and customizing special high-power computers for this scene can we meet the needs of these special scenes.
With the rise of high-performance computing and machine learning, the workload of heterogeneous processors has increased dramatically, so it is very important to establish an open ecological cooperation in the whole semiconductor industry.
Not long ago, Intel, AMD, Arm, Qualcomm, TSMC, etc. The Small Chip Standards Alliance has been jointly established, and UCIe, a universal high-speed interconnection standard for small chips, has been launched. Under the framework of UCIe, the interconnection interface standard has been unified, and various chiplets with different processes and functions are expected to be integrated through various packaging methods such as 2D, 2.5D and 3D, and various processing engines will form a very large-scale complex chip system.
On the one hand, NVIDIA announced its support for UCIe specification at GTC22 last month, and on the other hand, it announced the opening of its NVLink-C2C interconnection technology for semi-custom chips, which is a chip-to-chip and chip-to-chip interconnection technology that supports memory consistency. This route has clearly demonstrated NVIDIA's heterogeneous determination. According to this plan, it is even theoretically possible to put NVIDIA's chips and competitors' chips in the same package.
Huang Renxun told me that he liked PCIe first, followed by UCIe, and predicted that the benefits of UCIe would gradually emerge in five years. As for NVIDIA's own NVlink interconnection technology, he stressed that its advantage lies in its direct connection ability. UCIe can't directly access the chip, or it is a peripheral interface; The advantage of NVlink is that it can connect directly, almost like connecting directly to the brain. This may lead to its complicated assembly to some extent, and partners and customers must know NVlink very well. However, once they can do this well, they can make full use of all the resources inside the chip as if they were all on the same chip.
This answer not only shows that NVIDIA does not intend to exclude itself from UCIe alliance, but also shows its absolute confidence in its own NVLink interconnection technology, and speculates that this technology will also become the key to building a heterogeneous ecology in NVIDIA.
The huge potential of the edge computing market naturally attracts the competition of cloud chip giants. They are making a comprehensive layout through heterogeneous computing, advanced manufacturing technology and advanced packaging. Coupled with high ecological barriers, do domestic AI chip manufacturers still have the opportunity to compete?
"Those who can build skyscrapers are not necessarily good at carving beams and painting columns. Of course, compared with the current highly monopolized and centralized pattern of the cloud, there is no definite pattern at the edge, everyone has opportunities, and manufacturers with stronger technical capabilities and landing capabilities will have greater opportunities to stand out from the competition. " Yu Xin of Shiqing Technology said, "It is meaningful for the cloud edge to integrate and cooperate in some scenarios, but from the perspective of chip design, it will still be very different."
Hua of Lingxi Technology believes that heterogeneous computing, advanced manufacturing technology and advanced packaging are all means, which cannot fundamentally solve the problems of high energy efficiency ratio, small sample learning and online learning. Driven by industry orientation and market demand, the heterogeneous integration of Von Neumann architecture and non-Von Neumann architecture will become the core engine to drive the technological innovation of edge computing and the high-quality development of future industries.
On the one hand, the chip based on Von Neumann is still in the aesthetic direction of "violent computing", and the most advanced technology and packaging will be considered to improve the computing power; On the other hand, the non-von Neumann architecture gives priority to the large-scale use of biological neural networks, brain-like directions and new hybrid neural networks through architectural innovation. The new computing architecture represented by brain like computing will be deeply integrated with the traditional computing architecture, leading a new round of technological changes.
Zhao Xiaowu of Xuehu Technology said that industry head manufacturers have begun to use small chips to piece together large chips to complete product layout to meet the computing needs of different scenarios. For example, Apple and NVIDIA have begun to adopt this "building block" approach, which is a very obvious trend.
In the past two years, the domestic market has been very hot and developed rapidly, but there are not many manufacturers with scale and competitiveness. "Chips are still an industry that needs to be accumulated, and the industrial chain is relatively long," said Zhao Xiaowu. "At present, the domestic form is small and numerous, which is not conducive to competing with the upstream and downstream for the right to speak. It is expected that a wave of AI chip manufacturers will be eliminated in the next 1-2 years. "
Qiu, founder, chairman and CEO of Ai Xin Yuan Zhi, also expressed similar views. She said that this is a golden opportunity for the domestic chip industry. Due to market demand and state support, many entrepreneurial companies have emerged. From the macro environment, China's chip industry is still in its infancy, showing the trend of a hundred flowers blooming, but with the continuous development and growth of the industry, the subsequent industry integration will also be an inevitable process.
She stressed that this is in line with the development law of the chip industry in the past few decades. After such integration, there will definitely be head enterprises in the industry, which is very important for the overall industrial development of the country. Only in this way can China enterprises have the opportunity to compete with international manufacturers on the same stage.
Lingxi Technology China said that the marginal AI chip market is still open and there is no absolute overlord. Emerging and diverse application scenarios have brought huge market opportunities for domestic AI chips, especially in increasingly fragmented markets such as autonomous driving, intelligent security, intelligent Internet of Things, and wearable devices. Domestic AI chip manufacturers and international giant chip manufacturers are on the same starting line, and even have advantages in some fields.
As a famous scientist in the field of computing architecture said, now is the golden age of chip architecture, which is unprecedented. Although CPU and GPU will continue to innovate and are indispensable in some computing tasks, the new market spawned by the trend of AI accelerated computing and data explosion must be huge and diverse, which brings new opportunities to AI chip companies.
From the perspective of CPU, x86 architecture has shown signs of loosening the dominance of PC and server, and Arm has gradually moved from mobile phones and IoT to PC and server. RISC-V also starts with IOT devices and lays out more devices. The heterogeneous integration of Von Neumann architecture and non-Von Neumann architecture is moving towards scale. ...
Every wave of technology will produce new leading companies. Will the edge be AI?