How does the domestic computing power chip empower AI large models?
In November 2022, with the advent of ChatGPT, a successful AI technology revolution and its commercial application wave were set off. Under this wave, AIGC has become one of the most eye-catching tracks in the industry, with technology giants successively launching AI competitions, and AI large models emerging rapidly like bamboo shoots after a spring rain. After the large language models (LLMs) represented by OpenAI's GPT and Dall-E took the limelight, the domestic market also ushered in the "hundred-model war" era. The competition for the AIGC highland is becoming increasingly fierce. With the rapid rise and widespread application of large models, the current market demand for computing power has been ignited. The "arms race" in the AI field is changing from competition in algorithms and data to competition in underlying computing power. Nvidia has become the biggest beneficiary of this wave of AI dividends. As Huang Renxun said before, "We are in the iPhone moment of AI", the hot application of large models has directly brought Nvidia into the trillion-dollar market value club. In addition, the outbreak of generative AI applications and the rush to release large models may also usher in a new blue ocean market for computing chip manufacturers. Under this trend, countless veterans and newcomers are flocking into the competition for computing power chips. For domestic computing power chip manufacturers, they are also facing new development opportunities. Recently, at the 2023 World Artificial Intelligence Conference (WAIC) with the theme of "Intelligently Connecting the World, Generating the Future", Shanghai Tian Shuzi Xin Semiconductor Co., Ltd. (hereinafter referred to as "Tian Shuzi Xin") appeared with a number of highlight products and technical solutions, demonstrating significant progress in large model training and inference, as well as various application cases such as image recognition, 3D modeling, smart retail, intelligent computing center, and target detection, fully presenting the "core" strength of domestic general-purpose GPUs. How Tian Shuzi Xin grasps the new opportunity of AI large models, it is reported that Tian Shuzi Xin has successively released general-purpose GPU training products "Tian Gao 100" and inference products "Zhi Kai 100" in the past few years, and has achieved significant results in the application landing level after adaptation and verification by multiple partners. Especially in the current hot field of large models, Tian Shuzi Xin built a 40P computing power 100 acceleration card computing power cluster in the first half of the year, and completed the training of a 7 billion parameter large model of the Zhiyuan Research Institute, which is currently the only domestic general-purpose GPU product that can support the complete training of large models. Tian Shuzi Xin Chairman and CEO Gai Lu Jiang said in an interview with semiconductor industry observation and other media: "At present, the product Tian Gao 100 has successfully passed the Tsinghua Zhi Pu AI large model ChatGLM and the LLaMA model developed by Meta. In addition, Tian Shuzi Xin is building a 200P computing power independent computing power cluster to support the training of a large model with 65 billion parameters, which is expected to be completed in October." In addition to training, in the inference-level application of the vertical field, the actual performance of Tian Shuzi Xin's "Zhi Kai 100" is also very good, and it can be compared with the performance of international mainstream companies. Gai Lu Jiang said that subsequent products are continuously evolving, and some large model algorithms are optimized at the hardware level to greatly improve the universality and performance of computing power to meet the computing power needs of domestic large model development. "If there are customers who urgently need computing power and want to migrate back from foreign cloud platforms, we can also cooperate with partners to build a computing power platform to support their development." It can be seen that in the marketization of GPU products, Tian Shuzi Xin is at the forefront of domestic manufacturers. The reason for such rapid commercial progress, Gai Lu Jiang believes, depends on the hardware capabilities of the product on the one hand, and on the software ecosystem on the other hand. Tian Gao 100 did not choose the dedicated GPU development path commonly chosen by other domestic GPU manufacturers, but chose to make a general-purpose GPU. The so-called general-purpose architecture chip is one that can provide a general computing power and can provide computing power support to a wide range of users in various industries. "If you don't use a general architecture, the threshold for customers to switch platforms is relatively high." Gai Lu Jiang said that Tian Shuzi Xin took a route compatible with the international mainstream ecosystem in the first stage. Because Tian Shuzi uses a general architecture, it can facilitate customers to be compatible with the international mainstream terminal at the API interface level and reduce migration costs. "In the research team of more than 500 people, the number of software teams is twice that of hardware." Gai Lu Jiang said that only by doing a good job in software optimization can the performance of hardware be improved by several times. In the era of large models, in addition to requiring general-purpose GPUs to have high computing power, support for various data precision, and high bandwidth interconnection capabilities, the software ecosystem is also crucial. From the current progress, although domestic computing power chip manufacturers have collectively stepped out of "acceleration", there is still a certain gap in the competitiveness of domestic large computing power chip products, especially in the software ecosystem, compared with the international advanced level. For example, Nvidia's core advantage is not only in the superior hardware performance but also in the complete CUDA software ecosystem, which is also the shortcoming of domestic computing power manufacturers. In terms of software, Gai Lu Jiang emphasized that Tian Shuzi Xin will adopt an open attitude, maintain close cooperation with partners and customers in the large model industry ecosystem, form a "application-optimization-feedback-iteration" positive cycle, and accelerate the update and improvement of the software ecosystem. In this regard, in order to help users better evaluate and use general computing power and benefit more terminal industries from general computing power, Tian Shuzi Xin also released the DeepSpark open platform for 100 major applications last year. The platform is based on Tian Shuzi Xin's rich experience in application landing, and builds a systematic evaluation system from six major dimensions such as speed, power consumption, accuracy, linearity, video memory occupation, and stability, which can help users quickly and efficiently identify more effective computing power in their own business, improve users' algorithm development efficiency, and shorten the application landing cycle. It can be seen that on the road to commercialization, Tian Shuzi Xin has taken a step ahead. In the future, Gai Lu Jiang said that on the one hand, it is necessary to continue to improve its own capabilities, continue the iterative upgrade of the next generation of products, and improve chip performance and computing power; on the other hand, actively build an ecosystem, continuously improve the software stack, and aim to reduce costs and increase efficiency for customers, and improve cost-effectiveness. In this process, as the parameter scale of large models continues to grow, the demand for computing power also shows an order of magnitude growth, and the computing power cluster is becoming increasingly large. Gai Lu Jiang further pointed out that to support a larger parameter scale, when the training computing power of a single card is insufficient, piling up is a method, that is, by piling up computing power through a cluster, increasing computing power, and then unifying scheduling through software capabilities. It should be noted that the training cluster needs tens of thousands of cards to run at the same time, and it is necessary to ensure that they can work continuously without failure during the training process, which poses very strict requirements for product stability and reliability. At the same time, it is also necessary to support scalable elastic capabilities to achieve scalable computing power. In addition, it is also necessary to provide a solid guarantee to quickly locate and recover in case of failure. In this regard, Tian Shuzi Xin has independently developed the IXCCL distributed communication technology, which significantly improves the high-speed interconnection performance of multiple machines and multiple cards, and creates a computing power cluster solution based on its own general-purpose GPU, continuously optimizing parallel acceleration strategies such as automatic mixed precision training, pipeline parallelism, tensor parallelism, data parallelism, and model parallelism, making large model training and inference more efficient. Overall, Tian Gao 100 has taken the lead in completing the training of large models with hundreds of billions of parameters, taking an important step in the application of large models with independent general-purpose GPUs. This achievement fully proves that Tian Gao products can support the training of large models, breaking through the key "blockage" in the innovation and development of domestic large models, which is of great significance to the construction of the independent ecosystem of domestic large models and the security of the industrial chain. At the same time, together with "Zhi Kai 100", it outlines the preliminary layout of Tian Shuzi Xin's "training + inference" products. Through communication with Gai Lu Jiang, it can be found that as a general-purpose GPU manufacturer, in line with the development trend of large models, Tian Shuzi Xin relies on the general-purpose GPU architecture to provide support for customers from the two aspects of training and inference, and is committed to creating a high cost-performance, general-purpose full-stack cluster solution, providing a strong computing power base for the era of large models. Next, Tian Shuzi Xin will continue to cooperate deeply with partners to build a larger Tian Gao 100 computing power cluster, complete the training of larger parameter scale large models, and better support domestic large model innovation applications with independent general-purpose GPU products, further consolidate China's computing power foundation, and help build an independent ecosystem for the artificial intelligence industry. At the end of the new AI era, the torrent of massive data and the outbreak of large model application needs will continue to drive the scale of computing power to grow exponentially. Although the mainstream computing power solution is still Nvidia's leading, in the long run, China's general-purpose GPU companies have a lot to do. The surge in demand for computing power provides a huge market, and the current tight supply of foreign computing chips and export restrictions will provide more opportunities for domestic chip companies. This is undoubtedly an excellent time window for domestic chips to strive to build their own independent and innovative architectures, and meet the market's diverse needs for cost-effectiveness and energy efficiency. Whoever can take the lead in providing a complete domestic alternative solution will be able to share a piece of the huge AI computing power market. Under this opportunity, as a representative enterprise of domestic general-purpose GPU computing power chips, Tian Shuzi Xin is insisting on developing independent and controllable, internationally leading high-performance general-purpose GPU products, continuously upgrading computing power solutions, adapting to support more parameters, larger data sets, and more complex algorithms of large models, providing more solid computing power support for China's large model innovation and application landing. At the moment when large models have triggered a new round of global artificial intelligence innovation, after Tian Shuzi Xin has made a significant breakthrough from 0 to 1 in China's general-purpose GPU, it will further accelerate the new breakthrough from 1 to 100 in the local GPU industry ecosystem.
Leave A Comment