As artificial intelligence accelerates its penetration into all industries, the computing power structure is undergoing a profound transformation from training-oriented to reasoning-oriented. As the core support for AI implementation, reasoning computing power is becoming a key force driving the commercialization of large models and the expansion of intelligent ecosystems.
PART.01
China's reasoning computing power definition and service coverage
Inference computing power is mainly responsible for the inference tasks of AI models, used to execute trained models, process real-time data, and provide prediction results. The inference process requires rapid response to computing resources and has high requirements for real-time performance. As the underlying hardware supporting inference tasks, inference chips focus on low latency and low power consumption to ensure efficient response. Inference-based AI centers can be configured with optimized inference hardware, high-performance servers, and network devices to ensure fast response times and stable services, with more emphasis on processing speed and reliability.

Source: Analysis by Frost & Sullivan
PART.02
Market scale and share of reasoning computing power in China
In AI infrastructure, computing power is the core driving force behind innovation. As of 2023, general computing power and intelligent computing power were 171 and 59 EFLPOS respectively, and it is expected that they will reach 330 and 240 EFLPOS by 2027, with an overall growth rate of 39%. China's average daily Token consumption increased from 100 billion at the beginning of 2024 to over 30 trillion by the end of June this year, a growth of more than 300 times in one and a half years, reflecting the rapid growth in the scale of AI applications in China. The market size of China's reasoning computing power is expected to reach 43.83 billion yuan in 2025.

Source: Analysis by Frost & Sullivan
PART.03
Core reasoning computing power technology
The development of inference computing power focuses on solving high real-time, low-latency, and high-concurrency requirements. The key technological breakthrough lies in the adoption ofP/D separation architecture, achieved through pre-filling and decompression instance division of labor, utilizing high-performance RoCE networksKV Cache SynchronizationThis architecture takes into account both the low latency of the first token and the efficiency of subsequent token generation. It effectively supports application scenarios that require high real-time performance and low latency, such as intelligent customer service, real-time financial analysis, intelligent driving, and smart healthcare.Initial Token LatencyIt can be controlled within 1 second, with subsequent Token latencies less than 50 milliseconds.
However, in massive user inference scenarios, there are still core challenges: how to ensure user experience and high concurrency access at low cost, while taking into account low latency for initial tokens and continuous low latency for subsequent tokens, and how to cope with latency impacts brought about by the distance between computing power centers and terminals.

Source: Analysis by Frost & Sullivan
PART.04
The Development Trend and Challenges of Reasoning Computing Power in China
Against the backdrop of the current national strategy that places great emphasis on the development of artificial intelligence, China's reasoning computing power is entering a rapid development phase. With the widespread application of large models and multimodal models, the demand for efficient and low-latency reasoning computing power continues to rise. From the perspective of technical development trends, reasoning computing power mainly presents four directions: continuous expansion and upgrading of computing infrastructure, optimization of reasoning for long sequences and super-large models,Multi-machine parallel reasoningSupporting the popularization of reasoning through supporting super-large models and multimodal applications, as well as the maturity of software and hardware collaboration and ecosystem.
Domestic computing power is continuously enhancing its overall capabilities through technological breakthroughs, ecological construction, and coordinated development of the industrial chain.Huawei AscendRepresented by the accelerated iteration of domestic chips, breakthroughs in system-level computing power have been achieved through 'super-node' clusters and multi-card interconnection. At the same time, an open ecosystem has attracted more enterprises to join, forming an independently controllable computing infrastructure system, laying a solid foundation for the development of reasoning computing power in China.

Source: Analysis by Frost & Sullivan
The development of intelligent computing power in China is experiencing rapid growth, but it also faces many challenges, including tight power resources, insufficient supply of high-power cabinets, and the lack of edge data security and cross-level collaboration mechanisms. The industry is accelerating the construction of green and high-density infrastructure, promoting the optimization of computing power center layout, and exploring safer and more efficient data collaboration methods to ensure the continuous and efficient development of computing power.

