

Li Qing, Director of Frost & Sullivan Greater China
On August 30th, at the AI Reconstructing the Digital Economy sub-forum, Li Qing, Director of Frost & Sullivan Greater China, released 'The Mid-Year Evaluation of China's Large Model Research and Development Capabilities in 2024', which sorted out and evaluated several leading large models in the market, and conducted an in-depth analysis of their current comprehensive research and development capabilities.
Since the public release of ChatGPT by Frost & Sullivan at the end of 2022, AI technology has officially moved from the closed-door exploration of technology companies to the attention of all mankind. After more than a year of development, large model technology based on GPT has become a key strategic element for national technology and industry, receiving significant international attention. Against this backdrop, hundreds of pre-trained language large models have emerged in China, with participants including top academic research institutions and internet technology companies. To sort out the capability echelons of large models in China and their corporate backgrounds, Frost & Sullivan, in collaboration with LeadLeo Research Institute, conducted the first multi-dimensional comprehensive evaluation of large model research capabilities in December 2023.
Half a year later, with the continuous iterative upgrade of large model capabilities and multiple rounds of reshuffle in the market competition landscape, the current large model market has taken on a new look. Not only have internet giants such as Baidu, Alibaba, and Tencent continued to lead the market, but also startups like Moonshot, Zero One Everything, and BaiChuan Intelligence have emerged as formidable competitors, challenging the status quo of traditional internet brands. To reflect the most realistic competitive situation at present, Frost & Sullivan, in collaboration with LeadLeo Research Institute, conducted a survey and analysis of the auxiliary research capabilities of Chinese large models in the first half of 2024. Based on this research, they sorted out and evaluated several leading large models in the market, and conducted an in-depth analysis of their current comprehensive research capabilities.
Background and Methods of Large Model Research Capability Evaluation
1. Overview of Research Background —— Industry research provides key insights through multi-level in-depth analysis, supporting corporate strategic decision-making and market positioning.
Industry research is a comprehensive process that analyzes the current development status and market dynamics of specific industries, covering key dimensions such as industry definition, classification, competitive landscape, and market capacity. Analysts provide profound insights and valuable perspectives through in-depth research, offering important support for various fields such as corporate strategic planning, policy formulation, financial investment decisions, and education and training.

Source: Analysis by Frost & Sullivan, LeadLeo Research Institute
In industry research, the industry level, sector level, and product level represent different levels of the macroeconomy: the industry level encompasses groups of industries with similar characteristics, the sector level focuses on market dynamics and corporate conditions of specific industries, while the product level delves into the design, functionality, and market positioning of specific products or services. Research methodologies are adjusted according to these macro-to-micro differences. The macro level pays attention to factors such as policy, economy, and environment, while the micro level includes more detailed content such as development history and industrial chain analysis.

Source: Analysis by Frost & Sullivan, LeadLeo Research Institute
2. Pain Points in Traditional Industry Research Development - Traditional industry research faces challenges such as outdated tools, difficulties in knowledge transfer, and complex quality control, which seriously affect its output efficiency and innovation capabilities.
The output process of traditional industry research typically includes three key steps: the first is basic research, focusing on the collection of primary and secondary industry data; the second is data processing, involving organizing data logic, verifying data authenticity, and visualizing key information; finally, it is the result output, ensuring that the report has consistent logic, clear visualization, and reasonable viewpoints.
However, in practical operations, industry research faces many challenges: tool updates are lagging behind, industry research has long relied on web search and office software, with no significant innovation in the past 20 years; team knowledge is difficult to effectively pass on, high personnel turnover and excessively long training cycles for new members make it challenging for analysts to continuously transfer their experience and knowledge; information traceability and compliance issues are complex, and ensuring the reliability and compliance of information sources becomes a problem under the pressure of massive information and time; quality control is difficult, as quality control personnel lack professional writing skills, and professional analysts also find it challenging to fully participate in quality control due to time constraints. These challenges collectively affect the output efficiency and innovation capabilities of industry research, limiting its further development.

Source: Analysis by Frost & Sullivan, LeadLeo Research Institute
3. Large Model Empowering Industry Research - Large models effectively empower industry research through innovation and accuracy, improving the quality and efficiency of analysts' content creation and information retrieval.
Large models, through their core functions such as creation, generation, rewriting, and retrieval, comprehensively drive the advancement of industry research. Firstly, large models act as third-party AI experts, assisting analysts in framework construction and content creation at the initial stages of research, effectively reducing the burden of desk work. Secondly, through effective interaction with analysts, large models help generate structured content and insights, significantly improving the output efficiency of basic content. Furthermore, by reducing text errors and repetitive content, optimizing the proofreading process, and enhancing output quality, large models improve the quality of production. Finally, large models can quickly process massive data, provide real-time information retrieval, and enhance analysts' ability to obtain comprehensive information within limited time.

Source: Analysis by Frost & Sullivan, LeadLeo Research Institute
Large models, in assisting industry research, effectively empower industry analysis through 'two innovations' and 'three accuracies'. The 'two innovations' include the creativity in creating analysis dimensions and in judging viewpoints, using creativity to provide analysts with broader perspective guidance and support the generation of research content that is original and insightful. The 'three accuracies' cover the accuracy of information data, the accuracy of prompt word understanding, and the accuracy of sub-industry cognition. Through rigorous and precise content output, analysts can more comprehensively grasp industry dynamics and thus accurately judge the overall development trend of the industry.
This evaluation will assess the differentiated performance of large models in terms of innovation and accuracy through three-dimensional capability tests: report writing ability, industry understanding ability, and industry research foundational ability. Ultimately, it aims to identify the large models that can most effectively assist analysts in generating high-quality content through industry research.

Source: Analysis by Frost & Sullivan, LeadLeo Research Institute
4. Evaluation of Large Model Participants — In 2024, Frost & Sullivan, in collaboration with LeadLeo Research Institute, conducted a comprehensive assessment of the industry research capabilities of 16 leading Chinese large models, revealing their latest applications in the field of industry research.
Since the launch of ChatGPT, generative AI has sparked a global craze and gradually penetrated into daily life and work scenarios. After conducting the first evaluation of the research capabilities of large models in 2023, Frost & Sullivan, in collaboration with LeadLeo Research Institute, released updated evaluation results for mid-year 2024. They selected 16 leading large models on the Chinese market for a comprehensive assessment to gain insights into the latest applications of Chinese large models in the research field.

Source: Analysis by Frost & Sullivan, LeadLeo Research Institute
5. Evaluation Methods and Metrics - This large model capability test, conducted through 3,540 questions, comprehensively examined the large model's writing ability, foundational skills, and industry understanding through both analyst manual evaluation and automatic large model evaluation.
This large model capability test is conducted across three core areas: report writing ability, model foundational capabilities, and industry understanding. The report writing covers the writing of 20 reports from different industries, comprising 300 questions. Analysts have long tracked report issues, accumulating over 3,000 questions; model capabilities cover six core text generation abilities, comprising 60 questions; industry understanding covers 15 core industries, with each industry involving 12 questions, totaling 180 questions. The three dimensions together comprise a total of 3,540 questions. The analyst team is composed of senior analysts from various teams of the LeadLeo Research Institute, all with over 16 months of experience using large models.
The evaluation methods are divided into two types: manual evaluation by analysts and automatic evaluation by large model referee models. On the analyst evaluation side, a double-blind mechanism is adopted to ensure fairness to the greatest extent possible. Each tester is randomly assigned N models for answer collection, and information sharing is prohibited during this period to ensure fairness during the answer evaluation stage. During the evaluation phase, the order of the answers from the 16 models corresponding to each question is randomly shuffled to prevent evaluators from having any biases towards the answers. On the referee model evaluation side, the world's top ten Chinese and foreign large models are used as referee models for scoring. To eliminate model bias and improve score fairness, each referee model will generate three scoring versions and take their average. Ultimately, the result of the referee model's scoring is determined by the average score of these ten referee large models.

Source: Analysis by Frost & Sullivan, LeadLeo Research Institute
6. Sub - evaluation dimensions of research capability - This large model capability test is conducted across three core areas: research report writing ability, model basic capabilities, and industry understanding ability. The evaluation results of large models in research capability are ultimately derived from their performance in these three core areas.
The ability to write research reports refers to the professional level demonstrated by large models in the actual process of report writing. The 8-D methodology developed jointly by Frost & Sullivan and LeadLeo, an enterprise research institute, consists of eight core modules, constructing a systematic and comprehensive framework for in-depth industry analysis. Supported by this methodology, a combination of detailed data and precise analysis can distill insightful conclusions, significantly enhancing the clarity of industry research and the rigor of data. After more than a hundred analysts collaborated intensively for eight months and optimized multiple times, an efficient 8-D modular large model questioning framework was successfully constructed. This carefully designed questioning system has been transformed into an evaluation tool, which uses targeted questions to deeply test and assess the quality and effectiveness of model reports.

Source: Analysis by Frost & Sullivan, LeadLeo Research Institute
The basic capabilities of industry research refer to six core competency dimensions summarized based on analysts' long-term practice of using large models in the process of AI-assisted writing of industry research reports. These dimensions include logical reasoning, which ensures the report content is structured rigorously and the conclusions are reliable by analyzing and inferring the logical relationships between data and facts; summarization and refinement, which extracts key points from a large amount of information and presents important conclusions and insights concisely and clearly; knowledge reserve, which uses extensive industry and market knowledge, combined with a multidisciplinary background, to write in-depth analysis reports; long text generation, which produces long reports with complete structures and detailed content, ensuring that each part has sufficient argumentation and data support; intention understanding, which accurately grasps the needs and intentions of clients or readers, ensuring that the report content meets their expectations; role-playing, which simulates different perspectives to deeply analyze and predict market behavior and industry trends, providing multi-dimensional insights to meet the needs of specific readers.

Source: Analysis by Frost & Sullivan, LeadLeo Research Institute
The industry understanding capability refers to the model's accuracy in recognizing different sub-sectors and its depth of insight. Since its establishment, Frost & Sullivan, in collaboration with LeadLeo Research Institute, has accumulated over 140,000 registered users and more than 7,000 industry enterprise research reports, covering 15 major categories of industries and thousands of sub-fields. Its elite analyst team and research findings have been widely recognized by users in finance, manufacturing, internet technology, and other fields, making LeadLeo one of China's largest industry research platforms. It possesses core advantages such as broad industry coverage, a large number of reports, high writing efficiency, and precise knowledge. In this major model industry research capability evaluation, LeadLeo Research Institute, together with its Shanghai, Nanjing, and Shenzhen offices, gathered senior analysts from cross-industry fields. Leveraging their profound understanding of key areas such as competitive landscape, development trends, constraints, and industry barriers, combined with rich report writing experience, they posed in-depth questions to the model targeting 15 major industries. Through vertical evaluation of the model in each sub-sector and horizontal comparison across all industries, it was understood the depth and capabilities of the 16 major models in terms of industry understanding and content output.

Data source: Frost & Sullivan analysis, LeadLeo research institute
Results of the Chinese Large Model Capability Evaluation

Source: Analysis by Frost & Sullivan, LeadLeo Research Institute
Based on three evaluation dimensions of research capability, a review was conducted on 16 mainstream large models in the market. The results of the mid-year review of large model research capability in 2024 show that SenseTime Daily Update, Tencent HMY, Tongyixianwen, Vanyuanyi, and DouPao rank in the first tier, with outstanding comprehensive performance and strong capabilities.
Shang Tang is constantly innovating:The SenseNova large model system of SenseNova Technology, with its daily updates · Consultation (SenseNova), has introduced the "Edge-to-Cloud" full-stack large model product matrix in its latest version, SenseNova 5.0. This enables seamless integration of AI capabilities at the cloud, edge, and device levels, further enhancing its application capabilities across various industries. The SenseNova system continues to focus on providing strong knowledge coverage, reasoning capabilities, and cross-modal interaction, supporting up to 128K context windows and maintaining performance comparable to GPT-4.
Tencent Hanyuan:Tencent Mosaic Large Model is a general-purpose large language model independently developed by Tencent, with a scale of over 200 billion parameters and pre-trained corpora exceeding 3 trillion tokens. Mosaic Large Model possesses powerful Chinese understanding and creation capabilities, logical reasoning abilities, as well as reliable task execution capabilities. It supports multiple rounds of dialogue, content creation, logical reasoning, knowledge enhancement, and other functions, as well as multimodal image generation. Mosaic Large Model has been used in multiple businesses and products such as Tencent Cloud, Tencent Advertising, Tencent Games, Tencent Fintech, Tencent Meeting, Tencent Document, WeCom Search, and QQ Browser.
Tongyi Qianwen:Tongyiyi Qianwen is a large-scale parameter model with hundreds of billions of parameters launched by Alibaba Cloud, whose comprehensive performance has been comparable to GPT-4 in multiple authoritative evaluations. The Tongyiyi Qianwen model has made significant improvements in complex instruction understanding, literary creation, general mathematics, knowledge memory, and illusion resistance. Tongyiyi Qianwen 2.5 is more mature and user-friendly, with technical optimizations to better meet the integration needs of downstream application scenarios. In addition, the official website of Tongyiyi Qianwen has launched multimodal and plugin functions, supporting subtasks such as image input and document parsing, and has introduced 10 industry models trained based on the Tongyiyi large model to support applications in different fields.
Text Mind One Sentence:ERNIE Bot is a large-scale pre-trained language model developed by Baidu based on the Transformer architecture. As a new member of the Wenxin Large Model family, it possesses core functions such as intelligent dialogue, content creation, multimodal generation, and knowledge enhancement. Through massive data pre-training, the model can handle various natural language processing tasks such as text classification and sentiment analysis, and supports multi-language applications. After upgrading to version 4.0,ERNIE Bot is available for enterprise customers on the Baidu Intelligent Cloud platform, widely used in academic research and business scenarios, significantly improving work efficiency and user satisfaction.
Bean bun:ByteDance's DouPao Large Model, released in May 2024, is a multi-functional AI model covering multiple fields such as natural language processing, knowledge Q&A, and language translation. With its efficient processing capabilities and highly competitive pricing advantages, DouPao Large Model has been widely applied in more than 30 industries including mobile phones, automobiles, and finance. It processes an average of 120 billion Tokens of text per day and generates 30 million images, demonstrating its outstanding performance. With successful applications in various business scenarios, DouPao Large Model has become one of the most used large models in China.
Recommended Reading

↓↓Long-press to scan the QR code below to obtain ↓↓


