Inspur AI Servers Achieve Record-Breaking Results in the Latest MLPerf v2.1 Inference Benchmarks
Performances in multiple tasks including natural language processing and medical image segmentation improved by almost 100% compared to previous results
SAN JOSE, Calif.–(BUSINESS WIRE)–Inspur Systems, a leading data center, cloud computing and AI solutions provider, announced that Inspur AI servers achieved record-breaking results with massive performance gains in the newly-released MLPerf™ Inference v2.1 AI benchmark results. Inspur AI servers took the lead in more than half of the tasks in the Closed division, posting improvements in performance over 100% in multiple tasks compared to previous results.
Inspur AI Servers were top ranked in 19 out of 30 tasks in the Closed division, which offers an apples-to-apples performance comparison between submitters. Among them, Inspur AI servers won 12 titles out of 16 tasks in the datacenter category and 7 titles out of 14 tasks in edge category. Inspur successfully defended 11 performance records and saw performance improvements of approximately 100% in several tasks like BERT (natural language processing) and 3D U-Net (medical image segmentation).
Strong lead in BERT, greatly improving Transformer performance
21 global companies and research institutions submitted more than 10,000 performance results for the Inference v2.1 benchmarks. The Inspur NF5468M6J AI Server has a pioneering design with 24 GPUs in a single machine. Inspur improved BERT inference performance, which is based on Transformer architecture, with strategies including in-depth optimization of Round Robin Scheduling of GPUs to make full use of the performance of each GPU, enabling the completion 75,000 question-and-answer tasks per second. This is a massive 93.81% jump compared with the previous best performance in the v2.0 results. It is also marked the 4th time that an Inspur AI Server was the benchmark leader for the MLPerf inference BERT task.
The Inspur NF5468M6J AI Server achieved record-breaking performance that was 20% higher than the runner-up In the BERT task. The success of NF5468M6J is due to its excellent system design. It supports up to 24 A100 GPUs with a layered and scalable computing architecture, and earned 8 titles with excellent performance. Among the participating high-end mainstream models utilizing 8 GPUs with NVLink technology, Inspur AI servers achieved top results in 7 out of 16 tasks in the Data Center category, showing leading performance among high-end models. Among them, NF5488A5, Inspur’s flagship high-performance AI server, supports 8 third-generation NVlink interconnected A100 GPUs and 2 AMD Milan CPUs and 8 GPUs in a 4U space. The NF5688M6 is an AI server with extreme scalability optimized for large-scale data centers. It supports 8 A100 GPUs and 2 Intel Ice Lake CPUs and 8 GPUs, and supports up to 13 PCIe Gen4 I/O expansion cards.
Optimization on algorithm and architecture, further enhancing performance
Inspur is the first to apply the hyperparameter optimization solution in MLPerf training, which greatly improves performance. Inspur pioneered a ResNet convergence optimization solution. In the ImageNet dataset, only 85% of the original iteration steps were used to achieve the target accuracy. This optimization scheme improved training performance by 15%. Inspur is also the first to use the self-developed convolution merging algorithm plugin operator solution in the MLPerf Inference benchmarks. The algorithm improves performance from 123TOPS to 141TOPS, a performance gain of 14.6%.
In terms of architecture optimization, Inspur took the lead in using the JBOG solution to greatly improve the ability of Inspur AI servers to adopt a large number of GPUs in a single node. In addition, the high-load multi-GPU collaborative task scheduling and the data transmission performance between NUMA nodes and GPUs are deeply optimized. This enables a linear expansion of CPU and GPU utilization and the simultaneous operation of multiple concurrent task, which greatly improves performance.
Inspur is committed to the full stack innovation of an AI computing platform, resource platform and algorithm platform, and jointly accelerates the process of AI industrialization and intelligent development of various industries through its MetaBrain ecosystem partners.
As a member of MLCommons, Inspur has actively promoted the development and innovation of the MLPerf benchmark suite, participating in the benchmarks 10 times and winning multiple performance titles. Inspur continues to innovate in aspects such as overall system optimization, software and hardware synergistic optimization, and reduction of energy consumption ratio, constantly breaking MLPerf performance records, and sharing the technology with the MLCommons community, which has been used by a large number of participating manufacturers and is widely used in subsequent MLPerf benchmarks.