Intel® Select Solutions: AI Inferencing

Summary

As an important strategic partner of Intel, after Intel launched the 2nd Generation Intel Xeon Scalable processor with stronger computing performance and better user experience support, Inspur and Intel jointly developed artificial intelligence inference based on this processor Solutions that accelerate AI inference and are committed to helping industry and enterprise customers reduce deployment time and costs while ensuring their performance to meet customers’ growing needs. Intel Select Solutions for AI Inferencing provides customers with an efficient artificial intelligence inference algorithm solution based on the proven Intel architecture, which can be quickly launched and deployed, enabling you to innovate and get to market faster. In order to accelerate AI Inferencing and time to market, Intel Select Solutions for AI Inferencing also combines several Intel and third-party software and hardware technologies.

Background

In the context of artificial intelligence and machine learning, training is the stage where neural networks try to learn from data. Reasoning puts learning into practice, and trained models are used to infer and predict outcomes—classify, identify, and process new input data based on what you have learned. The solution is a deep learning inference solution for some of the fastest growing areas of artificial intelligence, such as video, natural language processing, and image processing. As a “turnkey platform” solution, with proven IA building blocks, partners can innovate and bring this integrated tool’s building block solution to market, making AI simple and efficient. Given that neural network models have been trained, model inference will be a challenge. This solution provides a starting point for customers to deploy efficient artificial intelligence inference algorithms. Use OpenVINO to accelerate inference, which shortens the time from enterprise data to strategic decisions, provides low latency and high-end to end throughput, and reduces enterprise costs.

Solution Introduction: NF5280M5 AI Server

Industry: Cross-industry solutions, mainly for retail, finance, medical and other fields.
Use cases: Natural language processing, image classification, object detection and object tracking.
Deployment: Edge and data center deployments.

To apply this Intel Select Solutions for AI Inferencing, the server vendor or data center solution provider must meet or exceed the defined minimum configuration components and reference minimum benchmark performance thresholds listed below.

Basic Configuration

Component Model Description Qty
NF5280M5 2U dual-socket rackmount server 1
6240/6248 Intel Xeon Gold 6240 CPU at 2.6 GHz / 18C / 36T, 6248 at 2.5 GHz / 20C / 40T 2
192 GB with 12 x 16 GB or higher, 2933 MHz or higher, DDR4 ECC RDIMM 12
S4510 Intel SSD D3-S4510 Series
(240GB, 2.5in SATA 6Gb / s, 3D2, TLC)or Intel SSD D3-S4510 Series
(240GB, M.2 80mm SATA 6Gb / s, 3D2, TLC) or higher
1
P4610 Intel SSD DC P4610 Series
(2.5in PCIe 3.1 x4, 3D2, TLC) @ 3.2TB or higher
1
25Gb Network 56Gb InfiniBand / 25GbEthernet 1

Upgrade Configuration

Component Model Description Qty
NF5280M5 2U dual-socket rackmount server 1
8280 Intel Xeon Platinum 8280 CPU at 2.7 GHz / 28C / 56T 2
384 GB with 12 x 32 GB or higher, 2933 MHz or higher, DDR4 ECC RDIMM 12
S4510 Intel SSD D3-S4510 Series
(240GB, 2.5in SATA 6Gb / s, 3D2, TLC)or Intel SSD D3-S4510 Series
(240GB, M.2 80mm SATA 6Gb / s, 3D2, TLC) or higher
1
P4610 Intel SSD DC P4610 Series
(2.5in PCIe 3.1 x4, 3D2, TLC) @ 3.2TB or higher
1
25Gb Network 56Gb InfiniBand / 25GbEthernet 1

Single Node Spec

CPU 2 x Intel Xeon Scalable Gold 6230/6240/6248/8280
Memory (min) 512GB
(12 x 16 GB 2666MHz DDR4 ECC RDIMM)
Boot Drive 1 x Intel SSD DC S4500 Series >= 240GB (P4101)
Storage Intel SSD D3-S4510 Series(240G, 2.5in SATA 6Gb/s, 3D2, TLC)
or Intel SSDD3-S4510 Series(240G, M.280mm SATA 6Gb/s, 3D2, TLC) or higher
Intel SSD DC P4610 Series(2.5in PCIe 3.1x 4, 3D2, TLC)@3.2 TB or higher
Data Network InfiniBand (IB) 56Gb and Ethernet 25Gb
Management Network Integrated 1 GbE port 0/RMM port

Software

Software Required or Recommended Version
Linux Distribution Recommended CentOS* Linux* release 7.6.1810 (Core)
Intel Math Kernal Library (MKL) Required 2018 Update 3
Intel MKL-DNN Required 0.17
OpenVino Toolkit Required 2018 R5
OpenVino Model Server Recommended 0.41
TensorFlow Required 1.14
PyTorch Recommended 1.0.1
MXNet Recommended 1.3.1
Intel Python Recommended 2019 Update 1
Intel Perf & Monitoring Tools Recommended Cores per Node (VTune Platform Profiler)

Customer Benefits

Test Results

In the test comparison between Intel Xeon Gold 6240 and Intel Xeon Gold 6130, the performance can be accelerated by 1.35-1.43 times.

  • ResNet50 training performance reaches 1.35 times acceleration
  • Inception_v3 training performance reaches 1.43 times acceleration

Inference performance (FP32) can achieve 1.3-3 times acceleration on Intel Xeon Gold 6240 and Intel Xeon Gold 6130, with the best performance of multiple streams (BS = 1)

  • ReNnet50 multi-steam inference for best performance up to 3x acceleration
  • Inception_v3 multi-steam best inference performance up to 2.6 times faster
  • A3C multi-steam optimal performance can reach 1.32 times acceleration

Compared with the FP32 model, using VNNI to upgrade and optimize TensorFlow, the multi-steam inference performance of INT8 on the Intel Xeon Gold 6240 can be accelerated by 1.9-3.2 times.

  • Under 40 parallel streams, RN50 int8 performance can reach 3.5 times acceleration
  • Under 40 parallel streams, the performance of Inception int8 can be 3.2 times faster
  • W & D Int8 performance can be accelerated up to 1.9 times under 40 parallel streams
  • Under 20 parallel streams, SSD Mobilent int8 performance can be doubled

Compared with TensorFlow, OpenVINO can deeply improve INT8 performance and achieve multiple speedups

  • Based on Intel Xeon Gold 6240, compared to TensorFlow, ResNet50 INT8 performance can achieve 1.3 times acceleration
  • Based on Intel Xeon Gold 6240, compared to TensorFlow, Inception_v3 INT8 performance can achieve 1.4 times speedup

Intel, the Intel logo and Xeon are trademarks of Intel Corporation in the United States and/or other countries.

Return to the Intel Select Solutions Page:

Talk to an Expert