Intel® Select Solutions: BigDL on Apache Spark

Accelerate and simplify deep learning development and deployment on an optimized, verified infrastructure based on Apache Spark.

In the past few years, organizations have seen a convergence of massive amounts of data with the compute power and large-capacity storage needed to process it all. The right infrastructure can provide modern businesses with new ways of harnessing data for innovative apps and services built on artificial intelligence (AI). The opportunities are nearly infinite and stretch across almost every field—from financial services to manufacturing to healthcare and beyond.

But organizations with on-premises infrastructures or using hybrid cloud models face several challenges on the road to AI. They need to research, select, deploy, and optimize infrastructure that can provide efficient resource utilization while scaling on demand to meet changing business requirements. Beyond scalability, organizations seek easier ways to implement AI initiatives. Many businesses lack sufficient in-house expertise and infrastructure to get started with AI, particularly for deep learning (DL). The road to deploying DL in production environments is time-intensive and complex. Managing the data for AI initiatives can also be a challenge: organizations struggle to extract value from their “data swamps,” and it can be complex and resource-intensive to move data from on premises to the cloud for analytics.

The Intel® Select Solution for BigDL on Apache Spark* can help businesses overcome these key challenges to achieve their AI initiatives faster and more easily. The pre-tested and tuned solution eliminates the need for organizations to research and manually optimize infrastructure to efficiently pursue their AI initiatives. The solution reduces the need for specialized in-house expertise to deploy and manage AI infrastructure. And it can help IT organizations improve infrastructure utilization, while ensuring scalability to meet the growing needs of their companies.

BigDL

Apache Spark helps solve the IT challenges of DL, data, and specialized expertise by providing for standardized big-data storage and compute, with scalability, by enabling the addition of hundreds of nodes without degrading performance and without changing the fundamental architecture.

BigDL: a distributed DL library that augments the storage and compute capabilities of Apache Spark—provides efficient, scalable, and optimized DL development. BigDL enables the development of new DL models for training and serving on the same big data cluster. It also supports models from other frameworks, including TensorFlow*, Keras*, and others, so you can import other trained models into the BigDL framework or use BigDL trained models in other frameworks. BigDL is supported by Analytics Zoo, which provides a unified AI platform and pipeline with built-in reference use cases to further simplify your AIsolutions development.

BigDL is optimized for Intel®-based platforms with software libraries like Intel® Math Kernel Library (Intel® MKL) and Intel® Math Kernel Library for Deep Learning Networks (Intel® MKL-DNN) to increase computational performance. Other supporting software includes the Intel® Distribution for Python*, which accelerates popular machine learning libraries such as NumPy*, SciPy*, and scikit-learn* with integrated Intel® Performance Libraries such as Intel MKL and Intel® Data Analytics Acceleration Library (Intel® DAAL). On the hardware side, the Intel Select Solution for BigDL on Apache Spark uses Intel® Xeon® Scalable processors for high performance and Intel® Solid State Drives (SSDs) for better performance and improved reliability compared to traditional hard-disk drives (HDDs).

The Intel Select Solution for BigDL  on Apache Spark

The Intel Select Solution for BigDL on Apache Spark helps optimize price/performance while significantly reducing infrastructure evaluation time. The Intel Select Solution for BigDL on Apache Spark combine Intel Xeon Scalable processors, Intel SSDs, and Intel® Ethernet Network Adapters to empower enterprises to quickly harness a reliable, comprehensive solution that delivers:

  • The ability to prepare your machine learning (ML)/DL infrastructure investments for the future with scalable storage and compute
  • Excellent total cost of ownership (TCO) with multi-purpose hardware that your IT organization is  used to managing in a verified, tested solution that simplifies deployment
  • Accelerated time to market with a turnkey solution that includes a rich development toolset and that is optimized for crucial software libraries
  • The ability to run analytics on data where it is stored

BigDL Application Scenario

  • Analyze large amounts of data on big data Spark clusters that store data, such as HDFS, Apache HBase, or Hive
  • Add deep learning capabilities (training or inference) to big data (Spark) programs or workflows
  • Run deep learning applications with existing Hadoop/Spark clusters and then easily share them with other workloads (eg extract-convert-load, data warehousing, feature design, classic machine learning, graphical analysis)

Inspur BigDL solution

This test is based on the Inspur NF5280M5 server.

Network Topology

Configuration for  Inspur solution based on Intel BigDL

To refer to a solution as an Intel Select Solution, a server vendor or data center solution provider must meet or exceed the defined minimum configuration ingredients and reference minimum benchmark-performance thresholds listed below.

One Master Node

Processor Intel® Xeon® Platinum 8160 processor (2.10 GHz, 24 cores, 48 threads)
Memory 384 GB or higher (12 x 32 GB DDR4-2666)
Boot Drive 1 x 240 GB Intel® SSD DC S4510
Data Tier 1 x 1.92 TB Intel® SSD DC S4510
Data Network 10 Gb Intel® Ethernet Converged Network Adapter X520-SR2
Management Network per Node Integrated 1 GbE port 0/RMM port

Four Worker Nodes

Processor Intel® Xeon® Platinum 8280 processor (2.60 GHz, 28 cores, 56 threads)
Memory 384 GB (12 x 32 GB DDR4-2933)
Boot Drive 1 x 240 GB Intel® SSD DC S4510
Data Tier 1 x 1.92 TB Intel® SSD DC S4510
Data Network 10 Gb Intel® Ethernet Converged Network Adapter X520-SR2
Management Network per Node Integrated 1 GbE port 0/RMM port

Network Switches

Top of the Rack (TOR) Switch 10 Gbps 48x port switch
Management Switch 1 Gbps 48x port switch

Software

Linux OS Intel® Xeon® Platinum 8280 processor (2.60 GHz, 28 cores, 56 threads)
Apache Spark 384 GB (12 x 32 GB DDR4-2933)
Apache Hadoop 1 x 240 GB Intel® SSD DC S4510
Java Development Kit (JDK) 1 x 1.92 TB Intel® SSD DC S4510
BigDL 10 Gb Intel® Ethernet Converged Network Adapter X520-SR2
Analytics Zoo Integrated 1 GbE port 0/RMM port
Intel® Distribution for Python 10 Gb Intel® Ethernet Converged Network Adapter X520-SR2
Intel® Math Kernel Library (Intel® MKL) Integrated 1 GbE port 0/RMM port

Applies to All Nodes

Trusted Platform Module (TPM) TPM 1.2 discrete or firmware TPM (Intel® Platform Trust Technology [Intel® PTT])
Firmware and Software Optimizations Intel® Hyper-Threading Technology (Intel® HT Technology) disabled
Intel® Turbo Boost Technology enabled
P-states enabled**
C-states enabled**
Power-management settings set to performance**
Workload configuration set to balanced**
Memory Latency Checker (MLC) streamer enabled**
MLC spatial prefetch enabled**
Data Cache Unit (DCU) data prefetch enabled**
DCU instruction prefetch enabled**
Last-level cache (LLC) prefetch disabled**
Uncore frequency scaling enabled**

BigDL

Datasets ImageNet-2012
Model Inception V1
Benchmark Training, Inference
Spark Cores 50 (every worker node)
Batch Size 800 images (4*cores*executors)

Performance

ImageNet Training Throughput 453 images/sec with Top-5 Accuracy of 85.7%
ImageNet Inference Throughput 1358 images/sec with Top-5 Accuracy of 85.7%

Test Results

  • The CPU cores of worker node are not fully used in this test,The average throughput (453 images/sec) and Top5 accuracy (85.7%) during the initial V1 model training process, both performance indicators exceeded the 375 images/sec, 85% of the Intel Select Solution Certification.
  • In the inference, the average throughput is three times that of training.

 

Return to the Intel Select Solutions Page: