AEWIN Edge AI Server Performs GPU Scaling with InfinitiesSoft AI-Stack

social_icon_fb social_icon_twitter social_icon_line social_icon_line

As the technologies develop rapidly, AI has been integrated to various kinds of verticals including Smart City, ITS, Smart Manufacturing, Smart Medical, and more. In this blog, we tested a solution integrated AEWIN hardware and our partner InfinitiesSoft AI Stack software to accelerate GPU load balancing and scaling for inferencing.

Test Setup
AEWIN’s SCB-1932 MEC is a 2U rack-mount hardware networking system. Based on dual 3rd Gen Intel® Xeon® scalable processors, the high-performance platform supports eight channels DDR4 register ECC RDIMM (up to 3200 MHz), and has a maximum memory capacity of up to 1TB per CPU, supporting up to eight Network Expansion Modules or four Network Expansion Modules plus two PCIe x16 full-height, full-length PCIe slots. The maximum capable Ethernet bandwidth is up to 800GbE.

A performant platform for the development and orchestration of machine learning is created through the integration of the AEWIN Edge AI Server and the AI-stack of InfinitiesSoft. The Edge AI Appliance provides the ideal development environment to support industry in creating valuable applications.

System AEWIN SCB-1932C
Processor 2x Intel® Xeon® Gold 5318S CPU @ 2.10GHz
DIMM Slot 16x DDR4 32G=512G
OS Ubuntu Linux 18.04 (Kernel: 5.4.0-Generic)
BIOS C1932A003
BIOS Settings “Above 4G Decoding”: Enable
“Above 4GB MMIO BIOS assignment”: Enable
PCIe Accelerator 2x NVIDIA T4

Figure 1: Test system set-up

Figure 2: GPU load balancing and scaling

Figure 3: Run the script to activate the demo

Figure 4: Increase the number of GPU

Figure 5: Increase more workload against more allocated GPUs. e.g. GPUs to concurrent users ratio at 2 to 1, 2 to 4, 2 to 8, 2 to 10.

The results presented in Figure 5 show that AEWIN SCB-1932C Edge AI server with AI Stack, can perform GPU load balancing and scaling for inferencing and deliver linear relationships when we increase the workloads and GPUs.

In this test, we demonstrated a solution integrated hardware (AEWIN – Intel Xeon Ice Lake SP Edge AI Server) and software (InfinitiesSoft AI Stack) to accelerate GPU load balancing and scaling for inferencing. The integration of AEWIN Edge Server and AI Stack can optimize the use of GPU resources with better efficiency and make AI development/management easy and quick through the interface platform. The AEWIN Edge AI Server with installed AI Stack platform can handle numerous servers to strengthen the total resources for high-efficient operation and to create win-win solutions for the AI applications.