Harnessing the Potential of InfiniBand: Solutions for Modern Networking Challenges

InfiniBand (IB) is an advanced computer network communication standard developed by the InfiniBand Trade Association (IBTA). InfiniBand technology enjoys a high reputation in HPC connections for supercomputers, storage, and even LAN networks. InfiniBand has numerous advantages, including simplified management, high bandwidth, complete CPU offloading, ultra-low latency, cluster scalability and flexibility, quality of service (QoS), and SHARP support, among others.

InfiniBand is a critical communication technology for data transmission, suitable for various applications. It has evolved to dominate network speeds of 100G EDR or 200G HDR and is progressing towards even faster speeds like 400G NDR and 800G XDR. InfiniBand adheres to strict latency requirements, approaching zero latency. It excels in applications requiring rapid and precise data processing, commonly used for tasks such as extensive data analysis, machine learning, deep learning training, inference, conversational AI, prediction, and forecasting in supercomputing.

InfiniBand HDR Product Solutions in AI

With the advancements in ChatGPT and artificial intelligence (AI), the pace of global data center construction has accelerated, leading to changes in the scale and pattern of optical module procurement. AI data centers adopt a Fat Tree network structure, where the number of uplink and downlink ports is equal at each node. Thence, compared to traditional data centers, there are more switches in AI data centers.

Taking NVIDIA’s AI cluster model as an example, the SuperPOD serves as the foundational unit, comprising 140 DGX A100 GPU servers, HDR InfiniBand 200G network cards, and 170 NVIDIA Quantum QM8790 switches. With an increase in the number of switches, employing a 1:1 connection approach, the requirement for port interconnection is calculated as 40×170/2=3400 ports. Considering actual deployment scenarios, this is adjusted to 4000 ports. Meanwhile, meeting entry-level requirements similar to GPT-4.0 would necessitate approximately 3,750 NVIDIA DGX A100 servers and 110,000 optical modules.

FS provides products for the SuperPOD utilizing HDR InfiniBand 200G network interconnection. These products include high-speed, low-latency data transmission, low-power, and high-stability InfiniBand optical devices. NVIDIA’s solution also mentions the use of AOC and DAC cables, typically employed for short-distance data transmission. These cable types offer high bandwidth, low latency, and low power consumption transmission solutions. Therefore, HDR AOC and DAC cables can be used in the SuperPOD to meet various distance and application requirements.

For example, the 2x200G HDR splitter cable, specifically the QSFP-2Q200G-2QAO05 cable provided by FS, is an active splitter cable based on QSFP56 VCSEL technology. It supports 2x 200Gb/s data transmission and complies with standards such as SFF-8665, RoHS, and SFF-8636. Each end of the cable is equipped with EEPROM, providing product and status monitoring information accessible to the host system. It is primarily used in Fat Tree topologies to connect 200G leaf switches and spine switches to facilitate cross-connection functionality. This allows the ports of HDR InfiniBand QSFP56 switches to operate as 2xHDR100. The 2x200G HDR splitter cable also offers numerous advantages and applications in large-scale server clusters. By maximizing port access capacity, expanding network scalability, and reducing costs, it enhances the network capabilities of IB switches.

InfiniBand HDR Product Solutions in Supercomputing

NVIDIA GPUs and networking products, particularly Mellanox HDR Quantum QM87xx switches and BlueField DPUs, have established a dominant position in the interconnect of over two-thirds of supercomputers.

InfiniBand HDR Switch

NVIDIA offers two types of InfiniBand HDR switches, namely the HDR CS8500 modular chassis switch and the QM87xx series fixed switches. The 200G HDR QM87xx switches come in two models: MQM8700-HS2F and MQM8790-HS2F. CQM8700 and QM8790 switches typically serve two connection applications. One is to connect with a 200G HDR network interface card (NIC) using 200G-to-200G AOC/DAC cables for direct connection.

Another common application is to connect with a 100G HDR NIC, requiring the use of a 200G-to-2X100G cable to split one physical 200G (4X50G) QSFP56 port of the switch into two virtual 100G (2X50G) ports. After splitting, the port symbols change from x/y to x/Y/z, where “x/Y” represents the original symbol of the port before splitting, and “z” represents the port number (1,2) of the single-channel port, with each sub-physical port considered as a separate port.

InfiniBand HDR Network Interface Cards (NICs)

Compared to HDR switches, HDR Network Interface Cards (NICs) come in various types. In terms of speed, there are two options: HDR100 and HDR. In addition to the data rates for each interface, NICs of each speed can also be selected based on business requirements for single-port, dual-port, and PCIe types.

The HDR InfiniBand network architecture is straightforward yet offers a range of hardware options. For 100Gb/s speed, there are solutions like 100G EDR and 100G HDR100. At 200Gb/s speed, options include HDR and 200G NDR200. There are significant differences in the switches, NICs, and accessories used in various applications.

conclusion

InfiniBand high-performance HDR and EDR switches, Smart NIC cards, as well as solutions combining NADDOD/Mellanox/Cisco/HPE AOC & DAC & optical module products, provide more advantageous and valuable optical network products and comprehensive solutions for applications such as data centers, high-performance computing, edge computing, and artificial intelligence. This significantly enhances customers’ business acceleration capabilities while offering low cost and excellent performance.

Click to read more related content: Exploring InfiniBand Network, HDR and Significance of IB Applications in Supercomputing | FS Communi

Advantages and Applications of 2x200G HDR Splitter | FS Community

Introducing InfiniBand HDR Products for AI | FS Community

This entry was posted in data center, Fiber Optic Transceiver, Networking and tagged , . Bookmark the permalink.