The Power of InfiniBand: Revolutionizing AI Training and Autonomous Vehicles

In recent years, large-scale artificial intelligence (AI) models have garnered widespread attention in the AI community due to their exceptional capabilities in natural language understanding, cross-media processing, and the potential to advance towards general artificial intelligence. Leading models in the industry have reached parameter scales of trillions or even tens of trillions.

Network Bottlenecks in Large GPU Clusters

In large-scale model training tasks involving hundreds or even thousands of GPU computing capabilities, the requirement for extensive server nodes and inter-server communication imposes network bandwidth as a bottleneck for GPU cluster systems. It’s worth noting that as the cluster scale increases, exceptionally high demands are placed on network performance. Once a GPU cluster reaches a certain scale, ensuring the stability of the cluster system becomes another challenge to address, alongside performance optimization.

The reliability of the network plays a crucial role in determining the computational stability of the entire cluster. This is due to the following reasons: large-scale network failure domains and significant fluctuations in network performance. Addressing these considerations is essential for maintaining the robustness and consistent performance of large-scale GPU clusters.

Empowering High-Performance AI Training Networks

In the realm of large-scale model training, extensive communication is required for compute iterations and gradient synchronization, with single iterations often reaching several hundred gigabytes. Additionally, the parallel patterns and communication requirements introduced by acceleration frameworks render traditional low-speed networks ineffective in supporting the robust computation of GPU clusters.

To fully harness the potent computational capabilities of GPUs, NVIDIA InfiniBand (IB) networking stands out, providing ultra-high communication bandwidth of up to 1.6Tbps per compute node. This represents over a tenfold improvement compared to traditional networks. Key features of NVIDIA InfiniBand networking include non-blocking Fat-Tree topology, network scalability, and high-bandwidth access.

Applications of InfiniBand in Autonomous Vehicles

Autonomous vehicles (AVs) rely on sophisticated communication networks with high-speed and low-latency capabilities to facilitate real-time decision-making and seamless communication between various onboard systems. InfiniBand network technology has emerged as a notable AV solution, offering an appealing combination of high bandwidth and low-latency communication.

In the realm of autonomous driving, InfiniBand has proven beneficial in establishing connections between various onboard systems, including sensors, cameras, and control systems. It can be utilized to create networks among multiple autonomous vehicles, enabling seamless communication and coordination. One notable application of InfiniBand in autonomous vehicles involves offloading compute-intensive tasks.

The Impact of InfiniBand on Autonomous Vehicles

The excellent combination of inherent low latency and high bandwidth in InfiniBand greatly contributes to ensuring that autonomous vehicles can make real-time decisions based on the latest information. This is crucial for coping with dynamic and unpredictable environments. The high bandwidth of InfiniBand also facilitates effective communication between multiple autonomous vehicles connected in the network. This network coordination is particularly useful in scenarios requiring collaborative actions, enhancing the overall efficiency of autonomous vehicle fleets.

Conclusion

In the era of artificial intelligence, high-bandwidth, low-latency, scalable networks will become the standard. These attributes are crucial for providing robust support for large-scale model training and facilitating real-time decision-making. Let’s join hands to address the challenges of the AI era and collectively write a new chapter for the intelligent future.

How FS Can Help

Explore FS’s range of InfiniBand modules and switches, covering configurations from 100G to 800G, to meet various speed requirements such as NDR, HDR, EDR, and FRD. Whenever you need it, our knowledgeable team at FS.com is here to provide expert assistance.

Posted in Fiber Optic Network | Tagged | Comments Off on The Power of InfiniBand: Revolutionizing AI Training and Autonomous Vehicles

Elevating Network Performance: Insights into Protocols, Architectures, and Solutions

In the ever-evolving field of computer networking, protocols play a crucial role in managing data exchange. One cornerstone is the OSI Seven-Layer Model, designed to standardize communication between computers by showcasing its complexity through a layered network model. From the hardware-centric Physical Layer to the application-centric Application Layer, each layer contributes to seamless communication.

Understanding OSI Protocol and the Transition to RDMA in HPC

A protocol is a set of rules, standards, or agreements established for data exchange within computer networks. Legally, the OSI (Open Systems Interconnection) Seven-Layer Model is an international standard designed to meet the requirements of open networks through its seven-layer network model. Each layer has specific functions and responsibilities that facilitate communication and data exchange between computers. It is worth noting that real-world network protocols may deviate from the OSI model based on practical needs and network architecture design and implementation.

TCP/IP is a protocol suite composed of various protocols, roughly divided into four layers: the Application Layer, Transport Layer, Network Layer, and Data Link Layer. TCP/IP is considered an optimized version of the seven-layer model.

Against the backdrop of high-performance computing (HPC) and its demand for high throughput and low latency, TCP/IP has transitioned to RDMA (Remote Direct Memory Access). TCP/IP has some drawbacks, including latency and significant CPU overhead due to multiple context switches and CPU involvement in encapsulation during transmission.

RDMA, as a technology, allows direct access to memory data through the network interface without involving the operating system kernel. It enables high-throughput, low-latency network communication, making it ideal for large-scale parallel computing clusters.

Spine-Leaf Architecture vs. Traditional Three-Layer Networks

Traditional data centers typically employ a three-tier architecture, consisting of the access layer, aggregation layer, and core layer. However, traditional three-tier network architectures have significant drawbacks, which become more apparent with the development of cloud computing: bandwidth waste, large failure domains, and high latency.

The spine-leaf architecture offers significant advantages, including a flat design, low latency, and high bandwidth. In a spine-leaf network, the role of leaf switches is similar to traditional access switches, while spine switches act as core switches. This architecture achieves non-blocking performance. Since each leaf in the structure is connected to every spine, any issue with one spine only results in a slight decrease in throughput performance for the data center.

A Deep Dive into NVIDIA SuperPOD Architecture

SuperPOD refers to a cluster of servers interconnected through multiple computing nodes to provide high-throughput performance. Taking the NVIDIA DGX A100 SuperPOD as an example, the recommended configuration utilizes the QM8790 switch, offering 40 ports, each operating at 200G. The architecture employed follows a fat-tree (non-blocking) structure.

In terms of switch performance, the QM9700 introduced in the DGX H100 SuperPOD recommended configuration incorporates Sharp technology. This technology utilizes an aggregator manager to construct Streaming Aggregated Trees (SATs) within the physical topology. Multiple switches in the tree execute parallel computation, thereby reducing latency and enhancing network performance. The QM8700/8790+CX6 supports up to 2 SATs, while the QM9700/9790+CX7 supports up to 64 SATs. As the number of ports increases, the number of switches decreases.

Switch Choices: Ethernet, InfiniBand, and RoCE Compared

The fundamental difference between Ethernet switches and InfiniBand switches lies in the distinction between the TCP/IP protocol and RDMA (Remote Direct Memory Access). Currently, Ethernet switches are more commonly used in traditional data centers, while InfiniBand switches are more prevalent in storage networks and high-performance computing (HPC) environments.

Modern data centers demand underlying interconnects with maximum bandwidth and extremely low latency. In this scenario, traditional TCP/IP network protocols prove inadequate, resulting in CPU processing overhead and high latency.

For enterprises deciding between RoCE and InfiniBand, careful consideration of unique requirements and cost factors is crucial. Those prioritizing the highest performance network connections may find InfiniBand preferable, while those seeking optimal performance, ease of management, and cost-effectiveness may choose RoCE in their data centers.

Inquiry and Answers on InfiniBand Technology

With the advancement of big data and artificial intelligence technologies, the demand for high-performance computing continues to rise. To meet this demand, the NVIDIA Quantum-2 InfiniBand platform provides users with outstanding distributed computing performance, achieving high-speed, low-latency data transmission, and processing capabilities.

FS’s InfiniBand solutions include AOC/DAC cables and modules with speeds of 800G, 400G, 200G, 100G, and 56/40G, as well as NVIDIA InfiniBand adapters and NVIDIA InfiniBand switches. In IB network cluster solutions, FS’s professional team will provide corresponding hardware connectivity solutions based on the network. Tailored to your needs and network scale, ensuring network stability and high performance.

For more inquiries and answers regarding InfiniBand technology, please read Inquiries and Answers about Infiniband Technology.

How FS Can Help

FS offers a rich array of products supporting RoCE or InfiniBand. Regardless of your choice, it provides lossless network solutions based on these two network connectivity options. These solutions enable users to build high-performance computing capabilities and lossless network environments. Sign up now to improve your connectivity or request a customized consultation for high-speed solution design.

Posted in data center, Networking | Tagged , , , | Comments Off on Elevating Network Performance: Insights into Protocols, Architectures, and Solutions

Accelerating Data Centers: Unveiling the Power of InfiniBand Technology

Data centers are undergoing a decisive shift towards accelerated computing, driven by the momentum of AIGC. To meet the growing demands of high-performance computing (HPC), artificial intelligence, and expanded infrastructure, there is a clear need to accelerate interconnectivity and deploy smarter network solutions. Against this backdrop, InfiniBand products have emerged as a focal point of industry attention, meticulously addressing these pressing needs.

Basics of InfiniBand

InfiniBand is a high-speed, low-latency interconnect technology primarily used in data centers and high-performance computing (HPC) environments. It provides a high-performance fabric for connecting servers, storage devices, and other network resources within clusters or data centers. Overall, InfiniBand technology offers the following advantages: high speed and scalability, low latency, and low power consumption.

InfiniBand in HPC Networking

In the field of high-performance computing (HPC), high-speed interconnect (HSI) networks play a crucial role in system performance and efficiency. As one of the fastest-developing HSI technologies, InfiniBand offers bandwidth of up to 200Gbps and point-to-point latency of less than 0.6 microseconds, providing robust support for building high-performance computing clusters.

With the high-speed networking capabilities of InfiniBand, HPC systems can effectively combine multiple servers to achieve linear performance scalability. Therefore, the importance of InfiniBand technology in the HPC field is not only reflected in enhancing the performance of computing clusters but also in providing essential support for data centers of different scales, driving the overall development of the HPC ecosystem.

For more information about InfiniBand, please refer to Getting to Know about InfiniBand.

Tips for Choosing InfiniBand Product

InfiniBand products play a crucial role in high-performance computing data centers, and choosing the right products is essential for operational success. Selecting suitable InfiniBand products is paramount for high-performance computing data centers. Factors to consider include bandwidth and distance requirements, connectors, budget, compatibility, reliability, and future needs, all of which contribute to choosing the appropriate IB connector.

Regarding InfiniBand network interconnect products: DAC high-speed copper cables provide an economical solution for short-distance high-speed interconnects. AOC active cables utilize optical technology for long-distance data transmission.

Optical modules are typically used for long-distance, high-speed interconnects. Understanding the different product categories, rates, and packaging modules helps make informed decisions, while selecting the right vendor ensures receiving high-quality InfiniBand products that meet performance and budget requirements.

200G Data Centers: Choosing Between QSFP56 and QSFP-DD as the Dominant Standard

With the rapid development of optical communication and the internet, the demand for networks has correspondingly increased, leading to a significant annual growth rate of 50% to 80% in telecom backbone network traffic. To meet user demands, the transmission rates of optical communication have continuously evolved, progressing from 10G, 25G, and 40G to the current 100G, 200G, 400G, and beyond. Currently, there are two main forms of 200G optical modules in the market: 200G QSFP56 and 200G QSFP-DD.

FS offers a full range of next-generation 200G InfiniBand and 200G Ethernet products, including 200G QSFP56 SR4, 200G QSFP56 FR4, 200G QSFP56 LR4, 200G QSFP-DD 2SR4, 200G QSFP56 AOC, 200G QSFP-DD AOC, 200G QSFP56 DAC, and 200G QSFP-DD AOC—both DAC and AOC support “breakout” applications.

The 200G QSFP56 SR4 optical module is suitable for data centers, high-performance computing networks, enterprise core, and distribution layer applications.

The 200G QSFP56 FR4 transceiver is suitable for 200G Ethernet, data centers, and cloud networks.

The 200G QSFP56 LR4 transceiver is suitable for 200G Ethernet, data centers, and 5G backhaul.

The 200G QSFP-DD 2SR4 transceiver is suitable for 2×100GBASE-SR4 Ethernet, data centers, as well as switch and router connections.

200G AOC and DAC are typically used for connections between access switches and servers. In basic interconnection scenarios between access switches and servers, branch DAC and AOC can meet diverse requirements beyond standard direct DAC and AOC connections. FS provides a range of 200G to 4x50G, 200G to 8x25G, and 200G to 2X100G DAC and AOC products, offering data centers more flexible and adaptable solutions.

A Closer Look at 200G Active Optical Cables (AOC) in Data Centers

In the data center environment, a 200G AOC specifically refers to AOC designed to handle a 200 Gbps data rate. The core principle of a 200G AOC is to utilize lasers at one end to convert electrical signals into optical signals and then convert them back into electrical signals at the other end. This process ensures long-distance, high-speed, and reliable data transmission within the data center.

The landscape of 200G AOC includes variants such as 200G QSFP-DD AOC and 200G QSFP56 AOC. The former is based on the Quad Small Form-factor Pluggable Double Density (QSFP-DD) standard, known for its high density, with each channel supporting 8 channels of 25G or 50G. The latter is based on the QSFP56 standard, providing an economical and efficient solution for 200G connections. A fundamental feature of 200G AOC is its ability to branch into multiple low-speed channels, providing flexibility in various network scenarios.

The versatility of 200G AOC makes it suitable for various applications. Specific use cases include data center networking, high-performance computing (HPC), cloud computing, supercomputing and research, video production, and broadcasting.

Advantages of AOC over DAC

Compared to 200G Direct Attach Cables (DAC), 200G Active Optical Cables offer several advantages, making them a preferred choice in certain scenarios:

Longer Reach:

AOC can transmit data over longer distances compared to DAC, making them suitable for applications where the endpoints are not close.

Lighter Weight:

AOC is generally lighter than DAC, contributing to easier cable management and reduced strain on equipment.

Electromagnetic Interference (EMI) Immunity:

AOC is less susceptible to electromagnetic interference, ensuring a more stable and reliable signal transmission in environments with high interference.

How FS Can Help

FS is an elite partner of NVIDIA® and offers a rich variety of InfiniBand products on its official website, including NVIDIA® InfiniBand Switches, InfiniBand modules, InfiniBand cables, and NVIDIA® InfiniBand Adapters. FS website maintains an ample stock of InfiniBand products and ensures swift delivery. If you wish to purchase InfiniBand products or obtain InfiniBand solutions, you can contact FS for assistance.

For detailed information on purchasing 200G InfiniBand products, you can read:

200G Data Centers: Choosing Between QSFP56 and QSFP-DD as the Dominant Standard | FS Community

InfiniBand 200Gbps QSFP56 DAC/AOC Cable and Transceiver Solutions | FS Community

200G Active Optical Cables (AOC) in Data Centers | FS CommunityTips on Choosing InfiniBand Products for HPC Computing | FS Community

Posted in 200G Network, data center | Tagged , , | Comments Off on Accelerating Data Centers: Unveiling the Power of InfiniBand Technology

Data Centre Connectivity: The Surge of Coherent Optical Transceiver Technology

According to the optical transceiver report from the Yole Group, the revenue generated by optical transceivers in 2022 was approximately $11 billion. Forecasts indicate substantial growth in this field, with projections reaching $22.2 billion by 2028.

With increased investments in data centres, rapid growth in data centre traffic, and the mainstream adoption of silicon photonics technology, the data centre optical module market is undergoing a transformative phase. The shift towards silicon photonics is evident in the development and deployment of optical transceivers with higher data rates and greater efficiency. As data centre operators seek to maximize the capabilities of their infrastructure, the mainstream adoption of silicon photonics in optical transceivers has become a key trend driving the ongoing evolution of data centres. Click to learn more about the trends in the data centre optical module market: New Trends of Optical Transceiver Market in Data Centers | FS Community

Advancements in Coherent Optical Module Technology and Standardization Trends

Coherent technology has become the mainstream solution for Data Center Interconnect (DCI) applications, covering distances of 80 to 120 km in the field of data communication. Evolving applications have presented new demands for coherent optical transceiver systems, shifting the development of coherent transceiver units from initial integration with line cards and Multi-Source Agreements (MSA) transceivers to independent, standardized pluggable optical transceivers.

The latest advancements in Complementary Metal-Oxide-Semiconductor (CMOS) technology digital signal processor (DSP) chips and integrated photonics technology have paved the way for developing smaller, lower power-consuming pluggable coherent optical transceivers. The trajectory of coherent optical modules applied in metropolitan and backbone networks is characterized by high speed, miniaturization, low power consumption, and standardization of interoperability.

Currently, commercial coherent technology has advanced to enable single-wavelength 800G transmission. However, there is currently a lack of standardization specifications for 800G in the industry. In comparison, 400G coherent technology has matured and follows standards such as 400ZR, OpenROADM, and OpenZR+. In the realm of standardization evolution, the next generation of super 400G coherent pluggable products is expected to adopt a single-wavelength 800G rate. Currently, the Optical Internetworking Forum (OIF) is deliberating on the development of the next-generation coherent technology standard, tentatively named 800ZR.

Coherent Modulation vs. PAM4 in 800G Optical Transmission

Coherent modulation used in coherent optical communication involves altering the frequency, phase, and amplitude of the optical carrier to transmit signals. Unlike intensity detection, coherent modulation requires coherent light with clear frequency and phase, primarily used for high-speed and long-distance transmission. PAM4 is suitable for high-speed, medium-short distance transmission, making it ideal for internal connections in next-generation data centres.

In the context of long-distance Data Center Interconnect (DCI) scenarios, PAM4 faces competition from coherent modulation based on the 400ZR protocol. As data centre speeds enter the era of 800G, the differences between PAM4 and coherent technology are gradually diminishing. The competitiveness of each technology depends on factors such as cost and power consumption.

Choosing Between InP and Silicon Photonics

In the context of coherent technology, the choice between InP (Indium Phosphide) and silicon photonics for I/Q modulators and receivers becomes crucial. Despite being cost-effective, silicon photonics exhibits lower performance, known for its high peak voltage and limited bandwidth. In contrast, InP offers lower peak voltage and superior bandwidth but at a higher cost. In PAM4 and coherent technologies, InP transceivers are often more expensive, while silicon photonics provides a more economical alternative.

Coherent vs. PAM4 in High-Speed Transmission

Regarding power consumption, with the evolution of chip technology from 7nm to 5nm and even 3nm, the enhancement is not limited to an increase in DSP processing rates. It also extends to superior power reduction performance.

Conclusion

Several companies have validated these methods through experiments. FS believes that with increased production and reduced costs, coherent methods can achieve cost competitiveness with PAM4 by requiring only a laser, modulator, and receiver. This remains true even as optical equipment becomes more complex. Consistency in solutions enables higher flexibility and performance, distinguishing them. In conclusion, the competition between coherent transmission technology and PAM4 transmission technology continues, with future developments determining the mainstream approach.

Read more about the detailed content on coherent modules: Advancements in Coherent Optical Module Technology and Standardization Trends | FS Community

Coherent Modulation vs. PAM4 in 800G Optical Transmission | FS Community

Posted in data center, Fiber Optic Transceiver | Tagged , , , | Comments Off on Data Centre Connectivity: The Surge of Coherent Optical Transceiver Technology

Harnessing the Potential of InfiniBand: Solutions for Modern Networking Challenges

InfiniBand (IB) is an advanced computer network communication standard developed by the InfiniBand Trade Association (IBTA). InfiniBand technology enjoys a high reputation in HPC connections for supercomputers, storage, and even LAN networks. InfiniBand has numerous advantages, including simplified management, high bandwidth, complete CPU offloading, ultra-low latency, cluster scalability and flexibility, quality of service (QoS), and SHARP support, among others.

InfiniBand is a critical communication technology for data transmission, suitable for various applications. It has evolved to dominate network speeds of 100G EDR or 200G HDR and is progressing towards even faster speeds like 400G NDR and 800G XDR. InfiniBand adheres to strict latency requirements, approaching zero latency. It excels in applications requiring rapid and precise data processing, commonly used for tasks such as extensive data analysis, machine learning, deep learning training, inference, conversational AI, prediction, and forecasting in supercomputing.

InfiniBand HDR Product Solutions in AI

With the advancements in ChatGPT and artificial intelligence (AI), the pace of global data center construction has accelerated, leading to changes in the scale and pattern of optical module procurement. AI data centers adopt a Fat Tree network structure, where the number of uplink and downlink ports is equal at each node. Thence, compared to traditional data centers, there are more switches in AI data centers.

Taking NVIDIA’s AI cluster model as an example, the SuperPOD serves as the foundational unit, comprising 140 DGX A100 GPU servers, HDR InfiniBand 200G network cards, and 170 NVIDIA Quantum QM8790 switches. With an increase in the number of switches, employing a 1:1 connection approach, the requirement for port interconnection is calculated as 40×170/2=3400 ports. Considering actual deployment scenarios, this is adjusted to 4000 ports. Meanwhile, meeting entry-level requirements similar to GPT-4.0 would necessitate approximately 3,750 NVIDIA DGX A100 servers and 110,000 optical modules.

FS provides products for the SuperPOD utilizing HDR InfiniBand 200G network interconnection. These products include high-speed, low-latency data transmission, low-power, and high-stability InfiniBand optical devices. NVIDIA’s solution also mentions the use of AOC and DAC cables, typically employed for short-distance data transmission. These cable types offer high bandwidth, low latency, and low power consumption transmission solutions. Therefore, HDR AOC and DAC cables can be used in the SuperPOD to meet various distance and application requirements.

For example, the 2x200G HDR splitter cable, specifically the QSFP-2Q200G-2QAO05 cable provided by FS, is an active splitter cable based on QSFP56 VCSEL technology. It supports 2x 200Gb/s data transmission and complies with standards such as SFF-8665, RoHS, and SFF-8636. Each end of the cable is equipped with EEPROM, providing product and status monitoring information accessible to the host system. It is primarily used in Fat Tree topologies to connect 200G leaf switches and spine switches to facilitate cross-connection functionality. This allows the ports of HDR InfiniBand QSFP56 switches to operate as 2xHDR100. The 2x200G HDR splitter cable also offers numerous advantages and applications in large-scale server clusters. By maximizing port access capacity, expanding network scalability, and reducing costs, it enhances the network capabilities of IB switches.

InfiniBand HDR Product Solutions in Supercomputing

NVIDIA GPUs and networking products, particularly Mellanox HDR Quantum QM87xx switches and BlueField DPUs, have established a dominant position in the interconnect of over two-thirds of supercomputers.

InfiniBand HDR Switch

NVIDIA offers two types of InfiniBand HDR switches, namely the HDR CS8500 modular chassis switch and the QM87xx series fixed switches. The 200G HDR QM87xx switches come in two models: MQM8700-HS2F and MQM8790-HS2F. CQM8700 and QM8790 switches typically serve two connection applications. One is to connect with a 200G HDR network interface card (NIC) using 200G-to-200G AOC/DAC cables for direct connection.

Another common application is to connect with a 100G HDR NIC, requiring the use of a 200G-to-2X100G cable to split one physical 200G (4X50G) QSFP56 port of the switch into two virtual 100G (2X50G) ports. After splitting, the port symbols change from x/y to x/Y/z, where “x/Y” represents the original symbol of the port before splitting, and “z” represents the port number (1,2) of the single-channel port, with each sub-physical port considered as a separate port.

InfiniBand HDR Network Interface Cards (NICs)

Compared to HDR switches, HDR Network Interface Cards (NICs) come in various types. In terms of speed, there are two options: HDR100 and HDR. In addition to the data rates for each interface, NICs of each speed can also be selected based on business requirements for single-port, dual-port, and PCIe types.

The HDR InfiniBand network architecture is straightforward yet offers a range of hardware options. For 100Gb/s speed, there are solutions like 100G EDR and 100G HDR100. At 200Gb/s speed, options include HDR and 200G NDR200. There are significant differences in the switches, NICs, and accessories used in various applications.

conclusion

InfiniBand high-performance HDR and EDR switches, Smart NIC cards, as well as solutions combining NADDOD/Mellanox/Cisco/HPE AOC & DAC & optical module products, provide more advantageous and valuable optical network products and comprehensive solutions for applications such as data centers, high-performance computing, edge computing, and artificial intelligence. This significantly enhances customers’ business acceleration capabilities while offering low cost and excellent performance.

Click to read more related content: Exploring InfiniBand Network, HDR and Significance of IB Applications in Supercomputing | FS Communi

Advantages and Applications of 2x200G HDR Splitter | FS Community

Introducing InfiniBand HDR Products for AI | FS Community

Posted in data center, Fiber Optic Transceiver, Networking | Tagged , | Comments Off on Harnessing the Potential of InfiniBand: Solutions for Modern Networking Challenges

Enhancing Data Center Networks with InfiniBand Solutions

With the rapid growth of data centers driven by expansive models, cloud computing, and big data analytics, there is an increasing demand for high-speed data transfer and low-latency communication. In this complex network ecosystem, InfiniBand (IB) technology has become a market leader, playing a vital role in addressing the challenges posed by the training and deployment of expansive models. Constructing high-speed networks within data centers requires essential components such as high-rate network cards, optical modules, switches, and advanced network interconnect technologies.

NVIDIA Quantum™-2 InfiniBand Switch

When selecting switches, NVIDIA’s QM9700 and QM9790 series stand out as the most advanced devices. Built on NVIDIA Quantum-2 architecture, they offer 64 NDR 400Gb/s InfiniBand ports within a standard 1U chassis. This breakthrough translates to an individual switch providing a total bidirectional bandwidth of 51.2 terabits per second (Tb/s), along with an unprecedented handling capacity exceeding 66.5 billion packets per second (BPPS).The NVIDIA Quantum-2 InfiniBand switches extend beyond their NDR high-speed data transfer capabilities, incorporating extensive throughput, on-chip compute processing, advanced intelligent acceleration features, adaptability, and sturdy construction. These attributes establish them as the quintessential selections for sectors involving high-performance computing (HPC), artificial intelligence, and expansive cloud-based infrastructures. Additionally, the integration of NDR switches helps minimize overall expenses and complexity, propelling the progression and evolution of data center network technologies.It can be said that NVIDIA Quantum-2 InfiniBand switches not only feature high-speed NDR data transfer capabilities but also integrate extensive throughput, on-chip compute processing, advanced intelligent acceleration features, and robust structure. These attributes make them a typical choice in the realm of High-Performance Computing (HPC), Artificial Intelligence, and a wide range of cloud-based infrastructure applications. Moreover, the integration of NDR switches helps minimize overall cost and complexity, thereby promoting the development of data center network technology.Also Check- Revolutionizing Data Center Networks: 800G Optical Modules and NDR Switches | FS Community

ConnectX®-7 InfiniBand Card

The NVIDIA ConnectX®-7 InfiniBand network card (HCA) ASIC delivers a staggering data throughput of 400Gb/s, supporting 16 lanes of PCIe 5.0 or PCIe 4.0 host interface. Utilizing advanced SerDes technology with 100Gb/s per lane, the 400Gb/s InfiniBand is achieved through OSFP connectors on both the switch and HCA ports. The OSFP connector on the switch supports two 400Gb/s InfiniBand ports or 200Gb/s InfiniBand ports, while the network card HCA features one 400Gb/s InfiniBand port. The product range includes active and passive copper cables, transceivers, and MPO fiber cables. Notably, despite both using OSFP packaging, there are differences in physical dimensions, with the switch-side OSFP module equipped with heat fins for cooling.

OSFP 800G Optical Transceiver

The OSFP-800G SR8 Module is designed for use in 800Gb/s 2xNDR InfiniBand systems throughput up to 30m over OM3 or 50m over OM4 multimode fiber (MMF) using a wavelength of 850nm via dual MTP/MPO-12 connectors. The dual-port design is a key innovation that incorporates two internal transceiver engines, fully unleashing the potential of the switch. This allows the 32 physical interfaces to provide up to 64 400G NDR interfaces. This high-density and higgh-bandwidth design enables data centers to meet the growing network demands and requirements of applications such as high-performance computing artificial intelligence, and cloud infrastructure.FS’s OSFP-800G SR8 Module offers superior performance and dependability, offering strong optical interconnection options for data centers. This module empowers data centers to harness the full performance capabilities of the QM9700/9790 series switch, supporting the transmission of data with both high bandwidth and low latency.

NDR Optical Connection Solution

Addressing the NDR optical connection challenge, the NDR switch ports utilize OSFP with eight channels per interface, each employing 100Gb/s SerDes. This allows for three mainstream connection speed options: 800G to 800G, 800G to 2X400G, and 800G to 4X200G. Additionally, each channel supports downgrade from 100Gb/s to 50Gb/s, facilitating interoperability with previous-generation HDR devices. The 400G NDR series cables and transceivers offer diverse product choices for configuring network switch and adapter systems, focusing on data center lengths of up to 500 meters to accelerate AI computing systems. The various connector types, including passive copper cables (DAC), active optical cables (AOC), and optical modules with jumpers, cater to different transmission distances and bandwidth requirements, ensuring low latency and an extremely low bit error rate for high-bandwidth AI and accelerated computing applications. Please see the article for deployment details Infiniband NDR OSFP Solution from FS community.

Posted in data center | Tagged , , | Comments Off on Enhancing Data Center Networks with InfiniBand Solutions

RoCE Technology for Data Transmission in HPC Networks

RDMA (Remote Direct Memory Access) enables direct data transfer between devices in a network, and RoCE (RDMA over Converged Ethernet) is a leading implementation of this technology. It improves data transmission with high speed and low latency, making it ideal for high-performance computing and cloud environments.

Definition

As a type of RDMA, RoCE is a network protocol defined in the InfiniBand Trade Association (IBTA) standard, allowing RDMA over converged Ethernet network. Shortly, it can be regarded as the application of RDMA technology in hyper-converged data centers, cloud, storage, and virtualized environments. It possesses all the benefits of RDMA technology and the familiarity of Ethernet. If you want to understand it in depth, you can read this article RDMA over Converged Ethernet Guide | FS Community.

Types

Generally, there are two RDMA over Converged Ethernet versions: RoCE v1 and RoCE v2. It depends on the network adapter or card used.

Retaining the interface, transport layer, and network layer of InfiniBand (IB), the RoCE protocol substitutes the link layer and physical layer of IB with the link layer and network layer of Ethernet. In the link-layer data frame of a RoCE data packet, the Ethertype field value is specified by IEEE as 0x8915, unmistakably identifying it as a RoCE data packet. However, due to the RoCE protocol’s non-adoption of the Ethernet network layer, RoCE data packets lack an IP field. Consequently, routing at the network layer is unfeasible for RoCE data packets, restricting their transmission to routing within a Layer 2 network.

Introducing substantial enhancements, the RoCE v2 protocol builds upon the RoCE protocol’s foundation. It transforms the InfiniBand (IB) network layer utilized by the RoCE protocol by incorporating the Ethernet network layer and a transport layer employing the UDP protocol. It harnesses the DSCP and ECN fields within the IP datagram of the Ethernet network layer for implementing congestion control. This enables RoCE v2 protocol packets to undergo routing, ensuring superior scalability. As it fully supersedes the original RoCE protocol, references to the RoCE protocol generally denote the RoCE v2 protocol, unless explicitly specified as the first generation of RoCE. Also Check- An In-Depth Guide to RoCE v2 Network | FS Community

InfiniBand vs. RoCE

In comparison to InfiniBand, RoCE presents the advantages of increased versatility and relatively lower costs. It not only serves to construct high-performance RDMA networks but also finds utility in traditional Ethernet networks. However, configuring parameters such as Headroom, PFC (Priority-based Flow Control), and ECN (Explicit Congestion Notification) on switches can pose complexity. In extensive deployments, especially those featuring numerous network cards, the overall throughput performance of RoCE networks may exhibit a slight decrease compared to InfiniBand networks. In actual business scenarios, there are major differences between the two in terms of business performance, scale, operation and maintenance. For detailed comparison, please refer to this article InfiniBand vs. RoCE: How to choose a network for AI data center from FS community.

Benefits

RDMA over Converged Ethernet ensures low-latency and high-performance data transmission by providing direct memory access through the network interface. This technology minimizes CPU involvement, optimizing bandwidth and scalability as it enables access to remote switch or server memory without consuming CPU cycles. The zero-copy feature facilitates efficient data transfer to and from remote buffers, contributing to improved latency and throughput with RoCE. Notably, RoCE eliminates the need for new equipment or Ethernet infrastructure replacement, leading to substantial cost savings for companies dealing with massive data volumes.

How FS Can Help

In the fast-evolving landscape of AI data center networks, selecting the right solution is paramount. Drawing on a skilled technical team and vast experience in diverse application scenarios, FS utilizes RoCE to tackle the formidable challenges encountered in High-Performance Computing (HPC).

FS offers a range of products, including NVIDIA® InfiniBand Switches, 100G/200G/400G/800G InfiniBand transceivers and NVIDIA® InfiniBand Adapters, establishing itself as a professional provider of communication and high-speed network system solutions for networks, data centers, and telecom clients.

Take action now – register for more information and experience our products through a Free Product Trial.

Posted in 400G Network | Tagged , | Comments Off on RoCE Technology for Data Transmission in HPC Networks

Revolutionize High-Performance Computing with RDMA

To address the efficiency challenges of rapidly growing data storage and retrieval within data centers, the use of Ethernet-converged distributed storage networks is becoming increasingly popular. However, in storage networks where data flows are mainly characterized by large flows, packet loss caused by congestion will reduce data transmission efficiency and aggravate congestion. In order to solve this series of problems, RDMA technology emerged as the times require.

What is RDMA?

RDMA (Remote Direct Memory Access) is an advanced technology designed to reduce the latency associated with server-side data processing during network transfers. Allowing user-level applications to directly read from and write to remote memory without involving the CPU in multiple memory copies, RDMA bypasses the kernel and writes data directly to the network card. This achieves high throughput, ultra-low latency, and minimal CPU overhead. Presently, RDMA’s transport protocol over Ethernet is RoCEv2 (RDMA over Converged Ethernet v2). RoCEv2, a connectionless protocol based on UDP (User Datagram Protocol), is faster and consumes fewer CPU resources compared to the connection-oriented TCP (Transmission Control Protocol).

Building Lossless Network with RDMA

The RDMA networks achieve lossless transmission through the deployment of PFC and ECN functionalities. PFC technology controls RDMA-specific queue traffic on the link, applying backpressure to upstream devices during congestion at the switch’s ingress port. With ECN technology, end-to-end congestion control is achieved by marking packets during congestion at the egress port, prompting the sending end to reduce its transmission rate.

Optimal network performance is achieved by adjusting buffer thresholds for ECN and PFC, ensuring faster triggering of ECN than PFC. This allows the network to maintain full-speed data forwarding while actively reducing the server’s transmission rate to address congestion.

Accelerating Cluster Performance with GPU Direct-RDMA

The traditional TCP network heavily relies on CPU processing for packet management, often struggling to fully utilize available bandwidth. Therefore, in AI environments, RDMA has become an indispensable network transfer technology, particularly during large-scale cluster training. It surpasses high-performance network transfers in user space data stored in CPU memory and contributes to GPU transfers within GPU clusters across multiple servers. And the Direct-RDMA technology is a key component in optimizing HPC/AI performance, and NVIDIA enhances the performance of GPU clusters by supporting the function of GPU Direct-RDMA.

Streamlining RDMA Product Selection

In building high-performance RDMA networks, essential elements like RDMA adapters and powerful servers are necessary, but success also hinges on critical components such as high-speed optical modules, switches, and optical cables. As a leading provider of high-speed data transmission solutions, FS offers a diverse range of top-quality products, including high-performance switches, 200/400/800G optical modules, smart network cards, and more. These are precisely designed to meet the stringent requirements of low-latency and high-speed data transmission.

Posted in Network Switch | Tagged , , | Leave a comment

InfiniBand: Powering High-Performance Data Centers

Driven by the booming development of cloud computing and big data, InfiniBand has become a key technology and plays a vital role at the core of the data center. But what exactly is InfiniBand technology? What attributes contribute to its widespread adoption? The following guide will answer your questions.

What is InfiniBand?

InfiniBand is an open industrial standard that defines a high-speed network for interconnecting servers, storage devices, and more. Moreover, it leverages point-to-point bidirectional links to enable seamless communication between processors located on different servers. It is compatible with various operating systems such as Linux, Windows, and ESXi.

InfiniBand Network Fabric

InfiniBand, built on a channel-based fabric, comprises key components like HCA (Host Channel Adapter), TCA (Target Channel Adapter), InfiniBand links (connecting channels, ranging from cables to fibers, and even on-board links), and InfiniBand switches and routers (integral for networking). Channel adapters, particularly HCA and TCA, are pivotal in forming InfiniBand channels, ensuring security and adherence to Quality of Service (QoS) levels for transmissions.

InfiniBand vs Ethernet

InfiniBand was developed to address data transmission bottlenecks in high-performance computing clusters. The primary differences with Ethernet lie in bandwidth, latency, network reliability, and more.

High Bandwidth and Low Latency

It provides higher bandwidth and lower latency, meeting the performance demands of large-scale data transfer and real-time communication applications.

RDMA Support

It supports Remote Direct Memory Access (RDMA), enabling direct data transfer between node memories. This reduces CPU overhead and improves transfer efficiency.

Scalability

The fabric allows for easy scalability by connecting a large number of nodes and supporting high-density server layouts. Additional InfiniBand switches and cables can expand network scale and bandwidth capacity.

High Reliability

InfiniBand FaInfiniBand Fabric incorporates redundant designs and fault isolation mechanisms, enhancing network availability and fault tolerance. Alternate paths maintain network connectivity in case of node or connection failures.

Conclusion

The InfiniBand network has undergone rapid iterations, progressing from SDR 10Gbps, DDR 20Gbps, QDR 40Gbps, FDR56Gbps, EDR 100Gbps, and now to HDR 200Gbps and NDR 400Gbps/800Gbps InfiniBand. For those considering deployment in their high-performance data centers, further details are available from FS.com.

Posted in 400G Network, data center, Industry News | Tagged , , | Leave a comment

Mastering the Basics of GPU Computing

It’s known that training large models is done on clusters of machines with preferably many GPUs per server.This article will introduce the professional terminology and common network architecture of GPU computing.

Exploring Key Components in GPU Computing

PCIe Switch Chip

In the domain of high-performance GPU computing, vital elements such as CPUs, memory modules, NVMe storage, GPUs, and network cards establish fluid connections via the PCIe (Peripheral Component Interconnect Express) bus or specialized PCIe switch chips.

NVLink

NVLink is a wire-based serial multi-lane near-range communications link developed by Nvidia. Unlike PCI Express, a device can consist of muıltiple NVLinks, and devices use mesh networking to communicate instead of a central hub. The protocol was first announced in March 2014 and uses proprietary high-speed signaling interconnect (NVHS).The technology supports full mesh interconnection between GPUs on the same node. And the development from NVLink 1.0, NVLink 2.0, NVLink 3.0 to NVLink 4.0 has significantly enhanced the two-way bandwidth and improved the performance of GPU computing applications.

NVSwitch

NVSwitch is a switching chip developed by NVIDIA, designed specifically for high-performance computing and artificial intelligence applications. Its primary function is to provide high-speed, low-latency communication between multiple GPUs within the same host.

NVLink Switch

Unlike the NVSwitch, which is integrated into GPU modules within a single host, the NVLink Switch serves as a standalone switch specifically engineered for linking GPUs in a distributed computing environment.

HBM

Several GPU manufacturers have taken innovative ways to address the speed bottleneck by stacking multiple DDR chips to form so-called high-bandwidth memory (HBM) and integrating them with the GPU. This design removes the need for each GPU to traverse the PCIe switch chip when engaging its dedicated memory. As a result, this strategy significantly increases data transfer speeds, potentially achieving significant orders of magnitude improvements.

Bandwidth Unit

In large-scale GPU computing training, performance is directly tied to data transfer speeds, involving pathways such as PCIe, memory, NVLink, HBM, and network bandwidth. Different bandwidth units are used to measure these data rates.

Storage Network Card

The storage network card in GPU architecture connects to the CPU via PCIe, enabling communication with distributed storage systems. It plays a crucial role in efficient data reading and writing for deep learning model training. Additionally, the storage network card handles node management tasks, including SSH (Secure Shell) remote login, system performance monitoring, and collecting related data. These tasks help monitor and maintain the running status of the GPU cluster.

For the above in-depth exploration of various professional terms, you can refer to this article Unveiling the Foundations of GPU Computing-1 from FS community.

High-Performance GPU Fabric

NVSwitch Fabric

In a full mesh network topology, each node is connected directly to all the other nodes. Usually, 8 GPUs are connected in a full-mesh configuration through six NVSwitch chips, also referred to as NVSwitch fabric.This fabric optimizes data transfer with a bidirectional bandwidth, providing efficient communication between GPUs and supporting parallel computing tasks. The bandwidth per line depends on the NVLink technology utilized, such as NVLink3, enhancing the overall performance in large-scale GPU clusters.

IDC GPU Fabric

The fabric mainly includes computing network and storage network. The computing network is mainly used to connect GPU nodes and support the collaboration of parallel computing tasks. This involves transferring data between multiple GPUs, sharing calculation results, and coordinating the execution of massively parallel computing tasks. The storage network mainly connects GPU nodes and storage systems to support large-scale data read and write operations. This includes loading data from the storage system into GPU memory and writing calculation results back to the storage system.

Want to know more about CPU fabric? Please check this article Unveiling the Foundations of GPU Computing-2 from FS community.

Posted in Fiber Optic Network | Tagged , | Comments Off on Mastering the Basics of GPU Computing