The Consquences Of Congestion: Why Modern High Performance Workloads Need A Modern Network Architecture

Whitepaper

Published May 2022

x

Today, every organization running Machine Learning, AI or HPC application workloads faces the same crippling issue: Congestion in the network. Congestion can delay time-to-results for crucial scientific and enterprise research and analysis, making systems unpredictable and leaving high-cost cluster resources waiting for delayed data to arrive. Despite various brute-force attempts to resolve the congestion issue, the problem has persisted. Until now.

In this paper, Matthew Williams, CTO at Rockport Networks, explains how recent innovations in networking technologies have led to a new network architecture that targets the root causes of network congestion, specifically:

  • Why today’s network architectures are not a sustainable approach to advanced Machine Learning, AI and HPC workloads
  • How congestion and latency issues are directly tied to the network architecture
  • Why a direct interconnect network architecture minimizes congestion and tail latency