Intra-Pod Physical Topology Design and Routing Polarization Mitigation

Investigate and develop principled design methodologies for intra-Pod physical topologies and corresponding strategies to mitigate routing polarization in three-tier leaf–spine–OCS GPU clusters, where optical circuit switches provide only one-to-one inter-spine circuits and can induce bottlenecks on leaf-to-spine links due to uneven fulfillment of inter-Pod bandwidth requirements.

Background

OCS-based GPU clusters typically adopt a three-tier leaf–spine–OCS architecture in which OCSes interconnect Pods via circuit-based paths. This constraint can create routing polarization, where the bandwidth requirements between specific pairs of Pods are unevenly fulfilled through links among different spine switches, leading to bottlenecks on intra-Pod leaf-to-spine links.

Prior works have examined inter-Pod physical topologies and traffic-aware logical topology design, but gaps remain regarding intra-Pod physical topology design and concrete mitigation strategies for routing polarization. The authors highlight that these aspects continue to be unresolved and warrant further study to ensure contention-free scheduling and robust performance in large-scale OCS-based GPU clusters.

References

Works like Interleaved Wiring have investigated the design of inter-Pod physical topologies; however, the design of intra-Pod physical topologies and strategies to mitigate routing polarization remain open for further investigation.

Leaf-centric Logical Topology Design for OCS-based GPU Clusters  (2603.28168 - Han et al., 30 Mar 2026) in Section 6.2 Designing of Current OCS-based GPU Cluster (Related Work)