CSIT/csit tg servers

From fd.io
Jump to: navigation, search

FD.io CSIT recommended server specification for Traffic Generator (TRex) used in FD.io performance tests.

TODAY - Deployed Traffic Generator (TG) Servers

  1. All TG/TRex instances are based on Xeon servers
    • SuperMicro 2 socket servers.
    • Motherboards with max PCIe I/O for NIC cards.
  2. For Xeon SUTs, same Xeon generation for TG
  3. For Arm SUT ThunderX2
    • 2n-tx2 => shared skx TG, shared SKX TG.
    • shared TG has two TRex instances running in parallel, one per Numa.
  4. For AMD SUTs, same AMD generation for TG
  5. for DNV

FUTURE - Recommended "standardised" approach

  1. Standardise on a reference TG server
    • Reduce amount of calibration work involved in making TRex fit for purpose for CSIT tests
      • Note that separate calibration is required for TRex STL (stateless) and ASTF (advanced stateful) APIs.
    • Improve TG/TRex determinism of behaviour and performance.
    • Get max performance from TRex.
  2. Proposal
    • Use Xeon ICX as the main TG server platform (2 socket servers)
      • Processor: ICX high end SKU (assume 8380, other recommended by the vendor), similar to what is used for SKX (8180) and CLX (8280).
    • Server/Motherboard: OEM motherboards with max PCIEe Gen4 I/O (as per current practice, i.e. SuperMicro).
    • NICs
      • Separate onboard management connectivity.
      • 4p10GbE, 2p25GbE Intel FVL.
      • 2p100GbE Mellanox and Intel CVL.
      • Separate 2p100GbE NIC to enable b2b calibration tests.
    • Approach for for lower end SUTs
      • Use the "standardised" TG 2-socket server, allocating NUMA per lower-end SUT testbed
      • Note: if SUT vendor prefers to use a lower power Xeon builds, then the SUT vendor would need to take on the responsibility for calibration of TRex.

POINTS FOR DISCUSSION

  1. CLOSED What is the status of TRex support for Arm, and expected performance?
    • CSIT project doesn't have any experience running TRex on Arm.
    • TRex documentation claims generic support for Arm, but no builds provided, need to compile from source.
    • No / limited documented use of TRex on Arm servers.
  2. OPEN If for the new builds in 2021 CSIT recommend ICX Xeon, but once SPR Xeon arrives in the future it will be faster.
    • Meaning we always end up with a mix of TG server platforms, as is the case today.
    • But we should strive to minimize the number of TRex server variations, to reduce calibration/maintenance work.
  3. OPEN What about AMD?
    • CSIT should encourage AMD to provide AMD based servers for TG.
    • But AMD should be co-sharing the responsibility of TRex calibration.