CSIT/csit tg servers
From fd.io
< CSIT
FD.io CSIT recommended server specification for Traffic Generator (TRex) used in FD.io performance tests.
TODAY - Deployed Traffic Generator (TG) Servers
- All TG/TRex instances are based on Xeon servers
- SuperMicro 2 socket servers.
- Motherboards with max PCIe I/O for NIC cards.
- For Xeon SUTs, same Xeon generation for TG
- For Arm SUT ThunderX2
- 2n-tx2 => shared skx TG, shared SKX TG.
- shared TG has two TRex instances running in parallel, one per Numa.
- For AMD SUTs, same AMD generation for TG
- 2n-zn2 => zn2 TG, ZN2 TG
- for DNV
- 2n-dnv => shared skx TG, shared SKX TG.
- 3n-dnv => shared skx TG, shared SKX TG.
FUTURE - Recommended "standardised" approach
- Standardise on a reference TG server
- Reduce amount of calibration work involved in making TRex fit for purpose for CSIT tests
- Note that separate calibration is required for TRex STL (stateless) and ASTF (advanced stateful) APIs.
- Improve TG/TRex determinism of behaviour and performance.
- Get max performance from TRex.
- Reduce amount of calibration work involved in making TRex fit for purpose for CSIT tests
- Proposal
- Use Xeon ICX as the main TG server platform (2 socket servers)
- Processor: ICX high end SKU (assume 8380, other recommended by the vendor), similar to what is used for SKX (8180) and CLX (8280).
- Server/Motherboard: OEM motherboards with max PCIEe Gen4 I/O (as per current practice, i.e. SuperMicro).
- NICs
- Separate onboard management connectivity.
- 4p10GbE, 2p25GbE Intel FVL.
- 2p100GbE Mellanox and Intel CVL.
- Separate 2p100GbE NIC to enable b2b calibration tests.
- Approach for for lower end SUTs
- Use the "standardised" TG 2-socket server, allocating NUMA per lower-end SUT testbed
- Note: if SUT vendor prefers to use a lower power Xeon builds, then the SUT vendor would need to take on the responsibility for calibration of TRex.
- Use Xeon ICX as the main TG server platform (2 socket servers)
POINTS FOR DISCUSSION
- CLOSED What is the status of TRex support for Arm, and expected performance?
- CSIT project doesn't have any experience running TRex on Arm.
- TRex documentation claims generic support for Arm, but no builds provided, need to compile from source.
- No / limited documented use of TRex on Arm servers.
- OPEN If for the new builds in 2021 CSIT recommend ICX Xeon, but once SPR Xeon arrives in the future it will be faster.
- Meaning we always end up with a mix of TG server platforms, as is the case today.
- But we should strive to minimize the number of TRex server variations, to reduce calibration/maintenance work.
- OPEN What about AMD?
- CSIT should encourage AMD to provide AMD based servers for TG.
- But AMD should be co-sharing the responsibility of TRex calibration.