CSIT/csit tg servers
From fd.io
DRAFT FD.io CSIT recommended server specification for Traffic Generator (TRex) used in FD.io performance tests.
TODAY - Deployed Traffic Generator (TG) Servers
- All TG/TRex instances are based on Xeon servers
- SuperMicro 2 socket servers.
- Motherboards with max PCIe I/O for NIC cards.
- For Xeon SUTs, same Xeon generation for TG
- For Arm SUT ThunderX2
- 2n-tx2 => shared skx TG, shared SKX TG.
- shared TG has two TRex instances running in parallel, one per Numa.
- For AMD SUTs, same AMD generation for TG
- 2n-zn2 => zn2 TG, ZN2 TG
- for DNV
- 2n-dnv => shared skx TG, shared SKX TG.
- 3n-dnv => shared skx TG, shared SKX TG.
FUTURE - Recommended "standardised" approach
- Standardise on a reference TG server
- Reduce amount of work involved in TRex calibration for tests with STL and ASTF APIs.
- Improve TG/TRex determinism of behaviour and performance.
- Get max performance from TRex.
- Proposal
- Use Xeon ICX as the main TG server platform
- Processor: ICX high end SKU (assume 8380), similar to what is used for SKX (8180) and CLX (8280).
- Server/Motherboard: OEM motherboards with max PCIEe Gen4 I/O (as per current practice).
- Optionally consider lower power Xeon builds for lower end SUTs
- e.g. for new Intel Atom / Snowridge SUTs?
- e.g. for new Arm / Ampere SUTs?
- The risk here is that underpowered TG may not be able to stress Ampere 80-core processor, in case we ever go that high on Arm SUTs.
- Agree this is a risk - we should look for DUT vendor contribution to mitigate, otherwise let them assume the risk.
- Use Xeon ICX as the main TG server platform
POINTS FOR DISCUSSION
- Is CSIT project good to continue to have TG/TRex servers shared, e.g. TRex instance per Numa/socket? If yes, need to work out what "slower" SUTs / low-speed NIC we expect in the project going forward.
- Can we use virtualization to ensure a higher degree of separation?
- What is the status of TRex support for Arm, and expected performance?
- CSIT project doesn't have any experience running TRex on Arm.
- TRex documentation claims generic support for Arm, but no builds provided, need to compile from source.
- No / limited documented use of TRex on Arm servers.
- If for the new builds in 2021 CSIT recommend ICX Xeon, but once SPR Xeon arrives in the future it will be faster.
- Meaning we always end up with a mix of TG server platforms, as is the case today.
- Is this de-risked by running more TG cores compared to DUT cores? If I have 8xTG cores blasting traffic at 4xDUT cores, does it matter that the DUT cores are 10-20% faster? Inevitably the DUT core are doing more work compared to the TG cores in anycase?
- Meaning we always end up with a mix of TG server platforms, as is the case today.