Difference between revisions of "CSIT/csit tg servers"
From fd.io
								< CSIT
												
				| Line 29: | Line 29: | ||
| #** Processor: ICX high end SKU (assume 8380), similar to what is used for SKX (8180) and CLX (8280). | #** Processor: ICX high end SKU (assume 8380), similar to what is used for SKX (8180) and CLX (8280). | ||
| #** Server/Motherboard: OEM motherboards with max PCIEe Gen4 I/O (as per current practice). | #** Server/Motherboard: OEM motherboards with max PCIEe Gen4 I/O (as per current practice). | ||
| + | #** NIC: Separate onboard management connectivity; using 40Gb/100Gb/200Gb NIC cards (Nvidia /formerly Mellanox/, Intel) directly connected to DUT; using separate 100Gb card for b2b calibration tests. | ||
| #* Optionally consider lower power Xeon builds for lower end SUTs | #* Optionally consider lower power Xeon builds for lower end SUTs | ||
| #** e.g. for new Intel Atom / Snowridge SUTs? | #** e.g. for new Intel Atom / Snowridge SUTs? | ||
| Line 38: | Line 39: | ||
| # Is CSIT project good to continue to have TG/TRex servers shared, e.g. TRex instance per Numa/socket? If yes, need to work out what "slower" SUTs / low-speed NIC we expect in the project going forward. | # Is CSIT project good to continue to have TG/TRex servers shared, e.g. TRex instance per Numa/socket? If yes, need to work out what "slower" SUTs / low-speed NIC we expect in the project going forward. | ||
| − | |||
| # What is the status of TRex support for Arm, and expected performance? | # What is the status of TRex support for Arm, and expected performance? | ||
| #* CSIT project doesn't have any experience running TRex on Arm. | #* CSIT project doesn't have any experience running TRex on Arm. | ||
| Line 46: | Line 46: | ||
| #* Meaning we always end up with a mix of TG server platforms, as is the case today. | #* Meaning we always end up with a mix of TG server platforms, as is the case today. | ||
| #** Is this de-risked by running more TG cores compared to DUT cores? If I have 8xTG cores blasting traffic at 4xDUT cores, does it matter that the DUT cores are 10-20% faster? Inevitably the DUT core are doing more work compared to the TG cores in anycase? | #** Is this de-risked by running more TG cores compared to DUT cores? If I have 8xTG cores blasting traffic at 4xDUT cores, does it matter that the DUT cores are 10-20% faster? Inevitably the DUT core are doing more work compared to the TG cores in anycase? | ||
| + | #** Consider better utilizing PCI bus by running 2p2nic configuration. | ||
Revision as of 05:56, 1 April 2021
DRAFT FD.io CSIT recommended server specification for Traffic Generator (TRex) used in FD.io performance tests.
TODAY - Deployed Traffic Generator (TG) Servers
-  All TG/TRex instances are based on Xeon servers
- SuperMicro 2 socket servers.
- Motherboards with max PCIe I/O for NIC cards.
 
- For Xeon SUTs, same Xeon generation for TG
-  For Arm SUT ThunderX2
- 2n-tx2 => shared skx TG, shared SKX TG.
- shared TG has two TRex instances running in parallel, one per Numa.
 
-  For AMD SUTs, same AMD generation for TG
- 2n-zn2 => zn2 TG, ZN2 TG
 
-  for DNV
- 2n-dnv => shared skx TG, shared SKX TG.
- 3n-dnv => shared skx TG, shared SKX TG.
 
FUTURE - Recommended "standardised" approach
-  Standardise on a reference TG server
- Reduce amount of work involved in TRex calibration for tests with STL and ASTF APIs.
- Improve TG/TRex determinism of behaviour and performance.
- Get max performance from TRex.
 
-  Proposal
-  Use Xeon ICX as the main TG server platform
- Processor: ICX high end SKU (assume 8380), similar to what is used for SKX (8180) and CLX (8280).
- Server/Motherboard: OEM motherboards with max PCIEe Gen4 I/O (as per current practice).
- NIC: Separate onboard management connectivity; using 40Gb/100Gb/200Gb NIC cards (Nvidia /formerly Mellanox/, Intel) directly connected to DUT; using separate 100Gb card for b2b calibration tests.
 
-  Optionally consider lower power Xeon builds for lower end SUTs
- e.g. for new Intel Atom / Snowridge SUTs?
-  e.g. for new Arm / Ampere SUTs?
- The risk here is that underpowered TG may not be able to stress Ampere 80-core processor, in case we ever go that high on Arm SUTs.
- Agree this is a risk - we should look for DUT vendor contribution to mitigate, otherwise let them assume the risk.
 
 
 
-  Use Xeon ICX as the main TG server platform
POINTS FOR DISCUSSION
- Is CSIT project good to continue to have TG/TRex servers shared, e.g. TRex instance per Numa/socket? If yes, need to work out what "slower" SUTs / low-speed NIC we expect in the project going forward.
-  What is the status of TRex support for Arm, and expected performance?
- CSIT project doesn't have any experience running TRex on Arm.
- TRex documentation claims generic support for Arm, but no builds provided, need to compile from source.
- No / limited documented use of TRex on Arm servers.
 
-  If for the new builds in 2021 CSIT recommend ICX Xeon, but once SPR Xeon arrives in the future it will be faster.
-  Meaning we always end up with a mix of TG server platforms, as is the case today.
- Is this de-risked by running more TG cores compared to DUT cores? If I have 8xTG cores blasting traffic at 4xDUT cores, does it matter that the DUT cores are 10-20% faster? Inevitably the DUT core are doing more work compared to the TG cores in anycase?
- Consider better utilizing PCI bus by running 2p2nic configuration.
 
 
-  Meaning we always end up with a mix of TG server platforms, as is the case today.
