Difference between revisions of "CSIT/csit1810 plan"
From fd.io
< CSIT
Mackonstan (Talk | contribs) |
Mackonstan (Talk | contribs) |
||
(6 intermediate revisions by 3 users not shown) | |||
Line 79: | Line 79: | ||
|- | |- | ||
| VPP_Device | | VPP_Device | ||
− | | | + | | Done |
| 1810-Framework | | 1810-Framework | ||
| Container based functional VPP device tests: CI/CD system design integrated into LF, running on 1-Node testbeds (1n-skx, 1n-arm). Relying on VF+dot1q for external loopback cable packet passing. Focus on few baseline tests. | | Container based functional VPP device tests: CI/CD system design integrated into LF, running on 1-Node testbeds (1n-skx, 1n-arm). Relying on VF+dot1q for external loopback cable packet passing. Focus on few baseline tests. | ||
|- | |- | ||
| VPP_Path | | VPP_Path | ||
− | | | + | | Next Rls |
| 1810-Framework | | 1810-Framework | ||
| Continuing migration of CSIT VIRL tests to VPP-make_test VPP integration tests for functional acceptance of VPP feature path(s) driven by use case(s). See P1 and P2 markup in [https://docs.google.com/spreadsheets/d/1PciV8XN9v1qHbIRUpFJoqyES29_vik7lcFDl73G1usc/edit?usp=sharing CSIT_VIRL migration progress]. | | Continuing migration of CSIT VIRL tests to VPP-make_test VPP integration tests for functional acceptance of VPP feature path(s) driven by use case(s). See P1 and P2 markup in [https://docs.google.com/spreadsheets/d/1PciV8XN9v1qHbIRUpFJoqyES29_vik7lcFDl73G1usc/edit?usp=sharing CSIT_VIRL migration progress]. | ||
− | |||
− | |||
− | |||
− | |||
− | |||
|- | |- | ||
| Trending Tests BMRR | | Trending Tests BMRR | ||
Line 104: | Line 99: | ||
|- | |- | ||
| Per VPP Patch Performance Checks | | Per VPP Patch Performance Checks | ||
− | | | + | | Done |
| 1810-Framework | | 1810-Framework | ||
− | | Per VPP gerrit patch vs. parent performance tests, anomaly detection | + | | Per VPP gerrit patch vs. parent performance tests, anomaly detection, no voting (-1/0/+1) yet. Manual trigger. Not "marketed" yet, as impact physical testbed processing capacity. |
|- | |- | ||
| Patch-on-Patch Infra | | Patch-on-Patch Infra | ||
Line 116: | Line 111: | ||
| WIP | | WIP | ||
| 1810-Framework | | 1810-Framework | ||
− | | Implementation of PAPI L1 KWs in CSIT. Required for | + | | Implementation of PAPI L1 KWs in CSIT. Required for migraing away from VAT. ETA EOW, Confidence level 70%. |
|- | |- | ||
| Graphs Layout Improvements | | Graphs Layout Improvements | ||
− | | | + | | Done |
| 1810-PAL | | 1810-PAL | ||
− | | Improve performance graphs layout for better readibility and maintenance: test grouping, axis labels, descriptions, other informative decoration. | + | | Improve performance graphs layout for better readibility and maintenance: test grouping, axis labels, descriptions, other informative decoration. Master report generated. 744 graphs(!) |
|- | |- | ||
| Clock cycles per VPP node into CSIT-PAL | | Clock cycles per VPP node into CSIT-PAL | ||
− | | | + | | Next Rls |
| 1810-PAL | | 1810-PAL | ||
| Use the new VPP stats infra available via PAPI to retrieve runtime counters instead of using "show run". Blocked by no PAPI support in FD.io CSIT. | | Use the new VPP stats infra available via PAPI to retrieve runtime counters instead of using "show run". Blocked by no PAPI support in FD.io CSIT. | ||
+ | |- | ||
+ | | 3n-dnv Tests (3rd Party) | ||
+ | | WIP | ||
+ | | 1810-Test | ||
+ | | Publish performance tests for 3n-dnv (3-Node Atom Denverton) from 3rd party testbeds. | ||
|} | |} | ||
Line 161: | Line 161: | ||
| VPP per patch performance tests | | VPP per patch performance tests | ||
| Productise per VPP patch performance tests with change detection, prepare for voting: i) Improve detection accuracy and precision; ii) Nail down current results variance; iii) Apply improvements to continuous trending and (future) git auto-bisection. | | Productise per VPP patch performance tests with change detection, prepare for voting: i) Improve detection accuracy and precision; ii) Nail down current results variance; iii) Apply improvements to continuous trending and (future) git auto-bisection. | ||
+ | |- | ||
+ | | Trending Improved Detection | ||
+ | | Make trending job use new Burst MRR trending tests for better anomaly detection. Currently postponed, as the algorithm detects performance changes not related to VPP code. We need heavy workarounds or way more predictable SUT behavior. | ||
|- | |- | ||
| More VPP telemetry reported and analysed | | More VPP telemetry reported and analysed | ||
Line 174: | Line 177: | ||
== External Dependencies == | == External Dependencies == | ||
− | + | # Intel FD.io Team 3rd Party Benchmarking of the Intel Denverton Platform |
Latest revision as of 10:33, 21 November 2018
Contents
Introduction
This page tracks release information for FD.io CSIT-18.10. It is updated regularly by hand. Real-time information is available in FD.io CSIT code repository and auto-generated docs.
Release Milestones
Milestone | Date | Deliverables |
---|---|---|
F0 | 2018-10-03 | Test case keywords code complete. Only low-risk changes accepted. |
RC1 | 2018-10-10 (F0+7) | Code complete. Pull first release branch. Only bug fixes accepted in release branch. Date aligned with VPP RC1. |
RC2 | 2018-10-17 (RC1+7) | Dry-run testing begins of VPP RC2, performance and functional. Date aligned with VPP RC2. |
CSIT Release | 2018-10-24 (RC2+7) | CSIT release complete. VPP release testing starts. Date aligned with VPP Formal Release. |
Report Publish | 2018-11-07 (Rls+14) | CSIT report published for VPP release. |
Release Deliverables
Name | Status | Jira Category | Description |
---|---|---|---|
3n-skx Tests | Done | 1810-Test | Add performamce tests for 3n-skx (3-Node Xeon Skylake) testbeds: VM vhost and Container memif tests. |
2n-skx Tests | Done | 1810-Test | Add performamce tests for 2n-skx (2-Node Xeon Skylake) testbeds: focus on baseline and scale tests. |
VXLAN Scale Tests | Done | 1810-Test | Add performamce tests for VXLAN scale with dot1q and l2bd. |
AVF Driver Tests | Done | 1810-Test | Add performamce tests for i40e AVF driver on VPP, DPDK-free. |
QAT | Done | 1810-Test | Fix reoccuring issues with QAT crypto accelerator cards. |
VM Vhost Virtio Params Combinations | Done | 1810-Test | Add performance tests for VM vhost with different virtio parameters combinations: indirect buffers, mergeable buffers. |
Soak Tests Experimental | WIP | 1810-Framework | Add soak performamce tests infrastructure for extended test duration and throughput discovery at given PLR and total long soak time e.g. minutes, hours, days. |
VPP_Device | Done | 1810-Framework | Container based functional VPP device tests: CI/CD system design integrated into LF, running on 1-Node testbeds (1n-skx, 1n-arm). Relying on VF+dot1q for external loopback cable packet passing. Focus on few baseline tests. |
VPP_Path | Next Rls | 1810-Framework | Continuing migration of CSIT VIRL tests to VPP-make_test VPP integration tests for functional acceptance of VPP feature path(s) driven by use case(s). See P1 and P2 markup in CSIT_VIRL migration progress. |
Trending Tests BMRR | Done | 1810-Framework | Use new Burst MRR tests for daily trending. |
K8s/Ligato in Trending | Done | 1810-Test | Add K8s/Ligato Container memif tests to daily trending. |
Per VPP Patch Performance Checks | Done | 1810-Framework | Per VPP gerrit patch vs. parent performance tests, anomaly detection, no voting (-1/0/+1) yet. Manual trigger. Not "marketed" yet, as impact physical testbed processing capacity. |
Patch-on-Patch Infra | Done | 1810-Framework | Add capability to run performance tests using CSIT gerrit patch code testing VPP gerrit patch code, i.e. before any code is merged into git branch. |
CSIT PAPI Support | WIP | 1810-Framework | Implementation of PAPI L1 KWs in CSIT. Required for migraing away from VAT. ETA EOW, Confidence level 70%. |
Graphs Layout Improvements | Done | 1810-PAL | Improve performance graphs layout for better readibility and maintenance: test grouping, axis labels, descriptions, other informative decoration. Master report generated. 744 graphs(!) |
Clock cycles per VPP node into CSIT-PAL | Next Rls | 1810-PAL | Use the new VPP stats infra available via PAPI to retrieve runtime counters instead of using "show run". Blocked by no PAPI support in FD.io CSIT. |
3n-dnv Tests (3rd Party) | WIP | 1810-Test | Publish performance tests for 3n-dnv (3-Node Atom Denverton) from 3rd party testbeds. |
Jira Task Tracking
All CSIT release deliverables should be tracked in FDio CSIT Jira using one of the following Jira Epic categories:
- CSIT Framework
- Operations
- Test
- PAL
- VIRL
- HoneyComb
- [DMM]
Multi-Release Work Areas
Work Area | Description |
---|---|
Xeon Skx testbeds | Make Skylake performance test coverage complete: i) Boost tests in 2-Node setups, complete 3-Node setups; ii) Complete Memif/Container and Vhost-user/VM with latest greatest QEMU etc; iii) Push vpp-dev to Ubuntu 18.04. |
Arm testbeds | Introduce Arm performance tests. |
Atom testbeds | Introduce Atom performance tests. |
Better vhost, memif coverage | Make CSIT produce more complete test data for scaled-out Vhost-user/VM and Memif/Container: i) Complete same packet paths and topologies for a low number of VMs and Containers, then scale-up VM and Container numbers; ii) See if we can isolate the actual cost of Vhostuser-virtio and Memif-Memif virtual interfaces based on the test and system telemetry. |
VPP per patch performance tests | Productise per VPP patch performance tests with change detection, prepare for voting: i) Improve detection accuracy and precision; ii) Nail down current results variance; iii) Apply improvements to continuous trending and (future) git auto-bisection. |
Trending Improved Detection | Make trending job use new Burst MRR trending tests for better anomaly detection. Currently postponed, as the algorithm detects performance changes not related to VPP code. We need heavy workarounds or way more predictable SUT behavior. |
More VPP telemetry reported and analysed | API based consumption of VPP telemetry including existing general counters, and future extended per node counters. |
Evolve throughput search | Build upon MLRsearch experience vs. ordinary binary search: i) New POC for extended soak test for validating NDR (zero packet-loss-ratio PLR) and(?) PDR (non-zero PLR). |
General enhancements | General CSIT and VPP performance test and infrastructure enhancements: i) Productize VPP_Device container-based functional tests in 1-Node Skylake testbeds, assist with the same for Arm; ii) Add proper packet latency measurements with T-Rex HDRhistogram, push T-Rex to productize HDRh'gram; iii) Start using the new VPP stats infra for per test counters and "gauges" collection incl. "show runtime", instead of VPP show CLI; iv) Start migration from VAT to VPP Python API; v) Nail down "broken"/not-performing VPP data plane feature arcs (incl. multi-threading) indicated by CSIT-18.07 results data. |
External Dependencies
- Intel FD.io Team 3rd Party Benchmarking of the Intel Denverton Platform