Difference between revisions of "CSIT/csit2001 plan"
Dwallacelf (Talk | contribs) (→Release Deliverables) |
Mackonstan (Talk | contribs) |
||
Line 44: | Line 44: | ||
|- | |- | ||
| VPP API checks | | VPP API checks | ||
− | | | + | | WIP |
| Framework | | Framework | ||
| Improving VPP API change process to make it more reliable and reduce the false positive. | | Improving VPP API change process to make it more reliable and reduce the false positive. | ||
|- | |- | ||
| VAT to PAPI | | VAT to PAPI | ||
− | | | + | | Done |
| Framework | | Framework | ||
− | | Complete VAT to PAPI migration - address the API execution efficiency for scale tests. | + | | Complete VAT to PAPI migration - address the API execution efficiency for scale tests. TODO performance tests of fib scale still on VAT. |
|- | |- | ||
| Python migration | | Python migration | ||
− | | | + | | Done |
| Framework | | Framework | ||
| Python 2.7 to 3.x migration, .md analysis and migration plan coming in gerrit. | | Python 2.7 to 3.x migration, .md analysis and migration plan coming in gerrit. | ||
|- | |- | ||
− | | Performance | + | | Performance bisection |
− | | | + | | WIP |
| Framework | | Framework | ||
| Job for bisecting performance regressions (leveraging per patch perf test work). | | Job for bisecting performance regressions (leveraging per patch perf test work). | ||
|- | |- | ||
| Data backend | | Data backend | ||
− | | | + | | Next Release |
| Framework | | Framework | ||
| A standalone test data processing backend - datastore, analytics/query engine. Stop relying on Nexus as results file store. | | A standalone test data processing backend - datastore, analytics/query engine. Stop relying on Nexus as results file store. | ||
|- | |- | ||
| HDRhistogram | | HDRhistogram | ||
− | | | + | | Done |
| Framework | | Framework | ||
| Making use of HDRhistogram in TRex, and higher resolution of latency data for performance tests. | | Making use of HDRhistogram in TRex, and higher resolution of latency data for performance tests. | ||
|- | |- | ||
| Reconf tests | | Reconf tests | ||
− | | | + | | WIP |
| Methodology | | Methodology | ||
| Reconf tests methodology - see if any methodology improvements required based on feedback, add more test cases. | | Reconf tests methodology - see if any methodology improvements required based on feedback, add more test cases. | ||
|- | |- | ||
| Perfmon | | Perfmon | ||
− | | | + | | Next Release |
| Framework | | Framework | ||
| Per vpp node efficiency - today storing elog capturing thread barriers - for perfmon we are missing an API to catch two values for the run, we would need to check if this got resolved. | | Per vpp node efficiency - today storing elog capturing thread barriers - for perfmon we are missing an API to catch two values for the run, we would need to check if this got resolved. | ||
|- | |- | ||
| Per packet path telemetry | | Per packet path telemetry | ||
− | | | + | | Next Release |
| Tools | | Tools | ||
| Start with a new telemetry approach - per packet path analysis, similarly how it's done in NFVbench, see how this could be applied to NFV density tests and actually all other tests. | | Start with a new telemetry approach - per packet path analysis, similarly how it's done in NFVbench, see how this could be applied to NFV density tests and actually all other tests. | ||
|- | |- | ||
| Emails with regressions | | Emails with regressions | ||
− | | | + | | Done |
| Presentation | | Presentation | ||
| Trending regressions - add announce emails to csit-report. | | Trending regressions - add announce emails to csit-report. | ||
|- | |- | ||
| Improved anomaly detection | | Improved anomaly detection | ||
− | | | + | | WIP |
| Methodology | | Methodology | ||
| Anomaly detection - still seeing some noise, more data doesn't seem to be helping, no pattern. Need more inside knowledge, white-box, need more telemetry data from tests to see if any correlation can be found. Affects trending anomaly detection, per patch perf, perf bisecting. | | Anomaly detection - still seeing some noise, more data doesn't seem to be helping, no pattern. Need more inside knowledge, white-box, need more telemetry data from tests to see if any correlation can be found. Affects trending anomaly detection, per patch perf, perf bisecting. | ||
|- | |- | ||
| IPsec in container | | IPsec in container | ||
− | | | + | | Done |
| Performance | | Performance | ||
| vhost/memif - adding vpp-in-container with ipsec. | | vhost/memif - adding vpp-in-container with ipsec. | ||
|- | |- | ||
| More Arm testbeds for vpp_device | | More Arm testbeds for vpp_device | ||
− | | | + | | WIP |
| Device | | Device | ||
| Testbeds - Arm - adding more ThunderX machines for vpp_device to run csit-vpp and vpp-csit device tests. | | Testbeds - Arm - adding more ThunderX machines for vpp_device to run csit-vpp and vpp-csit device tests. | ||
|- | |- | ||
| More vpp_device tests | | More vpp_device tests | ||
− | | | + | | WIP |
| Device | | Device | ||
| Add more vpp_device tests for better VPP API coverage, as those are executed per vpp patch and per csit patch | | Add more vpp_device tests for better VPP API coverage, as those are executed per vpp patch and per csit patch | ||
|- | |- | ||
| Arm vpp_Device per VPP patch voting | | Arm vpp_Device per VPP patch voting | ||
− | | | + | | WIP |
| CI process | | CI process | ||
| Productize per VPP patch (with voting?) vpp-csit device tests for Arm. | | Productize per VPP patch (with voting?) vpp-csit device tests for Arm. | ||
|- | |- | ||
| HostStack Tests | | HostStack Tests | ||
− | | | + | | WIP |
| Performance | | Performance | ||
| New Performance Tests - Iperf3+LDP with WRK, Nginx+VCL with WRK, Quic Transport | | New Performance Tests - Iperf3+LDP with WRK, Nginx+VCL with WRK, Quic Transport |
Revision as of 15:01, 15 January 2020
Contents
Introduction
This page tracks release information for FD.io CSIT-2001. It is updated regularly by hand. Real-time information is available in FD.io CSIT code repository and auto-generated docs.
Release Milestones
Milestone | Date | Deliverables |
---|---|---|
F0 | 2020-01-08 | Test case keywords code complete. Only low-risk changes accepted. |
RC1 | 2020-01-15 (F0+7) | Code complete. Pull first release branch. Only bug fixes accepted in release branch. Date aligned with VPP RC1. Start dry-runs to identify CSIT gaps on less frequently run tests. |
RC2 | 2020-01-22 (RC1+7) | Dry-run testing begins of VPP RC2, performance and functional. Date aligned with VPP RC2. |
CSIT Release | 2020-01-29 (RC2+7) | CSIT release complete. VPP release testing starts. Date aligned with VPP Formal Release. |
Report Publish | 2020-02-12 (Rls+14) | CSIT report published for VPP release. |
Release Deliverables
Name | Status | Jira Category | Description |
---|---|---|---|
VPP API checks | WIP | Framework | Improving VPP API change process to make it more reliable and reduce the false positive. |
VAT to PAPI | Done | Framework | Complete VAT to PAPI migration - address the API execution efficiency for scale tests. TODO performance tests of fib scale still on VAT. |
Python migration | Done | Framework | Python 2.7 to 3.x migration, .md analysis and migration plan coming in gerrit. |
Performance bisection | WIP | Framework | Job for bisecting performance regressions (leveraging per patch perf test work). |
Data backend | Next Release | Framework | A standalone test data processing backend - datastore, analytics/query engine. Stop relying on Nexus as results file store. |
HDRhistogram | Done | Framework | Making use of HDRhistogram in TRex, and higher resolution of latency data for performance tests. |
Reconf tests | WIP | Methodology | Reconf tests methodology - see if any methodology improvements required based on feedback, add more test cases. |
Perfmon | Next Release | Framework | Per vpp node efficiency - today storing elog capturing thread barriers - for perfmon we are missing an API to catch two values for the run, we would need to check if this got resolved. |
Per packet path telemetry | Next Release | Tools | Start with a new telemetry approach - per packet path analysis, similarly how it's done in NFVbench, see how this could be applied to NFV density tests and actually all other tests. |
Emails with regressions | Done | Presentation | Trending regressions - add announce emails to csit-report. |
Improved anomaly detection | WIP | Methodology | Anomaly detection - still seeing some noise, more data doesn't seem to be helping, no pattern. Need more inside knowledge, white-box, need more telemetry data from tests to see if any correlation can be found. Affects trending anomaly detection, per patch perf, perf bisecting. |
IPsec in container | Done | Performance | vhost/memif - adding vpp-in-container with ipsec. |
More Arm testbeds for vpp_device | WIP | Device | Testbeds - Arm - adding more ThunderX machines for vpp_device to run csit-vpp and vpp-csit device tests. |
More vpp_device tests | WIP | Device | Add more vpp_device tests for better VPP API coverage, as those are executed per vpp patch and per csit patch |
Arm vpp_Device per VPP patch voting | WIP | CI process | Productize per VPP patch (with voting?) vpp-csit device tests for Arm. |
HostStack Tests | WIP | Performance | New Performance Tests - Iperf3+LDP with WRK, Nginx+VCL with WRK, Quic Transport |
Name | Status | Jira Category | Description |
Jira Task Tracking
All CSIT release deliverables should be tracked in FDio CSIT Jira using one of the following Jira Epic categories:
Framework CI process Performance Device Methodology Telemetry Tools Presentation Honeycomb Aarch64
Multi-Release Work Areas
Work Area | Description |
---|---|
Xeon Skx testbeds | Make Skylake performance test coverage complete:
i) Boost tests in 2-Node setups, complete 3-Node setups; ii) Complete Memif/Container and Vhost-user/VM with latest QEMU; iii) Push vpp-dev to Ubuntu 18.04. |
Arm testbeds | Introduce Arm performance tests. |
Atom testbeds | Introduce Denverton and Rangeley performance tests. |
Better vhost, memif coverage | Produce more complete test data for NFV service density:
i) Scaled-out Vhost-user/VM and Memif/Container tests; ii) Test the same packet paths and NF topologies: service chains, service pipelines; iii) See if we can isolate the actual cost of Vhostuser-virtio and Memif-Memif virtual interfaces based on the test and system telemetry. iv) Test with VM and Containers running on a single Processor (single socket) with no core oversubscription and with. v) Extend the test over two Processors to quantify impact of UPI latency (and bandwidth). |
VPP per patch performance tests | Productise per VPP patch performance tests with change detection, prepare for voting:
i) Improve detection accuracy and precision; ii) Nail down current results variance; iii) Apply improvements to continuous trending and (future) git auto-bisection. |
Trending Improved Detection | Make trending job use new Burst MRR trending tests for better anomaly detection:
i) Currently postponed, as the algorithm detects performance changes not related to VPP code. ii) We need heavy workarounds or way more predictable SUT behavior. |
More VPP telemetry reported and analysed | API based consumption of VPP telemetry including existing general counters, and future extended per node counters. |
Evolve throughput search | Build upon MLRsearch and PLRsearch experience vs. ordinary binary search:
i) Compare MLRsearch with PLRsearch soak test results. |
General enhancements | General CSIT and VPP performance test and infrastructure enhancements:
i) Productize VPP_Device container-based functional tests in 1-Node Skylake testbeds, assist with the same for Arm; ii) Add proper packet latency measurements with T-Rex HDRhistogram, push T-Rex to productize HDRh'gram; iii) Start using the new VPP stats infra for per test counters and "gauges" collection incl. "show runtime", instead of VPP show CLI; iv) Start migration from VAT to VPP Python API; v) Nail down "broken"/not-performing VPP data plane feature arcs (incl. multi-threading) indicated by CSIT-18.10 results data. |
External Dependencies
- No known external dependencies.