Difference between revisions of "CSIT/TestFailuresTracking"

From fd.io
Jump to: navigation, search
(Created page with "== CSIT Test Failure Clasification == All known CSIT failures grouped and listed in the following order: * Always failing followed by sometimes failing. * Always failing test...")
 
Line 86: Line 86:
 
* All tests with 9000B payload frames not forwarded over vhostuser interfaces.
 
* All tests with 9000B payload frames not forwarded over vhostuser interfaces.
 
** work-to-fix: hard
 
** work-to-fix: hard
** rca: VPP code: [34839: dpdk: cleanup MTU handling](https://gerrit.fd.io/r/c/vpp/+/34839)
+
** rca: VPP code: [https://gerrit.fd.io/r/c/vpp/+/34839 34839: dpdk: cleanup MTU handling]
 
** test: 9000B - vhostuser
 
** test: 9000B - vhostuser
 
** frequency: always
 
** frequency: always
 
** testbed: 2n-skx, 3n-skx, 2n-clx
 
** testbed: 2n-skx, 3n-skx, 2n-clx
** examples: [3n-skx vhostuser](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2202-3n-skx/67/log.html.gz#s1-s1-s1-s1-s1)
+
** examples: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2202-3n-skx/67/log.html.gz#s1-s1-s1-s1-s1 3n-skx vhostuser]
 
** ticket: [https://jira.fd.io/browse/CSIT-1809 CSIT-1809]
 
** ticket: [https://jira.fd.io/browse/CSIT-1809 CSIT-1809]
 
** note:
 
** note:
Line 96: Line 96:
 
* All tests with 9000B payload frames not forwarded over memif interfaces.
 
* All tests with 9000B payload frames not forwarded over memif interfaces.
 
** work-to-fix: hard
 
** work-to-fix: hard
** rca: VPP code: [34839: dpdk: cleanup MTU handling](https://gerrit.fd.io/r/c/vpp/+/34839)
+
** rca: VPP code: [https://gerrit.fd.io/r/c/vpp/+/34839 34839: dpdk: cleanup MTU handling]
 
** test: 9000B - memif
 
** test: 9000B - memif
 
** frequency: always
 
** frequency: always
 
** testbed: 2n-skx, 3n-skx, 2n-clx
 
** testbed: 2n-skx, 3n-skx, 2n-clx
** examples: [2n-skx Memif](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2202-2n-skx/33/log.html.gz#s1-s1-s1-s1-s1)
+
** examples: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2202-2n-skx/33/log.html.gz#s1-s1-s1-s1-s1 2n-skx Memif]
 
** ticket: [https://jira.fd.io/browse/CSIT-1808 CSIT-1808]
 
** ticket: [https://jira.fd.io/browse/CSIT-1808 CSIT-1808]
 
** note:
 
** note:
Line 106: Line 106:
 
* 9000B payload frames not forwarded over tunnels due to violating supported Max Frame Size (VxLAN, LISP, SRv6)
 
* 9000B payload frames not forwarded over tunnels due to violating supported Max Frame Size (VxLAN, LISP, SRv6)
 
** work-to-fix: hard
 
** work-to-fix: hard
** rca: VPP code: [34839: dpdk: cleanup MTU handling](https://gerrit.fd.io/r/c/vpp/+/34839)
+
** rca: VPP code: [https://gerrit.fd.io/r/c/vpp/+/34839 34839: dpdk: cleanup MTU handling]
 
** test: 9000B - IP4 tunnels VXLAN, IP4 tunnels LISP, Srv6
 
** test: 9000B - IP4 tunnels VXLAN, IP4 tunnels LISP, Srv6
 
** frequency: always
 
** frequency: always
 
** testbed: 2n-icx, 3n-icx
 
** testbed: 2n-icx, 3n-icx
** examples: [2n-icx VXLAN](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2210-2n-icx/10/log.html.gz), [3n-icx](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2210-3n-icx/22/log.html.gz#s1-s1-s1-s1-s1-t6)
+
** examples: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2210-2n-icx/10/log.html.gz 2n-icx VXLAN], [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2210-3n-icx/22/log.html.gz#s1-s1-s1-s1-s1-t6 3n-icx]
 
** ticket:
 
** ticket:
 
** note:
 
** note:
Line 120: Line 120:
 
** frequency: always
 
** frequency: always
 
** testbed: 3n-icx
 
** testbed: 3n-icx
** examples: [3n-icx ip4base](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2210-3n-icx/13/log.html.gz#s1-s1-s1-s1-s1-t6)
+
** examples: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2210-3n-icx/13/log.html.gz#s1-s1-s1-s1-s1-t6 3n-icx ip4base]
 
** ticket: [https://jira.fd.io/browse/CSIT-1885 CSIT-1885]
 
** ticket: [https://jira.fd.io/browse/CSIT-1885 CSIT-1885]
 
** note:
 
** note:
Line 130: Line 130:
 
** frequency: always
 
** frequency: always
 
** testbed: 2n-clx, 2n-icx, 2n-zn2
 
** testbed: 2n-clx, 2n-icx, 2n-zn2
** example: [2n-clx](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-dpdk-perf-report-iterative-2210-2n-clx/1/log.html.gz#s1-s1-s1-s3-t6), [2n-icx](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-dpdk-perf-report-iterative-2210-2n-icx/3/log.html.gz#s1-s1-s1-s1-t6)
+
** example: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-dpdk-perf-report-iterative-2210-2n-clx/1/log.html.gz#s1-s1-s1-s3-t6 2n-clx], [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-dpdk-perf-report-iterative-2210-2n-icx/3/log.html.gz#s1-s1-s1-s1-t6 2n-icx]
 
** ticket: [https://jira.fd.io/browse/CSIT-1870 CSIT-1870]
 
** ticket: [https://jira.fd.io/browse/CSIT-1870 CSIT-1870]
 
** note:
 
** note:
Line 142: Line 142:
 
** frequency: always
 
** frequency: always
 
** testbed: 2n-skx, 2n-clx, 2n-icx
 
** testbed: 2n-skx, 2n-clx, 2n-icx
** example: [2n-icx](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2210-2n-icx/10/log.html.gz#s1-s1-s1-s1-s1)
+
** example: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2210-2n-icx/10/log.html.gz#s1-s1-s1-s1-s1 2n-icx]
 
** ticket: [https://jira.fd.io/browse/CSIT-1800 CSIT-1800]
 
** ticket: [https://jira.fd.io/browse/CSIT-1800 CSIT-1800]
 
** note:
 
** note:
Line 154: Line 154:
 
** frequency: always
 
** frequency: always
 
** testbeds: 2n-skx, 2n-clx, 2n-icx
 
** testbeds: 2n-skx, 2n-clx, 2n-icx
** example: [2n-icx](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2210-2n-icx/18/log.html.gz#s1-s1-s1-s1-s11-t3), [2n-clx](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2210-2n-clx/9/log.html.gz#s1-s1-s1-s1-s11-t1)
+
** example: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2210-2n-icx/18/log.html.gz#s1-s1-s1-s1-s11-t3 2n-icx], [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2210-2n-clx/9/log.html.gz#s1-s1-s1-s1-s11-t1 2n-clx]
 
** ticket: [https://jira.fd.io/browse/CSIT-1799 CSIT-1799]
 
** ticket: [https://jira.fd.io/browse/CSIT-1799 CSIT-1799]
 
** note:
 
** note:
Line 166: Line 166:
 
** frequency: always
 
** frequency: always
 
** testbed: 2n-skx, 2n-clx, 2n-icx
 
** testbed: 2n-skx, 2n-clx, 2n-icx
** example: [2n-icx](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2210-2n-icx/18/log.html.gz#s1-s1-s1-s1-s2-t4)
+
** example: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2210-2n-icx/18/log.html.gz#s1-s1-s1-s1-s2-t4 2n-icx]
 
** ticket: [https://jira.fd.io/browse/CSIT-1884 CSIT-1884]
 
** ticket: [https://jira.fd.io/browse/CSIT-1884 CSIT-1884]
 
** note:
 
** note:
Line 182: Line 182:
 
** frequency: medium
 
** frequency: medium
 
** testbed: 2n-icx
 
** testbed: 2n-icx
** example: [2n-icx mrr](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-weekly-master-2n-icx/47/log.html.gz#s1-s1-s1-s1-s1-s1-s1), [2n-icx ndrpdr](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-icx/48/log.html.gz#s1-s1-s1-s5-s8-t1)
+
** example: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-weekly-master-2n-icx/47/log.html.gz#s1-s1-s1-s1-s1-s1-s1 2n-icx mrr], [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-icx/48/log.html.gz#s1-s1-s1-s5-s8-t1 2n-icx ndrpdr]
 
** ticket: [https://jira.fd.io/browse/CSIT-1881 CSIT-1881]
 
** ticket: [https://jira.fd.io/browse/CSIT-1881 CSIT-1881]
 
** note: Once VPP breaks, all subsequent tests fail. Even all subsequent builds will be failing until Peter makes TB working again. Although it's failing with medium frequency when it happens it breaks all subsequent builds on the TB therefore [H] priority.
 
** note: Once VPP breaks, all subsequent tests fail. Even all subsequent builds will be failing until Peter makes TB working again. Although it's failing with medium frequency when it happens it breaks all subsequent builds on the TB therefore [H] priority.
Line 194: Line 194:
 
** frequency: high
 
** frequency: high
 
** testbed: 2n-clx
 
** testbed: 2n-clx
** example: [2n-clx](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/167/log.html.gz#s1-s1-s1-s2-s8-t1)
+
** example: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/167/log.html.gz#s1-s1-s1-s2-s8-t1 2n-clx]
 
** ticket: [https://jira.fd.io/browse/CSIT-1864 CSIT-1864]
 
** ticket: [https://jira.fd.io/browse/CSIT-1864 CSIT-1864]
 
** note:
 
** note:
Line 206: Line 206:
 
** frequency: high
 
** frequency: high
 
** testbed: 3n-snr
 
** testbed: 3n-snr
** example: [3n-snr](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-snr/45/log.html.gz#s1-s1-s1-s3-s12-t1)
+
** example: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-snr/45/log.html.gz#s1-s1-s1-s3-s12-t1 3n-snr]
 
** ticket: [https://jira.fd.io/browse/CSIT-1871 CSIT-1871]
 
** ticket: [https://jira.fd.io/browse/CSIT-1871 CSIT-1871]
 
** note: Sometimes 'TwentyFiveGigabitEthernetec/0/0' goes down and all subsequent tests fail.
 
** note: Sometimes 'TwentyFiveGigabitEthernetec/0/0' goes down and all subsequent tests fail.
Line 218: Line 218:
 
** frequency: high
 
** frequency: high
 
** testbed: 3n-icx
 
** testbed: 3n-icx
** examples: [3n-icx](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2210-3n-icx/23/log.html.gz#s1-s1-s1-s1-s1-t2)
+
** examples: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2210-3n-icx/23/log.html.gz#s1-s1-s1-s1-s1-t2 3n-icx]
 
** ticket: [https://jira.fd.io/browse/CSIT-1886 CSIT-1886]
 
** ticket: [https://jira.fd.io/browse/CSIT-1886 CSIT-1886]
 
** note:
 
** note:
Line 230: Line 230:
 
** frequency: high
 
** frequency: high
 
** testbed: 3n-tsh
 
** testbed: 3n-tsh
** example: [3n-tsh](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-tsh/710/log.html.gz#s1-s1-s1-s7-s2-t1)
+
** example: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-tsh/710/log.html.gz#s1-s1-s1-s7-s2-t1 3n-tsh], [https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-master-3n-tsh/123/ 3n-tsh]
 
** ticket: [https://jira.fd.io/browse/CSIT-1877 CSIT-1877]
 
** ticket: [https://jira.fd.io/browse/CSIT-1877 CSIT-1877]
** note: 3n-alt testbed was fixed. 3n-tsh still failing. fixed: by rebuild initrd .37 on TB, [3n-tsh test log](https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-master-3n-tsh/123/)
+
** note: 3n-alt testbed was fixed. 3n-tsh still failing. fixed: by rebuild initrd .37 on TB,
 +
 
  
 
=== in trending - lower frequency failures ===
 
=== in trending - lower frequency failures ===
Line 244: Line 245:
 
** frequency: low
 
** frequency: low
 
** testbed: 3n-skx, 3n-icx, 3n-snr
 
** testbed: 3n-skx, 3n-icx, 3n-snr
** example: [3n-icx](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/148/log.html.gz#s1-s1-s1-s1-s4-t1), [3n-snr](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-snr/32/log.html.gz#s1-s1-s1-s1-s4-t1), [3n-icx](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-3n-icx/43/log.html.gz#s1-s1-s1-s1-s4-t1)
+
** example: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/148/log.html.gz#s1-s1-s1-s1-s4-t1 3n-icx], [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-snr/32/log.html.gz#s1-s1-s1-s1-s4-t1 3n-snr], [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-3n-icx/43/log.html.gz#s1-s1-s1-s1-s4-t1 3n-icx]
 
** ticket: [https://jira.fd.io/browse/CSIT-1827 CSIT-1827]
 
** ticket: [https://jira.fd.io/browse/CSIT-1827 CSIT-1827]
 
** note:
 
** note:
Line 256: Line 257:
 
** frequency: low
 
** frequency: low
 
** testbed: 3n-tsh, 3n-alt, 2n-clx
 
** testbed: 3n-tsh, 3n-alt, 2n-clx
** example: [2n-icx](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-icx/47/log.html.gz#s1-s1-s1-s2-s37-t1)
+
** example: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-icx/47/log.html.gz#s1-s1-s1-s2-s37-t1 2n-icx]
 
** ticket: [https://jira.fd.io/browse/CSIT-1804 CSIT-1804]
 
** ticket: [https://jira.fd.io/browse/CSIT-1804 CSIT-1804]
 
** note:
 
** note:
Line 268: Line 269:
 
** frequency: low
 
** frequency: low
 
** testbed: 2n-clx, 2n-skx, 2n-tx2, 2n-icx
 
** testbed: 2n-clx, 2n-skx, 2n-tx2, 2n-icx
** example: [2n-skx](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-skx/202/log.html.gz#s1-s1-s1-s2-s4-t3), [2n-clx](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/152/log.html.gz#s1-s1-s1-s5-s12-t3)
+
** example: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-skx/202/log.html.gz#s1-s1-s1-s2-s4-t3 2n-skx], [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/152/log.html.gz#s1-s1-s1-s5-s12-t3 2n-clx]
 
** ticket: [https://jira.fd.io/browse/CSIT-1802 CSIT-1802]
 
** ticket: [https://jira.fd.io/browse/CSIT-1802 CSIT-1802]
 
** note: This is mainly observed in iterative and coverage. It's very low frequency ~ 1 out of 100
 
** note: This is mainly observed in iterative and coverage. It's very low frequency ~ 1 out of 100
Line 280: Line 281:
 
** frequency: low
 
** frequency: low
 
** testbed: all testbeds
 
** testbed: all testbeds
** example: [2n-zn2](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-zn2/639/log.html.gz#s1-s1-s1-s2-s18-t3), [3n-icx](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/152/log.html.gz#s1-s1-s1-s5-s1-t2)
+
** example: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-zn2/639/log.html.gz#s1-s1-s1-s2-s18-t3 2n-zn2], [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/152/log.html.gz#s1-s1-s1-s5-s1-t2 3n-icx]
 
** ticket: [https://jira.fd.io/browse/CSIT-1782 CSIT-1782]
 
** ticket: [https://jira.fd.io/browse/CSIT-1782 CSIT-1782]
 
** note: A long standing issue without a final permanent fix.
 
** note: A long standing issue without a final permanent fix.
Line 292: Line 293:
 
** frequency: high
 
** frequency: high
 
** testbed: 2n-dnv and 3n-dnv
 
** testbed: 2n-dnv and 3n-dnv
** example: [2n-dnv](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-dnv/1264/log.html.gz#s1-s1-s1-s1-s3-t1), [3n-dnv](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-dnv/1274/log.html.gz#s1-s1-s1-s2-s1-t1)
+
** example: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-dnv/1264/log.html.gz#s1-s1-s1-s1-s3-t1 2n-dnv], [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-dnv/1274/log.html.gz#s1-s1-s1-s2-s1-t1 3n-dnv]
 
** ticket: [/VPP-2010](https://jira.fd.io/browse/VPP-2010)
 
** ticket: [/VPP-2010](https://jira.fd.io/browse/VPP-2010)
 
** note: TODO VPP to fix speed_capability.
 
** note: TODO VPP to fix speed_capability.
Line 304: Line 305:
 
** frequency: low
 
** frequency: low
 
** testbed: 2n-zn2, 2n-skx, 2n-icx, 2n-clx
 
** testbed: 2n-zn2, 2n-skx, 2n-icx, 2n-clx
** example: [2n-icx](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-icx/160/log.html.gz#s1-s1-s1-s2-s35-t1), [2n-clx](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/164/log.html.gz#s1-s1-s1-s2-s54-t1)
+
** example: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-icx/160/log.html.gz#s1-s1-s1-s2-s35-t1 2n-icx], [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-ndrpdr-weekly-master-2n-clx/164/log.html.gz#s1-s1-s1-s2-s54-t1 2n-clx]
 
** ticket: [https://jira.fd.io/browse/CSIT-1795 CSIT-1795]
 
** ticket: [https://jira.fd.io/browse/CSIT-1795 CSIT-1795]
 
** note:
 
** note:
Line 316: Line 317:
 
** frequency: low
 
** frequency: low
 
** testbeds: 2n-dnv
 
** testbeds: 2n-dnv
** examples: [2n-dnv](https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-dnv/1264/log.html.gz#s1-s1-s1-s1-s7-t4)
+
** examples: [https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-dnv/1264/log.html.gz#s1-s1-s1-s1-s7-t4 2n-dnv]
 
** ticket: [https://jira.fd.io/browse/CSIT-1850 CSIT-1850]
 
** ticket: [https://jira.fd.io/browse/CSIT-1850 CSIT-1850]
 
** note:
 
** note:

Revision as of 12:59, 5 December 2022

Contents

CSIT Test Failure Clasification

All known CSIT failures grouped and listed in the following order:

  • Always failing followed by sometimes failing.
  • Always failing tests:
    • Most common use cases followed by less common.
  • Sometimes failing tests:
    • Most frequently failing followed by less frequently failing.
    • Within each sub-group: most common use cases followed by less common.

CSIT Test Fixing Priorities

  • Test fixing work priorities defined as follows
    • (H)igh priority, most common use cases and most common test code.
    • (M)edium priority, specific HW and pervasive test code issue.
    • (L)ow priority, corner cases and external dependencies.

Always Failing Tests

In Trending

(H) 2n-clx, 2n-zn2: VPP RDMA tests no traffic forwarded

  • (H) 2n-clx, 2n-zn2: all RDMA tests failing with cli_inband clear runtime command
    • work-to-fix: easy
    • rca:
    • test: all RDMA with CX556A NIC
    • frequency: always
    • testbed: 2n-clx, 2n-zn2
    • example: 2n-clx, 2n-zn2, 2n-clx
    • ticket: CSIT-1882
    • note:

(M) 3n-snr: hwasync Wireguard failing to verify device

  • (M) 3n-snr: All hwasync wireguard tests failing when trying to verify device
    • work-to-fix: easy
    • rca: Failed to bind PCI device 0000:f4:00.0 to c4xxx on host 10.30.51.93
    • test: hwasync wireguard
    • frequency: always
    • testbed: 3n-snr
    • example: 3n-snr
    • ticket: CSIT-1883
    • note:

(M) 1n-aws: TRex mlrsearch fails to find NDR & PDR due to AWS rate limiting (5min total test duration)

  • (M) 1n-aws: TRex NDR PDR ALL IP4 scale and L2 scale tests failing with 50% packet loss
    • work-to-fix: hard
    • rca:
    • test: ip4scale2m
    • frequency: always
    • testbed: 1n-aws
    • example: 1n-aws
    • ticket: CSIT-1876
    • note: The root cause can be shared environment in aws cloud.

(M) 3n-alt, 3n-snr: testpmd no traffic forwarded

  • (M) 3n-alt, 3n-snr: testpmd tests fail with no traffic
    • work-to-fix: hard
    • rca:
    • test: testpmd
    • frequency: always
    • testbed: 3n-alt, 3n-snr
    • example: 3n-alt, 3n-snr, 3n-snr
    • ticket: CSIT-1848
    • note:

not in trending

(H) 3n-icx: vpp hoststack QUIC vppecho tests failing

  • (H) 3n-icx: QUIC vppecho BPS tests failing on timeout when checking hoststack finished
    • work-to-fix: easy
    • rca:
    • test: Quic vppecho BPS
    • frequency: always
    • testbed: 3n-skx, 3n-icx
    • example: 3n-icx
    • ticket: CSIT-1835
    • note:

(M) all testbeds: vpp 9000B tests with vhostuser, memif, tunnels, avf

  • 9000B payload frames not forwarded over tunnels due to violating supported Max Frame Size (VxLAN, LISP, SRv6)
  • (M) 3n-icx: 9000b ip4 ip6 l2 NDRPDR AVF tests are failing to forward traffic
    • work-to-fix: hard
    • rca:
    • test: 9000B - IP4, IP6, l2 - base and scale
    • frequency: always
    • testbed: 3n-icx
    • examples: 3n-icx ip4base
    • ticket: CSIT-1885
    • note:
  • (M) 2n-clx, 2n-icx, 2n-zn2: DPDK testpmd 9000b tests on xxv710 nic are failing with no traffic
    • work-to-fix: hard
    • rca:
    • test: DPDK testpmd 9000b tests on xxv710 nic
    • frequency: always
    • testbed: 2n-clx, 2n-icx, 2n-zn2
    • example: 2n-clx, 2n-icx
    • ticket: CSIT-1870
    • note:

(M) 2n-clx, 2n-icx: all Geneve tests with 1024 tunnels fail

  • (M) All Geneve L3 mode scale tests (1024 tunnels) are failing
    • work-to-fix: hard
    • rca: VPP crash, Failed to add IP neighbor on interface geneve_tunnel258
    • test: avf-ethip4--ethip4udpgeneve-1024tun-ip4base 64B 1518B IMIX 1c 2c 4c
    • frequency: always
    • testbed: 2n-skx, 2n-clx, 2n-icx
    • example: 2n-icx
    • ticket: CSIT-1800
    • note:

(L) 2n-clx, 2n-icx: nat44ed cps 16M sessions scale fail

  • (L) All NAT44-ED 16M sessions CPS scale tests fail while setting NAT44 address range.
    • work-to-fix: hard
    • rca: VPP crash, Failed to set NAT44 address range on host 10.30.51.44 (connections-per-second tests only)
    • test: 64B-avf-ethip4tcp-nat44ed-h262144-p63-s16515072-cps-ndrpdr 1c 2c 4c, 64B-avf-ethip4udp-nat44ed-h262144-p63-s16515072-cps-ndrpdr 1c 2c 4c
    • frequency: always
    • testbeds: 2n-skx, 2n-clx, 2n-icx
    • example: 2n-icx, 2n-clx
    • ticket: CSIT-1799
    • note:

(L) 2n-clx, 2n-icx: nat44det imix 1M sessions fails to create sessions

  • (L) 2n-clx, 2n-icx: All NAT44DET NDR PDR IMIX over 1M sessions BIDIR tests failing to create enough sessions
    • work-to-fix: hard
    • rca:
    • test: IMIX over 1M sessions bidir
    • frequency: always
    • testbed: 2n-skx, 2n-clx, 2n-icx
    • example: 2n-icx
    • ticket: CSIT-1884
    • note:

Sometimes failing tests

in trending - high frequency failures

(H) 2n-icx: NFV density VPP does not start in container

  • (H) 2n-icx: NFV density tests breaks VPP which fails to start (re-opened)
    • work-to-fix: hard
    • rca:
    • test: all subsequent
    • frequency: medium
    • testbed: 2n-icx
    • example: 2n-icx mrr, 2n-icx ndrpdr
    • ticket: CSIT-1881
    • note: Once VPP breaks, all subsequent tests fail. Even all subsequent builds will be failing until Peter makes TB working again. Although it's failing with medium frequency when it happens it breaks all subsequent builds on the TB therefore [H] priority.

(M) 2n-clx: e810 mlrsearch tests packets forwarding in one direction

  • (M) 2n-clx: half of the packets lost on PDR tests (re-opened)
    • work-to-fix: hard
    • rca:
    • test: e810Cq ip4base, ip6base
    • frequency: high
    • testbed: 2n-clx
    • example: 2n-clx
    • ticket: CSIT-1864
    • note:

(M) 3n-snr: 25GE links randomly going down between snr/sut and icx/tg-trex

  • (M) 3n-snr: 25GE interface between SUT and TG/TRex goes down randomly
    • work-to-fix: hard
    • rca:
    • test: all subsequent
    • frequency: high
    • testbed: 3n-snr
    • example: 3n-snr
    • ticket: CSIT-1871
    • note: Sometimes 'TwentyFiveGigabitEthernetec/0/0' goes down and all subsequent tests fail.

(M) 3n-icx: wireguard 1k tunnels mlrsearch tests failing with 2c and 4c

  • (M) 3n-icx: Wireguard tests with 100 and more tunnels are failing PDR criteria
    • work-to-fix: easy
    • rca:
    • test: wireguard 100 tunnels and more
    • frequency: high
    • testbed: 3n-icx
    • examples: 3n-icx
    • ticket: CSIT-1886
    • note:

(M) 3n-tsh: vpp in VM not starting

  • (M) 3n-tsh: VM tests failing to boot VM
    • work-to-fix: easy
    • rca:
    • test: 3n-tsh: sporadic VM vhost
    • frequency: high
    • testbed: 3n-tsh
    • example: 3n-tsh, 3n-tsh
    • ticket: CSIT-1877
    • note: 3n-alt testbed was fixed. 3n-tsh still failing. fixed: by rebuild initrd .37 on TB,


in trending - lower frequency failures

(M) 3n-icx, 3n-snr: 1518B IPsec packets not passing

  • (M) 3n-icx, 3n-skx, 3n-snr: all 1518B AVF crypto tests failed with no traffic, all IMIX AVF crypto with excessive packet loss
    • work-to-fix: hard
    • rca:
    • test: all AVF crypto
    • frequency: low
    • testbed: 3n-skx, 3n-icx, 3n-snr
    • example: 3n-icx, 3n-snr, 3n-icx
    • ticket: CSIT-1827
    • note:

(M) all testbeds: mlrsearch fails to find NDR rate

  • (M) 3n-tsh, 3n-alt, 2n-clx testbed (Taishan, Altra, Cascade-lake): NDR tests failing from time to time.
    • work-to-fix: hard
    • rca:
    • test: Crypto, Ip4, L2, Srv6, Vm Vhost (all packet sizes, all core configurations affected)
    • frequency: low
    • testbed: 3n-tsh, 3n-alt, 2n-clx
    • example: 2n-icx
    • ticket: CSIT-1804
    • note:

(M) all testbeds: AF_XDP mlrsearch fails to find NDR rate

  • (M) all testbeds: AF-XDP - NDR tests failing from time to time
    • work-to-fix: hard
    • rca:
    • test: af-xdp multicore tests
    • frequency: low
    • testbed: 2n-clx, 2n-skx, 2n-tx2, 2n-icx
    • example: 2n-skx, 2n-clx
    • ticket: CSIT-1802
    • note: This is mainly observed in iterative and coverage. It's very low frequency ~ 1 out of 100

(L) all testbeds: vpp create avf interface failure in multi-core configs

  • (L) multicore AVF tests are failing when trying to create interface
    • work-to-fix: hard
    • rca: issue in Intel FVL driver
    • test: multicore AVF
    • frequency: low
    • testbed: all testbeds
    • example: 2n-zn2, 3n-icx
    • ticket: CSIT-1782
    • note: A long standing issue without a final permanent fix.

(L) 2n-dnv, 3n-dnv: x557 auto-negotiating 1ge instead of 10ge

  • (L) T-Rex STL runtime error
    • work-to-fix: hard
    • rca: VPP code - X557 speed_capability set 1GE instead of 10GE
    • test: all tests
    • frequency: high
    • testbed: 2n-dnv and 3n-dnv
    • example: 2n-dnv, 3n-dnv
    • ticket: [/VPP-2010](https://jira.fd.io/browse/VPP-2010)
    • note: TODO VPP to fix speed_capability.

(L) all testbeds: nat44det 4M and 16M scale 1 session not established

  • (L) Not all DET44 sessions have been established: 4128767 != 4128768
    • work-to-fix: hard
    • rca:
    • test: nat44det udp 4m and 16m (64k and 1m are ok)
    • frequency: low
    • testbed: 2n-zn2, 2n-skx, 2n-icx, 2n-clx
    • example: 2n-icx, 2n-clx
    • ticket: CSIT-1795
    • note:

(L) 2n-dnv: nat44ed 1518B 64k sessions not establishing all sessions

  • (L) 2n-dnv: sporadic 1518B tput tests failing to establish required sessions
    • work-to-fix: hard
    • rca:
    • test: 1518B tput
    • frequency: low
    • testbeds: 2n-dnv
    • examples: 2n-dnv
    • ticket: CSIT-1850
    • note: