Difference between revisions of "VPP/AArch64"

From fd.io
< VPP
Jump to: navigation, search
(Meeting Minutes)
(Meeting Minutes)
 
(468 intermediate revisions by 14 users not shown)
Line 4: Line 4:
 
=== Meeting Details ===
 
=== Meeting Details ===
  
* Regular AArch64 meeting: [https://zoom.us/my/fastdata Tuesdays at 06:00 PT (Pacific Time)] (weekly). [http://www.thetimezoneconverter.com/?t=06:00&tz=PT%20%28Pacific%20Time%29 Convert to your timezone.]
+
* Regular AArch64 meeting: 1st and 3rd Tuesdays of every month at 06:00 PT (Pacific Time) (biweekly). [http://www.thetimezoneconverter.com/?t=06:00&tz=PT%20%28Pacific%20Time%29 Convert to your timezone.]
** [https://zoom.us/my/fastdata FD.io Zoom Meeting room ]
+
** [https://zoom.us/my/fastdata?pwd=Z3Z0UnJyUmRIMlU3eTJLcGF6VEptQT09 FD.io Zoom Meeting room ]
  
 
=== IRC Channel ===
 
=== IRC Channel ===
  
 
'''<code>#fdio-arm</code>''' on <code>freenode.net</code>
 
'''<code>#fdio-arm</code>''' on <code>freenode.net</code>
 +
 +
=== Slack ===
 +
 +
Request invitation at https://slack.fd.io/
  
 
=== Jira ===
 
=== Jira ===
Line 18: Line 22:
  
 
* [https://schd.ws/hosted_files/fdiominisummitatkubeconeu20/aa/kubecon_fdio_brooks.pdf The path to Fast Data on Arm] [pdf] - FD.io Mini-Summit at KC+CNC EU 2018
 
* [https://schd.ws/hosted_files/fdiominisummitatkubeconeu20/aa/kubecon_fdio_brooks.pdf The path to Fast Data on Arm] [pdf] - FD.io Mini-Summit at KC+CNC EU 2018
 +
* [https://www.youtube.com/watch?v=T7za89oBZtw&t=79s Vector Packet Processing (VPP) Arm Story: Now and Beyond] [youtube] - FD.io Mini-summit at KC+CNC NA 2018
  
 
== Release Milestones ==
 
== Release Milestones ==
Line 41: Line 46:
 
* [https://jenkins.fd.io/computer/ '''CI build servers'''] integrated into Jenkins
 
* [https://jenkins.fd.io/computer/ '''CI build servers'''] integrated into Jenkins
  
* [https://wiki.fd.io/view/CSIT/fdio_csit_lab_ext_lld_draft '''CSIT test beds'''] (''under construction'')
+
* [https://github.com/FDio/csit/blob/master/docs/lab/testbed_specifications.md '''CSIT testbed specifications''']
  
 
{| class="wikitable"
 
{| class="wikitable"
Line 56: Line 61:
 
! Distro
 
! Distro
 
|-
 
|-
| [https://softiron.com/development-tools/overdrive-1000/ SoftIron OverDrive 1000] || CI build server || Running in CI || softiron-1 || 10.30.51.12 || N/A || 4 || 8GB || || openSUSE
+
| [https://www.marvell.com/server-processors/thunderx-arm-processors/ Marvell ThunderX] || VPP dev debug server|| Running || vpp-marvell-dev || 10.30.51.38 || 10.30.50.38 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 18.04.4
 +
|-
 +
| || CI build server|| Running in Nomad || s53-nomad || 10.30.51.39 || 10.30.50.39 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 18.04.4
 +
|-
 +
| || CI build server|| Running in Nomad || s54-nomad || 10.30.51.40 || 10.30.50.40 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 18.04.4
 +
|-
 +
| || CI build server || Running in Nomad || s52-nomad || 10.30.51.65 || 10.30.50.65 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.4
 +
|-
 +
| || CI build server || Running in Nomad || s51-nomad || 10.30.51.66 || 10.30.50.66 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.4
 
|-
 
|-
| || CI build server || Running in CI || softiron-2 || 10.30.51.13 || N/A || 4 || 8GB || || openSUSE
+
| || CI build server || Running in Nomad || s49-nomad || 10.30.51.67 || 10.30.50.67 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.4
 
|-
 
|-
| || CI build server || Running || softiron-3 || 10.30.51.14 || N/A || 4 || 8GB || || openSUSE
+
| || CI build server || Running in Nomad || s50-nomad || 10.30.51.68 || 10.30.50.68 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.4
 
|-
 
|-
| [https://cavium.com/product-thunderx-arm-processors.html Cavium ThunderX] || CI build server || Running in CI || nomad3arm || 10.30.51.38 || 10.30.50.38 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 16.04
+
| [https://www.marvell.com/server-processors/thunderx2-arm-processors/ Marvell ThunderX2] || Perf DUT candidate || Running || s27-t13-sut1 || 10.30.51.69 || 10.30.50.69 || 224 || 128GB || 3x40GbE QSFP+ XL710-QDA2 || Ubuntu 18.04.2
 
|-
 
|-
| || CI build server || Running in CI || nomad4arm || 10.30.51.39 || 10.30.50.39 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 16.04
+
| || VPP device server || Running in Nomad || s55-t36-sut1 || 10.30.51.70 || 10.30.50.70 || 256 || 256GB || 2x40GbE QSFP+ XL710-QDA2 / 2x10/25GE SFP+ ConnectX5 || Ubuntu 18.04.4
 
|-
 
|-
| || CI build server || Running in CI || nomad5arm || 10.30.51.40 || 10.30.50.40 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 16.04
+
| || VPP device server || Running in Nomad || s56-t37-sut1 || 10.30.51.71 || 10.30.50.71 || 256 || 256GB || 2x40GbE QSFP+ XL710-QDA2 / 2x10/25GE SFP+ ConnectX5 || Ubuntu 18.04.4
 
|-
 
|-
| || CI build server || IP not configured || c2-n2 || 10.30.51.65 || 10.30.50.65 || 96 || || 2xQSFP+ / USB Ethernet || Centos7
+
| Huawei TaiShan 2280 || CSIT testbed || Running in CI || s17-t33-sut1 || 10.30.51.36 || 10.30.50.36 || 64 || 128GB || 2x10GbE SFP+ Intel X520-DA2 / 2x25GbE SFP28 Mellanox CX-4 || 18.04.1
 
|-
 
|-
| || CI build server || Running || fdio-cavium5 || 10.30.51.66 || 10.30.50.66 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 16.04.4
+
| || CSIT testbed || Running in CI || s18-t33-sut2 || 10.30.51.37 || 10.30.50.37 || 64 || 128GB || 2x10GbE SFP+ Intel X520-DA2 / 2x25GbE SFP28 Mellanox CX-4 || 18.04.1
 
|-
 
|-
| || CI build server || Running || fdio-cavium6 || 10.30.51.67 || 10.30.50.67 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 16.04.1
+
| [http://macchiatobin.net/ Marvell MACCHIATObin] || N/A || Decommissioned || s20-t34-sut1 || 10.30.51.41 || 10.30.51.49, then connect to /dev/ttyUSB0 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.4
 
|-
 
|-
| || CI build server || Running || fdio-cavium7 || 10.30.51.68 || 10.30.50.68 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 16.04.1
+
| || N/A || Decommissioned || s21-t34-sut2 || 10.30.51.42 || 10.30.51.49, then connect to /dev/ttyUSB1 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.5
 
|-
 
|-
| Huawei TaiShan 2280 || CSIT testbed || Running || s15-t33-sut1 || 10.30.51.36 || 10.30.50.36 || 64 || 128GB || 2x10GbE SFP+ Intel 82599 / 2x25GbE SFP28 Mellanox CX-4 || Ubuntu 17.10
+
| || N/A || Decommissioned || fdio-mcbin3 || 10.30.51.43 || 10.30.51.49, then connect to /dev/ttyUSB2 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.5
 
|-
 
|-
| || CSIT testbed || Running || s16-t33-sut2 || 10.30.51.37 || 10.30.50.37 || 64 || 128GB || 2x10GbE SFP+ Intel 82599 / 2x25GbE SFP28 Mellanox CX-4 || Ubuntu 17.10
+
| || Power Cycler || Operational || || 10.30.50.80 || || || || ||
 
|-
 
|-
| [http://macchiatobin.net/ Marvell MACCHIATObin] || CSIT testbed || Need TG connections || s20-t34-sut1 || 10.30.51.41 || 10.30.51.49, then connect to /dev/ttyUSB0 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.4
+
| [https://softiron.com/development-tools/overdrive-1000/ SoftIron OverDrive 1000] || N/A || Decommissioned || softiron-1 || 10.30.51.12 || N/A || 4 || 8GB || || openSUSE
 
|-
 
|-
| || CSIT testbed || Need TG connections || s21-t34-sut2 || 10.30.51.42 || 10.30.51.49, then connect to /dev/ttyUSB1 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.4
+
| || N/A || Decommissioned || softiron-2 || 10.30.51.13 || N/A || 4 || 8GB || || openSUSE
 
|-
 
|-
| || CSIT testbed || Need Port0-Port1 connection || fdio-mcbin3 || 10.30.51.43 || 10.30.51.49, then connect to /dev/ttyUSB2 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.5
+
| || N/A || Decommissioned || softiron-3 || 10.30.51.14 || N/A || 4 || 8GB || || openSUSE
 
|-
 
|-
 
|}
 
|}
  
Note: to get lab access, open a ticket at https://rt.linuxfoundation.org/
+
Note: to get lab access, create a gpg key, upload it to keyserver, have it signed by a trusted anchor in a video call (fingerprint will be needed) and then an ARM authority (Tina) needs to send an e-mail to helpdesk@fd.io with your name, e-mail, keygrip and key fingerprint
  
 
== CI ==
 
== CI ==
Line 122: Line 135:
  
 
https://wiki.fd.io/view/CSIT/AArch64
 
https://wiki.fd.io/view/CSIT/AArch64
 +
 +
== Contiv-VPP ==
 +
 +
This Kubernetes network plugin uses FD.io VPP to provide network connectivity between PODs.
 +
 +
https://github.com/contiv/vpp
 +
 +
The installation guide of Contiv-VPP on Arm64 platform is
 +
 +
https://github.com/contiv/vpp/blob/master/docs/arm64/MANUAL_INSTALL_ARM64.md
 +
 +
== Porting and Tuning Roadmap ==
 +
 +
* VPP Vectorization: Expanding the Neon Library for IPv4 forwarding code path - Sirshak/Lijian
 +
* Tuning the quad loop/dual loop for small cores - Lijian
 +
* General performance analysis and tuning of various graph nodes for IPv4 forwarding test case - Sirshak/Lijian
 +
* Memory Ordering - Sirshak
 +
* CSIT Performance Test - Khemendra
 +
* CSIT Device Test - Juraj
 +
* CSIT Path Test - Juraj
  
 
== Known Issues ==
 
== Known Issues ==
Line 130: Line 163:
 
=== Recent Patches ===
 
=== Recent Patches ===
 
{| class="wikitable"
 
{| class="wikitable"
| [https://gerrit.fd.io/r/#/c/13850/ Add support for shuffle vector intrinsic via Neon in ARM] || || || Sirshak Das
 
 
|-
 
|-
| [https://gerrit.fd.io/r/#/c/13696/ Improve cpu { coremask-% } configure option] || || || Yi He
+
| [https://gerrit.fd.io/r/c/vpp/+/34716 misc: vppctl fix heap-buffer-overflow & memleaks] || Merged 12/14 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/34634 crypto-native: fix build error on Arm using clang-13] || Merged 12/14 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/33306 snort: fix unused result warning for gcc-10] || Merged 11/06 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/33307 l2: fix array-bounds error for prefetch on Arm] || Merged 11/07 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/33422 ip6: fix IPv6 address calculation error using "ip route add" CLI] || Merged 10/21 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/31694 ipsec: Performance improvement of ipsec4_output_node using flow cache] || Merged 10/13 || || Govindarajan Mohandoss
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/33999 build: fix centos rpm build] || Merged 10/08 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/33324 vppinfra: fix potential memory access error in _pool_init_fixed] || Merged 10/05 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/32885 svm: fix asan check failed @svm_map_region on arm ] || Merged 06/24 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/32638 l2: fix vrrp prefix mac comparison ] || Merged 06/09 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/32565 build: fix build error after make wipe ] || Merged 06/04 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/32367 memif: fix input node buffer prefetch ] || Merged 05/21 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/32366 memif: fix gcc-10 build error on arm platform ] || Merged 05/21 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/31972 papi: fix ubuntu 1804 make test socket.close error] || Merged 04/16 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/31960 rdma: fix skip_ipv4_cksum behavior in scalar path] || Merged 04/15 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/31985 vppinfra: correct intrinsic called by u16x16_from_u8x16] || Merged 04/15 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/31421 vppinfra: fix compiling error due to incompatible udphdr field names] || Merged 03/05 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/30458 avf: optimized with NEON SIMD instruction] || Merged 12/18 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/28252 ip: fix compiling error with gcc-10] || Merged 09/01 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/28044 build: Fix 'make install-deps' errors on aarch64 CentOS 7] || Merged 07/29 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/28034 acl: correct acl vat help message] || Merged 07/24 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/27417 build: add libssl-dev library for ubuntu 20.04] || Merged 06/04 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/26949 dpdk: fix compiling issue with clang] || Merged 05/08 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/26950 vppinfra: fix u32x4_byte_swap on Arm] || Merged 05/08 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/26804 build: support arch-specific compiling for Neoverse N1] || Merged 04/30 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/26023 dpdk: false link down issue with ixgbe NIC] || Merged 03/23 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/25896 vlib: fix error when creating avf interface on SMP system] || Merged 03/21 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/25906 vlib: leave SIGPROF signal with its default handler] || Merged 03/21 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/25259 build: add libssl-dev for ubuntu 16.04 and 18.04] || Merged 03/11 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/25195 vlib: fix code of getting numa node with specific cpu_id] || Merged 02/17 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/23083 docs: add physmem section in configuration parameters] || Merged 12/19 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/23082 vlib: add max-size configuration parameter for pmalloc] || Merged 12/18 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/23075 crypto: not use vec api with opt_data[VNET_CRYPTO_N_OP_IDS]] || Merged 11/13 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/23084 acl: add missing square brackets to vat_help option in acl api] || Merged 10/31 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21968 dpdk: apply dual loop unrolling in DPDK TX] || Merged 09/12 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21969 ip: apply dual loop unrolling in ip4_rewrite] || Merged 09/12 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21970 ip: apply dual loop unrolling in ip4_input] || Merged 09/12 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21940 build: fix running error with vmxnet3_test_plugin.so] || Merged 09/11 || || Jianlin Lv
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21741 build: fix unsupported CMake comparison operation] || Merged 09/05 || || Jianlin Lv
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21469 tap: fix tap interface not working on Arm issue] || Merged 09/04 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20379 build: fix vpp compilation failure on ThunderX2 and Amp] || Merged 08/19 || || Jianlin Lv
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/18564 vppinfra: Update "show cpu" output for AArch64 chips] || Merged 08/19 || || Nitin Saxena
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20861 vppinfra: refactor test_and_set spinlocks to use clib_spinlock_t] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20862 vppinfra: added performance test for clib_rwlock_t (test_rwlock.c)] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20863 vppinfra: refactor clib_rwlock_t to use single condition variable] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20860 vppinfra: refactor clib_spinlock_t to use compare and swap] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20859 vppinfra: added lock performance test for clib_spinlock_t (test_spinlock.c)] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20857 vppinfra: refactor use of CLIB_MEMORY_BARRIER ()] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20856 vppinfra: conformed spinlocks to use CLIB_PAUSE] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/20272/ vppinfra: add u64x2_scatter/u32x4_scatter] || Merged 06/21 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/20271/ vppinfra: add u64x2_gather/u32x4_gather] || Merged 06/21 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/20064/ fix compiling error with marvell pp2 plugin] || Merged 06/11 || || Jianlin Lv
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/19930/ Switch atomic release API from __sync to __atomic builtin] || Merged 06/05 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/19929/ Switch atomic test and set API from __sync to __atomic builtin] || Merged 06/05 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/18278/ Build packages for generic Arm architecture] || Merged 05/15 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/19135/ Enable NEON instructions in memcpy_le] || Merged 05/01 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/18223/ svm_fifo rework to avoid contention on cursize] || Merged 04/17 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/18405/ Re-enable aarch64 neon instruction in vlib_buffer_free_inline] || Merged 03/20 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/18077/ sctp chunk_len fix] || Merged 03/06 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/16184/ Use acquire/release ordering when accessing svm_fifo shared variable cursize] || Merged 11/29 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/15756/ Optimize xxx_zero_byte_mask NEON function.] || Merged 11/07 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/15618/ Enable atomic swap and store macro with acquire and release ordering.] || Merged 11/03 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/15606/ Add and enable msb mask vector intrinsic for aarch64.] || Merged 10/31 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/15181/ vppinfra: add atomic macros for __sync builtins] || Merged 10/19 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/15196/ vppinfra: Fix extendto_high aarch64 NEON api.] || Merged 10/09 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/14905/ Support dynamic dual/quad loop selection on aarch64] || Merged 10/01 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/14963/ Enable verbose output during VPP cmake compiling] || Merged 9/25 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/14980/ dpdk_plugin: fix mlx5 build and runtime issues] || Merged 9/27 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/14606/ Add and enable u32x4_extend_to_u64x2_high for aarch64 NEON intrinsics.] || Merged 9/12 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/14604/ Add horizontal add (hadd) vector intrinsic via NEON.] || Merged 9/11 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/14608/ Add u32x4_extend_to_u64x2 for aarch64 using NEON intrinsics] || Merged 9/11 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/14607/ Replacing vtbl NEON intrinsic with rev NEON intrinsic for byte_swap.] || Merged 9/11 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/14570/ Fix array bound failure in api_sr_localsid_add_del] || Merged 8/30 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/14539/ cmake: fix marvell plugin build] || Merged 8/28 || || Brian Brooks
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/14300/ fix dpdk_plugin.so load failure with DPDK 18.08] || Merged 8/23 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/14243/ Fix a bug in function pipe_rx] || Merged 8/17 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/14244/ fix compiling warnings with GCC] || Merged 8/17 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/14273/ Update AArch64 CSIT machines into FD.io VPP docs] || Merged 8/17 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/13850/ Add support for shuffle vector intrinsic via Neon in ARM] || Merged 8/1 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/13696/ Improve cpu { coremask-% } configure option] || Merged 8/1 || || Yi He
 
|-
 
|-
| [https://gerrit.fd.io/r/#/c/13695/ Fix undefined symbol: fformat_append_cr in vat plugins loading] || || || Yi He
+
| [https://gerrit.fd.io/r/#/c/13695/ Fix undefined symbol: fformat_append_cr in vat plugins loading] || Merged 7/31 || || Yi He
 
|-
 
|-
 
| [https://gerrit.fd.io/r/#/c/13376/ pp2: increase recycle batch size] || Merged 7/10 || || Brian Brooks
 
| [https://gerrit.fd.io/r/#/c/13376/ pp2: increase recycle batch size] || Merged 7/10 || || Brian Brooks
Line 237: Line 427:
 
|}
 
|}
  
=== Meeting Minutes ===
+
== Meeting Minutes ==
'''8/14/2018'''
+
'''11/21/2023'''
 
* Attendees
 
* Attendees
 +
** Lijian Zhang
 
** Juraj Linkes
 
** Juraj Linkes
 +
** Niyaz Murshed
 +
** Jieqiang Wang
 +
 +
* CSIT
 +
** Status
 +
*** Dave Wallace help monitor the AArch64 CI/CD status, which looks fine
 +
*** Replace old thunderX2 with Ampera Altra, bugdets got approved, still in progress
 +
**** Sync with CSIT folks in the call when possible -- Juraj
 +
*** Maciek asked about the availability of N2-based hardwares
 +
**** Plans to ship N2-based servers(Nvidia Grace(V2)/Ampere One(in-house design by Ampere)) to FD.io lab in next year
 +
**** Timeline TBD
 +
*** IPSec test cases
 +
**** Patch already merged
 +
**** QAT cards in Austin labs, plan to ship them to FD.io lab
 +
*** RDMA test cases
 +
**** MLX DPDK test cases are enabled, RDMA are not on AArch64
 +
 +
* VPP
 +
** Detailed planning for VPP projects in the next call
 +
** Refactor OpenSSL usage in VPP IPsec -- Lijian
 +
*** Move key generation and initialization steps out of data plane to control plane, see performance boost
 +
** Investigate make test framework in VPP -- Lijian
 +
*** Patch broke wireguard test cases so need to figure out the work flow
 +
** VPP ramp-up -- Niyaz
 +
*** Investigate VPP graph node mechanism and how to add nodes to the group
 +
** IPSec scalability tests -- Jieqiang
 +
*** Try to figure out dpdk-rss-flows.py and how to generate balanced rss flows for IPSec tests
 +
 +
'''07/18/2023'''
 +
* Attendees
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj Linkes
 +
* CSIT
 +
** Timeout issue happens preriodically on Taishan server, even in release testing.
 +
*** Setting CPU affinity only after VMs boot up fully.
 +
**** https://gerrit.fd.io/r/c/csit/+/38550
 +
*** Another issue maybe related with Taishan NUMA topology
 +
**** https://gerrit.fd.io/r/c/csit/+/35772
 +
*** Increase timeout will bypass the issue and have no effecton VPP VM perf
 +
** IPSec & VxLAN performance drop issue on Ampere Altra
 +
*** QAT cards are planned to be shipped
 +
*** need to pay attention to the execution time for IPSec release testing
 +
*** Need to investigate further on performance degradation issue
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
*** Will have a debug meeting with RDMA maintainers on the issues.
 +
*** ConnectX6 NIC info will be updated in doc first
 +
*** Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
 +
** MRR failed cases
 +
*** Probably due to latest DPDK upgrade, not an arm-specific issue.
 +
** New test cases list on 3n-alt
 +
*** NAT tests cannot be added because they are running on 2-node testbed only
 +
*** enable IPSec flow cache(arm)/IPSec SPD fast path feature
 +
** Release testing
 +
*** 23.06 release testing is done
 +
*** New CSIT page https://csit.fd.io/
 +
** Plan to replace TX2 with Altra as VPP device testing testbed
 +
 +
'''06/20/2023'''
 +
* Attendees
 
** Lijian Zhang
 
** Lijian Zhang
** Andy Wang
+
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj Linkes
 +
* CSIT
 +
** Timeout issue happens preriodically on Taishan server, even in release testing.
 +
*** Setting CPU affinity only after VMs boot up fully.
 +
**** https://gerrit.fd.io/r/c/csit/+/38550
 +
*** Another issue maybe related with Taishan NUMA topology
 +
**** https://gerrit.fd.io/r/c/csit/+/35772
 +
*** Increase timeout will bypass the issue and have no effecton VPP VM perf
 +
** IPSec & VxLAN performance drop issue on Ampere Altra
 +
*** QAT cards are planned to be shipped
 +
*** need to pay attention to the execution time for IPSec release testing
 +
*** Need to investigate further on performance degradation issue
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
*** Will have a debug meeting with RDMA maintainers on the issues.
 +
*** ConnectX6 NIC info will be updated in doc first
 +
*** Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
 +
** MRR failed cases
 +
*** Probably due to latest DPDK upgrade, not an arm-specific issue.
 +
** New test cases list on 3n-alt
 +
*** NAT tests cannot be added because they are running on 2-node testbed only
 +
*** enable IPSec flow cache(arm)/IPSec SPD fast path feature
 +
 
 +
'''05/16/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
* CSIT
 +
** Timeout issue happens preriodically on Taishan server, even in release testing.
 +
*** Setting CPU affinity only after VMs boot up fully.
 +
**** https://gerrit.fd.io/r/c/csit/+/38550
 +
*** Another issue maybe related with Taishan NUMA topology
 +
**** https://gerrit.fd.io/r/c/csit/+/35772
 +
*** Increase timeout will bypass the issue and have no effecton VPP VM perf
 +
** The link issue in DPDK testpmd test cases on Ampere Altra is still there.
 +
*** Try cable switch while upgrading NIC firmeare and drivers
 +
*** Try to reproduce the tests after the NIC firmware
 +
*** Try different port pairs of the same two NICs
 +
** IPSec & VxLAN performance drop issue on Ampere Altra
 +
*** QAT cards are planned to be shipped
 +
*** need to pay attention to the execution time for IPSec release testing
 +
*** Need to investigate further on performance degradation issue
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
*** Will have a debug meeting with RDMA maintainers on the issues.
 +
*** ConnectX6 NIC info will be updated in doc first
 +
*** Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
 +
** MRR failed cases
 +
*** Probably due to latest DPDK upgrade, not an arm-specific issue.
 +
* VPP
 +
'''04/18/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
* CSIT
 +
** Timeout issue happens preriodically on Taishan server, even in release testing.
 +
*** Setting CPU affinity only after VMs boot up fully.
 +
**** https://gerrit.fd.io/r/c/csit/+/38550
 +
*** Another issue maybe related with Taishan NUMA topology
 +
**** https://gerrit.fd.io/r/c/csit/+/35772
 +
** The link issue in DPDK testpmd test cases on Ampere Altra is still there.
 +
** IPSec & VxLAN performance drop issue on Ampere Altra
 +
*** QAT cards are planned to be shipped
 +
*** need to pay attention to the execution time for IPSec release testing
 +
*** Need to investigate further on performance degradation issue
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
*** Will have a debug meeting with RDMA maintainers on the issues.
 +
*** ConnectX6 NIC info will be updated in doc first
 +
* VPP
 +
 
 +
'''04/04/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
* CSIT
 +
** Timeout issue happens preriodically on Taishan server, even in release testing.
 +
*** Setting CPU affinity only after VMs boot up fully.
 +
**** https://gerrit.fd.io/r/c/csit/+/38550
 +
*** Another issue maybe related with Taishan NUMA topology
 +
**** https://gerrit.fd.io/r/c/csit/+/35772
 +
** The link issue in DPDK testpmd test cases on Ampere Altra is still there.
 +
** IPSec & VxLAN performance drop issue on Ampere Altra
 +
***
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
*** Will have a debug meeting with RDMA maintainers on the issues.
 +
* VPP
 +
 
 +
'''03/07/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
* CSIT
 +
** Timeout issue happens preriodically on Taishan server, even in release testing.
 +
** The link issue in DPDK testpmd test cases on Ampere Altra is still there.
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
* VPP
 +
 
 +
'''2/21/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
 
 +
* Miscellaneous
 +
** Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Confirm with Vexxhost people if replacing intel NICs is feasible
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
****** Will talk to dpdk i40e maintainer to seek their help
 +
******* Dpdk Port/link status broken - l3fwd have the some issue
 +
******* Sent detail email to i40e maintainer in dpdk-dev mail list waiting for response
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Need to investigate 22.10 release testing result
 +
****** Compiler version change seems to be one of factors for perf degradation
 +
******* Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
 +
******* New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
 +
***** VM testcase timeout issue on 3-tsh testbed
 +
****** Timeout issue occured when starting VPP inside VM, but not for starting testpmd
 +
****** Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
 +
******* isolcpus seems to be working fine
 +
******* still need to root cause the timeout issue- sometimes slower
 +
******* run dpdk build, just use the non-isolated cores for build
 +
******* both VM and VPP start slower than before
 +
******* VPP loading plugins and timeout happens
 +
******* Is VPP crashing? - not crash
 +
******* Is the VM bound with isolated core? - need to check
 +
******* Will set up a live debug session for Tianyu and Juraj
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
 
 +
** VPP Device
 +
*** Enable VPP device testing per patch
 +
**** Voting right for VPP device testing on Arm is enabled
 +
**** VPP device testing on Arm runs per VPP/CSIT patch
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
** MLX NICs Planning
 +
*** CX6 and CX7 - CX7 is hard to get on market - MLX Nics will be used and reported
 +
*** CX6 vpp native rdma driver has issues, dpdk mlx driver is fine.
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
 
 +
'''2/7/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
 
 +
* Miscellaneous
 +
** Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Confirm with Vexxhost people if replacing intel NICs is feasible
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
****** Will talk to dpdk i40e maintainer to seek their help
 +
******* Dpdk Port/link status broken - l3fwd have the some issue
 +
******* Sent detail email to i40e maintainer in dpdk-dev mail list waiting for response
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Need to investigate 22.10 release testing result
 +
****** Compiler version change seems to be one of factors for perf degradation
 +
******* Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
 +
******* New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
 +
***** VM testcase timeout issue on 3-tsh testbed
 +
****** Timeout issue occured when starting VPP inside VM, but not for starting testpmd
 +
****** Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
 +
******* isolcpus seems to be working fine
 +
******* still need to root cause the timeout issue- sometimes slower
 +
******* run dpdk build, just use the non-isolated cores for build
 +
******* both VM and VPP start slower than before
 +
******* VPP loading plugins and timeout happens
 +
******* Is VPP crashing? - not crash
 +
******* Is the VM bound with isolated core? - need to check
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
 
 +
** VPP Device
 +
*** Enable VPP device testing per patch
 +
**** Voting right for VPP device testing on Arm is enabled
 +
**** VPP device testing on Arm runs per VPP/CSIT patch
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
** MLX NICs Planning
 +
*** CX6 and CX7 - CX7 is hard to get on market - MLX Nics will be used and reported
 +
*** CX6 vpp native rdma driver has issues, dpdk mlx driver is fine.
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''1/17/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 
** Tina Tsou
 
** Tina Tsou
** Khemendra Kumar
+
 
** Honnappa Nagarahalli
+
* Miscellaneous
** Brian Brooks
+
** Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
* FD.io lab
+
 
** SFP+ cables shipment showing as delivered
+
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Confirm with Vexxhost people if replacing intel NICs is feasible
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
****** Will talk to dpdk i40e maintainer to seek their help
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Need to investigate 22.10 release testing result
 +
****** Compiler version change seems to be one of factors for perf degradation
 +
******* Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
 +
******* New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
 +
***** VM testcase timeout issue on 3-tsh testbed
 +
****** Timeout issue occured when starting VPP inside VM, but not for starting testpmd
 +
****** Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
 
 +
** VPP Device
 +
*** Enable VPP device testing per patch
 +
**** Voting right for VPP device testing on Arm is enabled
 +
**** VPP device testing on Arm runs per VPP/CSIT patch
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
 
 
* VPP
 
* VPP
** VPP-1339 - Mellanox NIC not working with VPP
+
** VPP SVE implementation - Lijian
*** Lijian noticed DPDK version updated to 18.08 and might help - https://gerrit.fd.io/r/#/c/14154/
+
*** SVE validation on FPGA platform - Confluence page ready
*** Tina helping find someone from Mellanox to help
+
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
** VPP-1391 - VPP 'make verify' failed on Huawei Taishan servers
+
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
*** Khem looking into this
+
**** Investigate SVE vs NEON packet checksum comparison
** No updates on crypto
+
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
** No updates on vectorization
+
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
** Tuning dual/quad loop
+
** Investigate One Terabit throughput test on Arm platform
*** DaveB suggests looking at MULTIARCH macros
+
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''12/20/2022'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Tina Tsou
 +
 
 +
* Miscellaneous
 +
** Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
 +
 
 
* CSIT
 
* CSIT
** CSIT-1139 - parallelize 'make test'
+
** VPP Performance Test
*** Juraj updated patch with comments from Klement
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** Khem seeing failures with jumbo frames
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** Khem noticed new CSIT machines using tag to run a subset of tests
+
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
* Documentation
+
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
** Lijian working on patch to add Arm to Architecture section and Arm-based CSIT testbeds to CSIT section
+
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
* Action Items - Next Week
+
****** Confirm with Vexxhost people if replacing intel NICs is feasible
** [Sirshak] Create LF RT ticket for power cycling mcbins
+
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
** [Honnappa] Add module owners list and performance analysis items to wiki page
+
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
** [Lijian] Check if DPDK 18.08 helps Mellanox NIC issues
+
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
** [Sirshak] Create Jira ticket to see impact of Florin's patch
+
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
** [Sirshak] Create Jira ticket for msb
+
****** Will talk to dpdk i40e maintainer to seek their help
** [Khem] Try dual loop ip4_lookup_inline patch to see if it helps on A72-based D05
+
**** New links for VPP perf trending/report pages
** [Brian] Help resolve VPP build failure on mcbins in FD.io lab
+
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
** [Juraj] Enable VPP Device on 1-node SoC now that SFP+ cables have arrived
+
***** Release report: https://s3-docs.fd.io/csit/master/report/
** [Sirshak] Follow up with Cavium regarding Ubuntu installation on cavium-4
+
***** Need to investigate 22.10 release testing result
** [Khem] Create Jira ticket for CSIT failures with jumbo frames
+
****** Compiler version change seems to be one of factors for perf degradation
** [Khem] Create Jira ticket for running a subset of tests via a tag
+
******* Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
 +
******* New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
 +
***** VM testcase timeout issue on 3-tsh testbed
 +
****** Timeout issue occured when starting VPP inside VM, but not for starting testpmd
 +
****** Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
 
 +
** VPP Device
 +
*** Enable VPP device testing per patch
 +
**** Voting right for VPP device testing on Arm is enabled
 +
**** VPP device testing on Arm runs per VPP/CSIT patch
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
  
'''8/7/2018'''
+
'''12/06/2022'''
 
* Attendees
 
* Attendees
** Sirshak Das
 
 
** Juraj Linkes
 
** Juraj Linkes
** Lijian
+
** Lijian Zhang
** Andrew Pinski
+
** Jieqiang Wang
** Andy Wang
+
** Tianyu Li
 +
 
 +
* Miscellaneous
 +
** Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Confirm with Vexxhost people if replacing intel NICs is feasible
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
****** Will talk to dpdk i40e maintainer to seek their help
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Need to investigate 22.10 release testing result
 +
****** Compiler version change seems to be one of factors for perf degradation
 +
******* Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
 +
******* New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
 +
***** VM testcase timeout issue on 3-tsh testbed
 +
****** Timeout issue occured when starting VPP inside VM, but not for starting testpmd
 +
****** Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
 
 +
** VPP Device
 +
*** Enable VPP device testing per patch
 +
**** Voting right for VPP device testing on Arm is enabled
 +
**** VPP device testing on Arm runs per VPP/CSIT patch
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''11/15/2022'''
 +
* Attendees
 +
** Juraj Linkes
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
 
 +
* Miscellaneous
 +
** Reschedule the meeting to 9 am for Juraj and 3 pm for Shanghai folks
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Confirm with Vexxhost people if replacing intel NICs is feasible
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Need to investigate 22.10 release testing result
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
 
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
**** Good news, No more slow down after 200 rounds of testing.
 +
***** Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
 +
***** Distro upgrade to ubuntu 22.04 is still ongoing - no ETA yet
 +
****** Server configuration will remain the same, already integrated in ansible playbook
 +
***** Re-enable voting IF no more issue with 22.04 device testing
 +
****** Submit a patch to enable voting right after meeting
 +
*** Test meltdown/spectre vulnerabilities
 +
**** CSIT maintainers ask for tools if existing to test vulnerabilities on Arm platform(not just limited to Arm)
 +
**** Will confirm this issue with support team - Lijian
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** VM cases failed only on 3n-alt performance testbed, error log report some file missing, likely configuration issue
 +
**** Another intermit failed VM issue happens on tx2 and alt, need to figure out above case first
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
 
 +
'''10/18/2022'''
 +
* Attendees
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
* Miscellaneous
 +
** Reschedule the meeting to 9 am for Juraj and 3 pm for Shanghai folks
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
***** NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Replace XL710 NIC? - try asking tomorrow.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** NUMA issue
 +
***** Will run performance report on Arm testbed onece patch to resolve NUMA issue is merged
 +
***** Dave will help merge the patch into the corresponding branches
 +
 
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
**** Good news, No more slow down after 200 rounds of testing.
 +
***** Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
 +
***** Suggest to rerun test after upgrade to 22.04
 +
***** Re-enable voting after not more issue with 22.04 device testing
 +
 
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''9/20/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
***** NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Replace XL710 NIC? - try asking tomorrow.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
**** Good news, No more slow down after 200 rounds of testing.
 +
***** Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
 +
***** Suggest to rerun test after upgrade to 22.04
 +
***** Re-enable voting after not more issue with 22.04 device testing
 +
 
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
**** QAT enabled Kernel patch release about October, upgrade kernel required.
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
** VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
 
 +
'''9/6/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Lijian Zhang
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
***** NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Replace XL710 NIC? - try asking tomorrow.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
**** Good news, No more slow down after 200 rounds of testing.
 +
***** Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
 +
***** Suggest to rerun test after upgrade to 22.04
 +
***** Re-enable voting after not more issue with 22.04 device testing
 +
 
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
**** QAT enabled Kernel patch release about October, upgrade kernel required.
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
** VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''8/16/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Masksym Vynnvk
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Lijian Zhang
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
***** NDR/PDR data difference - deep dive needed
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
** VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''8/2/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Masksym Vynnvk
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Check if there is customer support can help with the PEX installation issue - Juraj
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
***** NDR PDR data difference - deep dive needed, MRR is
 +
******
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
** VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''7/19/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Check if there is customer support can help with the PEX installation issue - Juraj
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
 
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX NIC
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
 
 +
 
 +
'''7/5/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Check if there is customer support can help with the PEX installation issue - Juraj
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
 
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP on N1 platforms
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Tested perfmon patch - Jieqiang
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
 
 +
'''6/21/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 
** Tina Tsou
 
** Tina Tsou
** Nitin Saxena
 
** Khemendra
 
** Sachin Saxena
 
  
* General Topic
 
* Action Items - Last Week
 
** [Khem] make verify on Taishan failure Status: No Status. Khem to create a Jira Tkt.
 
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status: [Andy] Cables to be sent today.
 
** [Sirshak] Open Jira tkt look at Florin's patch. Status: To be done next week
 
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status: Failing in different place, like rx-error reported to Mellanox people by Lijian. [Lijian] To send the mail vpp-dev. [Honnappa] To talk to DPDK Mellanox DPDK community.
 
** [Sirshak] Share Mellanox settings with nitin.
 
** [Sirshak] to send email to yi and lijian for documentation. Status: Lijian has done the documentation under internal review.
 
** [Honnappa/Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Status: [Sachin] To include Nitin suggestions and upstream.
 
** [Sirshak] Send mail to LF for power cycler access for mcbin due to lack of IPMI interface Status: Sirshak to open LF Tkt
 
** [Nitin] Create a Jira Tkt - ip cksum 128 bit vector support. Status: Nitin using ip incremental cksum.
 
** [Sirshak] To create a LF tkt for Ubuntu 16.04 installation on cavium-4. Status: Anton tried 16.04 but it didnt work, sent mail to Cavium contact for help.
 
* VPP
 
** [Sirshak] Vectorization
 
*** msb is already implemented verifying correctness and performance.
 
*** [Sirshak] To raise a Jira Tkt for msb changes.
 
*** Have communicated to ARM compiler team related to vtbl performance.
 
*** planning to add cvt (extend_to) and hadd(horizontal) equivalents.
 
** [Brian/Sirshak] Tuning Dual or Quad loop.
 
*** [Khem/Sachin] Updates on the changes seen after applying Brian's Patch. Status: No Updates.
 
** [Sachin] To create Jira Card, DPDK IOVA issue. (Created VPP-1377)
 
** [Honnappa] Module Ownership Discussion. Status: To come back to discussion next time. Community feedback to move to more use-case based approach.
 
 
* CSIT
 
* CSIT
** [Juraj] Parallelizing the make test(CSIT-1139) Status: Scheduling Done, Waiting for community review, got some internal comments Juraj working on it. To try this patch on jenkins sandbox.  
+
** VPP Performance Test
** [Sirshak] replying to cavium regarding Ubuntu 18.04/16.04 installation problem cavium-4. Done. Status: Following up with Cavium
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** [Khem] Performance Suite: 64B, 9000Jumbo. Jumbo Frames is failing.(khem to jira tkt: startup.conf, Frame size, NIC Card, Hugepages configuration).
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** [Khem] Have a subset of tests running with tag.
+
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
** [Juraj/Sirshak] VPP Device SoC one node topology constraints Status: Orchestration still under discussion.  
+
***** CSIT perf numbers VS local perf numbers
* fd.io lab
+
****** VPP cloud image in CSIT VS native built VPP in local env
** [Juraj] mcbin access Status: Accessible mcbin build failing, wait fro Brian for help.
+
****** One DPDK patch introduced perf degradation on Arm platform
** [Sirshak] cavium blades. Status: [Sirshak] Following up with cavium
+
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
* Documentation
+
***** Check if there is customer support can help with the PEX installation issue - Juraj
** Need to update the working ARM boards in the documentation section.
+
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
****** Juraj should have already sent to Jieqiang previously.
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
***** 22.06 release testing will happen soon
** Subscribe to: docs@lists.fd.io
+
 
* Action Items - Next Week
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** [Khem] make verify on Taishan failure, Khem to create a Jira Tkt. Status:
+
**** IPsec SPD input/output test case ongoing - Juraj
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status:
+
***** Enable the SPD outbound tests
** [Sirshak] Open Jira tkt look at Florin's patch. Status: Not done to be done next week
+
****** Patches ready, waiting release testing done - ETA 1 week or 2
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status:
+
****** Outbound SPD test patch has been enabled
** [Lijian] To send the mail vpp-dev (VPP-1339) Status:
+
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
** [Honnappa] To talk to DPDK Mellanox DPDK community. Status:  
+
****** Investigate Inbound SPD test cases - Juraj
** [Sachin] To include Nitin suggestions and upstream.(ARMv8 Crypto changes) Status:
+
******* Juraj will commit the patch and get it confirmed with Zachary
** [Sirshak] To Open a LF Tkt regarding power cycler remote access fro mcbin. Status:
+
***** Release testing for 21.10 is done
** [Sirshak] To raise a Jira Tkt for msb changes. Status:
+
**** New links for VPP perf trending/report pages
'''7/30/2018'''
+
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.  
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Tested perfmon patch - Jieqiang
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
 
 +
'''6/7/2022'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
 
** Juraj Linkes
 
** Juraj Linkes
** Lijian
+
** Lijian Zhang
** Andrew Pinski
+
** Jieqiang Wang
** Andy Wang
+
** Tianyu Li
 
** Tina Tsou
 
** Tina Tsou
** Nitin Saxena
 
** Khemendra
 
** Sachin Saxena
 
  
* General Topic
 
* Action Items - Last Week
 
** [Khem] make verify on Taishan failure Status: No Status
 
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status: [Andy] Still working internally, expecting to be done this week.
 
** [Sirshak] mlnx tx non vector version used for no-multiseg. Status: Jira Tkt VPP-1355
 
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status: Yet to be verified , if fixed.
 
** [Sirshak] look at Florin's patch. Status: No status, [Sirshak] Open Jira tkt.
 
** [Tina] to get back on New ARMv8 Crypto. Status: Bob to schedule meeting with Cavium. To be tracked by Nitin, bob, tina.
 
** [Sirshak] Why Quad to Dual loop improves performance. Status: VPP-1356
 
** [Sirshak] To update VPP documentation with fd.io lab devices. Status: Not yet done. [Sirshak] to send email to yi and lijian.
 
** [Sirshak] VPP Vectorization Jira Tkt. Status: VPP-1357
 
** [Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Status: Waiting for Nitin to help on changes for Internal DPDK. External DPDK support patch done. [Sachin : created VPP-1378] To create a Jira Tkt for Internal Tkt. [Honnappa] To comment on current gerrit item to get it moving.
 
** [Sirshak] replying to cavium regarding Ubuntu 18.04 installation problem cavium-4. Status: Sent
 
** [Sirshak] Get credentails from Brian for mcbin Status: Done
 
** [Sirshak] Send mail to LF for power cycler access for mcbin due to lack of IPMI interface Status: Done
 
* VPP
 
** [Sirshak] Vectorization
 
*** Almost done with shuffle.
 
*** Will get to working with msb.
 
*** AARCH32 compilation to be discussed.(Shuffle Vector Intrinsic AARCH64 ARMv8 specific)
 
*** There are no specific requirements on aarch32 at this time.
 
** [Lijian && Yi] To continue effort on analyzing IPv4 nos on available platforms with Intel and Mellanox NICs
 
*** [Sirshak] Why is Mellanox NIC not used in CSIT ? Performance Suite Designed for Intel and Cisco NICs.
 
** [Brian/Sirshak] Tuning Dual or Quad loop.
 
*** [Khem/Sachin] Updates on the changes seen after applying Brian's Patch. Status: No Updates.
 
** [Khem] Updates on Benchmarking on taishan. Status: Held up hardware.
 
** [Nitin] Any new findings from IPv4 VPP test case. Status: Working HW offloading.
 
** [Sachin] To create Jira Card, DPDK IOVA issue. (Created VPP-1377)
 
** [Lijian] ipcksum - No Degradation on Qualcomm.
 
** [Nitin] Create a Jira Tkt - ip cksum 128 bit vector support.
 
 
* CSIT
 
* CSIT
** [Juraj] Parallelizing the make test(CSIT-1139) Status: All VPP instances running on same core. Tried scheduling cores. Dynamically finding available cores. Sweetspot currently:  8 containers with 96 core.
+
** VPP Performance Test
** [Juraj] Test features listed by talking to dave.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** [Sirshak] replying to cavium regarding Ubuntu 18.04 installation problem cavium-4. Done. Status: To open a new LF tkt to ask for 16.04 installation.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** [Juraj/Sirshak] VPP Device SoC one node topology constraints Status: Orchestration still under discussion.
+
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
** [Sirshak] to ask brian about mcbin credentials. Status: Done.
+
***** CSIT perf numbers VS local perf numbers
* fd.io lab
+
****** VPP cloud image in CSIT VS native built VPP in local env
** [Juraj] mcbin access Status: Created LF tkt.
+
****** One DPDK patch introduced perf degradation on Arm platform
** [Sirshak] cavium blades. Status: [Sirshak] To create a LF tkt for Ubuntu 16.04 installation on cavium-4.
+
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
* Documentation
+
***** Check if there is customer support can help with the PEX installation issue - Juraj
** Need to update the working ARM boards in the documentation section.
+
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
*** [Lijian/Yi] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin).
+
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
*** [Lijian/Yi] Add only fd.io lab devices.
+
****** Juraj should have already sent to Jieqiang previously.
*** [Sirshak] To send email with details.
+
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
** Subscribe to: docs@lists.fd.io
+
* Action Items - Next Week
+
** [Khem] make verify on Taishan failure Status:
+
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status: [Andy]
+
** [Sirshak] Open Jira tkt look at Florin's patch. Status:
+
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status:
+
** [Sirshak] to send email to yi and lijian for documentation.
+
** [Honnappa/Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Status:
+
** [Sirshak] Send mail to LF for power cycler access for mcbin due to lack of IPMI interface Status:
+
** [Nitin] Create a Jira Tkt - ip cksum 128 bit vector support.
+
** [Sirshak] To create a LF tkt for Ubuntu 16.04 installation on cavium-4.  
+
  
'''7/24/2018'''
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Tested perfmon patch - Jieqiang
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
 
 +
'''5/17/2022'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
 
** Juraj Linkes
 
** Juraj Linkes
** Lijian
+
** Lijian Zhang
** Andrew Pinski
+
** Andy Wang
+
 
** Tina Tsou
 
** Tina Tsou
** Nitin Saxena
 
** Khemendra
 
  
* General Topic
 
** .
 
* Action Items - Last Week
 
** [Honnappa/Nitin] Aarch64 rpms not building - DPDK Neon Build Break. Status: Nitin, compiler does not support arm neon intrinsics. Honnappa working with compiler team: neon intrinsics is supported #defines not present. Tmp solution available. Honnappa to follow up in DPDK.
 
** [Nitin/Sachin] Follow up: Add Virtual addressing support in IOVA dmap Status: patch committed by sachin merged.
 
** [Khem] make verify on Taishan failure Status: No updates.
 
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status: PO Approved. Should get going in few days.
 
** [Sirshak] mlnx tx non vector version used for no-multiseg. Status: Waiting for x86 hotspots for confirmation and will then open a ticket.
 
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status: Seems to be fixed but have not tried all the test cases to confirm.
 
** [Sirshak] look at Florin's patch. Status: Not done yet
 
** [Tina] to get back on New ARMv8 Crypto. Status: No updates. Close to complete but not upstreamed yet.
 
** [Sirshak] Why Quad to Dual loop improves performance. Status: Not saturating no of outstanding prefetches. AI to raise a Jira Bug.
 
** [Sirshak] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin). Status; Not done yet.
 
* VPP
 
** [Sirshak] vectorization patch effects
 
*** Made few changes no visible changes.
 
*** Plan to read mlnx drivers DPDK to understand how neon intrinsics accelerate the vectors.
 
*** Add Jira Tkt.
 
** [Sirshak] Anamolies with mlx5 and VPP.
 
** [Honnappa <-> Nitin] Nitin okay with ARM contacting Customer Support for help on TX2 optimal settings.
 
** [Brian/Sirshak] Tuning Dual or Quad loop.
 
*** Visible change in A72.
 
*** Sirshak sent patch to Sachin and Khem to analyze if they see any improvement.
 
*** [Khem->Sirshak] Why moving form Quad to Dual improves performance.
 
** [Lijian] x86 nos reported: 9.5 Mpps is not same as reported by Nitin Status: Could because of broadwell and skylake difference.
 
** [Khem] Updates on IPv4 Benchmarking on taishan. Status: CSIT perfomance bringup on fd.io lab. 18.04 gcc 7.3 trex. Workaround done. DUT VPP crashing. Plan for running L2 test cases.
 
** [Nitin] Any new findings from IPv4 VPP test case. Status: not available to discuss
 
**
 
 
* CSIT
 
* CSIT
** [Juraj] Parallelizing the make test(CSIT-1139) Status: Sent for review, Figuring out optimal no of threads.
+
** VPP Performance Test
** [Sirshak] replying to cavium regarding Ubuntu 18.04 installation problem cavium-4.  
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** [Juraj/Sirshak] VPP Device SoC one node topology constraints Status: [Sirshak] Access to one of the three consoles.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** [Sirshak] to ask brian about mcbin credentials.
+
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
** [Adarsh] VPP Path/Device Efforts: Nested Container, trying VM inside a container facing some issues. Status: Work on hold as adarsh moved out of the project.
+
***** CSIT perf numbers VS local perf numbers
* fd.io lab
+
****** VPP cloud image in CSIT VS native built VPP in local env
** [Sirshak] Installation of TG pending. Status: Done
+
****** One DPDK patch introduced perf degradation on Arm platform
** [Juraj] mcbin access Status: Two of them can be accessed the other 1 cant.
+
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
** [Sirshak] cavium blades connected need SFP and DACs. Status: Up and running, still need SFP and DACs
+
***** Check if there is customer support can help with the PEX installation issue - Juraj
* Documentation
+
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
** Need to update the working ARM boards in the documentation section.
+
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
*** [Sirshak] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin).
+
****** Juraj should have already sent to Jieqiang previously.
*** [Sirshak] Add only fd.io lab devices.
+
 
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
**** IPsec SPD input/output test case ongoing - Juraj
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
***** Enable the SPD outbound tests
** Subscribe to: docs@lists.fd.io
+
****** Patches ready, waiting release testing done - ETA 1 week or 2
* Action Items - Next Week
+
****** Outbound SPD test patch has been enabled
** [Khem] make verify on Taishan failure Status:  
+
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status:  
+
****** Investigate Inbound SPD test cases - Juraj
** [Sirshak] mlnx tx non vector version used for no-multiseg. Status: Jira Tkt VPP-1355
+
******* Juraj will commit the patch and get it confirmed with Zachary
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status:  
+
***** Release testing for 21.10 is done
** [Sirshak] look at Florin's patch. Status:
+
**** New links for VPP perf trending/report pages
** [Tina] to get back on New ARMv8 Crypto. Status:
+
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
** [Sirshak] Why Quad to Dual loop improves performance. Status: VPP-1356
+
***** Release report: https://s3-docs.fd.io/csit/master/report/
** [Sirshak] To update VPP documenetation witrh fd.io lab devices. Status:
+
***** Release testing for 22.02 is done
** [Sirshak] VPP Vectorization Jira Tkt. Status: VPP-1357
+
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
** [Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Last Status: Waiting for Nitin to help on changes for Internal DPDK. Current Status:
+
**** 3 nodes Taishan crypto test case failed - related to CSIT change
** [Sirshak] replying to cavium regarding Ubuntu 18.04 installation problem cavium-4. Status: Sent
+
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
** [Sirshak] Get credentails from Brian for mcbin Status: Done
+
** VPP Path
** [Sirshak] Send mail to LF for power cycler access for mcbin due to lack of IPMI interface Status: Done
+
*** Voting and working fine.
'''7/17/2018'''
+
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.  
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
 
 +
'''4/5/2022'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
** Sachin Saxena
+
** Khemendra Kumar
+
 
** Juraj Linkes
 
** Juraj Linkes
** Lijian
+
** Lijian Zhang
 +
** Tina Tsou
  
* General Topic
 
** Austin Folks leaving early meeting. If needs be somebody can takeover after 1 hour (9 am CT).
 
* Action Items - Last Week
 
** [Honnappa/Nitin] Aarch64 rpms not building - DPDK Neon Build Break
 
** [Nitin/Sachin]Follow up: Add Virtual addressing support in IOVA dmap Status: No updates.
 
** [Khem] make test on Taishan timings: Status: Done. To look at why make verify.
 
** [Sirshak] cavium USB-Ethernet adapters to Quantta Switch. Status: Andy waiting for cables to reach him.
 
** [Sirshak] mlnx tx non vector version used for no-multiseg. Jira Tkt Status: Not yet done. Will do this week.
 
** [Sirshak] DPDK 18.05 mlnx bug. Status: Sirshak to open Jira Tkt - VPP-1339
 
** [Sirshak] look at Florin's patch. Status: Not yet done.
 
** [Tina] to get back on New ARMv8 Crypto.
 
* VPP
 
** [Sirshak] vectorization patch effects
 
*** Made few changes no visible change.
 
*** Plan to read mlnx drivers DPDK to understand how neon intrinsics accelerate the vectors.
 
** [Brian/Sirshak] Tuning Dual or Quad loop.
 
*** Visible change in A72.
 
*** None in Qualcomm because of pfrm not being hotspot.
 
*** [Khem->Sirshak] Why moving form Quad to Dual improves performance.
 
*** Commmunity wide investigation needed.
 
** [Lijian] x86 nos reported: 9.5 Mpps is not same as reported by Nitin Status: Investigation.
 
** [Khem] Updates on IPv4 Benchmarking on taishan. Status: Stuck with pktgen.
 
** [Nitin] Any new findings from IPv4 VPP test case. Status:
 
** [Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Status: Waiting for Nitin to help on changes for Internal DPDK.
 
**
 
 
* CSIT
 
* CSIT
** [Juraj] Parallelizing the make test(CSIT-1139) Status: Almost done, need to work on polishing.
+
** VPP Performance Test
** [Juraj/Sirshak] SoC devices as non voting VPP device targets. Status: [Sirshak] pending on TG credentials.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** [Adarsh] VPP Path/Device Efforts: Nested Container, trying VM inside a container facing some issues. Status: No update, Adarsh replaced on the project; postponed
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
* fd.io lab
+
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
** [Sirshak] Installation of TG pending. Status: No update from LF - Anton
+
***** Check if there is customer support can help with the PEX installation issue - Juraj
** [Sirshak] cavium blades connected need SFP and DACs. Status: Up and running, still need SFP and DACs
+
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
* Documentation
+
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
** Need to update the working ARM boards in the documentation section.
+
****** Juraj should have alrady sent to Jieqiang previously.
*** [Sirshak] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin).
+
*** [ARM community] Waiting for feedback from Khem and other companies
+
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
** Subscribe to: docs@lists.fd.io
+
* Action Items - Next Week
+
** [Honnappa/Nitin] Aarch64 rpms not building - DPDK Neon Build Break Status:
+
** [Nitin/Sachin] Follow up: Add Virtual addressing support in IOVA dmap Status:
+
** [Khem] make test on Taishan failure Status:
+
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status:
+
** [Sirshak] mlnx tx non vector version used for no-multiseg. Status:
+
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status: 
+
** [Sirshak] look at Florin's patch. Status:
+
** [Tina] to get back on New ARMv8 Crypto.
+
** [Sirshak] Why Quad to Dual loop improves performance.
+
** [Sirshak] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin).
+
  
'''7/10/2018'''
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
 
 +
'''3/15/2022'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
** Sachin Saxena
+
** Juraj Linkes
** Khemendra Kumar
+
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 
** Tina Tsou
 
** Tina Tsou
** Nitin Saxena
+
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
'''3/1/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Paper work for shipment is done
 +
*** Build servers will arrive at end of Jan
 +
*** Performance servers will arrive in Feb
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''1/25/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 
** Juraj Linkes
 
** Juraj Linkes
** Brian Brooks
+
** Tianyu Li
** Lijian
+
** Tina Tsou
** Tom Herbert
+
 
* General Topic
+
** Austin Folks leaving early meeting. If needs be somebody can takeover after 1 hour (9 am CT).
+
** [Tom] Aarch64 rpms not building - anyone can help?
+
* Action Items - Last Week
+
** [Nitin/Sachin]Follow up: Add Virtual addressing support in IOVA dmap Status: No updates.
+
** [Nitin] make test on Thunderx2 timings Status: Send error report of make test.
+
** [Khem] make test on Taishan timings: Status: 22 mins. Try make verify.
+
** [Sirshak] cavium USB-Ethernet adapters to Quantta Switch. Status: Done for cavium 1,2,3. Need cables for 4,5,6,7. Cables ordered
+
** [Khem] to update on nested VMs on performance test cases. Status: No updates. Could be a naming problem.
+
** [Sirshak] Q to Maciek: buildroot image with VPP device(within container)? Status: No updates. Check with Brian to see if buildroot works on arm.
+
** [Sirshak] mlnx tx non vector version used for no-multiseg. Reason ? Status: No updates. Sirshak to open Jira Tkt.
+
** [Sirshak] DPDK 18.05 mlnx bug. Status: Asked in the community need to look at backtrace as pointed by damjan. Sirshak to open Jira Tkt.
+
* VPP
+
** [Sirshak] vectorization patch effects. https://gerrit.fd.io/r/#/c/13229/
+
*** I see around 15% in qualcomm with mellanox based on some patch which is not vectorization patch need find that.
+
*** Do others see similar improvement in past 2 weeks.
+
*** [Sirshak] look at Florin's patch.
+
** [Lijian] x86 nos, checking within Nitin for sync on configuration. Skylake Single Core Single Thread: Ipv4 forwarding 64B 15 Mppps.
+
** [Khem] Updates on IPv4 Benchmarking on taishan. Status: No Updates
+
** [Nitin] Any known comparision between AVF nos on aarch64 and DPDK nos ? On Intel its ~25% and ARM ~20%.
+
** [Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Status: Internal DPDK changes effort. Wait for status on New ARMv8 Crypto.
+
** [Sirshak->Nitin] Thunderx2(high core count)coremask for DPDK config in VPP startup conf.
+
** [Tina] to get back on New ARMv8 Crypto.
+
 
* CSIT
 
* CSIT
** [Juraj] Parallelizing the make test(CSIT-1139) Discussion: On Plan and if anybody wants to join hands.
+
** VPP Performance Test
** [Juraj/Sirshak] SoC devices as non voting VPP device targets. Discussion: mcbin console access will be available once TG credentials are availlable.  
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** [Adarsh] VPP Path/Device Efforts: Nested Container, trying VM inside a container facing some issues.  
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
* fd.io lab
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** [Sirshak] Taishan connected need to verify once we get TG credentials. [Khem] Checked from Taishan side ports connected to TG are up.
+
**** IPsec SPD input/output test case ongoing - Juraj
** [Sirshak] mcbin connected need to verify once we get TG credentials.
+
***** Enable the SPD outbound tests
** [Sirshak] cavium blades connected need to switch the network adapters before using it for CI.
+
****** Patches ready, waiting release testing done - ETA 1 week or 2
* Documentation
+
****** Outbound SPD test patch has been enabled
** Need to update the working ARM boards in the docyumentation section.
+
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
*** [Sirshak] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin).  
+
***** Release testing for 21.10 is done
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
**** New links for VPP perf trending/report pages
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
***** Release report: https://s3-docs.fd.io/csit/master/report/
** Subscribe to: docs@lists.fd.io
+
**** 3 nodes Taishan crypto test case failed - related to CSIT change
* Action Items - Next Week
+
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
** [Honnappa/Nitin] Aarch64 rpms not building - DPDK Neon Build Break
+
** VPP Path
** [Nitin/Sachin]Follow up: Add Virtual addressing support in IOVA dmap Status: No updates.
+
*** Voting and working fine.
** [Khem] make test on Taishan timings: Status:
+
*** CentOS-8 jobs have been removed.
** [Sirshak] cavium USB-Ethernet adapters to Quantta Switch.
+
** VPP Device
** [Sirshak] mlnx tx non vector version used for no-multiseg. Jira Tkt Status:
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
** [Sirshak] DPDK 18.05 mlnx bug. Status: Sirshak to open Jira Tkt.
+
*** VM cases failed only on Arm
** [Sirshak] look at Florin's patch.
+
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Paper work for shipment is done
 +
*** Build servers will arrive at end of Jan
 +
*** Performance servers will arrive in Feb
  
'''7/3/2018'''
+
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''1/18/2022'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
** Sachin Saxena
+
** Lijian Zhang
** Khemendra Kumar
+
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tianyu Li
 
** Tina Tsou
 
** Tina Tsou
** Nitin Saxena
+
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
 +
** VPP IPv4 fragmentation - Tianyu & Jieqiang
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''1/11/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 
** Juraj Linkes
 
** Juraj Linkes
** Brian Brooks
+
** Tina Tsou
** Ed Kern
+
 
** Song
+
* CSIT
** Lijian
+
** VPP Performance Test
* General Topic
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** Architecture Section in Documentation.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
* Action Items - Last Week
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** Khem: Ipv4 layer investigation. To Share some findings next week on parameters for CSIT Status: Done. If yes cover in VPP section.
+
**** IPsec SPD input/output test case ongoing - Juraj
** Nitin Follow up: Sachin: Upstreaming ARMv8 Crypto Changes with external DPDK. Status: Nitin to provide help on using Internal DPDK
+
***** Enable the SPD outbound tests
** Nitin Follow up: Add Virtual addressing support in IOVA dmap Status: Waiting for response from Damjan
+
****** Patches ready, waiting release testing done - ETA 1 week or 2
** Nitin make test on Thunderx2 timings :
+
****** Outbound SPD test patch has been enabled
** Khem: status on make test failures: CSIT-1148 Status: Fixed.
+
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
** Khem: make test on Taishan timings: Status: No status
+
***** Release testing for 21.10 is done
** Sirshak: cavium USB-Ethernet adapters to Quantta Switch. Status: Still working with LF guys
+
**** New links for VPP perf trending/report pages
** Khem to update on nested VMs on performance test cases. Status: No Updates
+
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
** Sirshak & Khem: Documentation review. Status: Done. continuous effort.  
+
***** Release report: https://s3-docs.fd.io/csit/master/report/
** Sirshak: Q to Maciek: buildroot image with VPP device(within container) ? Status: No updates.
+
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 
* VPP
 
* VPP
** Sirshak: Investigate mlnx_burst_rx_vec used in case of no multi-seg but plain mlnx_tx_burst used. Movement of hotspot seen for rx. Probable reason SRIOV(VFs) used. Root cause yet to be found.
+
** VPP SVE implementation - Lijian
** Sirshak: VPP DPDK 18.05 change done by damjan. mlnx drivers on Qualcomm are a problem. Urge Everyone to test respective sanity in their setup. set interface state <InerfaceName> up - stuck
+
*** SVE validation on FPGA platform - Conflunence page ready
** Khem: Discuss various parameters in CSIT for IPv4 Testing.
+
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
** Sirshak: TCP termination performance nos ?
+
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
** Sirshak: vectorization patch effects. https://gerrit.fd.io/r/#/c/13229/
+
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
 +
** VPP IPv4 fragmentation - Tianyu & Jieqiang
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''12/14/2021'''
 +
* Attendees
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
 
 
* CSIT
 
* CSIT
** Juraj Make test bottlenecks: Updates: One plausible solution available. Parallelizing the make test(CSIT-1139)
+
** VPP Performance Test
** Juraj to start looking at SoC devices as non voting VPP device targets.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** Adarsh: openssl issues ? Issue still persists.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** Adarsh: VPP Path Tasks.
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** Tkt updates:
+
**** IPsec SPD input/output test case ongoing - Juraj
*** CSIT-1021: Handle Scapy pcap limit Khem(brief on patch, updates): Status: To check with CSIT team for jenkins build failure. Status: No Updates. Not Priorty.
+
***** Enable the SPD outbound tests
* fd.io lab
+
****** Patches ready, waiting release testing done - ETA 1 week or 2
** Sirshak: Update from LF guys
+
****** Outbound SPD test patch has been enabled
* Documentation
+
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
***** Release testing for 21.10 is done
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
**** New links for VPP perf trending/report pages
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
** Subscribe to: docs@lists.fd.io
+
***** Release report: https://s3-docs.fd.io/csit/master/report/
* Action Items - Next Week
+
**** 3 nodes Taishan crypto test case failed - related to CSIT change
** [Nitin/Sachin]Follow up: Add Virtual addressing support in IOVA dmap Status:
+
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591
** [Nitin] make test on Thunderx2 timings :
+
** VPP Path
** [Khem] make test on Taishan timings: Status:
+
*** Voting and working fine.
** [Sirshak] cavium USB-Ethernet adapters to Quantta Switch. Status:
+
*** CentOS-8 jobs have been removed.
** [Khem] to update on nested VMs on performance test cases. Status:
+
** VPP Device
** [Sirshak] Q to Maciek: buildroot image with VPP device(within container)? Status:  
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
** [Sirshak] mlnx tx non vector version used for no-multiseg. Reason ? Status:
+
*** VM cases failed only on Arm
** [Sirshak] DPDK 18.05 mlnx bug. Status:
+
**** Tried to increase the timeout to see it will fix the issue
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.  
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
  
'''6/26/2018'''
+
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
 +
** VPP IPv4 fragmentation - Tianyu & Jieqiang
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
** VPP IPv6 Benchmarking and Profiling - Jieqiang
 +
*** IPv6 profiling
 +
**** No perf bump for lookup_x2 function in Fd.io gerrit
 +
**** Try Mellaonx nics for IPv6 routing tests
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''12/07/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Tianyu Li
** Sachin Saxena
+
** Govindarajan Mohandoss
** Khemendra Kumar
+
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 
** Tina Tsou
 
** Tina Tsou
** Nitin Saxena
+
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
 +
** VPP IPv4 fragmentation - Tianyu & Jieqiang
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
** VPP IPv6 Benchmarking and Profiling - Jieqiang
 +
*** IPv6 profiling
 +
**** No perf bump for lookup_x2 function in Fd.io gerrit
 +
**** Try Mellaonx nics for IPv6 routing tests
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''11/30/2021'''
 +
* Attendees
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 
** Juraj Linkes
 
** Juraj Linkes
** Brian Brooks
+
** Tina Tsou
** Ed Kern
+
 
** Song
+
* CSIT
* General Topic
+
** VPP Performance Test
** Introduce Song, Yi and Lijian
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
* Action Items - Last Week
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** Adarsh: Updates on Jira tkt for openssl issues. Updates: none
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** Adarsh: Update on topology for Kubernetes Functional Tests. Updates: Kubernetes, Docker
+
**** IPsec SPD input/output test case ongoing - Juraj
** Sirshak Tuning Section - Not Done
+
***** Enable the SPD outbound tests
** Khem: Ipv4 layer investigation. CSIT: IPv4. To Share some findings next week on parameters for CSIT
+
****** Patches ready, waiting release testing done - ETA 1 week or 2
** Nitin: Send old dpdk input node patch - Done
+
****** Outbound SPD test patch has been enabled
** Sachin: Upstreaming ARMv8 Crypto Changes with external DPDK. - Nitin to send mail
+
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
** Add Virtual addressing support in IOVA dmamap: Updates - nitin to send mail
+
***** Release testing for 21.10 is done
** Nitin Measure make make test on Thunderx2
+
**** New links for VPP perf trending/report pages
** Khem: measure make and make test on Taishan (Juraj tested it it failed : https://jira.fd.io/browse/CSIT-1148)
+
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
** Sirshak: try to switch eth-usb for regular eth ports on ThunderXs - Created a LF tkt have follow up meeting today.
+
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
** New Arm servers shipment to the FD.io lab
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 
* VPP
 
* VPP
** Discuss vec_en_rx/tx=1 parameters.
+
** VPP SVE implementation - Lijian
** Discuss Vectorized rx and tx functions in mlx5 (in case of no multi-seg)
+
*** SVE validation on FPGA platform - Conflunence page ready
** rxd,txd nos in VPP config.
+
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
** mbcache any configuring done from VPP side ?
+
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
 +
** VPP IPv4 fragmentation - Tianyu & Jieqiang
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
** VPP IPv6 Benchmarking and Profiling - Jieqiang
 +
*** IPv6 profiling
 +
**** No perf bump for lookup_x2 function in Fd.io gerrit
 +
**** Try Mellaonx nics for IPv6 routing tests
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
'''11/23/2021'''
 +
* Attendees
 +
** Tianyu Li
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
 
 
* CSIT
 
* CSIT
** make test failures Taishan Khem/adarsh (https://jira.fd.io/browse/CSIT-1148)
+
** VPP Performance Test
** Juraj Make test bottlenecks: Updates: Ran 4 containers (85 mins) (CSIT-1139)
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** mcbin, OD(1000/3000), cavium thunderX as one of the targets for VPP Device Test.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** Future role of devices. Status: Existing Taishan Servers to be used for performance suite only.
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** Khem to update on nested VMs on performance test cases.
+
**** IPsec SPD input/output test case ongoing - Juraj
** buildroot image with VPP device(within container) ? Sirshak to ask maciek
+
***** Enable the SPD outbound tests
** Tkt updates:
+
****** Patches ready, waiting release testing done - ETA 1 week or 2
*** CSIT-1043 (Guest OS becomes unresponsive during CSIT): Juraj Updates: Merged and Closed
+
****** Outbound SPD test patch merged and running, expected report shows next week.
*** CSIT-990 (buildroot package) Juraj Updates: Postponed
+
****** Inbound patch pending on merge, need maintainer's review
*** CSIT-1021: Handle Scapy pcap limit Khem(brief on patch, updates): Status: To check with CSIT team for jenkins build failure.
+
****** https://gerrit.fd.io/r/c/csit/+/34256
* fd.io lab
+
***** Release testing for 21.10 is done
** Sirshak to have follow up LF guys.
+
**** New links for VPP perf trending/report pages
* Documentation
+
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
***** Release report: https://s3-docs.fd.io/csit/master/report/
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
**** 3 nodes Taishan crypto test case failed - related to CSIT change
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
** VPP Path
** Subscribe to: docs@lists.fd.io
+
*** Voting and working fine.
** Sirshak and Khem to try doing some reviews this week.
+
*** CentOS-8 jobs have been removed.
* Action Items - Next Week
+
** VPP Device
** Khem: Ipv4 layer investigation. To Share some findings next week on parameters for CSIT
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
** Nitin Follow up: Sachin: Upstreaming ARMv8 Crypto Changes with external DPDK.
+
*** AVF interface creation issue:
** Nitin Follow up: Add Virtual addressing support in IOVA dmap
+
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
** Nitin make test on Thunderx2 timings :
+
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
** Khem: status on make test failures: CSIT-1148
+
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
** Khem: make test on Taishan timings:
+
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
** Sirshak: cavium USB-Ethernet adapters to Quantta Switch.
+
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
** New Arm servers shipment to the FD.io lab
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
*** Intel NIC firmware upgrade on Arm - not supported
  
** Sirshak: try to switch eth-usb for regular eth ports on ThunderXs - Created a LF tkt have follow up meeting today.
+
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enbaled with DMC-620
 +
** VPP IPv4 fragmentation
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Performance degradation with quad loop unrolling applied on ip6_lookup_inline
 +
**** Patch the current kernel to enable perfmon plugin on VPP
 +
**** Need to check performance for IPv6 subnet routing
 +
**** Try Mellaonx nics for IPv6 routing tests
 +
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
***** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
  
'''6/19/2018'''
+
 
 +
'''11/16/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Tianyu Li
** Sachin Saxena
+
** Govindarajan Mohandoss
** Khemendra Kumar
+
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 
** Tina Tsou
 
** Tina Tsou
** Nitin Saxena
+
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Inbound patch pending on merge, need maintainer's review
 +
****** https://gerrit.fd.io/r/c/csit/+/34256
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
***** Enable VPP device testing per patch
 +
** New Arm servers shipment to the FD.io lab
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enbaled with DMC-620
 +
** VPP IPv4 fragmentation
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Performance degradation with quad loop unrolling applied on ip6_lookup_inline
 +
**** Patch the current kernel to enable perfmon plugin on VPP
 +
**** Need to check performance for IPv6 subnet routing
 +
**** Try Mellaonx nics for IPv6 routing tests
 +
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
***** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
 
 +
'''11/09/2021'''
 +
* Attendees
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 
** Juraj Linkes
 
** Juraj Linkes
** Brian Brooks
+
** Tina Tsou
** Ed Kern
+
 
** Song
+
* CSIT
* General Topic
+
** VPP Performance Test
** Introduce Yi ,Lijian and Song
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
* Action Items - Last Week
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** Brian: mcbin Status:
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** Sirshak: Follow up clang changes. Status: Merged updated wiki.
+
**** IPsec SPD input/output test case ongoing - Juraj
** Khem: Update Kernel Image based on Sanil's input to move past known VM issues. Try this internally.
+
***** Enable the SPD outbound tests
** Khem: LF tkt for Taishan BIOS updates.
+
****** Patches ready, waiting release testing done - ETA 1 week or 2
*** No update for the ticket
+
****** Inbound patch pending on merge, need maintainer's review
** Adarsh: openssl updates. Status:
+
****** https://gerrit.fd.io/r/c/csit/+/34256
*** Raised Jira ticket, needs to be discussed with VPP folks
+
**** New links for VPP perf trending/report pages
** Adarsh: Kubernetes
+
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
*** Working with K8s folks, planning on creating topology from containers for functional tests
+
***** Release report: https://s3-docs.fd.io/csit/master/report/
** Khem: VM(s) in container, VFs for containers
+
** VPP Path
** Sirshak: Summarize tkts in the Tuning Section. Status: Not Done
+
*** Voting and working fine.
** Khem: Investigation on ipv4 layer. Status: Not Done
+
*** CentOS-8 jobs have been removed.
** Nitin: Send old patch on dpdk_input node tuning
+
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.  
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Will enable voting right soon after the patch gets merged
 +
** New Arm servers shippment to the FD.io lab
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 
* VPP
 
* VPP
** Sachin: Upstreaming armv8 crypto changes. Status: Sachin will try to upstream a patch related to external DPDK
+
** VPP SVE implementation - Lijian
** Sirshak: Vectorization - Presentation.
+
*** SVE validation on FPGA platform - Conflunce page ready
** Any new findings on hotspots or optimizations. Brian: adjusting queue sizes seem to have an effect
+
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
** https://gerrit.fd.io/r/#/c/12932/ discussion: Need to understand the usecase(s) for iommu inside VPP
+
**** Perfmonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
*** VPP IPv4 fragmetation
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Performance degradation with quad loop unrolling applied on ip6_lookup_inline
 +
**** Patch the current kernel to enable perfmon plugin on VPP
 +
**** Need to check performance for IPv6 subnet routing
 +
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
***** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
 
 +
'''11/02/2021'''
 +
* Attendees
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
 
 
* CSIT
 
* CSIT
** Discuss current make test time bottleneck.
+
** VPP Performance Test
** AI Nitin: measure make and make test on ThunderX
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** AI Khem: measure make and make test on Taishan
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** AI Sirshak: try to switch eth-usb for regular eth ports on Thunderxs
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** Future role of devices. Status: will be decided when we have more info (performance on different devices etc.)
+
**** IPsec SPD input/output test case ongoing - Juraj
** Question to Nitin/Anyone of how to individually run one test case of the performance suite. Status: no performance testcase can run on 2-node topologies
+
***** Enable the SPD outbound tests
** Tkt updates:
+
****** https://gerrit.fd.io/r/c/csit/+/34256
*** CSIT-1043 (Guest OS becomes unresponsive during CSIT): Juraj Updates: Sent a patch. Status: Patch is waiting to be merged
+
**** New links for VPP perf trending/report pages
*** CSIT-990 (buildroot package) Juraj Updates: No updates
+
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
*** CSIT-1021: Handle Scapy pcap limit Khem(brief on patch, updates): Submitted. Jobs still failing, Khem to investigate. Patch related to Jumbo pkts.
+
***** Release report: https://s3-docs.fd.io/csit/master/report/
* fd.io lab
+
** VPP Path
** mcbin get them up, discuss with LF. Status: Brian - No Updates
+
*** Voting and working fine.
** Cavium Blades LF ticket #56713 Status: Tina - Need to have a meeting
+
*** CentOS-8 jobs have been removed.
* Documentation
+
** VPP Device
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
*** AVF interface creation issue:
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
** Subscribe to: docs@lists.fd.io
+
***** race condition occur
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Addressed comments, waiting Peter's review.
 +
******* Will enable voting right soon after the patch gets merged
 +
** New Arm servers shippment to the FD.io lab
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
*** Intel NIC firmware upgrade on Arm - not supported
  
* Action Items - Next Week
+
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunce page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Perfmonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Performance degradation with quad loop unrolling applied on ip6_lookup_inline
 +
**** Patch the current kernel to enable perfmon plugin on VPP
 +
**** Need to check performance for IPv6 subnet routing
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu
 +
**** Two patches unmerged
 +
***** l2: fix array-bounds error for prefetch on Arm https://gerrit.fd.io/r/c/vpp/+/33307
 +
***** ioam: fix prefetch out bound on Arm https://gerrit.fd.io/r/c/vpp/+/33506
  
'''6/12/2018'''
+
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
***** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
 
 +
 
 +
'''10/26/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Juraj Linkes
** Brian Brooks
+
** Tianyu Li
** John Bromhead
+
** Govindarajan Mohandoss
** Sachin Saxena
+
** Lijian Zhang
** Khemendra Kumar
+
** Jieqiang Wang
** Adarsh
+
** Andy Wang
+
 
** Tina Tsou
 
** Tina Tsou
** Andrew Pinski
 
** Nitin Saxena
 
** Natalie Samsonov
 
  
* Action Items - Last Week
+
* CSIT
** Brian: mcbin status: Updates from Trishan LF tkt #54490. - No updates
+
** VPP Performance Test
** Sirshak: Follow up clang changes. Sent: Follow up patch.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** Khem: Update Kernel Image based on Sanil's input to move past known VM issues. Try this internally and then do it fd.io lab.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** Khem: LF tkt for Taishan BIOS updates. LF #56898 Status: Not done. Will follow up.
+
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week. - closed
** Adarsh: openssl updates. Status: IPSEC SA add entry error. To open a Jira tkt tracking this.  
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
** Sirshak: Summarize tkts in the Tuning Section. Didnt get chance to do this week would try to complete it by next week.
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** Sirshak: Schedule a Meeting between Juraj and Khem. Done
+
***** Inbound IPsec: reproduced and need to investigate - Juraj
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** IPsec SPD input/output case ongoing
 +
***** Adding IPsec SPD outbound test cases 64B 1, 100 and 1k SPD entries, 1, 2, 4 cores, on tx2 testbed - clarified
 +
****** Flow cache on and off cases need to be measured.
 +
***** L2 BD 20k test cases execute time too long, removed on taishan.
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 3n-tsh testbed unreachable, investigating right now - Juraj
 +
***** TG firmware is under upgradation
 +
***** Server unreachable due to firmware & driver update - resolved - update all done
 +
**** Release testing for 21.10 starts
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - Juraj
 +
***** Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
 +
**** AVF interface creation issue:
 +
***** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
****** race condition occur
 +
****** try mounting a part of /dev/vfio to see if issue can be resolved
 +
******  Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and looks good right now
 +
******* Addressed comments, waiting Peter's review..
 +
******* Will enable voting right soon after the patch gets merged
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
**** New servers are in the procurement process
 +
**** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
****** CPU not fully utilized on Arm, need further investigation
 +
** Intel NIC firmware upgrade on Arm - not supported
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 
* VPP
 
* VPP
** Brian: Talk on mcbin perf analysis. Nitin to send a old patch on tuning prefetch on dpdk_input node.
+
** VPP SVE implementation - Lijian
** Sirshak: VPP Multi-arch optimizations Guidelines
+
**** SVE validation on FPGA platform - Conflunce page ready
** Sirshak: Vectorization - Plan to present something next week. Any thoughts ?
+
***** Run unit tests from DPDK and VPP bihash on FPGA
** Nitin: anybody willing to take up ipv4 layer ? Khem to take a look.
+
***** Try Lijian's SVE patch to see any cycle count improvement
** Sachin: Upstreaming armv8 crypto changes.
+
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
** Nitin: memcpy updates ?
+
****** Run standalone SVE test cases on FPGA
** Sirshak: clang patch status
+
****** Ask for DMC 620 images to run for FPGA
 +
****** Enable DMC 620 more close to real system, but performance will drop
 +
****** Build a system using VPP memif and pktgen
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
**** Plan to try quad loop unrolling for ip6_lookup_inline function
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu
 +
**** Two patches unmerged
 +
***** l2: fix array-bounds error for prefetch on Arm https://gerrit.fd.io/r/c/vpp/+/33307
 +
***** ioam: fix prefetch out bound on Arm https://gerrit.fd.io/r/c/vpp/+/33506
 +
 
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
***** Try to use ansible to deploy VPP automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from maintainer and Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''10/19/2021'''
 +
* Attendees
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
 
 
* CSIT
 
* CSIT
** Sirshak: Explain VPP Path and VPP Device
+
** VPP Performance Test
** Open Questions and Answers surrounding VPP Device
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
*** Q. Do the Intel onboard NICs support VFs via SRIOV on machiattobin boards ?
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
*** A.[Natalie] We support it but it’s not formally released yet. Will be formally delivered in 18.09.
+
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week. - closed
*** BB - Kernel bypass uses UIO possible to do. [natalie] check support for VF for onboard NICs
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
*** Q. If Yes, is it a hardware level support or supported in musdk also ?
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
*** A.[Natalie] MUSDK is not relevant here. Intel NICs are using DPDK and ARM infrastructure directly. We support PCIE SR-IOV with both v4.4 and v4.14 kernels
+
***** Inbound IPsec: reproduced and need to investigate - Juraj
*** Q. Has anybody tested containers (docker) and any container orchestration system on mcbin (e.g Docker Swarm or Kubernetes) ?
+
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
*** A.[Natalie] Yes.
+
**** IPsec SPD input/output case ongoing
*** Q. K8s or Docker Swarn ?
+
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
*** A. [Bin Arm Internal] K8s is good choice version(1.9.4). Use kubeadm to install k8s cluster.
+
**** 3n-tsh testbed unreachable, investigating right now - Juraj
*** Q. VM inside a container works on ARM ?
+
***** TG firmware is under upgradation
*** A. [Bin ARM Internal] Use Kata and Runv. Kata/Runv is the solution of hardware-virtualized containers.
+
***** Server unreachable due to firmware & driver update - resolved - update all done
*** Q. Container within a Container(nested) works on ARM ?
+
**** Release testing for 21.10 starts
*** A.[Bin ARM Internal] ‘Docker in docker’ or ‘Docker of Docker’ can works well on Arm platform.
+
** VPP Path
** Sirshak: Explain the proposed role of Cavium Blades for functional tests.
+
*** Voting and working fine.
** Tkt updates:
+
*** CentOS-8 jobs have been removed.
*** CSIT-1043 (Guest OS becomes unresponsive during CSIT): Juraj Updates: Sent a patch.  
+
** VPP Device
*** CSIT-990 (buildroot package) Juraj Updates:
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
*** CSIT-1021: Handle Scapy pcap limit Khem(brief on patch, updates): Submitted. Jobs failing Khem to investigate. Patch related to Jumbo pkts.
+
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
*** Sachin: To open tkt to track ARMv8 crypto.
+
***** Try to reproduce with another set of firmware and etc but issues still exist
* fd.io lab
+
***** https://doc.dpdk.org/guides/nics/i40e.html
** mcbin Status: Brian - No Updates
+
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.  
** Cavium Blades #56713 Status: Tina
+
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.  
*Documentation
+
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
***** Resulting in the same failure as before, only happen on AArch64 platform
** Subscribe to: docs@lists.fd.io
+
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - Juraj
 +
***** Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
 +
**** AVF interface creation issue:
 +
***** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
****** race condition occur
 +
****** try mounting a part of /dev/vfio to see if issue can be resolved
 +
******  Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and looks good right now
 +
******* Will enable voting right soon after the patch gets merged
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
**** New servers are in the procurement process
 +
**** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
****** CPU not fully utilized on Arm, need further investigation
 +
** Intel NIC firmware upgrade on Arm - not supported
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  
* Action Items - Next Week
+
* VPP
 +
** VPP SVE implementation - Lijian
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
****** Enable DMC 620 more close to real system, but performance will drop
 +
****** Build a system using VPP memif and pktgen
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
**** Plan to try quad loop unrolling for ip6_lookup_inline function
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu
 +
**** Two patches unmerged
 +
***** l2: fix array-bounds error for prefetch on Arm https://gerrit.fd.io/r/c/vpp/+/33307
 +
***** ioam: fix prefetch out bound on Arm https://gerrit.fd.io/r/c/vpp/+/33506
  
** Brian: mcbin Status:
+
** VPP memif - Tianyu
** Sirshak: Follow up clang changes. Status: Merged updated wiki.
+
*** CNF PoC proposal preparation
** Khem: Update Kernel Image based on Sanil's input to move past known VM issues. Try this internally.
+
***** Add support for VPP aarch64 docker image build
** Khem: LF tkt for Taishan BIOS updates.
+
***** Calico use cases exploration on VPP
** Adarsh: openssl updates. Status:
+
***** Try to use ansible to deploy VPP automatically
** Sirshak: Summarize tkts in the Tuning Section. Status: Not Done
+
** VPP IPsec on Arm - Govind
** Khem: Investigation on ipv4 layer. Status:
+
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from maintainer and Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
  
'''6/4/2018'''
+
'''10/12/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Lijian Zhang
** Brian Brooks
+
** Juraj Linkes
** John Bromhead
+
** Tianyu Li
** Sachin Saxena
+
** Govindarajan Mohandoss
** Khemendra Kumar
+
** Adarsh
+
** Andy Wang
+
 
** Tina Tsou
 
** Tina Tsou
** Andrew Pinski
+
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week. - closed
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** Inbound IPsec: reproduced and need to investigate - Juraj
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing done.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 3n-tsh testbed unreachable, investigating right now - Juraj
 +
***** TG firmware is under ugradation
 +
***** Server unreachable due to firmware & driver update - resolved - update all done
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - Juraj
 +
***** Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
 +
**** AVF interface creation issue:
 +
***** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
****** race condition occur
 +
****** try mounting a part of /dev/vfio to see if issue can be resolved
 +
****** Talked with Peter, Juraj is working on prototype of mounting part of /dev/vfio
 +
******  x86 vpp device job is fine, duo to firmware & driver is old
 +
******  arm vpp device servers have drivers updated, vlan striping not allowed, vlan configuration cannot removed from lab view.
 +
******  only performance testbeds have NIC drivers updated
 +
******  maintainer doesn't want to a option from vpp config
 +
******  may need to check x86 have the same issue with the same version driver before reaching intel folks
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
**** New servers are in the procurement process
 +
**** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
****** CPU not fully utilized on Arm, need further investigation
 +
** Intel NIC firmware upgrade on Arm - not supported
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
****** Enable DMC 620 more close to real system, but performance will drop
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu
 +
**** Two patches unmerged
 +
***** l2: fix array-bounds error for prefetch on Arm https://gerrit.fd.io/r/c/vpp/+/33307
 +
***** ioam: fix prefetch out bound on Arm https://gerrit.fd.io/r/c/vpp/+/33506
 +
 
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from maintainer and Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''09/28/2021'''
 +
* Attendees
 +
** Lijian Zhang
 
** Juraj Linkes
 
** Juraj Linkes
** Nitin Saxena
+
** Tianyu Li
** Natalie Samsonov
+
** Jieqiang Wang
  
* Action Items - Last Week
+
* CSIT
** Sirshak: To create a LF tkt for mcbin - Didnt create as Brian is handling it offline. If things remain unresolved this week, will create one. - LF Tkt created #54490. [BB]Trishan to follow up over email.
+
** VPP Performance Test
** Sirshak: Follow up on cavium-3 : Its integrated to arm CI job.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** Sirshak: Upstream clang changes: Failing on Cavium TX1 host up-streamed related patch working on review comments.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** Sirshak: Discuss with Maciek and get a signoff for moving the x86 Hosts to arm rack: Done
+
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
** Honnappa: Provide inputs on how to proceed with comments on Marvell dpdk patch.
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
** Honnappa: VPP-1284: To look at this patch to provide comments on performance implications of the fix
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** Juraj estimate moving CSIT functional tests to make test. - 1-2 months for 1 person. Others CSIT looking into this. Better estimate soon.
+
***** Inbound IPsec: reproduced and need to investigate - Juraj
** Khem: Create LF tkt for Performance Suite Topology Creation. : Created LF #56736
+
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
** Adarsh: Create a Jira to document Automation Task. Created Jira Tkt.
+
**** Release testing done.
** Khem: Follow up Sanil : Known taishan vm issues. Update Kernel Image
+
**** IPsec SPD input/output case ongoing
** Khem: LF tkt for Taishan BIOS updates. LF #56898
+
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
** Adarsh: openssl updates. Updated openssl dpdk. VPP is now stable. Will test soon. Adarsh to close the tkt.
+
**** 3n-tsh testbed unreachable, investigating right now - Juraj
** Nitin: VPP-1064 multiple cache line size patch. Nitin to raise to LF tkt to remove DPDK package from Nexus server.
+
***** TG firmware is under ugradation
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.  
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - Juraj
 +
***** Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
 +
**** AVF interface creation issue:
 +
***** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
****** race condition occur
 +
****** try mounting a part of /dev/vfio to see if issue can be resolved
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
**** New servers are in the procurement process
 +
**** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  
* fd.io lab
+
* VPP
** mcbin onboarding issue. - Comments in Action Items - Last Week.
+
** VPP SVE implementation - Lijian
** new cavium boxes status - JohnB : Blade 1-4 racked. CSIT Functional.
+
**** SVE validation on FPGA platform - Conflunce page ready
** Sirshak : Summarize tkts.
+
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu
 +
**** Two patches unmerged
 +
***** l2: fix array-bounds error for prefetch on Arm https://gerrit.fd.io/r/c/vpp/+/33307
 +
***** ioam: fix prefetch out bound on Arm https://gerrit.fd.io/r/c/vpp/+/33506
 +
 
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from maintainer and Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''09/14/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Outbound IPsec finished.
 +
******* Waiting for new version of patchset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate - Juraj
 +
******** Learn more about RFC and need time to understand more - on hold - waiting Neale's response
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing done.
 +
***** A few more jobs run for release 21.06 and will be finished soon
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - juraj
 +
***** Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
 +
**** AVF interface creation issue:
 +
***** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  
 
* VPP
 
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
*** Try IPv4 multicast testing to verify the scenario when refcnt > 1
 +
**** GDB shows that mbufs are copied instead of reference from src port to all dst ports
 +
**** Direct/Indirect mbuf for VPP multicast testing
 +
**** Try IPv4 multicasting & L2 flood testing which works fine
 +
**** ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
 +
***** show mbuf is copied so that ref_cnt will always be one
 +
****** dpdk 21.08 have the patches, need to verify on vpp
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
**** Patch split into 3 components
 +
***** acl: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33597 (Merged)
 +
***** dpdk: fix prefetch assert on Arm https://gerrit.fd.io/r/c/vpp/+/33598 (Merged)
 +
***** session: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33599 (Merged)
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from maintainer and Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
  
** memcpy patch updates/closure: Abandon. Jira to be updated with more data.
+
'''09/07/2021'''
** clang compilation Sirshak: Working on getting the patch upstreamed.
+
* Attendees
** mcbin performance analysis Brian: To talk about this next week.
+
** Lijian Zhang
** vectorization sirshak(Problem, Plausible Solution, Volunteers): SSE2NEON
+
** Govindarajan Mohandoss
** Sachin: upstreaming armv8 crypto changes.
+
** Tianyu Li
** Sirshak: Add Tuning section in Wiki
+
** Jieqiang Wang
** Sirshak: Summarize Jira Tkts
+
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Outbound IPsec finished.
 +
******* Waiting for new version of patchset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate - Juraj
 +
******** Learn more about RFC and need time to understand more - on hold - waiting Neale's response
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing done.
 +
***** A few more jobs run for release 21.06 and will be finished soon
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - juraj
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
*** Try IPv4 multicast testing to verify the scenario when refcnt > 1
 +
**** GDB shows that mbufs are copied instead of reference from src port to all dst ports
 +
**** Direct/Indirect mbuf for VPP multicast testing
 +
**** Try IPv4 multicasting & L2 flood testing which works fine
 +
**** ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
 +
***** show mbuf is copied so that ref_cnt will always be one
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
**** Patch split into 3 components
 +
***** acl: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33597 (Merged)
 +
***** dpdk: fix prefetch assert on Arm https://gerrit.fd.io/r/c/vpp/+/33598 (Under review)
 +
***** session: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33599 (Merged)
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from one Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 +
'''08/31/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 
* CSIT
 
* CSIT
** Performance Suite Roadmap(topology, work distribution(khem, juraj)):
+
** VPP Performance Test
** Sirshak to Schedule a Meeting between Juraj and Khem.  
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** CSIT-1043 (Guest OS becomes unresponsive during CSIT): Juraj Updates: Seen by Juraj. Seeing the issue in ipv6 suite. happens during pcie rescan.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** CSIT-990 (buildroot package) Juraj Updates: Peter from pantheon replied Juraj still looking into it.
+
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
** CSIT-1021: Handle Scapy pcap limit Khem(brief on patch, updates):
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
** Sirshak : Summarize CSIT tkts
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** Sachin: To open tkt to track ARMv8 crypto.
+
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Outbound IPsec finished.
 +
******* Waiting for new version of patchset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate - Juraj
 +
******** Learn more about RFC and need time to understand more - on hold - waiting Neale's response
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing done.
 +
***** A few more jobs run for release 21.06 and will be finished soon
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.  
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.  
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
**** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
***** config option set to N, /dev/vfio device or resource busy error
 +
***** config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
 +
****** The longer the server runs, more test cases fails
 +
***** Next to do
 +
****** Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
 +
******* Juraj modified script to reproduce the issue - Lijian will try it locally
 +
******* Also seen in Intel QAT card from Zach
 +
****** Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
 +
****** Lijian have slight different firmware version, driver version
 +
****** Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
 +
****** Lijian can use Juraj's script to reproduce the issue on local tx2 server
 +
******* Reducing the numa buffer allocation size resolves this issue
 +
******* Observed from the error log of numa buffer allocation
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  
*Documentation
+
* VPP
** Special VPP installations(eg. dpaa).
+
** VPP SVE implementation - Lijian
** ARMv8 crypto needs to documented.
+
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
*** Try IPv4 multicast testing to verify the scenario when refcnt > 1
 +
**** GDB shows that mbufs are copied instead of reference from src port to all dst ports
 +
**** Direct/Indirect mbuf for VPP multicast testing
 +
**** Try IPv4 multicasting & L2 flood testing which works fine
 +
**** ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
 +
***** show mbuf is copied so that ref_cnt will always be one
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
**** Patch split into 3 components
 +
***** acl: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33597 (Merged)
 +
***** dpdk: fix prefetch assert on Arm https://gerrit.fd.io/r/c/vpp/+/33598 (Under review)
 +
***** session: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33599 (Merged)
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
  
* Action Items - Next Week
+
'''08/24/2021'''
** Brian: mcbin status: Updates from Trishan LF tkt #54490.
+
* Attendees
** Sirshak: Follow up clang changes.
+
** Lijian Zhang
** Khem: Update Kernel Image based on Sanil's input to move past known VM issues.
+
** Govindarajan Mohandoss
** Khem: LF tkt for Taishan BIOS updates. LF #56898 Status:
+
** Juraj Linkes
** Adarsh: openssl updates.
+
** Tianyu Li
** Sirshak: Summarize tkts in the Tuning Section.
+
** Jieqiang Wang
** Sirshak: Schedule a Meeting between Juraj and Khem.  
+
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Outbound IPsec finished.
 +
******* Waiting for new version of patchset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate - Juraj
 +
******** Learn more about RFC and need time to understand more - on hold - waiting Neale's response
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing done.
 +
***** A few more jobs run for release 21.06 and will be finished soon
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
**** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
***** config option set to N, /dev/vfio device or resource busy error
 +
***** config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
 +
****** The longer the server runs, more test cases fails
 +
***** Next to do
 +
****** Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
 +
******* Juraj modified script to reproduce the issue - Lijian will try it locally
 +
******* Also seen in Intel QAT card from Zach
 +
****** Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
 +
****** Lijian have slight different firmware version, driver version
 +
****** Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
 +
****** Lijian can use Juraj's script to reproduce the issue on local tx2 server
 +
******* Reducing the numa buffer allocation size resolves this issue
 +
******* Observed from the error log of numa buffer allocation
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
*** Try IPv4 multicast testing to verify the scenario when refcnt > 1
 +
**** GDB shows that mbufs are copied instead of reference from src port to all dst ports
 +
**** Will try L2 flood test case & understand VPP/multicast code
 +
**** Direct/Indirect mbuf for VPP multicast testing
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
**** Issues about prefetch on current VPP code base
 +
***** Issue 1 support 128B/64B cache-line size in Arm image
 +
***** Issue 2 prefetch 'overflow' for native build
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
  
'''5/29/2018'''
+
'''08/17/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Lijian Zhang
** Brian Brooks
+
** Govindarajan Mohandoss
** John Bromhead
+
** Juraj Linkes
** Sachin Saxena
+
** Tianyu Li
** Khemendra Kumar
+
** Jieqiang Wang
** Adarsh
+
** Zachary Leaf
** Andy Wang
+
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Outbound IPsec finished.
 +
******* Waiting for new version of patchset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate - Juraj
 +
******** Learn more about RFC and need time to understand more
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing ongoing
 +
***** A few more jobs run for release 21.06 and will be finished soon
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
**** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
***** config option set to N, /dev/vfio device or resource busy error
 +
***** config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
 +
****** The longer the server runs, more test cases fails
 +
***** Next to do
 +
****** Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
 +
******* Juraj modified script to reproduce the issue - Lijian will try it locally
 +
******* Also seen in Intel QAT card from Zach
 +
****** Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
 +
****** Lijian have slight different firmware version, driver version
 +
****** Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
 +
****** Lijian can use Juraj's script to reproduce the issue on local tx2 server
 +
******* Reducing the numa buffer allocation size resolves this issue
 +
******* Observed from the error log of numa buffer allocation
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
*** Try IPv4 multicast testing to verify the scenario when refcnt > 1
 +
**** GDB shows that mbufs are copied instead of reference from src port to all dst ports
 +
**** Will try L2 flood test case & understand VPP/multicast code
 +
**** Direct/Indirect mbuf for VPP multicast testing
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
**** Issues about prefetch on current VPP code base
 +
***** Issue 1 support 128B/64B cache-line size in Arm image
 +
***** Issue 2 prefetch 'overflow' for native build
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''08/10/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Outbound IPsec finished.
 +
******* Waiting for new version of patcheset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing ongoing
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
`
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
**** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
***** config option set to N, /dev/vfio device or resource busy error
 +
***** config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
 +
****** The longer the server runs, more test cases fails
 +
***** Next to do
 +
****** Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
 +
******* Juraj modified script to reproduce the issue - Lijian will try it locally
 +
******* Also seen in Intel QAT card from Zach
 +
****** Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
 +
****** Lijian have slight different firmware version, driver version
 +
****** Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian has got VPN access now
 +
*** Juraj singed Jieqiang's key
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128, CLI issue only, CSIT's python API works fine.
 +
*** Internal patch to resolve this issue under review - upstreamed
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
*** 4 loop unrolling decreasing performance
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''08/03/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Waiting for new version of patcheset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing ongoing
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
`
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
**** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
***** config option set to N, /dev/vfio device or resource busy error
 +
***** config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
 +
****** The longer the server runs, more test cases fails
 +
***** Next to do
 +
****** Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
 +
******* Also seen in Intel QAT card from Zach
 +
****** Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
 +
****** Will try Mellanox card to see if same issue happens - Juraj
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian has got VPN access now
 +
*** Juraj singed Jieqiang's key
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
*** Internal patch to resolve this issue under review
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
*** 4 loop unrolling decreasing performance
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''07/27/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing ongoing
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
 +
***** May need to check VM and IPsec cases
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
****** Random issue, more frequently happening on Arm
 +
******* Not see in CI recently or manually.
 +
**** scapy unexpected timeout issue: packet drop or slow issue?
 +
****** vfio-pci driver may be the root cause - bind/unbind
 +
*** Connection issue between Jenkins and the build executor in FD.io lab
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian has got VPN access now
 +
*** Juraj singed Jieqiang's key
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
*** 4 loop unrolling descreasing performance
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code: having some questions/comments, would like a review meeting - Lijian
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''07/20/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing ongoing
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
 +
***** May need to check VM and IPsec cases
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
****** Random issue, more frequently happening on Arm
 +
****** vfio-pci driver may be the root cause - bind/unbind
 +
*** Connection issue between Jenkins and the build executor in FD.io lab
 +
** Shipment of new advanced server to the FD.io lab
 +
*** Two advanced servers are in plan to ship
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian has got VPN access now
 +
*** Juraj singed Jieqiang's key
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
** VPP mbuf-fast-free tx offload
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
**** https://gerrit.fd.io/r/c/vpp/+/33062
 +
**** https://gerrit.fd.io/r/c/vpp/+/33063
 +
**** https://gerrit.fd.io/r/c/vpp/+/33061
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
***** Patches have been upstreamed and waiting for review
 +
****** https://gerrit.fd.io/r/c/vpp/+/32420
 +
***** Prepare the memif readout - Tianyu
 +
***** Add support for VPP aarch64 docker image build
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''07/13/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
 +
****** Expected to be merged soon
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
 +
****** Hugepage size, numa-node, core isolation etc. may need to check.
 +
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
 +
**** Release testing ongoing
 +
**** IPsec SPD input/output case ongoing
 +
**** Juraj may share the steps how CSIT handle new configuration changes
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
 +
***** May need to check VM and IPsec cases
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
****** Random issue, more frequently happening on Arm
 +
****** vfio-pci driver may be the root cause - bind/unbind
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
** Shipment of new advanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
** VPN access request to FD.io Arm servers
 +
*** Will remind Machiek to sign Lijian's GPG public key.
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
 
 +
'''07/06/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
 +
****** Expected to be merged soon
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
 +
****** Hugepage size, numa-node, core isolation etc. may need to check.
 +
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
 +
**** Release testing ongoing
 +
**** IPsec SPD input/output case ongoing
 +
**** Juraj may share the steps how CSIT handle new configuration changes
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
 +
***** May need to check VM and IPsec cases
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
****** Random issue, more frequently happening on Arm
 +
****** vfio-pci driver may be the root cause - bind/unbind
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
** Shipment of new advanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
** VPN access request to FD.io Arm servers
 +
*** Will remind Machiek to sign Lijian's GPG public key.
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform
 +
** VPP Prefetch
 +
*** Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
 +
**** Repeat the same test on Ampere server - PMU cache-miss less for write always
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
*** Investigating VPP classify function, use case, benchmarking - Lijian
 +
**** Start with simple use case
 +
**** VPP Classify basic inbound L3 src ip / prot case
 +
**** Benchmark VPP classifier on Arm/X86 platform
 +
***** Done some NEON changes, see some microbranchmark improvement
 +
*** investigate CSIT case
 +
**** No classify test case in CSIT. - Jieqiang
 +
**** may be there is a CSIT case named iacldstbase
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
**** Benchmarking DPDK memif vs VPP memif
 +
***** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
***** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
***** Patches have been upstreamed and waiting for review
 +
***** Prepare the memif readout - Tianyu
 +
***** Add support for VPP aarch64 docker image build
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
*** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
** VPP IPsec on Arm - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
**** VPP uses linear search on SPD lookups
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''06/29/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
 +
****** Expected to be merged soon
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
 +
****** Hugepage size, numa-node, core isolation etc. may need to check.
 +
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
 +
**** Release testing ongoing
 +
**** IPsec SPD input/output case ongoing
 +
**** Juraj may share the steps how CSIT handle new configuration changes
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
 +
***** May need to check VM and IPsec cases
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** Community plans to drop the support for CentOS-8.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
****** Debugging
 +
****** vfio-pci driver may be the root cause - bind/unbind
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
** Shipment of new advanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
** VPN access request to FD.io Arm servers
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform
 +
** VPP Prefetch
 +
*** Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
 +
**** Repeat the same test on Ampere server - PMU cache-miss less for write always
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
*** Investigating VPP classify function, use case, benchmarking - Lijian
 +
**** Start with simple use case
 +
**** VPP Classify basic inbound L3 src ip / prot case
 +
**** Benchmark VPP classifier on Arm/X86 platform
 +
***** Done some NEON changes, see some microbranchmark improvement
 +
*** investigate CSIT case
 +
**** No classify test case in CSIT. - Jieqiang
 +
**** may be there is a CSIT case named iacldstbase
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
**** Benchmarking DPDK memif vs VPP memif
 +
***** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
***** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
***** Patches have been upstreamed and waiting for review
 +
***** Prepare the memif readout - Tianyu
 +
***** Add support for VPP aarch64 docker image build
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
*** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
** VPP IPsec on Arm - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
**** VPP uses linear search on SPD lookups
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''06/22/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries
 +
****** Expected to be merged soon
 +
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
 +
**** Release testing ongoing
 +
**** IPsec SPD input/output case ongoing
 +
**** Juraj may share the steps how CSIT handle new configuration changes
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** Community plans to drop the support for CentOS-8.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** vfio-pci driver may be the root cause
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
** Shipment of new advanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
** VPN access request to FD.io Arm servers
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform
 +
** VPP Prefetch
 +
*** Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
 +
**** Repeat the same test on Ampere server
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** Investigating VPP classify function, use case, benchmarking - Lijian
 +
**** Start with simple use case
 +
**** VPP Classify basic inbound L3 src ip / prot case
 +
**** Benchmark VPP classifier on Arm/X86 platform
 +
***** Done some NEON changes, see some microbranchmark improvement
 +
*** investigate CSIT case
 +
**** No classify test case in CSIT. - Jieqiang
 +
**** may be there is a CSIT case named iacldstbase
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
**** Benchmarking DPDK memif vs VPP memif
 +
***** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
***** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
***** Patches have been upstreamed and waiting for review
 +
***** Prepare the memif readout - Tianyu
 +
***** Add support for VPP aarch64 docker image build
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
*** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
** VPP IPsec on Arm - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
**** VPP uses linear search on SPD lookups
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''06/15/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
***** VPP exits with the IPsec startup config, try startup config from Zach's email - Juraj
 +
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
 +
**** Release testing ongoing
 +
**** IPsec SPD input/output case ongoing
 +
**** Juraj may share the steps how CSIT handle new configuration changes
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** Community plans to drop the support for CentOS-8.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly. - DaveW
 +
** Shippment of new adavanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform
 +
** VPP Prefetch
 +
*** Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
 +
**** Repeat the same test on Ampere server
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** Investigating VPP classify function, use case, benchmarking - Lijian
 +
**** Start with simple use case
 +
**** VPP Classify basic inbound L3 src ip / prot case
 +
**** Benchmark VPP classifier on Arm/X86 platform
 +
***** Done some NEON changes, see some microbranchmark improvement
 +
*** investigate CSIT case
 +
**** No classify test case in CSIT. - Jieqiang - may be there is a CSIT case named iacldstbase
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
**** Benchmarking DPDK memif vs VPP memif
 +
***** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
***** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
***** Patches have been upstreamed and waiting for review
 +
***** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
*** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
** VPP IPsec on Arm - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
**** VPP uses linear search on SPD lookups
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Waiting for review comments on outbound side before upstream to VPP
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
 
 +
'''06/08/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
***** VPP exits with the IPsec startup config, try startup config from Zach's email - Juraj
 +
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Work in progress.
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Some container test cases failed on all platform.
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** Community plans to drop the support for CentOS-8.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results.
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
** Shippment of new adavanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
* VPP
 +
** VPP default compiler on Arm platform
 +
*** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
**** Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
 +
***** No obvious performance improvement, keep the original default compiler
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
**** SVE validation on FPGA platform
 +
** VPP Prefetch
 +
*** Benchmark VPP using prefetch read always vs prefetch write always - Jieqiang
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** Investigating VPP classify function, use case, benchmarking - Lijian
 +
**** Start with simple use case
 +
**** VPP Classify basic inbound L3 src ip / prot case
 +
**** Benchmark VPP classifier on Arm/X86 platform
 +
*** investigate CSIT case
 +
**** No classify test case in CSIT. - Jieqiang
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
**** Benchmarking DPDK memif vs VPP memif
 +
***** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
***** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
***** Patches have been upstreamed and waiting for review
 +
***** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
*** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
** VPP IPsec on Arm - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input/output nodes - Govind & Zach
 +
**** VPP uses linear search on SPD lookups
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Waiting for review comments on outbound side before upstream to VPP
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
 
 +
 
 +
'''06/01/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
 +
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
 +
****** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
******* New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
******* IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
 +
****** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
****** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Work in progress.
 +
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Juraj will investigate cabling issue on Taishan performance test-bed - resolved.
 +
***** Some container case are seems failure on all platform.
 +
** VPP Path
 +
*** Voting and working fine.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
*** Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
 +
** Vector length specific patch is ready
 +
** Investigating VPP classify function, use case, benchmarking - Lijian
 +
*** Start with simple use case
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
** SVE patch ready and upstreamed, under review - Lijian
 +
*** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
*** Review memif patch
 +
**** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
*** VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case - No classify test case in CSIT. - Jieqiang
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
** Work on IPsec input/output nodes - VPP uses linear search on SPD lookups - Govind & Zach
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Waiting for review comments on outbound side before upstream to VPP
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
*** Investigated CMN-600 stats in perfmon plugin
 +
**** Abandoned, CMN-600 only gives system level view, no useful stats at node level - linux perf tool can give the same result
 +
 
 +
'''05/25/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tianyu Li
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
 +
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
 +
****** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
******* New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
******* IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
 +
****** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
****** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - will look into it
 +
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Juraj will investigate cabling issue on Taishan performance test-bed - resolved.
 +
***** Some container case are seems failure on all platform.
 +
** VPP Path
 +
*** Voting and working fine.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** Try the new version of DPDK but it does not help
 +
****** Contact Intel devs for the possible advice
 +
****** Workaround may impact too much to all test cases
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
*** Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
 +
** Vector length specific patch is ready
 +
** Investigating VPP classify function, use case, benchmarking - Lijian
 +
*** Start with simple use case
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
** SVE patch ready and upstreamed, under review - Lijian
 +
*** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
*** Review memif patch
 +
**** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
*** VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case - No classify test case in CSIT. - Jieqiang
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform. - Govind
 +
*** IPsec input node optimization work in progress - Zach & Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** IPSec unit test - make test new cases implementation
 +
***** Make test cases for IPSec policy mode - Done, included in Govind's patch, waiting for maintainer review - Zach
 +
****** Add/Remove/Add+Remove+Readd/Hash collisions/Multiple interfaces & rules
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** perfmon CMN-600 investigating - Zach
 +
*** VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
 +
*** Plan to upstream perfmon plugin - resolving review comments - Zach
 +
*** IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
 +
*** IPSec decryption / input node - Zach
 +
 
 +
'''05/18/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Zachary Leaf
 +
** Tianyu Li
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
 +
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
 +
****** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
******* New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
******* IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
 +
****** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
****** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308
 +
****** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready
 +
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
 +
** VPP Path
 +
*** Voting and working fine.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** Try the new version of DPDK but it does not help
 +
****** Contact Intel devs for the possible advice
 +
****** Workaround may impact too much to all test cases
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
*** Lab move for the fd.io lab
 +
**** Lab moving started stage 2, moved part of the servers to make sure ci service not down.
 +
**** Lab move is done, some issues with taishan testbed
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
*** Plan to benchmark gcc-10 vs clang-12
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
** SVE patch ready and upstreamed, under review - Lijian
 +
*** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
*** Review memif patch
 +
**** Functional bug related to C11 atomics has been resolved by VPP maintainer.
 +
*** VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case. - Jieqiang
 +
*** Make test cases for IPSec policy mode - Zach
 +
**** Add/Remove/Add+Remove+Readd/Hash collisions/Multiple interfaces & rules - Add more test cases
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Performance improvement using loop unrolling for memif nodes
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform. - Govind
 +
*** IPsec input node optimization work in progress - Zach & Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** IPSec unit test - make test new cases implementation
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** perfmon CMN-600 investigating - Zach
 +
*** VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
 +
*** Plan to upstream perfmon plugin - resolving review comments - Zach
 +
*** IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
 +
*** IPSec decryption / input node - Zach
 +
 
 +
'''05/11/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Zachary Leaf
 +
** Tianyu Li
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
 +
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
 +
****** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
******* New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
******* IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
 +
****** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
 +
** VPP Path
 +
*** Voting and working fine.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** Try the new version of DPDK but it does not help
 +
****** Contact Intel devs for the possible advice
 +
****** Workaround may impact too much to all test cases
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
*** Lab move for the fd.io lab
 +
**** Lab moving started stage 2, moved part of the servers to make sure ci service not down.
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
**** Almost all except performance testbed, which will be moved this week, everything is smooth so far.
 +
**** ubuntu 1804 -> 2004
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
** SVE patch ready and upstreamed, under review - Lijian
 +
*** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
*** Review memif patch
 +
*** VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case.
 +
*** Make test cases for IPSec policy mode - Jieqiang
 +
** Review memif test cases/memif cases
 +
** Finished coding of SVE string library, bihash key compare functions
 +
*** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
** SVE unit testing based on test_vec, fix test_vec issues
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
***** Will do readout presentation with extended people - Tianyu
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Performance improvement using loop unrolling for memif nodes
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform. - Govind
 +
*** IPsec input node optimization work in progress - Zach & Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** IPSec unit test - make test new cases implementation
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** Using startup parameter to enable the IPsec flow cache feature
 +
** Adding Python test case to test IPSec node behavior - Jieqiang
 +
** perfmon CMN-600 investigating - Zach
 +
*** VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
 +
*** Plan to upstream perfmon plugin - resolving review comments - Zach
 +
*** IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
 +
*** IPSec decryption / input node - Zach
 +
 
 +
'''04/27/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
 +
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
 +
****** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
 +
** VPP Path
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
***** https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Its voting right is enabled on Arm.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** Try the new version of DPDK but it does not help
 +
****** Contact Intel devs for the possible advice
 +
****** Workaround may impact too much to all test cases
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
*** Lab move for the fd.io lab
 +
**** Lab moving started stage 2, moved part of the servers to make sure ci service not down.
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
** SVE patch ready and upstreamed, under review - Lijian
 +
*** Make test cases for IPSec policy mode - Jieqiang
 +
** Review memif test cases/memif cases
 +
** Finished coding of SVE string library, bihash key compare functions
 +
*** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
** SVE unit testing based on test_vec, fix test_vec issues
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
***** Will do readout presentation with extended people - Tianyu
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Performance improvement using loop unrolling for memif nodes
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform. - Govind
 +
*** IPsec input node optimization work in progress - Zach & Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** IPSec unit test - make test new cases implementation
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** Using startup parameter to enable the IPsec flow cache feature
 +
** Adding Python test case to test IPSec node behavior - Jieqiang
 +
** perfmon CMN-600 investigating - Zach
 +
*** Plan to upstream perfmon plugin - resolving review comments - Zach
 +
*** IPSec decryption / input node - Zach
 +
 
 +
'''04/13/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
 +
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
 +
****** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Some issues occurred during the upgrade.
 +
***** Patch to resolve the building error of DPDK on 3n-tsh testbed.
 +
***** Root cause is the change of build system of DPDK on 3n-tsh related to SOC id detection.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
 +
** VPP Path
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
***** https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Its voting right is enabled on Arm.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** Try the new version of DPDK but it does not help
 +
****** Contact Intel devs for the possible advice
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
*** Make test cases for IPSec policy mode - Jieqiang
 +
** Review memif test cases/memif cases
 +
** Finished coding of SVE string library, bihash key compare functions
 +
*** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
** SVE unit testing based on test_vec, fix test_vec issues
 +
** Test template update - Jieqiang
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
***** Will do readout presentation with extended people - Tianyu
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** Using startup parameter to enable the IPsec flow cache feature
 +
** Adding Python test case to test IPSec node behavior - Jieqiang
 +
** perfmon CMN-600 investigating - Zach
 +
*** Plan to upstream perfmon plugin - resolving review comments - Zach
 +
*** IPSec decryption - Zach
 +
 
 +
 
 +
'''03/30/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
****** 2 node IPsec SPD policy test case patch is ready, starting with 1 and 1k tunnels. (40, 400 tunnels in seperate patch)
 +
****** https://gerrit.fd.io/r/c/csit/+/31605
 +
****** Fix the wrong CLI commands but configuration still has problems.
 +
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Some issues occurred during the upgrade.
 +
***** Patch to resolve the building error of DPDK on arm testbed.(taishan dpdk cases still have issues, investigating)
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
 +
** VPP Path
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
***** https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Its voting right is enabled on Arm.
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** Try the new version of DPDK but it does not help
 +
****** Contact Intel devs for the possible advice
 +
**** Will try to reproduce the issue with x86 servers.
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Review memif test cases/memif cases
 +
** Finished coding of SVE string library, bihash key compare functions
 +
*** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
** SVE unit testing based on test_vec, fix test_vec issues
 +
** Test template update
 +
** SVE unit test in qemu-vm, met compiling issue, investigating
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
***** Will do readout presentation with extended people - Tianyu
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Prepare the memif readout - Tianyu
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** Using startup parameter to enable the IPsec flow cache feature
 +
** Discuss with jieqiang adding python test case to test ipsec node behavior
 +
** perfmon CMN-600 investigating - Zach
 +
 
 +
'''03/16/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
*** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
 +
** VPP Path
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Its voting right is enabled on Arm.
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version.
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
**** Will try to reproduce the issue with x86 servers.
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
*** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
***** Will do readout presentation with extented people - Tianyu
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''03/09/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
**** CSIT official release 21.01 is available
 +
***** https://docs.fd.io/csit/rls2101/report/
 +
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
 +
******* 20.09 vs 21.01 show run vector per call drop from 256 to 200 - need to check dpdk version changes
 +
******* Perf drop only observed for VM cases
 +
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP.
 +
****** Check the number for CSIT 2101 release 
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** Tests are running fine
 +
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
 +
****** Suitable time to run release testing on 2n-tx2 testbed.
 +
***** Will investigate IPSec test cases on 2n-tx2 - Juraj
 +
****** Maintainer confirm that it is feasible
 +
******* Patch merged, https://gerrit.fd.io/r/c/csit/+/31309 p
 +
******* Patch created for daily running https://gerrit.fd.io/r/c/csit/+/31478
 +
******* crypto tests will be enabled on daily and report Jenkins job
 +
******* IPv6 / policy mode crypto test cases to be investigated and added
 +
****** Add memif test case to 2n-tx2 once the release testing is done.
 +
******* Take ~ 1 or 1.5 hour for one round of memif testing.
 +
**** release testing for 2n-tx2
 +
***** Performance data added to daily trending page
 +
****** https://docs.fd.io/csit/master/trending/introduction/dashboard.html#n-tx2
 +
****** Test cases include L2/IPv4/IPv6/Classifier/ACL
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will not be supported.
 +
**** CentOS-8 will be supported by the end of this year by Redhat.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Job is enabled https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Running per patch and voting right is enabled
 +
***** Maintainer ask for more servers for sake of redundancy
 +
****** Sync with Dave for ARM server requirement
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
**** Dependency on maintainers to fix this issue
 +
**** Intel will ship a new NIC with latest firmware
 +
***** Shipment takes a long time empirically
 +
****** NIC has been shipped to vexxhost, wait for NIC arrival.
 +
***** Try to reproduce the issue on this NIC on Arm platform
 +
***** Updating firmware on the current NIC is risky
 +
**** Voting rights will be enabled once this issue is fixed
 +
****** Maintainer raised the ticket to get intel people involved
 +
****** Will not update the firmware because the release testing is ongoing
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
**** Will show Arm roadmap in the next TSC meeting
 +
**** Proposals have been sent to VPP maintainer on verifying SVE/SVE2
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
 
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
**** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
**** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
**** Analysis of benchmarking results for Ampere Altra
 +
***** A lot of context switches occur on Ampere Altra compared to N1SDP
 +
***** perf tools used to capture the perf events
 +
***** Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
 +
***** Remove interrupts on altra but no performance improvement seen
 +
***** instruction cache misses are higher on altra than N1
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
** VPP compiling error on CentOS 7 - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/31421
 +
**** CentOS 7 build issue has been fixed
 +
*** Developing NEON wrapper to SVE 128/256bit on qemu
 +
 
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
**** perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''02/23/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
**** CSIT official release 21.01 is available
 +
***** https://docs.fd.io/csit/rls2101/report/
 +
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
 +
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP.
 +
****** Check the number for CSIT 2101 release 
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** Tests are running fine
 +
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
 +
****** Suitable time to run release testing on 2n-tx2 testbed.
 +
****** Will investigate IPSec test cases on 2n-tx2 - Juraj
 +
******* Maintainer confirm that it is feasible
 +
******* Patch created, https://gerrit.fd.io/r/c/csit/+/31309
 +
******* crypto tests will be enabled on daily and report Jenkins job
 +
****** Add memif test case to 2n-tx2 once the release testing is done.
 +
******* Take ~ 1 or 1.5 hour for one round of memif testing.
 +
**** release testing for 2n-tx2
 +
***** Performance data added to daily trending page
 +
****** https://docs.fd.io/csit/master/trending/introduction/dashboard.html#n-tx2
 +
****** Test cases include L2/IPv4/IPv6/Classifier/ACL
 +
****** Release testing done for 2n-tx2, ongoing for 3n-tsh(due to next week)
 +
****** Release report plan to be published on 10th Feb
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Job is enabled https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Running per patch and voting right is enabled
 +
***** Maintainer ask for more servers for sake of redundancy
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
**** Dependency on maintainers to fix this issue
 +
**** Intel will ship a new NIC with latest firmware
 +
***** Shipment takes a long time empirically
 +
***** Try to reproduce the issue on this NIC on Arm platform
 +
***** Updating firmware on the current NIC is risky
 +
**** Voting rights will be enabled once this issue is fixed
 +
****** Maintainer raised the ticket to get intel people involved
 +
****** Will not update the firmware because the release testing is ongoing
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features inside QEMU docker - Lijian
 +
**** Latest VPP binary crash on the QEMU docker
 +
***** System call fails inside QEMU docker when running VPP
 +
**** Verify SVE/SVE2 features inside ARM QEMU VM
 +
**** VPP maintainers want real hardware to verify SVE code
 +
***** This solution will be abandoned.
 +
**** 'make test' execution is slow
 +
**** Sync with DPDK team/VPP community to decide the solution
 +
**** Proposals have been sent to VPP maintainer on verifying SVE/SVE2
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
 
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
**** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
**** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
**** Analysis of benchmarking results for Ampere Altra
 +
***** A lot of context switches occur on Ampere Altra compared to N1SDP
 +
***** perf tools used to capture the perf events
 +
***** Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
 +
***** Remove interrupts on altra but no performance improvement seen
 +
***** instruction cache misses are higher on altra than N1
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Investigate VPP memif test cases in VPP CSIT - Jieqiang
 +
*** Investigate VPP use cases proposals in containers - Tianyu
 +
**** 3 use cases are investigated.
 +
**** Will explore the memif logic and share the progress.
 +
**** Will share the link on details about how to run VPP in container.
 +
*** Investigate VPP agent usage - Tianyu
 +
**** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
** VPP compiling error on CentOS 7 - Jieqiang
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''02/09/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
**** CSIT official release 20.09 is available
 +
***** https://docs.fd.io/csit/rls2009/report/
 +
**** CSIT official release 21.01 is ongoing
 +
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
 +
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP.
 +
****** Check the number for CSIT 2101 release 
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** Tests are running fine
 +
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
 +
****** Suitable time to run release testing on 2n-tx2 testbed.
 +
****** Will investigate IPSec test cases on 2n-tx2 - Juraj
 +
******* Maintainer confirm that it is feasible
 +
****** Add memif test case to 2n-tx2 once the release testing is done.
 +
**** release testing for 2n-tx2
 +
***** Performance data added to daily trending page
 +
****** https://docs.fd.io/csit/master/trending/introduction/dashboard.html#n-tx2
 +
****** Test cases include L2/IPv4/IPv6/Classifier/ACL
 +
****** Release testing done for 2n-tx2, ongoing for 3n-tsh(due to next week)
 +
****** Release report plan to be published on 10th Feb
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Job is enabled https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Running per patch and voting right is enabled
 +
***** Will ask Dave if he needs help with testing CentOS-8 on Arm - Juraj.
 +
****** Will verify the image uploaded by Dave if it is ready.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
**** Jenkins job to verify runs fine but slow
 +
***** https://gerrit.fd.io/r/c/ci-management/+/31083
 +
***** Maintainer ask for more servers for sake of redundancy
 +
**** 'make test' failure on ubuntu 20.04 AARCH64
 +
***** Dave has sent email for the details
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
**** Dependency on maintainers to fix this issue
 +
**** Intel will ship a new NIC with latest firmware
 +
***** Shippment takes a long time empirically
 +
***** Try to reproduce the issue on this NIC on Arm platform
 +
***** Updating firmware on the current NIC is risky
 +
**** Voting rights will be enabled once this issue is fixed
 +
****** Maintainer raised the ticket to get intel people involved
 +
****** Will not update the firmware because the release testing is ongoing
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features inside QEMU docker - Lijian
 +
**** Latest VPP binary crash on the QEMU docker
 +
***** System call fails inside QEMU docker when running VPP
 +
**** Verify SVE/SVE2 features inside ARM QEMU VM
 +
**** 'make test' execution is slow
 +
**** Sync with DPDK team/VPP community to decide the solution
 +
**** Proposals have been sent to VPP maintainer on verifying SVE/SVE2
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
**** Analysis of benchmarking results for Ampere Altra
 +
***** A lot of context switches occur on Ampere Altra compared to N1SDP
 +
***** perf tools used to capture the perf events
 +
***** Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
 +
***** Remove interrupts on altra but no performance improvement seen
 +
***** instruction cache misses are higher on altra than N1
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Investigate VPP memif test cases in VPP CSIT - Jieqiang
 +
*** Investigate VPP use cases proposals in containers - Tianyu
 +
**** 3 use cases are investigated.
 +
**** Will explore the memif logic and share the progress.
 +
**** Will share the link on details about how to run VPP in container.
 +
*** Investigate VPP agent usage - Tianyu
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''02/02/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
**** CSIT official release 20.09 is available
 +
***** https://docs.fd.io/csit/rls2009/report/
 +
**** CSIT official release 21.01 is ongoing
 +
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
 +
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP.
 +
****** Check the number for CSIT 2101 release 
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** Tests are running fine
 +
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
 +
****** Suitable time to run release testing on 2n-tx2 testbed.
 +
****** Will investigate IPSec test cases on 2n-tx2 - Juraj
 +
******* Maintainer confirm that it is feasible
 +
****** Add memif test case to 2n-tx2 once the release testing is done.
 +
**** release testing for 2n-tx2
 +
***** Performance data added to daily trending page
 +
****** https://docs.fd.io/csit/master/trending/introduction/dashboard.html#n-tx2
 +
****** Test cases include L2/IPv4/IPv6/Classifier/ACL
 +
****** Release report plan to be published on 10th Feb
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Will ask Dave if he needs help with testing CentOS-8 on Arm - Juraj.
 +
****** Will verify the image uploaded by Dave if it is ready.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
**** Jenkins job to verify runs fine but slow
 +
***** https://gerrit.fd.io/r/c/ci-management/+/31083
 +
***** Maintainer ask for more servers for sake of redundancy
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
**** Dependency on maintainers to fix this issue
 +
**** Voting rights will be enabled once this issue is fixed
 +
***** Implementation is ready, and will do test it with actual patches.
 +
***** Apply file locking mechanism to allow that only one VPP instances are running.
 +
****** https://gerrit.fd.io/r/c/csit/+/30425
 +
****** Patches are under review
 +
****** Maintainer raised the ticket to get intel people involved
 +
****** Will not update the firmware because the release testing is ongoing
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features inside QEMU docker - Lijian
 +
**** Latest VPP binary crash on the QEMU docker
 +
***** System call fails inside QEMU docker when running VPP
 +
**** Verify SVE/SVE2 features inside ARM QEMU VM
 +
**** 'make test' execution is slow
 +
**** Sync with DPDK team/VPP community to decide the solution
 +
**** Proposals have been sent to VPP maintainer on verifying SVE/SVE2
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
**** Analysis of benchmarking results for Ampere Altra
 +
***** A lot of context switches occur on Ampere Altra compared to N1SDP
 +
***** perf tools used to capture the perf events
 +
***** Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
 +
***** Remove interrupts on altra but no performance improvement seen
 +
***** instruction cache misses are higher on altra than N1
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Investigate VPP memif test cases in VPP CSIT - Jieqiang
 +
*** Investigate VPP use cases proposals in containers - Tianyu
 +
**** 3 use cases are investigated.
 +
**** Will explore the memif logic and share the progress.
 +
**** Will share the link on details about how to run VPP in container.
 +
*** Investigate VPP agent usage - Tianyu
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
'''01/19/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
**** CSIT official release 20.09 is available
 +
***** https://docs.fd.io/csit/rls2009/report/
 +
***** Jieqiang will compare the performance data with release 20.09
 +
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
 +
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP.
 +
****** Check the number for CSIT 2101 release 
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Hardware configurations/wiring are done; Physical connection to the TG is done.
 +
**** almost done, two steps need to be done
 +
***** start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** Take the execution time into consideration if we want run release testing on 2n-thx2.
 +
***** It takes 9 hours to finish the one round testing.
 +
***** Tests are running fine
 +
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
 +
****** Suitable time to run release testing on 2n-tx2 testbed.
 +
****** Will investigate IPSec test cases on 2n-tx2 - Juraj
 +
****** Add memif test case to 2n-tx2 once the release testing is done.
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Will ask Dave if he needs help with testing CentOS-8 on Arm - Juraj.
 +
****** Will verify the image uploaded by Dave if it is ready.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
***** Implementation is ready, and will do test it with actual patches.
 +
***** Apply file locking mechanism to allow that only one VPP instances are running.
 +
****** https://gerrit.fd.io/r/c/csit/+/30425
 +
****** Patches are under review
 +
****** Machiek raised the ticket to get intel people involved
 +
****** Will not update the firmaware because the release testing is ongoing
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features inside QEMU docker
 +
**** Latest VPP binary crash on the QEMU docker - Lijian
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
**** Analysis of benchmarking results for Ampere Altra
 +
***** A lot of context switches occur on Ampere Altra compared to N1SDP
 +
***** perf tools used to capture the perf events
 +
***** Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Investigate VPP memif test cases in VPP CSIT - Jieqiang
 +
*** Investigate VPP use cases proposals in containers - Tianyu
 +
**** 3 use cases are investigated.
 +
**** Will explore the memif logic and share the progress.
 +
**** Will share the link on details about how to run VPP in container.
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
 
 +
 
 +
'''01/05/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
**** CSIT official release 20.09 is available
 +
***** https://docs.fd.io/csit/rls2009/report/
 +
***** Jieqiang will compare the performance data with release 20.09
 +
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
 +
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP. 
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Hardware configurations/wiring are done; Physical connection to the TG is done.
 +
**** almost done, two steps need to be done
 +
***** start with basic L2/L3/IPSec/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** Take the execution time into consideration if we want run release testing on 2n-thx2.
 +
***** Tests are running fine
 +
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
 +
****** Suitable time to run release testing on 2n-tx2 testbed.
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Will ask Dave if he needs help with testing CentOS-8 on Arm - Juraj.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
***** Implementation is ready, and will do test it with actual patches.
 +
***** Apply file locking mechanism to allow that only one VPP instances are running.
 +
****** https://gerrit.fd.io/r/c/csit/+/30425
 +
****** Patches are under review
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features inside QEMU docker
 +
**** Latest VPP binary crash on the QEMU docker - Lijian
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Investigate VPP test cases in VPP CSIT - Jieqiang
 +
*** Investigate VPP use cases proposals in containers - Tianyu
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''12/22/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
** Will cancel the meeting on Dec 29th;
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
**** CSIT official release 20.09 is available
 +
***** https://docs.fd.io/csit/rls2009/report/
 +
***** Jieqiang will compare the performance data with release 20.05
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Hardware configurations/wiring are done; Physical connection to the TG is done.
 +
**** almost done, two steps need to be done
 +
***** codes to update Jenkins job needs to be merged
 +
***** start with basic L2/L3/IPSec/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** Take the execution time into consideration if we want run release testing on 2n-thx2.
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Will ask Dave if he needs help with testing CentOS-8 on Arm - Juraj.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
***** Implementation is ready, and will do test it with actual patches.
 +
***** Apply file locking mechanism to allow that only one VPP instances are running.
 +
****** https://gerrit.fd.io/r/c/csit/+/30425
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** LF will provide QSFP+ fiber switch for FD.io lab.
 +
**** Basically done. LF just procured the existing fiber switch currently rent by Arm in FD.io lab.
 +
**** Send the progress to relavent people in Arm - Lijian
 +
**** Confirm with Tina to ensure Arm is not charged - Lijian
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features on VPP CSIT
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
 
 +
'''12/15/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
** Will cancel the meeting on Dec 29th;
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
**** CSIT official release 20.09 is available
 +
***** https://docs.fd.io/csit/rls2009/report/
 +
***** Jieqiang will compare the performance data with release 20.05
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Hardware configurations/wiring are done; Physical connection to the TG is done.
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
***** Implementation is ready, and will do test it with actual patches.
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maitainers
 +
*** LF will provide QSFP+ fiber switch for FD.io lab.
 +
**** Basically done. LF just procured the existing fiber switch currently rent by Arm in FD.io lab.
 +
**** Send the progress to relavent people in Arm - Lijian
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
 
 +
'''12/08/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
*** Use the spare TX2 server as 2-node topology performance test-bed.
 +
**** Hardware configurations/wiring are done.
 +
**** Physical connection to the TG is done.
 +
**** Software installation for the perf tests is pending.
 +
**** Execution time is much slower due to thunderx
 +
***** Code changes related to SSH calls speed up 4x.
 +
** VPP Path
 +
*** Dave will add CentOS-8 Jenkins on Arm job
 +
*** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** Working with VPP/DPDK/Intel to root cause this issue. - Juraj
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maitainers
 +
*** LF will provide QSFP+ fiber switch for FD.io lab.
 +
**** Vexxhost just has a spare one, and LF will buy it for FD.io lab, which will probably happen this month.
 +
*** N1SDP shipment to FD.io
 +
**** Govind will track the status
 +
*** CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
 +
**** Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
 +
**** Arm is required to present Arm achievement and plan to TSC.
 +
***** Govind will prepare the slides
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** SOC id will be available on /proc entry starting from kernel version 5.9
 +
**** Will investigate the details - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Benchmarked cross-connect and TX queue is dropping packets
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** To optimize ethernet-input and avf-input node with NEON intrinsics
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals upstreamed
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** Have to repeat the testing in the future.
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Ampere altra server has some PCIe bugs.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''12/1/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/ - Done
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/ - Done
 +
*** 20% perf-drop with L2 learning 1Mx flows, 4T4C, in release-2005
 +
**** Issue caused by - https://gerrit.fd.io/r/c/vpp/+/26549 - Sync up with Lijian
 +
*** Perf data capture for CSIT official release is done, so MRR testing with Taishan server is resolved.
 +
**** Huge-pages are not configured on Taishan, or previous 4K huge-pages are not enough.
 +
***** The issues are gone with 32k huge pages configured on the Taishan servers.
 +
**** Some random failed test cases due to SSH connection failures.
 +
*** Use the spare TX2 server as 2-node topology performance test-bed.
 +
**** Hardware configurations/wiring are done.
 +
**** Physical connection to the TG is done.
 +
**** Software installation for the perf tests is pending.
 +
**** Execution time is much slower due to thunderx
 +
***** Code changes related to SSH calls speed up 4x.
 +
** VPP Path
 +
*** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022 - auto-generate docker image
 +
**** Will keep the CentOS 7 with master branch.
 +
** VPP Device
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
*** LF will provide QSFP+ fiber switch for FD.io lab.
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** To enable voting right for the VPP device jobs. - Juraj
 +
***** Failed tests due to sw_interface_dump api issue. - Juraj
 +
**** VPP device job is unstable
 +
***** Race condition occurs when multiple VPP instances are starting.
 +
***** Will try to update the i40e driver & firmware.
 +
*** N1SDP shipment to FD.io
 +
**** Govind will update the shippment status to Juraj and Machiek.
 +
**** Will still have to ship N1SDP to FD.io lab, and Machiek confirmed that there will be enough rack space.
 +
*** CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
 +
**** Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
 +
**** Arm is required to present Arm achievement and plan to TSC.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** SOC id will be available on /proc entry starting from kernel version 5.9
 +
**** Will investigate the details - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** To optimize ethernet-input and avf-input node with NEON intrinsics
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 proposal
 +
*** Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
 +
*** Patches are upstreamed for comments
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Ampere altra server has some PCIe bugs.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches with ipsec-out node
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''11/24/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
*** 20% perf-drop with L2 learning 1Mx flows, 4T4C, in release-2005
 +
**** Issue caused by - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** Perf data capture for CSIT official release is done, so MRR testing with Taishan server is resolved.
 +
**** Huge-pages are not configured on Taishan, or previous 4K huge-pages are not enough.
 +
*** Use the spare TX2 server as 2-node topology performance test-bed.
 +
**** Hardware configurations/wiring are done.
 +
** VPP Path
 +
*** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022 - auto-generate docker image
 +
** VPP Device
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
*** LF will provide QSFP+ fiber switch for FD.io lab.
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** To enable voting right for the VPP device jobs. - Juraj
 +
***** Failed tests due to sw_interface_dump api issue. - Juraj
 +
*** N1SDP shipment to FD.io
 +
**** Govind will update the shippment status to Juraj and Machiek.
 +
**** Will still have to ship N1SDP to FD.io lab, and Machiek confirmed that there will be enough rack space.
 +
*** CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
 +
**** Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
 +
**** Arm is required to present Arm achievement and plan to TSC.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** SOC id will be available on /proc entry starting from kernel version 5.9
 +
**** Will investigate the details - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** To optimize ethernet-input and avf-input node with NEON intrinsics
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 proposal
 +
*** Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
 +
*** Patches are upstreamed for comments
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Ampere altra server has some PCIe bugs.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches with ipsec-out node
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''11/17/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
*** 20% perf-drop with L2 learning 1Mx flows, 4T4C, in release-2005
 +
**** Issue caused by - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** Use the spare TX2 server as 2-node topology performance test-bed.
 +
**** Juraj to check with Peter about the feasibility.
 +
** VPP Path
 +
*** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022 - auto-generate docker image
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** To enable voting right for the VPP device jobs. - Juraj
 +
***** Failed tests due to sw_interface_dump api issue. - Juraj
 +
*** N1SDP shipment to FD.io
 +
**** Will still have to ship N1SDP to FD.io lab, and Machiek confirmed that there will be enough rack space.
 +
*** CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
 +
**** Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
 +
**** Arm is required to present Arm achievement and plan to TSC.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** SOC id will be available on /proc entry starting from kernel version 5.9
 +
**** Will investigate the details - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** To optimize ethernet-input and avf-input node with NEON intrinsics
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 proposal
 +
*** Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
 +
*** Patches are upstreamed for comments
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Ampere altra server has some PCIe bugs.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches with ipsec-out node
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''11/10/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
***** Already done by juraj, the data is published on CSIT 2009 report.
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
**** Repeat tests on local N1SDP and cascade server. - Jieqiang
 +
**** Repeat the test case with latest master branch. - Jieqiang
 +
**** The patch introduced this perf drop need to be analyzed. - Jieqiang, Lijian
 +
**** This patch needs to be analysed on VPP 2005 and 2001 releases. - Jieqiang, Lijian
 +
**** The perf drop rate is ~5-8% on latest VPP code compared to the original data.
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
**** The iterative jobs for VPP 2009 are still running.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-report-iterative-2009-3n-tsh/
 +
**** Still running for one more weeks.
 +
**** Still running for more time due to Jenkins issues like Jenkins restart.
 +
**** Daily performance jobs only run MRR tests, while iterative jobs run MRR tests and NDR/PDR tests, etc, which takes longer time.
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
**** Move the thx2 to the same rack for tg and install the same nic on tg.
 +
**** 1g NIC for management installed on thx2, but cannot be net-booted.
 +
***** Able to net-boot from the built-in 10G NIC.
 +
***** The tx2 has been moved to the same rack where the tg is located.
 +
***** Plan to set up the weekly perf tests on the new topo.
 +
**** Port the robotframe configuration steps for tsh testbeds from thx1 to thx2 to speed up perf tests. - Juraj
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022 automate the generation of docker builder images.
 +
**** Plan to drop the support for CentOS 7 from Dave.
 +
**** Tried Dave's patch to generate docker image on Arm and saw some errors. - Juraj
 +
**** Test arm centos7 jenkins builder image. - Juraj.
 +
**** Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
**** Figure out which host of two hosts to run the Jenkins job.
 +
**** Revert to old kernel version 4.15.0-55 to avoid AVF issue.
 +
**** AVF issue is common across the platform.
 +
***** Differences between avf driver versions may be the root cause of behavior changes.
 +
**** New VPP device job takes about 55 minutes to finish, which needs to be reduced to 45 minutes around.
 +
***** Python runs slower on new thx2 servers than 1-node skylake.
 +
***** Try new version of Python(such as 3.8) or split the device tests into two parts.
 +
***** Check how many CPUs get utilized for robot framework execution on thx2 server.
 +
***** Two thunderx2 are running fine right now and the VPP device jobs are almost done.
 +
***** Disabling hyperthreading on new thx2 will speed up the VPP device tests.
 +
***** Enable the voting right for the VPP device jobs. - Juraj
 +
****** Failed tests due to sw_interface_dump api issue. - Juraj
 +
*** N1SDP shippment to FD.io
 +
**** Get response from Maciek about the rack space and traffic generator availability.
 +
*** CSIT budget plan for 10g switch purchase in FD.io lab. - Juraj, Tina
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
*** Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
 +
*** Summarize the meeting minutes and action items. - Lijian
 +
*** SOC id will be available on /proc entry from kernel version 5.9 - Ljian, Honnappa
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
*** Start benchmarking AVF PMD driver in VPP on N1SDP.
 +
*** Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
 +
*** avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
 +
** SVE/SVE2 proposal
 +
*** Will send email to Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
*** No further comments from VPP community.
 +
*** Apply the SVE/SVE2 intrinsics on refactoring ethernet-input node. - Lijian
 +
**** SVE/SVE2 functionality to be tested on the new development platform.
 +
**** Verify SVE/SVE2 code changes on simulator.
 +
**** Try to run standalone SVE codes on the new FPGA platform.
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
 +
** Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Ampere altra server has some PCIe bugs.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
*** Find out the tuned configuration for cross connect test cases using AVF PMD driver.
 +
**** Figure out corresponding configurations in CSIT scripts.
 +
**** Repeat the ACL ingress SL test cases locally for N1SDP.
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Plans
 +
 
 +
'''11/03/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 
** Honnappa Nagarahalli
 
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
**** Repeat tests on local N1SDP and cascade server. - Jieqiang
 +
**** Repeat the test case with latest master branch. - Jieqiang
 +
**** The patch introduced this perf drop need to be analyzed. - Jieqiang, Lijian
 +
**** Look into the patch to get some ideas about the code changes. - Jieqiang, Lijian
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
**** The iterative jobs for VPP 2009 are still running.
 +
**** Still running for one more weeks.
 +
***** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-report-iterative-2009-3n-tsh/
 +
***** Daily performance jobs only run MRR tests, while iterative jobs run MRR tests and NDR/PDR tests, etc, which takes longer time.
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
**** Move the thx2 to the same rack for tg and install the same nic on tg.
 +
**** 1g NIC for management installed on thx2, but cannnot be net-booted.
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022 automate the generation of docker builder images.
 +
**** Test arm centos7 jenkins builder image. - Juraj.
 +
**** Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
**** Figure out which host of two hosts to run the Jenkins job.
 +
**** Revert to old kernel version 4.15.0-55 to avoid AVF issue.
 +
**** AVF issue is common across the platform.
 +
***** Differences between avf driver versions may be the root cause of behavior changes.
 +
**** New VPP device job takes about 55 minutes to finish, which needs to be reduced to 45 minutes around.
 +
***** Python runs slower on new thx2 servers than 1-node skylake.
 +
***** Try new version of Python(such as 3.8) or split the device tests into two parts.
 +
***** Check how many CPUs get utilized for robot framework execution on thx2 server.
 +
***** Two thunderx2 are running fine right now and the VPP device jobs are almost done.
 +
*** N1SDP shippment to FD.io
 +
**** Get response from Machiek about the rack space and traffic generator avalability.
 +
*** CSIT budget plan for 10g switch purchase in FD.io lab. - Juraj, Tina
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
*** Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
 +
*** Summarize the meeting minutes and action items. - Lijian
 +
*** SOC id will be available on /proc entry from kernel version 5.9 - Ljian, Honnappa
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
*** Start benchmarking AVF PMD driver in VPP on N1SDP.
 +
*** Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
 +
*** avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
 +
** SVE/SVE2 proposal
 +
*** Will send email to Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
*** No further comments from VPP community.
 +
*** Apply the SVE/SVE2 intrinsics on refractoring ethernet-input node. - Lijian
 +
**** SVE/SVE2 functionality to be tested on the new development platform.
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
 +
** Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
*** Find out the tuned configuration for cross connect test cases using AVF PMD driver.
 +
**** Figure out corresponding configurations in CSIT scripts
 +
**** Repeat the ACL ingress SL test cases locally for N1SDP.
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind.
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
** Plans
 +
 +
'''10/27/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 
** Tina Tsou
 
** Tina Tsou
** Andrew Pinski
+
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
**** Repeat tests on local N1SDP and cascade server. - Jieqiang
 +
**** Look into the patch to get some ideas about the code changes. - Jieqiang, Lijian
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
**** The iterative jobs for VPP 2009 are still running.
 +
**** Still running for one or two weeks.
 +
***** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-report-iterative-2009-3n-tsh/
 +
***** Daily performance jobs only run MRR tests, while iterative jobs run MRR tests and NDR/PDR tests, etc, which takes longer time.
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
**** Move the thx2 to the same rack for tg and install the same nic on tg.
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
**** Figure out which host of two hosts to run the Jenkins job.
 +
**** Revert to old kernel version 4.15.0-55 to aviod AVF issue.
 +
***** Differences between avf driver versions may be the root cause of behavior changes.
 +
**** New VPP device job takes about 55 minutes to finish, which needs to be reduced to 40 minutes around.
 +
***** Python runs slower on new thx2 servers than 1-node skylake.
 +
***** Try new version of Python(such as 3.8) or split the device tests into two parts.
 +
***** Check how many CPUs get utilized for robot framework execution on thx2 server.
 +
 
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
*** Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
 +
*** Summarize the meeting minutes and action items. - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
*** Start benchmarking AVF PMD driver in VPP on N1SDP.
 +
*** Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
 +
*** avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
 +
** SVE/SVE2 proposal
 +
*** Will send email to Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
*** No further comments from VPP community.
 +
*** Apply the SVE/SVE2 on ethernet-input node. - Lijian
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server. - Jieqiang
 +
** Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
**** Figure out corresponding configurations in CSIT scripts
 +
**** Repeat the ACL ingress SL test cases locally for N1SDP.
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
** Plans
 +
 
 +
'''10/20/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 
** Juraj Linkes
 
** Juraj Linkes
** Nitin Saxena
+
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
**** The iterative jobs for VPP 2009 are still running.
 +
***** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-report-iterative-2009-3n-tsh/
 +
***** Daily performance jobs only run MRR tests, while iterative jobs run MRR tests and NDR/PDR tests and etc, which takes longer time.
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
 +
**** Errors happen when running latest VPP debug image, which was introduced by https://gerrit.fd.io/r/c/vpp/+/29490 - Lijian
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
**** Figure out which host of two hosts to run the Jenkins job.
 +
**** Two failed test cases related to AVF plugin.
 +
***** The root cause is the newer kernel version - 4.15.0-118-generic fails, 4.15.0-72-generic works.
 +
***** Downgrade the kernel version to 4.15.0-72-generic and continue the VPP device testing.
 +
***** Try the same experiment on X86 to see if this issue is arm-specific or not. - Juraj
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
*** Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
*** Start benchmarking AVF PMD driver in VPP on N1SDP.
 +
*** Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
 +
** SVE/SVE2 proposal
 +
*** Will send email to Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
*** No further comments from VPP community.
 +
** Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
 +
*** Finished the benchmarking and shared the data to team. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
**** Figure out corresponding configurations in CSIT scripts
 +
**** Repeat the ACL ingress SL test cases locally for N1SDP.
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
** Plans
  
* Action Items - Last Week
+
'''10/13/2020'''
** Sirshak: To update LF RT #54919 to follow up on cavium-2 status. - Not Needed as cavium-2 is present.
+
* Attendees
** Sirshak: Release Machine to EdK as soon as ThunderX is up. - Done
+
** Govindarajan Mohandoss
** Sirshak: Status on new ThunderXs: Will be decided after talks with Maciek. - Yet to decide.
+
** Lijian Zhang
** Sirshak: vm unresponsive issue. Tried again still got 27 errors for ipv4 handed over to Juraj for further investigation.
+
** Jieqiang Wang
** Sirshak: To ask about CSIT performance topology connection status. Didnt get time mostly discussing VIRL job.
+
** Juraj Linkes
** Sirshak: to add OS version to fd.io lab machines. -Done by somebody else.
+
** Tina Tsou
** Sirshak: to add Porting and Tuning section. Check with Honnappa
+
** Honnappa Nagarahalli
** Sirshak: to track arm master build failure. - Damjan has sent a fix.
+
* General
** Juraj: Access to fd.io lab. - Done.
+
* CSIT
** Khem: to create a Jira tkt to document automation task of CSIT. - Still Working on it.
+
** VPP Performance Test
** Khem: to reach out to Sanil(Huawei)regarding known Taishan problems with KVM. - No response from Sanil yet.
+
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
** Khem: BIOS patch for NUMA node numbering issue. - Khem to create LF RT tkt to do this in fd.io lab.
+
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** Nitin: VPP-1064 Support multiple cache line sizes per architecture. - Still in discussion with Dave.  
+
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
** Adarsh: openssl updates. VPP crashing.  
+
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
**** Figure out which host of two hosts to run the Jenkins job.
 +
**** Two failed test cases related to AVF plugin.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
*** Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
** SVE/SVE2 proposal
 +
*** Will send email Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
*** No further comments from VPP community.
 +
** Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
 +
*** Finished the benchmarking and shared the data to team.
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
**** Figure out corresponding configurations in CSIT scripts
 +
**** Repeat the ACL ingress SL test cases locally for N1SDP.
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
** Plans
  
 +
'''10/06/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs and other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
** SVE/SVE2 proposal
 +
*** Will send email Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
** Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
 +
*** Finished the benchmarking and shared the data to team.
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
*** Figure out corresponding configurations in CSIT scripts
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
** Plans
  
* fd.io lab
+
'''09/29/2020'''
** mcbin powering on ? Sirshak to create LF tkt. Reach out to Brian offline.
+
* Attendees
** Cavium-3 role. Make decision based on feedback Edk. Sirshak to check availability.  
+
** Govindarajan Mohandoss
** Sirshak to ask Brian to forward old LF tkt to JohnB.
+
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate Vendor CPUs with other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
** SVE/SVE2 proposal
 +
*** Will send email Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
** Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
 +
*** Finished the benchmarking and shared the data to team.
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
*** Figure out corresponding configurations in CSIT scripts
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
** Plans
  
 +
'''09/22/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
**
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
** VPP Path
 +
*** VexxHost will replace the faulty RAM with a new one, and get the expense reimbursed by LF.
 +
**** Issue is resolved by replugining back the previous RAM, and server is alive now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** Add CentOS-7 on Arm - Second step;
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
** VPP Device
 +
*** 3x SoftIron servers will be decommissioned directly to free rack space for 2x ThunderX2 servers.
 +
*** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
*** VexxHost people will setup the servers and provide IP connectivity.
 
* VPP
 
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate Vendor CPUs with other Perseus CPUs
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
** SVE/SVE2 proposal
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
** Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
*** Figure out corresponding configurations in CSIT scripts
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
** Plans
  
** ARMv8 crypto patch from Sachin related to dpdk_plugin only.
+
'''09/15/2020'''
** memcpy issue: going with memcpy and not hand crafted memcpy.
+
* Attendees
** clang compilation: Sirshak to upstream to clang related changes add all other aarch64 leads.
+
** Govindarajan Mohandoss
** Brian to use cache stashing result. Updates: No affects for VPP but there is improvement on musdk sample application.
+
** Juraj Linkes
** VPP-1267(Marvell dpdk patch mcbin): How to move forward based on Damjan's comments. Still discussing. Honnappa to provide some inputs next week.
+
** Jieqiang Wang
** VPP-1276(rpm issues aarch64): Not priorty. Status: No updates.
+
** Tina Tsou
** VPP-1284: TLS corruption on aarch64: Status(After Sachin's suggestion): Resolved. Might have performance implications but currently only possible solution. HN to look at this Jira Card in order talk to compiler team if needs be.
+
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
*** On L2 learning 1Mx flows, 4T4C, with release-2005, there is about 20% performance drop.
 +
**** The patch caused this issue has been identified.
 +
***** https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
** VPP Path
 +
*** LF will pay for the expense, and Vexhost has or will make the order for new RAM module.
 +
*** Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
*** Check with Juraj with the latest news about the faulty RAMs.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - first step;
 +
**** Add CentOS-7 on Arm will be second step.
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
**** Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
 +
** VPP Device
 +
*** We can decommission 3x SoftIron servers directly, but for the existing ThunderX2 servers, the decommission with it could be temporarily. We probably will reinstall it in the near future.
 +
*** Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
 +
** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
** Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
 +
** Budget plan for CSIT FD.io lab.
 +
*** We have enough servers for VPP path & device tests.
 +
*** We can ask the CSIT FD.io lab folks for saving rack space for arm servers.
 +
*** We may plan to send new advanced servers for perf tests in future but we won't mention the specific server type.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** Vendor CPU server enablement in VPP - Lijian
 +
*** Ready for internal review
 +
*** Will discuss with VPP maintainer
 +
** Investigate VPP Intel AVF driver - Lijian
 +
** SVE
 +
*** SVE intrinsics wrapper is done. Proposal patch is ready for review.
 +
*** ACLE, architecture, sve-sve2-programming-example
 +
*** SVE intrinsics is preferred.
 +
*** Share dpdk team with SVE knowledge.
 +
** Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
 +
*** Will repeat scalability testing on N1SDP.
 +
** Benchmark AVF driver btw Cascade Lake and N1SDP - Jieqiang
 +
*** Will investigate AVF drivers on Arm. - Lijian
 +
** Jieqiang will figure out performance data for 1x, 10Kx flows on Cascade Lake in CSIT.
 +
*** Conform if the system is same for the local dell server and cascade server in CSIT. - Jieqiang
 +
*** Check if there are any test cases with 1t1c/2t2c/4t4c configured for 2n-clx testbed in CSIT - Jieqiang
 +
*** Performance data; Configurations;
 +
** VM2VM
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Started system tuning on PMD TX direction.
 +
*** Investigate mempool configuration.
 +
*** Change the descriptor size by modifying the DPDK source code.
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
  
 +
'''09/08/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 
* CSIT
 
* CSIT
** TG status in fd.io lab and internal Huawei Lab. - Sirshak to discuss with Maciek. Khem to create LF tkt.
+
** VPP Performance Test
** CSIT-1019 (timeout of PacketVerifier.RxQueue is not working): Done.(Upstreamed Merged ?). Status: Merged.
+
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
** CSIT-1023 (Crypto Func Tests): VPP still crashing - Adarsh
+
*** On L2 learning 1Mx flows, 4T4C, with release-2005, there is about 20% performance drop.
** CSIT-1043 (Guest OS becomes unresponsive during CSIT): Sirshak tried pinning the VMs to phy CPUs but tests still failing. Juraj to take over.
+
**** The patch caused this issue has been identified.
** CSIT-990 (buildroot package) Brian Status: build issue with grub.  
+
***** https://gerrit.fd.io/r/c/vpp/+/26549
** Juraj: Estimate on moving CSIT Functional tests to make test. Maciek proposal does consider all the implications of letting go VIRL especially parallelization VIRL offers.
+
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
** VPP Path
 +
*** LF will pay for the expense, and Vexhost has or will make the order for new RAM module.
 +
*** Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - first step;
 +
**** Add CentOS-7 on Arm will be second step.
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
**** Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
 +
** VPP Device
 +
*** We can decommission 3x SoftIron servers directly, but for the existing ThunderX2 servers, the decommission with it could be temporarily. We probably will reinstall it in the near future.
 +
*** Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
 +
** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
** Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** SVE
 +
*** SVE intrinsics wrapper is done. Proposal patch is ready for review.
 +
*** ACLE, architecture, sve-sve2-programming-example
 +
*** SVE intrinsics is preferred.
 +
** Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
 +
*** Will repeat scalability testing on N1SDP.
 +
** Benchmark AVF driver btw Cascade Lake and N1SDP - Jieqiang
 +
*** Will investigate AVF drivers on Arm. - Lijian
 +
** Jieqiang will figure out performance data for 1x, 10Kx flows on Cascade Lake in CSIT.
 +
*** Performance data; Configurations;
 +
** VM2VM
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Started system tuning on PMD TX direction.
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
  
* Action Items - Next Week
+
'''09/01/2020'''
** Sirshak: To create a LF tkt for mcbin
+
* Attendees
** Sirshak: Follow up on cavium-3.
+
** Govindarajan Mohandoss
** Sirshak: Upstream clang changes.
+
** Juraj Linkes
** Honnappa: Provide inputs on how to proceed with comments on Marvell dpdk patch.
+
** Jieqiang Wang
** Honnappa: VPP-1284: To look at this patch to provide comments on performance implications of the fix
+
** Tina Tsou
** Juraj estimate moving CSIT functional tests to make test.
+
* General
** Sirshak: Discuss with Maciek and get a signoff for moving the x86 Hosts to arm rack.
+
* CSIT
** Khem: Create LF tkt for Performance Suite Topology Creation.
+
** VPP Performance Test
** Adarsh: Create a Jira to document Automation Task
+
*** On L2 learning 1Mx flows, 4T4C, with release-2005, there is about 20% performance drop.
** Khem: Follow up Sanil : Known taishan vm issues.
+
**** The patch caused this issue has been identified.
** Khem: LF tkt for Taishan BIOS updates.
+
***** https://gerrit.fd.io/r/c/vpp/+/26549
** Nitin: VPP-1064 multiple cache line size patch.:
+
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
** Adarsh: openssl updates.
+
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
** VPP Path
 +
*** Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
**** Seems plugin working RAMs into empty slots will resolve the problem.
 +
**** Juraj will send email to Machiek about the ownership of any FD.io lab servers, and who should pay for the charge.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
**** IPMI IP is configured via SSH Linux prompt. It's working fine now.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
**** Pending with Vexx host to proceed further.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
**** Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
 +
** VPP Device
 +
*** Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
 +
** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
** Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
 +
* VPP
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
*** gcc-10.1.0 has compiling errors with latest VPP source code.
 +
**** This issue is fixed by Jieqiang and available for internal review.
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** gcc-10 compiling issue is resolved and merged.
 +
** SVE
 +
*** ACLE, architecture, sve-sve2-programming-example
 +
*** SVE intrinsics is preferred.
 +
** Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
 +
** VM2VM
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Started system tuning on PMD TX direction.
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
  
'''5/22/2018'''
+
'''08/25/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** On L2 learning 1Mx flows, 4T4C, with release-2005, there is about 20% performance drop.
 +
**** Jieqiang is trying to narrow down the patch that causes the issue.
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
** VPP Path
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
**** Seems plugin working RAMs into empty slots will resolve the problem.
 +
**** Juraj will send email to Machiek about the ownership of any FD.io lab servers, and who should pay for the charge.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
**** IPMI IP is configured via SSH Linux prompt. It's working fine now.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
**** Pending with Vexx host to proceed further.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
**** Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
 +
** VPP Device
 +
*** Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
 +
** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
** Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
 +
* VPP
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
*** gcc-10.1.0 has compiling errors with latest VPP source code.
 +
**** This issue is fixed by Jieqiang and available for internal review.
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** SVE
 +
*** ACLE, architecture, sve-sve2-programming-example
 +
*** SVE intrinsics is preferred.
 +
** Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
 +
** VM2VM
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
 
 +
 
 +
'''08/18/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Jieqiang is investigating some performance drop (between 2005 and 2008 releases) cases on Taishan servers.
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
*** Jieqiang will share investigation report, but so far there is no apparent performance differences.
 +
**** Some 4T4C test cases on Taishan have obvious performance drop. Will compare the trending with x86 machines.
 +
** VPP Path
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
**** Pending with Vexx host to proceed further.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
**** Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
 +
** VPP Device
 +
** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
** Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
*** Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
 +
*** Investigating vlib_timer and timer wheel in VPP.
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
*** gcc-10.1.0 has compiling errors with latest VPP source code.
 +
**** This issue is fixed by Jieqiang and available for internal review.
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
 
 +
 
 +
'''08/11/2020'''
 
* Attendees
 
* Attendees
** Sirshak Das
 
** Stanislav Chlebec
 
** John Bromhead
 
** Sachin Saxena
 
** Khemendra Kumar
 
** Andy Wang
 
 
** Honnappa Nagarahalli
 
** Honnappa Nagarahalli
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 
** Tina Tsou
 
** Tina Tsou
** Andrew Pinski
+
** Lijian Zhang
** John Bromhead
+
** Filip Varga
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Jieqiang is investigating some performance drop cases on Taishan servers.
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
*** Jieqiang will share investigation report, but so far there is no apparent performance differences.
 +
**** Some 4T4C test cases on Taishan have obvious performance drop. Will compare the trending with x86 machines.
 +
** VPP Path
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
*** Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
 +
*** Investigating vlib_timer and timer wheel in VPP.
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
*** gcc-10.1.0 has compiling errors with latest VPP source code.
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
 
 +
'''08/04/2020'''
 +
* Attendees
 +
** Honnappa Nagarahalli
 +
** Govindarajan Mohandoss
 
** Juraj Linkes
 
** Juraj Linkes
** rkinsell
+
** Jieqiang Wang
** Nitin Saxena
+
** Tina Tsou
 +
** Lijian Zhang
 +
** Filip Varga
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
*** Jieqiang will share investigation report, but so far there is no apparent performance differences.
 +
**** Some 4T4C test cases on Taishan have obvious performance drop. Will compare the trending with x86 machines.
 +
** VPP Path
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
*** Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
 +
*** Investigating vlib_timer and timer wheel in VPP.
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
*** gcc-10.1.0 has compiling errors with latest VPP source code.
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
  
* Action Items - Last Week
 
** Sirshak: To update LF RT #54919 to follow up on cavium-2 status. - having troubles with login will sort it out today.
 
** Sirshak: Release Machine to EdK as soon as ThunderX is up: cavium-1 done cavium-2 still has issues with network connectivity.
 
** Sirshak: Status on new ThunderXs: Will be decided after talks with Maciek.
 
** Sirshak: vm unresponsive issue: No updates didnt get time to try will try this week.
 
** Sirshak: To ask about CSIT performance topology connection status. - TBD after call with Maciek.
 
** Nitin: VPP-1064 (Patch rejected by dave barach) Discuss cross compilation with Sachin. (Seperate or one unified Makefile). - No Updates.
 
** HN: memcpy benchmarking updates honnappa - 2 more tests to be done based on Ola's suggestion.
 
** Adarsh openssl issues: Will communicate with Sachin to get this resolved. Made changes based sachin's suggestions still issues to be resolved.
 
** Adarsh preparing a sheet updated with his progress on CSIT. - Added to the google sheets.
 
  
 +
'''07/28/2020'''
 +
* Attendees
 +
** Honnappa Nagarahalli
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
*** Jieqiang will share investigation report, but so far there is no apparent performance differences.
 +
*** VPP performance testing is running once a week.
 +
** VPP Path
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
*** Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
 +
*** Investigating vlib_timer and timer wheel in VPP.
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify  the patch.
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
  
* fd.io lab
+
'''07/21/2020'''
** cavium-2 follow up via LF #54919.
+
* Attendees
** Talk to Macek regrading TG physical placement on rack.
+
** Honnappa Nagarahalli
** Juraj : Needs access to fd.io lab. Tina to help Juraj with this.
+
** Govindarajan Mohandoss
** Juraj to send email to EdW to get access to fd.io lab.'
+
** Juraj Linkes
** Sirshak to add OS version to fd.io lab machines.
+
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
*** VPP performance testing is running once a week.
 +
** VPP Path
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Arm has
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
*** Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
 +
*** Investigating vlib_timer and timer wheel in VPP.
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
** N1SDP enablement. - Lijian
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify  the patch.
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
  
 +
 +
'''07/14/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VPP performance testing is running once a week.
 +
*** Community has started collecting performance data with these CSIT machines.
 +
** VPP Path
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
 +
**** Two of the three ThunderX1 servers cannot be accessed.
 +
**** Spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
 +
**** Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 
* VPP
 
* VPP
** HN->Nitin: Stick with memcpy. Nitin concern SIMD unit being idle with new GCC. Feedback from arm compiler team that vector instructions dont perform as expected on many platforms. 1ns better(dpdk_input node) if using SIMD memcpy on ThunderX. Nitin to try using restricted on non-SIMD memcpy.
+
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
** 1019: CSIT. Py-lint issues. Patch submitted. Khem to merge with Lucian's Patch.
+
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
** 1023: Khem, Adarsh to talk to Sachin to resolve openssl issue. - Sachin suggested some config changes resulted in VPP being unstable. Still working it out.
+
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
** 1043: No updates. Sirshak to investigate this and Khem to reach out to Sanil regarding known Taishan problems with KVM.
+
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
** 990: Brian Updates - Sirshak to get status offline.  
+
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
** 1267: l3fwd performance tuning: Status on Marvel patch: - No Updates. Nitin to submit his modified patch with -2.
+
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
** VPP-1276: Sachin facing issues with building rpm. - Any change in status ? No Updates. Low priorty for Sachin. Needs Help.  
+
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
** VPP-1284: TLS corruption: Dynamic linking related to Thread local storage. Logs recorded with this tkt.
+
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
** Sirshak to add Porting and Tuning section.
+
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
** Sirshak to track arm master build failure.
+
** N1SDP enablement. - Lijian
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
*** Investigating various No. of rx_q_bufs & tx_q_bufs
 +
*** Investigating various No. of vector size, and check its affection on throughput
 +
*** Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Investigating using SPE counters to profile ACL plugin bottle-neck
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''07/07/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 
* CSIT
 
* CSIT
** Adarsh openssl issues:  
+
** VPP Performance Test
** Performance Testing Khem : NUMA node numbering issue. Last Update: Still working internally. Status: Internal patch for BIOS.
+
*** VPP performance testing is running once a week.
** Khem: to create a Jira tkt to document automation task of CSIT.
+
*** Community has started collecting performance data with these CSIT machines.
** Khem : trex installation- Having x86 TG internally. Any luck ?
+
** VPP Path
** Brian to use cache stashing result. Updates:  
+
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
 +
**** Two of the three ThunderX1 servers cannot be accessed.
 +
**** Spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
 +
**** Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
*** Investigating various No. of rx_q_bufs & tx_q_bufs
 +
*** Investigating various No. of vector size, and check its affection on throughput
 +
*** Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Investigating using SPE counters to profile ACL plugin bottle-neck
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
  
* Action Items - Next Week
 
** Sirshak: To update LF RT #54919 to follow up on cavium-2 status.
 
** Sirshak: Release Machine to EdK as soon as ThunderX is up.
 
** Sirshak: Status on new ThunderXs: Will be decided after talks with Maciek.
 
** Sirshak: vm unresponsive issue: No updates didnt get time to try will try this week.
 
** Sirshak: To ask about CSIT performance topology connection status.
 
** Sirshak: to add OS version to fd.io lab machines.
 
** Sirshak: to add Porting and Tuning section.
 
** Sirshak: to track arm master build failure.
 
** Juraj: Access to fd.io lab.
 
** Nitin: VPP-1064 Support multiple cache line sizes per architecture.
 
** HN: memcpy benchmarking updates honnappa - 2 more tests to be done based on Ola's suggestion.
 
** Adarsh openssl updates
 
** Khem: to create a Jira tkt to document automation task of CSIT.
 
  
 +
'''06/30/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VPP performance testing is running once a week.
 +
*** Community has started collecting performance data with these CSIT machines.
 +
** VPP Path
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
 +
**** Two of the three ThunderX1 servers cannot be accessed.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
 +
**** Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
*** Investigating various No. of rx_q_bufs & tx_q_bufs
 +
*** Investigating various No. of vector size, and check its affection on throughput
 +
*** Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Investigating using SPE counters to profile ACL plugin bottle-neck
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
  
'''5/15/2018'''
+
'''06/23/2020'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
** Stanislav Chlebec
+
** Juraj Linkes
** Sachin Saxena
+
** Jieqiang Wang
** Khemendra Kumar
+
** Tina Tsou
** Andy Wang
+
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VPP performance testing is running once a week.
 +
*** Community has started collecting performance data with these CSIT machines.
 +
** VPP Path
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** Two of the three ThunderX1 servers cannot be accessed.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
 +
**** Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** L3FWD status
 +
** CSIT status
 +
** EPIC plan
 +
*** SVE2 investigation in VPP;
 +
*** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** Profiling with NMU-600 counters.
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''06/16/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VPP performance testing is running once a week.
 +
*** Community has started collecting performance data with these CSIT machines.
 +
** VPP Path
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
**** Patch is merged.
 +
**** libssl.so is missing in dependencies in vpp Makefile.
 +
**** Committed internal code review to address the issue - https://gerrit.oss.arm.com/#/c/162878/
 +
**** using gcc-9.3 now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
 +
**** labelled by Dave Wallace to use it for VPP Jenkins job.
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** Profiling with NMU-600 counters.
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''06/09/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VPP performance testing is running once a week.
 +
*** Community will collect performance data with these CSIT machines.
 +
*** IPSec tunnel configuration issue.
 +
**** Issue is resolved.
 +
**** Also failing on x86. CSIT maintainer is trying to root cause the problem.
 +
***** Juraj to run the IPSec regression on Taishan server with the IPSec patch.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
**** Patch is merged.
 +
**** libssl.so is missing in dependencies in vpp Makefile.
 +
**** Committed internal code review to address the issue - https://gerrit.oss.arm.com/#/c/162878/
 +
**** using gcc-9.3 now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
 +
**** labelled by Dave Wallace to use it for VPP Jenkins job.
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** Profiling with NMU-600 counters.
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''06/02/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** IPSec tunnel configuration issue.
 +
**** Also failing on x86. CSIT maintainer is trying to root cause the problem.
 +
***** Juraj to run the IPSec regression on Taishan server with the IPSec patch.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
**** libssl.so is missing in dependencies in vpp Makefile.
 +
**** Committed internal code review to address the issue - https://gerrit.oss.arm.com/#/c/162878/
 +
**** using gcc-9.3 now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
 +
**** labelled by Dave wallace to use it for VPP Jenkins job.
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''05/26/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** IPSec tunnel configuration issue.
 +
**** Also failing on x86. CSIT maintainer is trying to root cause the problem.
 +
***** Juraj to run the IPSec regression on Taishan server with the IPSec patch.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
**** libssl.so is missing in dependencies in vpp Makefile.
 +
**** Committed internal code review to address the issue - https://gerrit.oss.arm.com/#/c/162878/
 +
**** using gcc-9.3 now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
 +
**** labelled by Dave wallace to use it for VPP Jenkins job.
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers. Jieqiang will setup a meeting with Juraj regarding this documentation.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
** N1SDP enablement. - Lijian
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
 
 +
'''05/19/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 
** Honnappa Nagarahalli
 
** Honnappa Nagarahalli
 +
** Juraj Linkes
 
** Tina Tsou
 
** Tina Tsou
** Andrew Pinski
+
** Jieqiang Wang
** John Bromhead
+
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** the other failure is related with VPP image on Arm, IPSec tunnel configuration issue.
 +
**** Also failing on x86. CSIT maintainer is trying to root cause the problem.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
**** libssl.so is missing in dependencies in vpp Makefile.
 +
**** Committed internal code review to address the issue - https://gerrit.oss.arm.com/#/c/162878/
 +
**** using gcc-9.3 now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
** Ed Kern - Install nomad service in those two servers - Juraj & Jieqiang
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
** N1SDP enablement. - Lijian
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''04/28/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 
** Juraj Linkes
 
** Juraj Linkes
** rkinsell
+
** Tina Tsou
** Nitin Saxena
+
** Jieqiang Wang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Two failures in performance testing
 +
**** one failure is related with CSIT script, NAT44 is common issue, failing with x86 also.
 +
***** Has been fixed already.
 +
**** the other failure is related with VPP image on Arm, IPSec tunnel configuration issue.
 +
***** Also failing on x86. CSIT maintainer is trying to root cause the problem.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
** Ed Kern - Install nomad service in those two servers - Juraj & Jieqiang
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** Resolve VPP compiling issue with clang-6.
 +
*** Patch (https://gerrit.fd.io/r/c/vpp/+/26949) is merged.
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
** N1SDP enablement. - Lijian
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is merged.
 +
**** https://gerrit.fd.io/r/c/vpp/+/26804
 +
*** IOMMU limitation issue is gone after upgrade the kernel and fw
 +
**** Share kernel/fw upgrade version to Govind
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
  
* Action Items - Last Week
+
'''04/28/2020'''
** Nitin: Run a VPP performance test to understand if the memcpy neon version provides any benefits. - Able to run with l3fwd test case. Gives better numbers.
+
* Attendees
** Sirshak: Create a higher LF ticket so that it is easier for Trishan/Acton/Venessa/Mohammed to follow up on bringing up ThunderX/mcbin - Not Created yet as I think we are close to solving the issue. If its not solved after today's call will create the tkt.
+
** Govindarajan Mohandoss
** Nitin: start email discussion with Dave to address the creation of single makefile for all ARMv8 devices. Still understanding cross compilation works. Communicating with Sachin.
+
** Honnappa Nagarahalli
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
** Arthur Marshall
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Two failures in performance testing
 +
**** one failure is related with CSIT script, NAT44 is common issue, failing with x86 also.
 +
**** the other failure is related with VPP image on Arm, IPSec tunnel configuration issue.
 +
*** iommu_passthrough=1 does not make any differences on Taishan server - Lijian
 +
*** We cannot do kernel upgrade with Ubuntu-18.04.1/Ubuntu-18.04.2/Ubuntu-18.04.3/Ubuntu-18.04.4 on Taishan.
 +
**** For now, can the kernel of Taishan server be left as it is now, linux-4.15.0.54. - Juraj
 +
**** One possible option/improvement is to port FD.io CSIT performance testing to some more advanced Arm servers, e.g., Ampere
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will send email to community about two options to resolve gcc-7 issue with CentOS-7
 +
***** 1. update gcc-7 requirement to gcc-8 in Makefile
 +
***** 2. remove gcc-7 limitation in Makefile, and get user install gcc-8 manually
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** https://gerrit.oss.arm.com/#/c/160812/
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1SDP enablement. - Lijian
 +
** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
**** This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.  
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
*** The degradation is seen even when L3 cache is enabled.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
**** Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.
  
* New Joinees
+
'''04/21/2020'''
** Stanislav Chlebec - pantheon
+
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** iommu_passthrough=1 does not make any differences on Taishan server - Lijian
 +
*** We cannot do kernel upgrade with Ubuntu-18.04.1/Ubuntu-18.04.2/Ubuntu-18.04.3/Ubuntu-18.04.4 on Taishan.
 +
**** For now, can the kernel of Taishan server be left as it is now. Please confirm with Peter. - Juraj
 +
**** One possible option/improvement is to port FD.io CSIT performance testing to some more advanced Arm servers, e.g., Ampere
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** CentOS-8 is working fine. Will try CentOS-7 later.
 +
**** Is there any gcc version requirement in VPP official release?
 +
**** AES instructions in VPP source code requires gcc version newer than gcc-8.
 +
**** 'make install-deps' failure with CentOS-7 on Arm.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1SDP enablement. - Lijian
 +
** gcc-10 is not working so far.
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
**** This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
*** The degradation is seen even when L3 cache is enabled.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
**** Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.
  
* fd.io lab
+
'''04/14/2020'''
** Follow up on ThunderX to getting mgmt IP - IP addresses are assigned, but are not up yet.- Have a call today to discuss this with Mohammed
+
* Attendees
** USB to Ethernet Question: Andrew: shows up as Ethernet interface.
+
** Govindarajan Mohandoss
** Release Machine to EdK as soon as ThunderX is up. - Sirshak to set mgmt IP and handover the machine.
+
** Honnappa Nagarahalli
** Cavium has shipped more machines as well - Delivered a week back. Tina to follow up with Trishan: 2 Delivered. Sirshak to ask in todays meeting for status on new ThunderX.
+
** Juraj Linkes
** See the Taishan setup for any VM issue. - Sirshak is trying to reproduce the issue. - Reproduced still debugging.
+
** Tina Tsou
** Khemendra : Topology is correct. Sirshak to ask about CSIT performance topology connection status.
+
** Jieqiang Wang
** Khemendra: Intel NIC to be used or Mellanox. HN: Intially use Intel later move to Mellanox.
+
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
 +
**** Ubuntu-18.04 lts version is supposed to be kernel 4.15.72?
 +
**** Will try fresh install with local Taishan servers.
 +
***** Will try with Ubuntu-18.04.1/Ubuntu-18.04.2/Ubuntu-18.04.3/Ubuntu-18.04.4
 +
***** Will do fresh installation with Ubuntu-18.04.2 and then install kernel 4.15.72
 +
** VPP Path
 +
*** Try iommu_passthrough=1 in Taishan servers and see if it makes any differences - Lijian
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** CentOS-8 is working fine. Will try CentOS-7 later.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 
* VPP
 
* VPP
** VPP-1064 Dave Barach rejected the patch based on the solution Damjan and Nitin had decided upon following the reason that current approach breaks cross compilation. - NXP has upstreamed the DPAA2 patch, uses a separate segment makefile (dpaa.mk) for DPAA2. NXP does cross compilation most of the time. The approach could be that all platforms create a segment makefile and combine all of them into a single ARMv8 segment makefile. - Nitin Still discussing with Sachin regrading cross compilation
+
** Investigate bihash operations in L2 throughput are hot-spots
** One solution suggested was creating a platform specific Makefile for ThunderX - Any Decisions - Same as above.
+
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
** memcpy benchmarking updates honnappa - 2 more tests to be done based on Ola's suggestion. Nitin tested with restrict.
+
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
** 1019: No update. Few rough edges to clean up.
+
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
** 1021: Is it Closed ? Closed.
+
** N1SDP enablement. - Lijian
** 1023: migrated to openssl using DPDK manual but facing failed TCs - openSSL is integrated in his local environment - VPP not stable in his environment - Updated in the ticket. Status: Aadarsh still trying to get help from community. Khem, Aadarsh to talk to Sachin regarding openssl issues.
+
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
** 1043: No updates. Sirshak to investigate this.
+
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
** 990: Brian Updates:
+
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
** 1267: l3fwd performance tuning: Marvell to upstream a patch to enable dpdk on mcbin by making changes to dpdk plugin in vpp. Updates: natalie sent a email. Working on upstreaming changes to VPP for dpdk_plugin. Working on comparing musdk vs dpdk.  
+
** iova_mode == VA not working issue is not root-caused
** Auto-detection of memory channels: Startup conf solution decided. Updates: No updates not priorty now bug raised by Nitin.
+
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
** Sachin facing issues with build rpm currently on 1801 will open a Jira Tkt if issues persists with 1804. Updates: Jira VPP-1276 to track this issue.  
+
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
**** This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.  
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
*** The degradation is seen even when L3 cache is enabled.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
**** Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.
 +
 
 +
'''04/07/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
* General
 
* CSIT
 
* CSIT
** Adarsh openssl issues: Will communicate with Sachin to get this reolved
+
** VPP Performance Test
** Adarsh preparing a sheet updated with his progress on CSIT.
+
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
** Performance Testing Khem : NUMA node numbering issue Updates: No updates. Still working internally.
+
**** Ubuntu-18.04 lts version is supposed to be kernel 4.15.72?
** Khem facing issues with trex installation on ARM hence he will try getting a x86 machine as TG. Updates: Still working on getting an x86 in internal lab.
+
**** Will try cobbler with local Taishan servers, to try fresh install.
** brian to use cache stashing result. Updates:
+
***** Jieqiang will try fresh installation of kernel 4.15.72 in local Taishan through cobbler.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
**** Jieqiang updated docker file locally to add centOS as part of CI and facing some issues.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
**** Need 2 Thunderx2 servers to run the jobs for every VPP/CSIT patch submission instead of every half hour with a new VPP build. The current
 +
**** ThunderX2 server doesn't respond when the jobs are requested to run for every patch submission. No voting rights (+1 from CI) for VPP device
 +
**** suite.
 +
* FD.io lab
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
***** These patches are kept in backlog for now.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
**** This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.  
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
*** The degradation is seen even when L3 cache is enabled.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
**** Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.
  
* Action Items - Next Week
+
'''03/31/2020'''
** Sirshak: To update LF RT #54919 to follow up on cavium-2 status. - having troubles with login will sort it out today.
+
* Attendees
** Sirshak: Release Machine to EdK as soon as ThunderX is up: cavium-1 done cavium-2 still has issues with network connectivity.
+
** Govindarajan Mohandoss
** Sirshak: Status on new ThunderXs: Will be decided after talks with Maciek.
+
** Honnappa Nagarahalli
** Sirshak: vm unresponsive issue: No updates didnt get time to try will try this week.
+
** Lijian Zhang
** Sirshak: To ask about CSIT performance topology connection status. - TBD after call with Maciek.
+
** Juraj Linkes
** Nitin: VPP-1064 (Patch rejected by dave barach) Discuss cross compilation with Sachin. (Seperate or one unified Makefile).
+
** Tina Tsou
** HN: memcpy benchmarking updates honnappa - 2 more tests to be done based on Ola's suggestion.
+
** Jieqiang Wang
** Adarsh openssl issues: Will communicate with Sachin to get this resolved
+
** Michaela Tahiri
** Adarsh preparing a sheet updated with his progress on CSIT.
+
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
 +
**** Ubuntu-18.04 lts version is supposed to be kernel 4.15.72?
 +
**** Will try cobbler with local Taishan servers, to try fresh install.
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** https://docs.fd.io/csit/master/trending/introduction/failures.html#n-tsh
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-tsh/161/archives/log.html.gz
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in Apil.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.  
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
  
   
+
'''03/24/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** We are required to provide justification and use case for cross-compilation for VPP on Arm - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and externally with NXP.
 +
***** make build/build-release TARGET_PLATFORM=n1sdp // for n1sdp cross compiling
 +
***** make build/build-release  // for generic vpp image
 +
***** make build/build-release TARGET_PLATFORM=native  // for native vpp image
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in Apil.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
  
'''5/8/2018'''
+
'''03/17/2020'''
 
* Attendees
 
* Attendees
 +
** Govindarajan Mohandoss
 
** Honnappa Nagarahalli
 
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 
** Tina Tsou
 
** Tina Tsou
** Andrew Pinski
+
** Jieqiang Wang
** Natalie Samsonov
+
** Michaela Tahiri
** John Bromhead
+
* General
** Sachin Saxena
+
* CSIT
** Khemendra Kumar
+
** VPP Performance Test
** Andy Wang
+
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** We are required to provide justification and use case for cross-compilation for VPP on Arm - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and externally with NXP.
 +
** VPP Device
 +
*** Issue of huge pages running out has been resolved by resetting the servers.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in Apil.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - Usage of MAP is recorded in confluence
 +
*** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Patch is upstreamed for community review
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Confirm if community agrees with patch - Lijian
 +
*** Check how DPDK is detecting numa-id for a specific NIC device - Lijian
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1 SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
**** Sending Govind the steps on installing GCC-9.2.0
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
**** Will sync up with James Yang about cache line fill buffers
 +
** Investigating mtrie data structure on control-plane. - Govind
 +
*** 3% macro-benchmarking by adding prefetches to adj table on ThunderX2
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''03/10/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 
** Juraj Linkes
 
** Juraj Linkes
** rkinsell
+
** Tina Tsou
** Nitin Saxena
+
** Jieqiang Wang
** Ed Kern
+
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** We are required to provide justification and use case for cross-compilation for VPP on Arm - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and externally with NXP.
 +
** VPP Device
 +
*** Issue of huge pages running out has been resolved by resetting the servers.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in Apil.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - Usage of MAP is recorded in confluence
 +
*** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Check if detecting the source of SIGPROF is possible - Govind
 +
*** Confirm with Community about the possible solutions to this issue - Lijian
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Confirm if community agrees with patch - Lijian
 +
*** Check how DPDK is detecting numa-id for a specific NIC device - Lijian
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** Fix Makefile issue recently introduced on Arm machine - Jieqiang
 +
*** Patch is upstreamed for code review - https://gerrit.fd.io/r/c/vpp/+/25259
 +
** N1 SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
**** Sending Govind the steps on installing GCC-9.2.0
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
**** Will sync up with James Yang about cache line fill buffers
 +
** Investigating mtrie data structure on control-plane. - Govind
 +
*** 3% macro-benchmarking by adding prefetches to adj table on ThunderX2
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
  
* Action Items - Last Week
+
'''03/03/2020'''
** Sirshak: Follow up with Mohammed regarding ThunderX mgmt connectivity and mcbin - IP addresses allocated cavium-2 has IPMI connectivity but console still hanging. cavium-1,3 - Not able to connect to IPMI. - Create a higher LF ticket so that it is easier for Trishan/Acton/Venessa/Mohammed to follow up.
+
* Attendees
** Sirshak to hold a call with Khem and Adarsh to understand the Vm_vhsot issue because of nested VMs - Contact established still working on analyzing the setup.  
+
** Govindarajan Mohandoss
** Sirshak to create consolidated ARM ecosystem xls to reflect CSIT effort. (Need to add the link to the excel sheet to AArch64 page) - Not Done will do it next week.
+
** Honnappa Nagarahalli
** Honnappa: memcpy benchmarking - Micro benchmarks run on mcbin, qualcomm - vector Load/Store usually go to the LSU unit
+
** Lijian Zhang
** Brian : CSIT-990(buildroot) - Nitin ran on mcbin, it is failing at a different place - Brian to continue next week
+
** Juraj Linkes
** Brian to publish a pictorial representation of rx queues and tx queues in multicore case for mcbin. - Moved to next week
+
** Tina Tsou
** Khem to analyze make test failure in Taishan - 1804 - Tested with the latest code (make test), all test cases passing
+
** Jieqiang Wang
** ARM - For TG for deciding connectivity - MCBin and Taishan - Sirshak/Brian working on it.
+
** Michaela Tahiri
** Sirshak/Brian to recheck validity of ASLR issue. - Not Done. Next Week.
+
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** We are required to provide justification and use case for cross-compilation for VPP on Arm - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and externally with NXP.
 +
** VPP Device
 +
*** Issue of huge pages running out has been resolved by resetting the servers.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** The current ThunderX2 in Arm lab are pre-production servers.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - Usage of MAP is recorded in confluence
 +
*** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** Fix Makefile issue recently introduced on Arm machine - Jieqiang
 +
*** Patch is upstreamed for code review - https://gerrit.fd.io/r/c/vpp/+/25259
 +
** Investigating memory copy in ip4-rewrite on ThunderX2 - Govind
 +
*** Check the assembly code with other Arm CPU also.
 +
*** Send Govind the memory copy with fixed length.
 +
** N1 SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
**** Sending Govind the steps on installing GCC-9.2.0
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
**** Will sync up with James Yang about cache line fill buffers
 +
** Investigating mtrie data structure on control-plane. - Govind
 +
*** 3% macro-benchmarking by adding prefetches to adj table on ThunderX2
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.  
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
  
* New Joinees
+
'''02/25/2020'''
** Yuval Caduri - from Marvell responsible for MUSDK driver - packet processor 8K chips
+
* Attendees
** Natalie - responsible for network PMD DPDK driver
+
** Govindarajan Mohandoss
** Dmitri Epshtein - Responsible for crypto driver expert
+
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Govind will talk with George Zhao for Taishan fw version supporting Meltdown issue.
 +
*** Huawei is investigating which fw version of Taishan server supporting Meltdown issue. Will update with us soon.
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** We are required to provide justification and use case for cross-compilation for VPP on Arm - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and externally with NXP.
 +
** VPP Device
 +
*** Issue of huge pages running out has been resolved by resetting the servers.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** The current ThunderX2 in Arm lab are pre-production servers.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - Usage of MAP is recorded in confluence
 +
*** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** Fix Makefile issue recently introduced on Arm machine - Jieqiang
 +
*** Patch is upstreamed for code review - https://gerrit.fd.io/r/c/vpp/+/25259
 +
** Investigating memory copy in ip4-rewrite on ThunderX2 - Govind
 +
*** Check the assembly code with other Arm CPU also.
 +
*** Send Govind the memory copy with fixed length.
 +
** N1 SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
**** Sending Govind the steps on installing GCC-9.2.0
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
**** Will sync up with James Yang about cache line fill buffers
 +
** Investigating mtrie data structure on control-plane. - Govind
 +
*** 3% macro-benchmarking by adding prefetches to adj table on ThunderX2
  
* fd.io lab
+
 
** Follow up on ThunderX to getting mgmt IP - IP addresses are assigned, but are not up yet.
+
'''02/18/2020'''
** Release Machine to EdK as soon as ThunderX is up.
+
* Attendees
** Cavium has shipped more machines as well - Delivered a week back. Tina to follow up with Trishan.
+
** Govindarajan Mohandoss
** See the Taishan setup for any VM issue. - Sirshak is trying to reproduce the issue.
+
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VM-VHost test failing on 3n-tsh server.
 +
**** Issue with testpmd failure in VM has been resolved and merged.
 +
*** Govind will talk with Geoge for Taishan fw version supporting Meltdown issue.
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** Will discuss about the cross compilation with qemu emulation solution in the monthly VPP call tomorrow - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and with NXP.
 +
*** VPP crash issue on Taishan server is resolved and patch is resolved.
 +
**** ThunderX2 has the same issue and has been resolved also.
 +
** VPP Device
 +
*** Issue of huge pages running out has been resolved by resetting the servers.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Customer engineers claims ThunderX2 does not support i40e intel NIC, which seems not correct.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 
* VPP
 
* VPP
** VPP-1064 Dave Barach rejected the patch based on the solution Damjan and Nitin had decided upon following the reason that current approach breaks cross compilation. - NXP has upstreamed the DPAA2 patch, uses a separate segment makefile (dpaa.mk) for DPAA2. NXP does cross compilation most of the time. The approach could be that all platforms create a segment makefile and combine all of them into a single ARMv8 segment makefile.
+
** Vectorization
** One solution suggested was creating a platform specific Makefile for ThunderX
+
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
** Honnappa Suggested as this not just a ThunderX issue but also Qualcomm issue hence a ARM specific Makefile would be better.(Issue 128 byte Cache Line Size)
+
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
** Honnappa no update on memcpy benchmarking will do that next week
+
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
** 1019: fixed in local will upstream soon - Patch has issues and some of the issues are fixed
+
** MAP with VPP - error is resolved. Sort of working. Record the details.
** 1021: Patch submitted centos env issue CSIT follow up. - This can be closed
+
*** Usage of MAP is recorded in confluence
** 1023: migrated to openssl using DPDK manual but facing failed TCs - openSSL is integrated in his local environment - VPP not stable in his environment - Updated in the ticket.
+
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
** 1043: No updates
+
*** Will update the patch to ignore sigprof singal - Jieqiang
** 990: Brian to Retry on mcbin
+
**** Patch is updated by adding more comments. - Jieqiang
** 1267: l3fwd performance tuning: Marvell to upstream a patch to enable dpdk on mcbin by making changes to dpdk plugin in vpp.
+
** Benchmarking AVF drivers on Arm servers - Jieqiang
** Auto-detection of memory channels: Andrew's comment no really way to do that hence to go with making it a runtime argument via startup conf instead of being hard coded.
+
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
** Sachin facing issues with build rpm currently on 1801 will open a Jira Tkt if issues persists with 1804.
+
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** Fix Makefile issue recently introduced on Arm machine - Jieqiang
 +
*** Patch is ready for code review.
 +
** Investigating memory in ip4-rewrite on ThunderX2 - Govind
 +
*** Check the assembly with other Arm CPU also.
 +
** N1 SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** Investigating mtrie data structure on control-plane. - Govind
 +
*** 3% macro-benchmarking by adding prefetches to adj table on ThunderX2
 +
 
 +
 
 +
'''02/11/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
* General
 
* CSIT
 
* CSIT
** Adarsh stalled with failure of test cases after using openssl.
+
** VPP Performance Test
** Performance Testing Khem : NUMA node numbering issue.
+
*** VM-VHost test failing on 3n-tsh server.
** NUMA node no issue not seen in ThunderX. Khem to post the details of issue and the workaround on Taishan.
+
*** Tina to confirm which BIOS version on Taishan server support Meldown.
** Khem facing issues with trex installation on ARM hence he will try getting a x86 machine as TG.
+
**** NICs cannot be bound to VFIO_PCI driver in VM which caused the failure.
** Nitin known issue with trex with arm and mellanox card.
+
**** Will try iommu-passthrough=0/1 - Juraj
** Khem to try L2BD and L2XC.
+
*** Will confirm with Joyce about this issue - Lijian
** brian to use cache stashing and see the results.
+
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** Will discuss about the cross compilation with qemu emulation solution in the monthly VPP call tomorrow - Juraj
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Customer engineers claims ThunderX2 does not support i40e intel NIC, which seems not correct.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** Fix Makefile issue recently introduced on Arm machine - Jieqiang
 +
** Investigating memory in ip4-rewrite on ThunderX2 - Govind
 +
*** Check the assembly with other Arm CPU also.
  
* Action Items - Next Week
 
** Nitin: Run a VPP performance test to understand if the memcpy neon version provides any benefits.
 
** Sirshak: Create a higher LF ticket so that it is easier for Trishan/Acton/Venessa/Mohammed to follow up on bringing up ThunderX/mcbin
 
** Nitin: start email discussion with Dave to address the creation of single makefile for all ARMv8 devices
 
  
''' 5/1/2018 '''
+
'''02/04/2020'''
* New Joinees
+
* Attendees
** Natalie and Yuval from Marvell for engineering input.
+
** Govindarajan Mohandoss
* fd.io lab
+
** Honnappa Nagarahalli
** Follow up on ThunderX to getting mgmt IP
+
** Lijian Zhang
** Release Machine to EdK as soon as ThunderX is up.
+
** Juraj Linkes
** Cavium has shipped more machines as well.
+
** Tina Tsou
** See the Taishan setup for any VM issue.
+
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VM-VHost test failing on 3n-tsh server.
 +
*** Govind to send background details about Taishan kernel upgrade to Tina to confirm with George Zhao.
 +
**** The VM-VHost test cases have never passed before as per the previous logs in Taishan server.
 +
**** Issue is not reproducible locally - VHost/Virtual Ethernet interface creation passes in Taishan server in local setup.
 +
**** Next Steps: Follow up with Peter Mikus to debug the issue in Taishan server in CSIT lab.
 +
*****                Build a local test setup to run the Testpmd application in VM.
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Patch to resolve "show pci" crash issue is merged. Will ask CSIT team to remove the workaround. - Lijian
 +
**** Have upgraded Python2 to Python3 successfully.
 +
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
 +
***** Will send email to CSIT-dev on how to avoid the similar case/issue.
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
**** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** Will investigate 4-5 test failures in 'make test' - Lijian
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Cables for intel NICs have been ordered.
 +
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
 +
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** VPP-1064 Dave Barach rejected the patch based on the solution Damjan and Nitin had decided upon following the reason that current approach breaks cross compilation.  
+
** Align Arm patches with VPP release plan.
** One solution suggested was creating a platform specific Makefile for ThunderX
+
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
** Honnappa Suggested as this not just a ThunderX issue but also Qualcomm issue hence a ARM specific Makefile would be better.(Issue 128 byte Cache Line Size)
+
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
** Honnappa no update on memcpy benchmarking will do that next week
+
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
** 1019: fixed in local will upstream soon
+
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
** 1021: Patch submitted centos env issue CSIT follow up.
+
** Vectorization
** 1023: migrated to openssl using DPDK manual but facing failed TCs
+
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
** 1043: No updates
+
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
** 990: Brian to Retry on mcbin
+
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
** 1267: l3fwd performance tuning: Marvell to upstream a patch to enable dpdk on mcbin by making changes to dpdk plugin in vpp.
+
** MAP with VPP - error is resolved. Sort of working. Record the details.
** Auto-detection of memory channels: Andrew's comment no really way to do that hence to go with making it a runtime argument via startup conf instead of being hard coded.
+
*** Usage of MAP is recorded in confluence
** Sachin facing issues with build rpm currently on 1801 will open a Jira Tkt if issues persists with 1804.
+
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
*** Check if performance tests includes AVF driver or not?
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
 +
*** Will try one patch to enable N1SDP board.
 +
*** Please try AVF with Mcbin if possible.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
 +
** EPIC for next quarter:
 +
 
 +
'''01/28/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
* General
 
* CSIT
 
* CSIT
** Adarsh stalled with failure of test cases after using openssl.
+
** VPP Performance Test
** Performance Testing Khem : NUMA node numbering issue.
+
*** VM-VHost test failing on 3n-tsh server.
** NUMA node no issue not seen in ThunderX. Khem to post the details of issue and the workaround on Taishan.
+
**** The VM-VHost test cases have never passed before as per the previous logs in Taishan server.
** Khem facing issues with trex installation on ARM hence he will try getting a x86 machine as TG.
+
**** Issue is not reproducible locally - VHost/Virtual Ethernet interface creation passes in Taishan server in local setup.  
** Nitin known issue with trex with arm and mellanox card.
+
**** Next Steps: Follow up with Peter Mikus to debug the issue in Taishan server in CSIT lab.
** Khem to try L2BD and L2XC.
+
*****                Build a local test setup to run the Testpmd application in VM.
** brian to use cache stashing and see the results.
+
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Patch to resolve "show pci" crash issue is merged. Will ask CSIT team to remove the workaround. - Lijian
 +
**** Have upgraded Python2 to Python3 successfully.
 +
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
 +
***** Will send email to CSIT-dev on how to avoid the similar case/issue.
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
**** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** Will investigate 4-5 test failures in 'make test' - Lijian
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Cables for intel NICs have been ordered.
 +
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
 +
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
*** Check if performance tests includes AVF driver or not?
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
 +
*** Will try one patch to enable N1SDP board.
 +
*** Please try AVF with Mcbin if possible.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
 +
** EPIC for next quarter:
  
* Action Items - Next Week
+
'''01/21/2020'''
** Sirshak: Follow up with Mohammed regarding ThunderX mgmt connectivity and mcbin.
+
* Attendees
** Sirshak to hold a call with Khem and Adarsh to understand the Vm_vhsot issue because of nested VMs - Not done yet will do it next week.
+
** Tina Tsou
** Sirshak to create consolidated ARM ecosystem xls to reflect CSIT effort. - Not Done will do it next week.
+
** Govindarajan Mohandoss
** Honnappa: memcpy benchmarking
+
** Honnappa Nagarahalli
** Brian : CSIT-990(buildroot)
+
** Michaela Tahiri
** Brian to publish a pictorial representation of rx queues and tx queues in multicore case for mcbin. - Moved to next week
+
* General
** Khem to analyze make test failure in Taishan - 1804 - Next Week
+
* CSIT
** ARM - For TG for deciding connectivity - MCBin and Taishan - Working on it.
+
** VPP Performance Test
** CSIT 990 brian to try - Next Week
+
*** VM-VHost test failing on 3n-tsh server.
** Sirshak/Brian to recheck validity of ASLR issue. - Not Done. Next Week.
+
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Patch to resolve "show pci" crash issue is merged. Will ask CSIT team to remove the workaround. - Lijian
 +
**** Have upgraded Python2 to Python3 successfully.
 +
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
 +
***** Will send email to CSIT-dev on how to avoid the similar case/issue.
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
**** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** Will investigate 4-5 test failures in 'make test' - Lijian
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Cables for intel NICs have been ordered.
 +
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
 +
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
*** Check if performance tests includes AVF driver or not?
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
 +
*** Will try one patch to enable N1SDP board.
 +
*** Please try AVF with Mcbin if possible.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
 +
** EPIC for next quarter:
  
* Action Items - Last Week
 
** Khem to ask mohammed, anton for power clearance for 2 new taishan. - Ok for Power Clearance
 
** Sirshak to hold a call with Khem and Adarsh to understand the Vm_vhsot issue because of nested VMs - Not done yet will do it next week.
 
** Sirshak and Brian to discuss on TG connectivity. - Done
 
** Sirshak to create consolidated ARM ecosystem xls to reflect CSIT effort. - Not Done will do it next week.
 
** Nitin: To post vlib_main 1804_rc2 issue to community. - Done
 
** Sirshak : to check if vlib_main is a issue in centriq. - Done
 
** Nitin: AI for creating Jira for number of memory channel identification. - Done
 
** Brian to publish a pictorial representation of rx queues and tx queues in multicore case for mcbin. - Moved to next week
 
** John B - 1G to USB adapters Ship to lab. - Done
 
** Khem to analyze make test failure in Taishan - 1802 rc2 - Next Week
 
** ARM - For TG for deciding connectivity - MCBin and Taishan - Working on it.
 
** CSIT 990 brian to try - Next Week
 
** Sirshak to take 1103 and 1114 - Done
 
** Nitin to Create l3fwd tkt - Done
 
** Brian to create a mcbin crash tkt. Next Week
 
** Maen to provide contact for IO Stashing on mcbin. - Contacted Brian. Brian to provide further input.
 
** Sirshak/Brian to recheck validity of ASLR issue. - Not Done. Next Week.
 
  
''' 4/25/2018 '''
+
'''01/14/2020'''
* Meeting Time
+
* Attendees
** Proposed time 6-8am Tuesday PST.
+
** Tina Tsou
** Tina to update wiki with new meeting time.
+
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Patch to resolve "show pci" crash issue is merged. Will ask CSIT team to remove the workaround. - Lijian
 +
**** Have upgraded Python2 to Python3 successfully.
 +
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
 +
***** Will send email to CSIT-dev on how to avoid the similar case/issue.
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
**** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** Will investigate 4-5 test failures in 'make test' - Lijian
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** ThunderX
+
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
*** OS installed on ThunderX. Switch being sent.
+
*** Cables for intel NICs have been ordered.
*** 1 ThunderX booted.
+
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
*** Plan to use 1G to USB adapters.
+
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
*** Varun POC for Cavium.
+
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
** Taishan
+
** Current Configurations:
*** Its up and connected to Internet.
+
*** RAM: 256G
*** Build and make test 2 TCs failing (VCL TCs failing) - 1802 rc2 used.
+
*** Disk: 480G SSD
*** Brian no update for TG - Meeting on it next week.
+
*** The boxes are coming with Qlogic cards which are not supported in VPP.
*** Khem to ask mohammed, anton for power clearance for 2 new taishan.
+
** Changes required to the servers:
** MCBin
+
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
*** Maen POC - To Contact Mohammed.
+
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
*** Maen to provide engineering contact for help to Nitin.
+
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** Round Table status on Porting tkts.
+
** Align Arm patches with VPP release plan.
** Nitin: vlib_main taking a lot of time on both mcbin and thunderx2
+
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
** Sirshak to take on ARM tkts.
+
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
*** Check if performance tests includes AVF driver or not?
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
 +
*** Will try one patch to enable N1SDP board.
 +
*** Please try AVF with Mcbin if possible.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
 +
** EPIC for next quarter:
 +
 
 +
'''01/07/2020'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
* General
 
* CSIT
 
* CSIT
** Adarsh looking at IPv4 failed test cases with priorty.
+
** VPP Performance Test
** Sirshak to hold a call with Khem and Adarsh to understand the Vm_vhsot issue because of nested VMs
+
*** Performance data on Arm in official release 19.08 is available.
** Cavium to publish mcbin cist performance nos but low priority. Nitin faced build-root issue with this.
+
**** https://docs.fd.io/csit/rls1908/report/
** Maciek to host a kick off call.
+
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
** Sirshak and Brian to discuss on TG connectivity.
+
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
** Sirshak to create consolidated ARM ecosystem xls to reflect CSIT effort.
+
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
* Performance Benchmarking
+
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
** Nitin: To post vlib_main 1804_rc2 issue to community.
+
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
** Nitin: vlib_main issue in mcbin and thunderx2 at different points within the function. Not a hotspot in x86.
+
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
** Sirshak : to check if vlib_main is a issue in centriq.
+
**** Will change performance job to be running daily from current weekly running. - Juraj
** Nitin: AI for creating Jira for number of memory channel identification.
+
**** Patch to resolve "show pci" crash issue is merged. Will ask CSIT team to remove the workaround. - Lijian
** AI for creating Jira for the crash on Mcbin – Brian
+
**** Have upgraded Python2 to Python3 successfully.
** Khem to get started on CSIT performance suite this week and publish on shared xls.
+
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
** Brian to publish a pictorial representation of rx queues and tx queues in multicore case for mcbin.  
+
***** Will send email to CSIT-dev on how to avoid the similar case/issue.
* Action Items - Last Week
+
** VPP Path
** Sirshak to add link to xls to wiki page. - Done by somebody else.
+
*** Verifying VPP on Centos/Arm - Juraj
** Brian to raise LF RT ticket about MACCHIATObins - Done. Pinged Mohammed yet hear back from him.
+
*** Trying to update kernel to 64K page size on CentOS - No update - Lijian
** Nitin to check 'make test' on MACCHIATObin (16GB DRAM) - Failed. Error related to Python scripts.
+
**** VPP can boot up normally with 16K/64K page size. Will investigate 4-5 test failures in 'make test' - Lijian
** Honnappa, Khem to check Clang build on arm64. - Tried clang build on Centriq made some changes still fails. clang on x86 has errors still passes. 'make test' fails on x86. Jira Card to be created - '''AI(Sirshak)'''. Khem to try.
+
**** Will try with CentOS 8 which seems to be working fine with 64K page size.
* Action Items
+
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
** John B- 1G to USB adapters Ship to lab.
+
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
** Khem to analyze make test failure in Taishan - 1802 rc2
+
*** Tried cross-compiling with DPDK only.
** ARM - For TG for deciding connectivity - MCBin and Taishan
+
*** Initial cross-compiling is working fine.
** CSIT 990 brian to try
+
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
** Sirshak to take 1103 and 1114
+
*** There's issues to build vpp distros.
** Nitin to Create l3fwd tkt
+
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
** Brian to create a mcbin crash tkt.
+
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
** Maen to provide contact for IO Stashing on mcbin.
+
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
** Sirshak/Brian to recheck validity of ASLR issue.
+
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
** Sirshak to track down issues.
+
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Cables for intel NICs have been ordered.
 +
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
 +
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
*** Check if performance tests includes AVF driver or not?
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
  
''' 4/18/2018 '''
+
 
 +
'''12/17/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
** Xiaoming Jiang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Have upgraded Python2 to Python3 successfully.
 +
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
*** Trying to update kernel to 64K page size on CentOS - No update - Lijian
 +
**** Will try with CentOS 8 which seems to be working fine with 64K page size.
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** Temporarily borrow 1x ThunderX to be used for ONAP demo at OpenStack Summit (end of May)? Yes.
+
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
** OS exists on ThunderXs; Varun will keysign with EdW; need to resolve OS netdev connectivity over 10/40GbE
+
*** Cables for intel NICs have been ordered.
** OS exists on TaiShan2280; no connectivity to the Internet
+
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
 +
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** RC2
+
** Align Arm patches with VPP release plan.
*** 'make' passes, 'make test' fail, 'make test-all' ???  - MACCHIATObin (4GB DRAM)
+
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
*** 'make' passes, 'make test' pass, 'make test-all' fails - Centriq
+
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
*** 'make' passes, 'make test' pass, 'make test-all' fails - x86
+
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
** Build
+
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
*** Testing Verify and Merge jobs for 18.04 master on arm64 today
+
** Vectorization
*** Clang build fails on arm? 'CC=clang CXX=clang make'
+
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
*** Patches are upstreamed, but not reviewed yet.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Use rte_mbuf_sanity_check checking meta data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
 +
 
 +
'''12/10/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
** Xiaoming Jiang
 +
* General
 
* CSIT
 
* CSIT
** Adarsh updated CSIT status in xls
+
** VPP Performance Test
** CSIT-1023: decided to go with OpenSSL instead of ARMv8 crypto library, in DPDK, due to number of algorithms supported
+
*** Performance data on Arm in official release 19.08 is available.
*** e.g. AES-GCM not supported by ARMv8 crypto library
+
**** https://docs.fd.io/csit/rls1908/report/
** Nitin updated CSIT-990 (buildroot) with more information
+
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
* Action Items
+
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
** Sirshak to add link to xls to wiki page.
+
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
** Brian to raise LF RT ticket about MACCHIATObins
+
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
** Nitin to check 'make test' on MACCHIATObin (16GB DRAM)
+
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
** Honnappa, Khem to check Clang build on arm64
+
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Have upgraded Python2 to Python3 successfully.
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
*** Trying to update kernel to 64K page size on CentOS - Lijian
 +
**** Will try with CentOS 8 which seems to be working fine with 64K page size.
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2.
 +
*** What's the preferred work method with Mellanox NIC, using DPDK pmd or RDMA? - Juraj
 +
*** Check BIOS version - Lijian
 +
*** Make sure all NICs are plugged into same PCI slot number - Lijian
 +
*** Verify intel i40e driver/firmware version - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7) Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** MAP can give profiling data at certain different time-line spots
 +
*** MAP cannot do profiling with specific CPU cores, and cannot give assembly views
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Use rte_mbuf_sanity_check checking meta data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
  
''' 4/11/2018 '''
+
 
* Proposal to keep meeting at current time with additional overflow meeting at 8AM PST
+
'''12/03/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
** Xiaoming Jiang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** There's a Python API issue which affects all performance tests on Taishan server only.
 +
**** The failure turns out to be caused by PCI show with Mellanox NICs on Taishan servers.
 +
**** Talk to Peter to temporarily remove 'PCI dump' for Taishan servers - Juraj
 +
**** Could you try debug version of VPP with the setup and capture the traceback log? - Juraj
 +
**** Will try to root cause the problem with Taishan + Mellanox NIC - Lijian
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
*** Trying to update kernel to 64K page size on CentOS - Lijian
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
*** VPP device failed after Python3 upgrade
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** MACCHIATObins just arrived at VEXXHOST
+
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
** Nitin working on getting IPMI login credentials to provision OS on ThunderX
+
** Current Configurations:
** Need to connect Skylake TG machines to Arm machines
+
*** RAM: 256G
*** ETA: 1wk
+
*** Disk: 480G SSD
** Khem working with Aton (LF) to provision OS on TaiShan2280
+
*** The boxes are coming with Qlogic cards which are not supported in VPP.
*** ETA: 1wk, Ubuntu 17.10
+
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** Brian to do more benchmarking on MACCHIATObin
+
** Align Arm patches with VPP release plan.
** Khem working on benchmarking clib_memcpy64_x4()
+
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** MAP can give profiling data at certain different time-line spots
 +
*** MAP cannot do profiling with specific CPU cores, and cannot give assembly views
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Use rte_mbuf_sanity_check checking meta data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
 +
 
 +
'''11/26/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
** Xiaoming Jiang
 +
* General
 
* CSIT
 
* CSIT
** Lucian submitted patches for CSIT-1019, CSIT-1021
+
** VPP Performance Test
** Lucian looking for contact for ARMv8 crypto driver in DPDK for CSIT-1023
+
*** Performance data on Arm in official release 19.08 is available.
*** See CSIT-1023 for details; looks like DPDK issue?
+
**** https://docs.fd.io/csit/rls1908/report/
** Nitin to add more details to CSIT-990
+
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
* Action Items
+
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
** Sirshak to move JIRA tickets to xls
+
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
** Lucian to work with Nitin/Jerin on CSIT-1023
+
*** There's a Python API issue which affects all performance tests on Taishan server only.
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Use rte_mbuf_sanity_check checking meta data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
  
''' 4/4/2018 '''
+
'''11/19/2019'''
* Propose to move the meeting +2 hours?
+
* Attendees
* RC1 cut today
+
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** There's a Python API issue which affects all performance tests on Taishan server only.
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** Allocate 3 ThunderX for EdK to integrate into CI
+
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
*** JohnB from Cavium agreed to supply 3 more ThunderX for CSIT (will pre-install FW & OS)
+
** Current Configurations:
** Brian working on provisioning SSDs for MACCHIATObins
+
*** RAM: 256G
** Khem can ping IPMI interfaces on TaiShan2280s; also needs an OS to be installed
+
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** Discussed [https://schd.ws/hosted_files/onsna18/6c/ons_fdio_brooks.pdf ONS slides]
+
** Align Arm patches with VPP release plan.
** Khem has patch for clib_memcpy64_x4() and needs help benchmarking
+
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Use rte_mbuf_sanity_check checking meta data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
 +
 
 +
'''11/12/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Stanislav Clebec
 +
* General
 
* CSIT
 
* CSIT
** Lucian found and created JIRA tickets for 3 issues while running CSIT
+
** VPP Performance Test
** Nitin created JIRA ticket for buildroot issue
+
*** Performance data on Arm in official release 19.08 is available.
** Khem seeing issues with VM
+
**** https://docs.fd.io/csit/rls1908/report/
* Action Items
+
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
** Nitin/Varun to help provision Ubuntu 16.04 and firmware update on ThunderX machines
+
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure - Merged.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lockless
 +
** EPIC for next quarter:
  
''' 3/28/2018 '''
+
'''10/29/2019'''
* Sachin Saxena from NXP joined the call, welcome
+
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Stanislav Clebec
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure - Merged.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** Khemendra is having issues with Rudy's emails. Hence, not been able to access Taishan servers
+
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
** Nitin will try to access the servers this week
+
** Current Configurations:
** MACCHIATObin setup under progress
+
*** RAM: 256G
** OD1000 is added to Jenkins slave. The build is failing currently. The build can be triggered manually.
+
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** Discuss Single core, L3Fwd sample perf numbers and analysis next week
+
** Align Arm patches with VPP release plan.
** Sachin is working on compiling 18.01. Native compilation works fine, cross compilation is failing
+
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
** Nitin still working on patch for cache line size
+
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
** VPP-1126 is being used in DPDK input node. Khemendra will take a look at it this week.
+
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
** VPP-1129 Brian/Sirshak will take a look. Looks like it can be closed.
+
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
** VPP-1114 Patch under internal review
+
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lockless
 +
** EPIC for next quarter:
 +
 
 +
'''10/22/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Stanislav Clebec
 +
* General
 
* CSIT
 
* CSIT
** Khemendra having issues with interface bring up failing intermittently. Nitin suggested to add delay.
+
** VPP Performance Test
** Nicolas/Lucian debugging TC-07
+
*** Performance data on Arm in official release 19.08 is available.
** Khemendra having issues with TG VM crashing randomly with Ubuntu 16.04, QEMU 2.10. Solved by moving to Ubuntu 17.10, QEMU 2.10
+
**** https://docs.fd.io/csit/rls1908/report/
** Nitin using Ubuntu 16.04 with 4.13 kernel
+
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
* Action Items
+
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
** Discuss Single core, L3Fwd sample perf numbers and analysis next week - Brian
+
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
** VPP-1126 Take a look this week as it affects DPDK input node - Khemendra
+
** VPP Path
** Need more attention on solution for buildroot issue, need more information on failure [https://jira.fd.io/browse/CSIT-990 CSIT-990] - Nitin
+
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
** Create an excel sheet with the test case status - Nicolas/Lucian
+
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure - Merged.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Add max-size parameter to pmalloc module. - Lijian
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lockless
 +
** EPIC for next quarter:
  
''' 3/21/2018 '''
+
'''10/15/2019'''
* Key signing party! Thank you Ed!
+
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Stanislav Clebec
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure - Merged.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** VEXXHOST currently working on getting another PDU because there are not enough power ports
+
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
** Received SSDs for MACCHIATObins
+
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 240G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** Discuss high level plan for VPP on Arm
+
** Align Arm patches with VPP release plan.
** Nitin still working on patch for cache line size
+
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Investigating bi-hash lockless implementation - Jason
 +
** EPIC for next quarter:
 +
 
 +
'''10/08/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Stanislav Clebec
 +
* General
 
* CSIT
 
* CSIT
** Need more attention on solution for buildroot issue [https://jira.fd.io/browse/CSIT-990 CSIT-990]
+
** VPP Performance Test
** Nitin moving towards L2 & L3 perf test cases
+
*** Performance data on Arm in official release 19.08 is available.
** VM crash due to buffer overflow when multiple VMs share NVRAM; resolved in Fedora27
+
**** https://docs.fd.io/csit/rls1908/report/
''' 3/14/2018 '''
+
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
* Key signing party! Thank you Ed!
+
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure - Merged.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** ToR switch issue resolved; confirm mgmt IP address assignment to racked Huawei/Cavium machines
+
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
** Started provisioning MACCHIATObins; Andy ordered SSDs to go with them
+
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 240G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** No updates
+
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Investigating bi-hash lockless implementation - Jason
 +
** EPIC for next quarter:
 +
 
 +
'''10/01/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Stanislav Clebec
 +
* General
 
* CSIT
 
* CSIT
** Adarsh started running CSIT on virtual topology; moved past a paramiko issue, seeing other test failures
+
** VPP Performance Test
** Ongoing discussions on getting Adrian access to machines
+
*** Performance data on Arm in official release 19.08 is available.
''' 3/7/2018 '''
+
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** Trishan (LF) to help follow up on progress in FD.io lab
+
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 240G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** More discussion on patch for cache line size; use MIDR register exported by proc fs
+
** Align Arm patches with VPP release plan. - Lijian
** Decision has been made to use wrappers for atomics
+
** Vectorization
** Damjan reworked PCI handling code and added native driver for Intel AVF (XL710 i.e. Fortville)
+
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
*** Measuring 132 clocks per packet on Skylake (ip4 routing) with VLIB_FRAME_SIZE 256 (default); +1Mpps over DPDK avf/i40e PMD
+
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
** Damjan reworked memcpy() in MEMIF; achieve 2x25GbE line rate with these changes
+
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
** Sirshak working on getting VPP running on Qualcomm Centriq with Mellanox NIC
+
** MAP with VPP - error is resolved. Sort of working. Record the details.
*** Seeing issues with external DPDK; static works but not shared; is VPP build system missing -libverbs -lmlx5 in LDFLAGS?
+
*** There's no crash issue with latest VPP code.
*** Nitin noticed DPDK 17.11 Mellanox PMD does not compile
+
** Investigate bihash operations in L2 throughput are hot-spots
*** Mellanox recently submitted a patch to VPP to support dynamic loading of Mellanox libraries
+
*** Cache misses and CRC32 calculation are possible opportunities.  
 +
** Investigating bi-hash lockless implementation - Jason
 +
** EPIC for next quarter:
 +
 
 +
'''09/24/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 +
** Finished PPT and demo to Pravin - Will share with Juraj and Honnappa.
 
* CSIT
 
* CSIT
** Adrian does not have machines to work with in Bucharest; machine in Paris that Gabriel was using no longer available
+
** VPP Performance Test
*** AndyW to help resolve
+
*** Investigate DPDK performance job - Juraj
** Adarsh moved past VM issues; able to launch VPP in VM with virtio interface; starting to run CSIT scripts
+
*** Performance data on Arm in official release 19.08 is available.
''' 2/28/2018 '''
+
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
 +
**** Currently trending data could be monitored manually only.
 +
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** Ed Kern to try containerized CI on one OD1000 in parallel with Vanessa
+
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
** Received MACCHIATObins in Austin
+
** Inform Prashant in FD.io lab for the incoming ThunderX2 - Lijian
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** Adarsh trying to run VPP in VM but getting PCI mapping issue; trying to connect to Linux bridge on host
+
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
** Patches for build breakage were committed; arm64 build stable now
+
** Align Arm patches with VPP release plan. - Lijian
** Brian able to reproduce low PPS numbers seen on MACCHIATObin
+
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
**** Vectorize the data buffer index to data buffer pointer function.
 +
**** Jieqiang has finished code reviewing. Honnappa to review the patches.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
*** Finished reviewing the patches. Honnappa to review the patches.
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''09/17/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 +
** Show CSIT CI/CD, Jenkins status, log and the voting right if there's any failure - Juraj & Lijian
 +
*** Will sync up with Juraj/Stan on Thursday on CSIT demo to Arm product manager.
 
* CSIT
 
* CSIT
** Adarsh can reproduce a crash in qemu 2.10 Ubuntu 16.04; going to try Ubuntu 17.10
+
** VPP Performance Test
** Need to partition func test cases across people
+
*** Performance data on Arm in official release 19.08 is available.
''' 2/21/2018 '''
+
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
 +
**** Currently trending data could be monitored manually only.
 +
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilaion
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
 +
** Inform Prashant in FD.io lab for the incoming ThunderX2 - Lijian
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
 +
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
**** Vectorize the data buffer index to data buffer pointer function.
 +
**** Jieqiang has finished code reviewing. Honnappa to review the patches.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Crash issue is reproduced - Jieqiang
 +
*** Crash is gone after applying the patch.
 +
**** There's crash issue when executing 'show hardwares'
 +
**** https://gerrit.oss.arm.com/#/c/131831/
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
*** Finished reviewing the patches. Honnappa to review the patches.
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 +
'''09/10/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 +
** Show CSIT CI/CD, Jenkins status, log and the voting right if there's any failure - Juraj & Lijian
 +
*** Talk to Song about it.
 
* CSIT
 
* CSIT
** Gabriel updated CSIT/AArch64 wiki with PASS/FAIL/OTHER list
+
** VPP Performance Test
*** OTHER - failure due to expect-like parsing of output(?)
+
*** Performance data on Arm in official release 19.08 is available.
*** FAIL - ssh timeout during PCIe rescan(?)
+
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
** Moved past first UEFI crash; still seeing crashing on startup (Gabriel)
+
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
*** Setup new Ubuntu environment
+
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
*** Continue debugging UEFI issue on Fedora with JeremyL
+
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
** Ubuntu is used pretty much everywhere except for additional CentOS CSIT perf
+
**** Currently trending data could be monitored manually only.
** Nitin working on upstreaming changes to CSIT
+
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
** Adarsh working on getting VM interfaces working
+
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilaion
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Inform Prashant in FD.io lab for the incoming ThunderX2 - Lijian
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** More discussion on how to handle cache line size
+
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
** Sync'd on patches for build breakage
+
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Crash issue is reproduced - Jieqiang
 +
*** Crash is gone after applying the patch
 +
**** There's crash issue when executing 'show hardwares'
 +
**** https://gerrit.oss.arm.com/#/c/131831/
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 2/14/2018 '''
+
'''09/03/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 +
** Show CSIT CI/CD, Jenkins status, log and the voting right if there's any failure - Juraj & Lijian
 +
*** Talk to Song about it.
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
 +
**** Currently trending data could be monitored manually only.
 +
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Issue is root-caused. Patch is in community review - https://gerrit.fd.io/r/c/vpp/+/21469
 +
** Totally 29 VPP device test cases executed, and 26 cases passed, and only 3 tap related tests failed.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** Working on getting access to LF lab in order to setup OD1000 environment
+
** Inform Prashant in FD.io lab for the incoming ThunderX2 - Lijian
** Check with tykeal & zxiiro on trust policy for getting others access (Brian)
+
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
** VEXXHOST
+
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
*** Mohammed says they do not have extra rack shelf - we need to send one for 3x MACCHIATObin
+
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
*** LF RT tickets: #52434 (ThunderX), #52435 (TaiShan2280), #52436 (MACCHIATObin)
+
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** Build, unit test, deb/rpm
+
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
*** 64B/128B cache line size - working on passing this configuration to rest of build system i.e. DPDK (Nitin)
+
** Align Arm patches with VPP release plan. - Lijian
*** RPi3 32-bit
+
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
**** Some parts of patch are 32-bit related, some RPi3 related
+
*** Will check VPP release schedule and map with Arm Quarterly plan.
**** If there is justification, look into maintaining a 32-bit build on ARM
+
*** Note down patches in community review and align them to VPP release plan.
** Porting & Tuning
+
*** It has been challenging to do that in VPP.
*** If patches need to be tested on multiple Arm chips, please use DO_NOT_MERGE and Code Review -2
+
** Vectorization
*** Two NEON related patches merged, working in progress on others, Nitin testing CLASSIFY_USE_SSE
+
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Crash issue is reproduced - Jieqiang
 +
**** https://gerrit.oss.arm.com/#/c/131831/
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''08/27/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 
* CSIT
 
* CSIT
** Please open JIRA ticket with details on VM crashing on startup. DONE: [https://jira.fd.io/browse/CSIT-922 CSIT-922]
+
** VPP Performance Test
** Khem working on running VPP func tests on internal setup
+
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
 +
**** Currently trending data could be monitored manually only.
 +
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Issue is root-caused. Patch is in community review - https://gerrit.fd.io/r/c/vpp/+/21469
 +
** Totally 29 VPP device test cases executed, and 26 cases passed, and only 3 tap related tests failed.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Inform Prashant in FD.io lab for the incoming ThunderX2 - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
 +
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Run VPP with MAP and reproduce the previous crash/failures - Jieqiang
 +
*** Got latest license to install MAP on Shanghai server.
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 2/7/2018 '''
+
'''08/20/2019'''
* LF lab
+
* Attendees
** OD1000 - last machine was racked; Vanessa needs credentials
+
** Tina Tsou
** Taishan2280 - machines arrived at Vexxhost; confirm with Rudy/Mohammed
+
** Honnappa Nagarahalli
** ThunderX - machines arrived at Vexxhost; send board details to Mohammed
+
** Lijian Zhang
** MACCHIATObin - boards arrived in Arm SJC waiting for enclosures (Andy)
+
** Jieqiang Wang
* Build, unit test, packaging
+
** Jason Zhang
** 64B/128B cache line size - working on it (Nitin)
+
** Juraj Linkes
** Interest in ILP32 from Cavium; customer coming from MIPS32
+
** Christian Hopps
*** [https://www.slideshare.net/linaroorg/bkk16305b-ilp32-performance-on-aarch64 BKK16-305B ILP32 Performance on AArch64]
+
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
 +
**** Currently trending data could be monitored manually only.
 +
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** on Arm, different default memory map regions for normal page and huge page;
 +
**** vring with huge-page mapped to normal page region addresses is not working.
 +
**** 1. Reserve 16G VA space for future usage, automatic, private, anonymous and without HUGETLB option.
 +
***** base = mmap (0x410000000, 16 << 30, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 +
**** 2. From the 16G VA space, pick up a 40M unused space, redo mmap() with the HUGETLB option, address fixed
 +
***** vaWithinBase = mmap (base, 40 << 20, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED | MAP_HUGETLB | MAP_LOCKED, fd, 0);
 +
**** 3. Use vaWithinBase to initialize vring and vring_desc
 +
** Totally 29 VPP device test cases executed, and 26 cases passed, and only 3 tap related tests failed.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** NEON usage in vhost - sent first patch for review (Nitin)
+
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
*** Need to verify how it performs on other Arm-based machines (Brian)
+
** Align Arm patches with VPP release plan. - Lijian
*** VPP maintainers prefer to use SIMD wrappers, but it might not always be possible
+
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
**** Cavium/Arm had to rewrite algorithm for AArch64 instead of use SIMD wrappers in DPDK
+
*** Will check VPP release schedule and map with Arm Quarterly plan.
** CLIB_HAVE_VEC128 - working on it (Gabriel)
+
*** Note down patches in community review and align them to VPP release plan.
** Discussed compiler builtins for atomics in VPP call; need to spin another patch with wrappers based on architecture (Kevin)
+
*** It has been challenging to do that in VPP.
** Seeing prefetch hostspots on TX2+MlnxCX4en (similar to Armada8040) (Nitin)
+
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Run VPP with MAP and reproduce the previous crash/failures - Jieqiang
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''08/13/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 
* CSIT
 
* CSIT
** libvirt crashing on VM startup (Hierofalcon) (Gabriel)
+
** VPP Performance Test
*** Need someone who can reproduce this issue (Arm TBD)
+
** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
** Huawei also seeing VM issues (Khem)
+
** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
** buildroot doesn't work on Arm (Nitin)
+
** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
*** Root issue: no support in GRUB for AArch64 in buildroot (?)
+
** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
**** Need someone who can reproduce this issue (Arm TBD)
+
** Only 1 out of 199 test cases failed, 8 test cases show random 'show hardware-interfaces' failure.
*** Peter Mikus replied to Nitin on csit-dev mail list
+
** Some failures are related with 'show hardware-interfaces'/'show vhost dump', time-out.
*** Using a temporary workaround: use a different VM image (Ubuntu Cloud) instead of one produced by buildroot
+
*** Juraj to send Lijian the commands/APIs in random dump failure.
**** Working on patching DPDK in VM image (Ubuntu Cloud) just like done in buildroot
+
*** https://jira.fd.io/browse/CSIT-1453
* Misc
+
*** SFP eeprom dump is enabled with 'show hardware-interfaces detail' only. Patch is merged.
** OpenFlow (Nitin, Damjan)
+
*** Juraj will change CSIT script with 'show hardware-interfaces verbose', https://gerrit.fd.io/r/#/c/csit/+/21085/
*** Is there an OpenFlow agent for VPP, and can VPP implement OpenFlow rules/tables?
+
**** CSIT patch is merged.
*** VPP is not flow-based like OVS is; they are different
+
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
*** Can ODL/Honeycomb be used?
+
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
*** Patch to generate daily data and trending graph is committed.
 +
**** https://gerrit.fd.io/r/#/c/csit/+/20962/
 +
**** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine. Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Have gone thru the whole patch, pmalloc module and tap interface code, but cannot identify the root-cause - Lijian
 +
**** Buffer allocate/free based pmalloc seems to be causing the problem.
 +
**** mmap() regions with normal page and huge-page have separate VA spaces.
 +
** Totally 29 VPP device test cases executed, and 26 cases passed, and only 3 tap related tests failed.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
 +
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** All 7 patches are merged.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Jieqiang checked the video by Sirshak
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 1/31/2018 '''
+
'''08/06/2019'''
* LF lab
+
* Attendees
** OD1000 - 1 replacement being installed this week
+
** Tina Tsou
** Huawei & Cavium boards should arrive at colo this week; confirm with Rudy
+
** Honnappa Nagarahalli
* Build, unit test, packaging
+
** Lijian Zhang
** Kubeproxy/NAT failures
+
** Jieqiang Wang
*** Not arch related
+
** Jason Zhang
*** Part of extended unit tests, so does not block CI
+
** Juraj Linkes
** `make test` passes on D03 & D05 (Ubuntu)
+
** Christian Hopps
* MACCHIATObin
+
* General
** Seeing hotspots in VPP graph nodes
+
*** L3 forwarding - ip4 rewrite node
+
*** L2 cross-connect
+
*** Try reducing quad loop to a dual loop
+
*** dpdk-input node highly opt for x86 (could contribute to low perf) but hotspots still in rte_mbuf_t conversion(?)
+
** Some examples of runtime code selection based on uarch exist in the codebase
+
 
* CSIT
 
* CSIT
** Adrian Oanca join from Enea
+
** VPP Performance Test
** Gabriel seeing VM crashing during boot; related to # interfaces assigned (6)
+
** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
** Nitin ran into issue with buildroot on arm64; see thread on csit-dev
+
** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
** Only 1 out of 199 test cases failed, 8 test cases show random 'show hardware-interfaces' failure.
 +
** Some failures are related with 'show hardware-interfaces'/'show vhost dump', time-out.
 +
*** Juraj to send Lijian the commands/APIs in random dump failure.
 +
*** https://jira.fd.io/browse/CSIT-1453
 +
*** SFP eeprom dump is enabled with 'show hardware-interfaces detail' only. Patch is merged.
 +
*** Juraj will change CSIT script with 'show hardware-interfaces verbose', https://gerrit.fd.io/r/#/c/csit/+/21085/
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
*** Patch to generate daily data and trending graph is committed.
 +
**** https://gerrit.fd.io/r/#/c/csit/+/20962/
 +
**** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Have gone thru the whole patch, pmalloc module and tap interface code, but cannot identify the root-cause - Lijian
 +
**** Buffer allocate/free based pmalloc seems to be causing the problem.
 +
** Totally 29 VPP device test cases executed, and 26 cases passed, and only 3 tap related tests failed.
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
 +
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** All 7 patches are merged.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Jieqiang checked the video by Sirshak
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 1/24/2018 '''
+
'''07/30/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
** Only 1 out of 199 test cases failed, 8 test cases show random 'show hardware-interfaces' failure.
 +
** Some failures are related with 'show hardware-interfaces'/'show vhost dump', time-out.
 +
*** Juraj to send Lijian the commands/APIs in random dump failure.
 +
*** https://jira.fd.io/browse/CSIT-1453
 +
*** Will check details with x86 server also. It's slow also on x86, but only 5 sec, but it takes 40 sec on Taishan  - Lijian
 +
*** It’s quite time-consuming for ‘show hardware-interfaces’ reading eeprom of the SFP, via software emulated I2C bus.
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Have gone thru the whole patch, pmalloc module and tap interface code, but cannot identify the root-cause - Lijian
 +
**** pmalloc module test cases failed on Arm server due to sudo privilege.
 +
** Totally 35 VPP device test cases passed, and only 3 tap related tests failed.
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** DPDK issue with non-pci network cards
+
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
** build & test status updated
+
** Align Arm patches with VPP release plan.
** VPP-1127 (VEC_128 enable) under discussion. Should we enable this by default ?
+
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
** add Nitin to review Neon commits
+
*** Will check VPP release schedual and map with Arm Quaterly plan.
** VPP-1114 currently internal review
+
*** Note down patches in community review and align them to VPP release plan.
** VPP-1064 under rework after review by Damjan
+
*** It has been challenging to do that in VPP.
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Message queue optimize it with relaxed atomic intrinsics - Lijian
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
 +
*** Spinlock with inner loop got improvement on both x86 and Arm.
 +
*** Read/write lock got a little degradation with the patch.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Jieqiang checked the video by Sirshak
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''07/23/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
* General
 
* CSIT
 
* CSIT
** first 3-nodes functional tests status list
+
** VPP Performance Test
** TODO Gabriel: share CSIT VM setup env
+
** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
** nested VM: build-root package support for ARM. Create Jira ticket for Brian.
+
** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
** Only 1 out of 199 test cases failed, 8 test cases show random 'show interface' failure.
 +
** Some failures are related with 'show hardware'/'show interface'/'show vhost dump', time-out.
 +
*** https://jira.fd.io/browse/CSIT-1453
 +
*** Will check details with x86 server also. It's slow also on x86, but only 5 sec, but it takes 40 sec on Taishan  - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
**** 1. All tests are failing. 'show hardware' takes too much time. https://jira.fd.io/browse/VPP-1722
 +
**** 2. To figure out which test cases are executed
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** Working on MAC learning test failures on Cortex-A72 server - Jieqiang
 +
**** Enlarge duration can fix the failure, but will investigate more details.
 +
**** Issues have been fixed in latest master branch. Investigating the details.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Send email and current debug details to community calling for volunteer to fix it. - Lijian
 +
**** pmalloc module test cases failed on Arm server.
 +
*** Changes are uploaded to community gerrit.
 +
*** vpp VMs seems to bring up well. Will work on init script and bring up vpp.
 +
**** VM tests passed. Patches are to be submitted for community review.
 +
**** All the patches are merged and all images are built.
 +
*** Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
 +
**** Docker images for both Arm and x86 are merged and available.
 +
**** https://jenkins.fd.io/sandbox/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/1/console
 +
**** Docker image is verified on Arm server, but to verify it on x86 server also and try it in Jenkins.
 +
**** Arm and x86 have separate docker image. Arm docker image is to be built.
 +
**** Totally 35 test cases, and only 3 tap related tests failed.
 +
*** Ed to help set up numad cluster with dual ThunderX and one ThunderX2
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Message queue optimize it with relaxed atomic intrinsics - Lijian
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
 +
*** Spinlock with inner loop got improvement on both x86 and Arm.
 +
*** Read/write lock got a little degradation with the patch.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Inform MAP owner that Jieqiang will take care of MAP on VPP. - Lijian
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 1/17/2018 '''
+
'''07/16/2019'''
* Tina to send calendar invite for meeting
+
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
**** 1. All tests are failing. 'show hardware' takes too much time. https://jira.fd.io/browse/VPP-1722
 +
**** 2. To figure out which test cases are executed
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** Working on MAC learning test failures on Cortex-A72 server - Jieqiang
 +
**** Enlarge duration can fix the failure, but will investigate more details.
 +
**** Issues have been fixed in latest master branch. Investigating the details.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Send email and current debug details to community calling for volunteer to fix it. - Lijian
 +
*** Changes are uploaded to community gerrit.
 +
*** vpp VMs seems to bring up well. Will work on init script and bring up vpp.
 +
**** VM tests passed. Patches are to be submitted for community review.
 +
**** Patch is splited into three small pieces. Two patches (kernel image for VM test/generic CSIT changes to support ThunderX2 testbed) are merged. Third patch about code changes for VM test to be merged, Arm specific code and use kernel image.
 +
*** Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
 +
**** Docker images for both Arm and x86 are merged and available.
 +
**** Docker image is verified on Arm server, but to verify it on x86 server also and try it in Jenkins.
 +
*** Ed to help set up numad cluster with dual ThunderX and one ThunderX2
 
* FD.io lab
 
* FD.io lab
** Cavium shipping
+
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
** ThunderX1
 
* VPP
 
* VPP
** Kubeproxy tests failing
+
** VPP host-stack Hotspots
** Khem trying to find out the PCIe address for a given netdev interface
+
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Message queue optimize it with relaxed atomic intrinsics - Lijian
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
 +
*** Spinlock with inner loop got improvement on both x86 and Arm.
 +
*** Read/write lock got a little degradation with the patch.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin.
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''07/09/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
* General
 
* CSIT
 
* CSIT
** Gabriel setting up 3 node topo with VMs
+
** VPP Performance Test
** Gabriel working on PASS/FAIL status
+
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
* [https://docs.fd.io/csit/rls1710/report/index.html CSIT 17.10 report]
+
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** IPSEC test cases are failing and skipped on Arm server in CI/CD
 +
**** https://jira.fd.io/browse/VPP-1714
 +
**** Create a Jira ticket to track all the info related to this issue - Juraj
 +
*** Working on MAC learning test failures on Cortex-A72 server - Jieqiang
 +
**** Enlarge duration can fix the failure, but will investigate more details.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Send email and current debug details to community calling for volunteer to fix it. - Lijian
 +
*** Changes are uploaded to community gerrit.
 +
*** vpp VMs seems to bring up well. Will work on init script and bring up vpp.
 +
**** VM tests passed. Patches are to be submitted for community review.
 +
*** Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
 +
**** Docker images for both Arm and x86 are merged and available.
 +
*** Ed to help set up numad cluster with dual ThunderX and one ThunderX2
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
 +
*** Update the current status to Pravin. - Lijian
 +
*** The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** Require a bigger than 120G RAM, prefer 256G
 +
*** Three NICs and each has two ports.
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Message queue optimize it with relaxed atomic intrinsics - Lijian
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
 +
*** Spinlock with inner loop got improvement on both x86 and Arm.
 +
*** Read/write lock got a little degradation with the patch.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin.
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 1/10/2018 '''
+
 
* Meeting moved 2 hours earlier - 6AM PT / 3PM CET / 7:30PM IST / 10PM CST
+
'''07/02/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** IPSEC test cases are failing and skipped on Arm server in CI/CD
 +
**** https://jira.fd.io/browse/VPP-1714
 +
**** Create a Jira ticket to track all the info related to this issue - Juraj
 +
*** Working on MAC learning test failures on Cortex-A72 server - Jieqiang
 +
**** Enlarge duration can fix the failure, but will investigate more details.
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - Lijian
 +
**** The current default C compiler identification is GNU 8.3.0
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Send email and current debug details to community calling for volunteer to fix it. - Lijian
 +
*** vpp VMs seems to bring up well. Will work on init script and bring up vpp.
 +
*** Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
 +
*** Set up numad cluster with dual ThunderX and one ThunderX2
 
* FD.io lab
 
* FD.io lab
** Cavium ThunderX shipping soon
+
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
 +
*** Update the current status to Pravin. - Lijian
 +
*** The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** Require a bigger than 120G RAM, prefer 256G
 +
*** Three NICs and each has two ports.
 +
** ThunderX1
 
* VPP
 
* VPP
** Kumar to look at VPP-1126
+
** VPP host-stack Hotspots
** Gabriel proposed https://gerrit.fd.io/r/#/c/10049/ as follow-up to Damjan's patch
+
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Message queue, remove atomic intrinsics and use lock version only - Lijian
 +
*** Have verified removing atomic intrinsics in message_queue alloc/free APIs, and require confirmation from Florin.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
 +
** Fix ip4_forward compiling - Jason
 +
*** Will check gerrit CI/CD related with that patch. Check why it's not warning in gerrit Jenkins.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Spread dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Will do bench-marking profiling on mcbin.
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''06/25/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
* General
 
* CSIT
 
* CSIT
** Gabriel's patch for aarch64 support in CSIT merged
+
** VPP Performance Test
** VirtualBox not supported on Arm / Vagrant unknown
+
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
*** This is OK for upstream since automation expects VMs to already exist
+
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
* Performance
+
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
** Need plan for 1T; use TaiShans that were sent to lab
+
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
* AIs
+
*** creating a job. - Everything is ready except the docker image
** Brian: Follow up with Vanessa and EdW regarding 'resource issue'
+
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
** Gabriel: Update CSIT wiki page; which tests are passing/failing?
+
** VPP Path
** Brian: Check with Vanessa how to split machines between CI jobs and CSIT jobs
+
*** IPSEC test cases are failing and skipped on Arm server in CI/CD
 +
**** https://jira.fd.io/browse/VPP-1714
 +
**** Create a Jira ticket to track all the info related to this issue - Juraj
 +
*** Working on MAC learning test failures on Cortex-A72 server - Jieqiang
 +
**** Enlarge duration can fix the failure, but will investigate more details.
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - Juraj
 +
**** The current default C compiler identification is GNU 8.3.0
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
*** Crypto test cases, will use dpdk driver if configured, native-vpp implementation, fall back to openSSL
 +
**** Will try Crypto test cases next week - Juraj
 +
*** Juraj to send Lijian the details of vpp VMs, Lijian will confirm internally
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
 +
*** Firstly will sponsor the machine
 +
*** The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** Require a bigger than 120G RAM, prefer 256G
 +
*** Three NICs and each has two ports.
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Message queue, remove atomic intrinsics and use lock version only - Lijian
 +
*** Have verified removing atomic intrinsics in message_queue alloc/free APIs, and require confirmation from Florin.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
** Spinlock optimization - Jason
 +
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Spread dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Will do bench-marking profiling on mcbin.
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 1/3/2018 '''
+
'''06/18/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina Tsou
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - Juraj
 +
**** The current default C compiler identification is GNU 8.3.0
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 
* FD.io lab
 
* FD.io lab
** One OD1000 sent for RMA
+
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
** Huawei PO sent out
+
** ThunderX1
** Cavium PO sent out (?)
+
 
* VPP
 
* VPP
** Gabriel working on patch for "show cpu" to display MIDR as human readable
+
** VPP host-stack Hotspots
** Nitin sent preliminary patch for vhost-user NEON impl
+
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
*** Seeing perf differences on different cores; tradeoff is single-threaded perf vs. NEON
+
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
** Kumar built and unit test successfully on D03
+
*** Will confirm with Florin to remove atomic intrinsics in message_queue alloc/free APIs
** Nitin to resume patch for supporting different cache line sizes for the same arch
+
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - Upstreamed.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
** Spread qual/quad optimization - ethernet-input
 +
** Redo perf/MAP profiling/bench-marking
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** Apply dual/quad optimization on more data path nodes
 +
*** Investigate and optimize VPP hash and bihash library
 +
*** VPP translation overhead analysis btw Mbuf and VLIB buffer ENTNET-1293
 +
*** VPP Memif performance analysis and optimization ENTNET-1292
 +
*** VPP l3fwd performance analysis and optimization ENTNET-751
 +
*** Using MAP with VPP ENTNET-1288
 +
 
 +
'''06/11/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina Tsou
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj
 +
* General
 
* CSIT
 
* CSIT
** Gabriel cleaned up WIP patch; ready for review
+
** VPP Performance Test
** Kumar starting CSIT func tests with Ubuntu VMs
+
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
*** Scripts for running on dedicated hardware need to be modified, e.g. PCIe resources
+
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
** Kumar to send doc on testing
+
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
* Performance
+
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
** Kumar to start thread on performance testing
+
*** creating a job. - Everything is ready except the docker image
* AIs
+
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
** Brian: Check with Tina on shipping and open LF RT ticket once they have arrived
+
** VPP Path
** Brian: Need a way to choose either SW or NEON impl based on chip
+
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - Juraj
** Gabriel: Create list of broken CSIT tests for 2-node topology
+
**** The current default C compiler identification is GNU 8.3.0
''' 12/20/2017 '''
+
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
'''No meeting next week - Dec 27'''
+
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 
* FD.io lab
 
* FD.io lab
** OD1000s - build only
+
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
*** 1 of 3 needs to be RMAd
+
** ThunderX1
*** Can these be up in time to show 'make test' passes on ARM for 18.01 release report?
+
** TaiShan
+
*** PO in progress
+
** ThunderX - build only
+
*** PO went out
+
 
* VPP
 
* VPP
** Patches / JIRAs
+
** VPP host-stack Hotspots
*** Patch for extended test failure, but still more (new) extended test failures - Gabriel
+
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
*** Nitin to post vhost-user.c changes for NEON
+
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
**** Nitin will finish Gabriel's original NEON patch to add CLIB_HAVE_VEC_128
+
*** Will confirm with Florin to remove atomic intrinsics in message_queue alloc/free APIs
** Can we share code on Github e.g. NEON perf tests?
+
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - Upstreamed.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
** Spread qual/quad optimization - ethernet-input
 +
** Redo perf/MAP profiling/bench-marking
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''06/04/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina Tsou
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Stan
 +
* General
 
* CSIT
 
* CSIT
** Leading question: How many CSIT test cases are passing/failing?
+
** VPP Performance Test
** Environment issues preventing running through all CSIT test cases; Gabriel needs dedicated machines or more RAM
+
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
** Cavium & Huawei will join Gabriel in CSIT replication on ARM hardware next week
+
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
*** Cavium previously ran vhost test cases manually, now moving to CSIT
+
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - Upstreamed.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 12/13/2017 '''
+
'''05/28/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina
 +
** Lijian Zhang
 +
** Vijay (vijayakumar.rajamanickam@nokia.com)
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 
* VPP
 
* VPP
** Quick overview of work items
+
** VPP host-stack Hotspots
** Waiting to hear back from LF about OD1000 connectivity
+
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
*** Changes needed to ci-mgmt
+
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - internal Review.
 +
** MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''05/21/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina
 +
** Lijian Zhang
 +
** Vijay (vijayakumar.rajamanickam@nokia.com)
 +
* General
 
* CSIT
 
* CSIT
** Starting to reproduce CSIT on x86 and ARM (with Gabriel's WIP patch)
+
** VPP Performance Test
*** Some issues with environment variables (perf tests on 2-node)
+
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
** Need Nexus to support aarch64 packages
+
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
*** Need a contact for Nexus
+
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
* Share known issues on wiki!
+
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
* Request CSIT 'deep dive'
+
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - internal Review.
 +
** MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 12/06/2017 '''
+
'''05/14/2019'''
* Can we access the OD1000 in csit lab ?
+
* Attendees
** currently mainly working with VMs
+
** Sirshak Das
* added dedicated wiki page for CSIT : https://wiki.fd.io/view/CSIT/AArch64
+
** Honnappa Nagarahalli
* WIP : https://gerrit.fd.io/r/#/c/9474/
+
** Tina
 +
** Lijian Zhang
 +
** Vijay (vijayakumar.rajamanickam@nokia.com)
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** VPP generic distro package building patch - Patch updated. Require Damjan's follow up review.
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - internal Review.
 +
** MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 11/29/2017 '''
+
'''05/07/2019'''
*VPP
+
* Attendees
** vhost-user.c - SSE4.2 only. Implement range search using NEON. (nitin)
+
** Sirshak Das
** OD1000 status ?
+
** Honnappa Nagarahalli
*** build only
+
** Tina
*** can we access them ?
+
** Lijian Zhang
*** what wan we do to help in general ?
+
** Vijay (vijayakumar.rajamanickam@nokia.com)
** x86 intrinsic review
+
* General
** build VPP on ARM VM on x86
+
* CSIT
*CSIT
+
** VPP Performance Test
** what platforms wil lbe made available
+
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
*** Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input.
 +
** VPP generic distro package building patch - Patch updated Damjan's follow up review required.
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** Vectorization in esp-encrypt, optimize memcpy_le. Upstreamed(https://gerrit.fd.io/r/#/c/18398/). - Lijian
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - internal Review.
 +
** MAP(Arm Proprietary Performance Analysis Tool) with VPP - Tried internal Patch still failing. Continuing to work on it.
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 11/22/2017 '''
+
'''04/30/2019'''
* VPP CI
+
* Attendees
** 3 ThunderX for Chrismas
+
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina
 +
* General
 
* CSIT
 
* CSIT
** func on VM vs perfs on HW
+
** VPP Performance Test
** func on x86 VMs OK with 2 nodes
+
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
** DPDK integration WIP : https://gerrit.fd.io/r/#/c/9474/
+
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
** issues
+
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
*** how to access the lab ?
+
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
* Next steps
+
*** creating a job. - Everything is ready except the docker image
** VPP
+
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
** CSIT
+
** VPP Path
*** structure work & send email (Gabriel)
+
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
*** is xxhash vs crc32 finished ? (Gabriel)
+
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
*** ask Maciek & setup a presentation meeting with someone from CSIT (Tina)
+
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
*** find a time to reschedule this meeting before the CSIT weekly call (Brian)
+
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
*** Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input.
 +
** VPP generic distro package building patch - Patch updated Damjan's follow up review required.
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** Vectorization in esp-encrypt, optimize memcpy_le. Upstreamed(https://gerrit.fd.io/r/#/c/18398/). - Lijian
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - internal Review.
 +
** MAP(Arm Proprietary Performance Analysis Tool) with VPP - Tried internal Patch still failing. Continuing to work on it.
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 11/15/2017 '''
+
'''04/23/2019'''
* VPP upstream status
+
* Attendees
** build && build-release OK
+
** Sirshak Das
** "make test" && "make test-debug" OK
+
** Lijian Zhang
** packaging:
+
** Juraj Linkeš
*** Ubuntu 16.04 OK
+
** Vijay
*** Ubuntu 17.10 ? (TBC)
+
** Nitin
*** fedora-26 OK
+
** Khemendra Kumar
* vpp continuous test
+
** Tina Tsou
** all task required for jenkin's "verify" job are ready
+
** Andy Wang
** TODO: request gerrit hook to Dave Barachs / vpp-dev (NB & GG)
+
** Honnappa Nagarahalli
** set up ci in fdio lab
+
* General
 
* CSIT
 
* CSIT
** setting up env
+
** VPP Performance Test
** ThunderX platforms should arrive this week
+
** List all the blockers on aarch64 in CSIT wiki page - Done - Stan or Juraj
** csit work sharing
+
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
 +
*** b. merging CSIT patch. - Closing done
 +
*** c. creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** QSFP+ is available and working now.
 +
**** Juraj to resend email to Mahamad about the details, including Sirshak and Tina
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
 +
*** Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
 +
*** These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
 +
*** Investigate why these three blades have only one numa node - Juraj
 +
* VPP
 +
** Investigate session_queue_node_fn/vlib_worker_loop.
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
*** Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input
 +
** Rebase VPP distro package building patch; contact Damjan in slack; Talk with Damjan in vpp meeting - Lijian & Sirshak
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Perf degradation is fixed. Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK sample apps on it now.
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
** Vectorization
 +
*** Vecterization in esp-encrypt, https://gerrit.fd.io/r/#/c/18398/ - Get improvement on ThunderX/OcteonTX/Taishan, but degradation on ThunderX2 - Lijian
 +
** TAS patch will be ready soon (Sirshak)
 +
** MAP with VPP is ongoing - Sirshak
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
* Action Items - Last Week
 +
* Action Items - Next Week
  
''' 11/8/2017 '''
+
'''04/16/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Lijian Zhang
 +
** Juraj Linkeš
 +
** Vijay
 +
** Nitin
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** List all the blockers on aarch64 in CSIT wiki page - Done - Stan or Juraj
 +
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
 +
*** b. merging CSIT patch. - Closing done
 +
*** c. creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** QSFP+ is available and working now.
 +
**** Juraj to resend email to Mahamad about the details, including Sirshak and Tina
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
 +
*** Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
 +
*** These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
 +
*** Investigate why these three blades have only one numa node - Juraj
 +
* VPP
 +
** Investigate session_queue_node_fn/vlib_worker_loop. - https://jira.arm.com/browse/ENTNET-1179 - Done
 +
*** Will create two Jira tickets to track the findings. - Lijian
 +
** Rebase VPP distro package building patch; contact Damjan in slack; Talk with Damjan in vpp meeting - Lijian & Sirshak
 +
** Investigating message queue - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Perf degradation is fixed. Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK sample apps on it now.
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
** Vectorization
 +
*** Vecterization in esp-encrypt, https://gerrit.fd.io/r/#/c/18398/ - Get improvement on ThunderX/OcteonTX/Taishan, but degradation on ThunderX2 - Lijian
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
* Action Items - Last Week
 +
* Action Items - Next Week
  
* Unit tests
+
'''04/09/2019'''
** Tests pass except for random initialization failures
+
* Attendees
** Need to hear back from upstream about Extended unit tests
+
** Sirshak Das
* Should we run plugins such as NSH SFC?
+
** Lijian Zhang
* Hardware to lab
+
** Juraj Linkeš
** Huawei h/w stalled
+
** Nitin
** 3x ThunderX shipping to FD.io lab
+
** Khemendra Kumar
* CSIT replication
+
** Tina Tsou
** Cavium replicating on ThunderX2; getting started
+
** Andy Wang
* Let's track our work in Jira; Brian to migrate tasks to Jira
+
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** List all the blockers on aarch64 in CSIT wiki page - Done - Stan or Juraj
 +
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
 +
*** b. merging CSIT patch. - Closing done
 +
*** c. creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** QSFP+ is available and working now.
 +
**** Juraj to resend email to Mahamad about the details, including Sirshak and Tina
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
 +
*** Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
 +
*** These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
 +
*** Investigate why these three blades have only one numa node - Juraj
 +
* VPP
 +
* VPP Hoststack
 +
** Investigate session_queue_node_fn/vlib_worker_loop. - https://jira.arm.com/browse/ENTNET-1179 - Done
 +
** Rebase VPP distro package building patch; contact Damjan in slack; Talk with Damjan in vpp meeting - Lijian & Sirshak
 +
** Investigating message queue - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Perf degradation is fixed. Investigating performance degradation on Bluefield - Sirshak
 +
*** Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK sample apps on it now.
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
** Vectorization
 +
*** Vecterization in esp-encrypt, https://gerrit.fd.io/r/#/c/18398/ - Get improvement on ThunderX/OcteonTX/Taishan, but degradation on ThunderX2 - Lijian
 +
*** ethernet-input - will implement for aarch64 128bits only
 +
*** Create vectorization specific EPIC - Lijian
 +
* Action Items - Last Week
 +
* Action Items - Next Week
  
''' 10/25/2017 '''
+
'''04/02/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Nitin
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
* General
 +
* VPP Hoststack
 +
** After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
 +
** Investigate session_queue_node_fn/message queue data structure. - Investigating the source code
 +
** Review https://gerrit.fd.io/r/#/c/18398/ - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Upstreamed. Perf degradation is fixed.
 +
*** Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK on it now.
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
* CSIT
 +
** VPP Performance Test
 +
** List all the blockers on aarch64 in CSIT wiki page - Stan or Juraj
 +
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
** Both binaries and packages built with generic option by default, and provide Makefile variable NATIVE_OPTIMIZE=Y for end user to build native optimized images.
 +
*** Prepare email and a draft patch asking comments from community - Lijian
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
 +
*** b. merging CSIT patch. - Closing done
 +
*** c. creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** QSFP+ is available and working now.
 +
**** Juraj to resend email to Mahamad about the details, including Sirshak and Tina
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
 +
*** thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
 +
*** Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
 +
*** These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
 +
*** Investigate why these three blades have only one numa node - Juraj
 +
* VPP
 +
** Write description/expectation about the two NEON related patch - Lijian
 +
** Investigating performance degradation on CortexA72 - Sirshak
 +
** Message queue - Sirshak
 +
** Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - To close it.
 +
** Vectorization
 +
*** ethernet-input - no progress yet
 +
** 128B cache line size
 +
** Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
 +
 
 +
'''03/26/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Nitin
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
* General
 +
* VPP Hoststack
 +
** After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
 +
** Investigate session_queue_node_fn/message queue data structure. - Investigating the source code
 +
** Review https://gerrit.fd.io/r/#/c/18398/ - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Upstreamed. Perf degradation is fixed.
 +
*** Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK on it now.
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
** Both binaries and packages built with generic option by default, and provide Makefile variable NATIVE_OPTIMIZE=Y for end user to build native optimized images.
 +
*** Prepare email and a draft patch asking comments from community - Lijian
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
 +
*** b. merging CSIT patch. - Closing done
 +
*** c. creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** QSFP+ is available and working now.
 +
**** Juraj to resend email to Mahamad about the details, including Sirshak and Tina
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
 +
*** thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
 +
* FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
 +
*** Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
 +
*** These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
 +
*** Investigate why these three blades have only one numa node - Juraj
 +
* VPP
 +
** Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - To close it.
 +
** Vectorization
 +
*** ethernet-input - no progress yet
 +
** 128B cache line size
 +
** Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
 +
 
 +
'''03/19/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
* General
 +
* VPP Hoststack
 +
** After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
 +
** vlib_worker_loop and session_queue_node_fn are two major hot-spots. - Just started
 +
** Enable NEON instruction in Buffer pool free function. Patch is committed.
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Upstreamed, but still working on issues, e.g., performance degradation
 +
*** Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Done by Malvika.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
*** Prepare email and a draft patch asking comments from community - Lijian
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
 +
*** b. merging CSIT patch. - Closing done
 +
*** c. creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
**** Juraj to resend email to Mahamad about the details, including Sirshak and Tina
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
 +
*** thunderx2: Status: Talk to edk about deployment strategy with 1-node. Also blocked by QSFP+ issue.
 +
* FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Juraj setup call with LF people. Status: Done.
 +
** ThunderX2
 +
*** Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
 +
* VPP
 +
** Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - In internal review.
 +
** Vectorization
 +
*** ethernet-input - no progress yet
 +
*** buffer pools - https://jira.fd.io/browse/VPP-1560. In internal review
 +
** 128B cache line size
 +
*** VPP image with 128B cache line size crashed on ThunderX2 - Cannot reproduce crash with my setup
 +
** Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
 +
** Commit VPP distro making patch - Lijian
 +
** Plugin 25G NIC Taishan server, and connect the 25G ports to x86 25G NIC - Lijian
 +
** Follow Jianlin's suggestion, update Uboot and Kernel, and then sync up with Juraj - Lijian
 +
 
 +
'''03/12/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
* General
 +
** Tina to update the meeting notice.
 +
* VPP Hoststack
 +
** After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
 +
** vlib_worker_loop and session_queue_node_fn are two major hot-spots. - No progress
 +
** Enable NEON instruction in Buffer pool free function. Patch is committed.
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. All test failures are resolved.
 +
*** Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Switched to Malvika.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
*** Prepare email and a draft patch asking comments from community - Lijian
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
 +
*** b. merging CSIT patch. - Closing done
 +
*** c. creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
*** Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
 +
*** thunderx2: Status: Talk to edk about deployment strategy with 1-node.
 +
* FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Juraj setup call with LF people. Status: Done.
 +
** ThunderX2
 +
*** Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
 +
* VPP
 +
** Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - In internal review.
 +
** Vectorization
 +
*** ethernet-input - no progress yet
 +
*** buffer pools - https://jira.fd.io/browse/VPP-1560. In internal review
 +
** 128B cache line size
 +
*** VPP image with 128B cache line size crashed on ThunderX2
 +
** thunderx2 crashing - No update
 +
** Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
 +
** Commit VPP distro making patch - Lijian
 +
** Plugin 25G NIC Taishan server, and connect the 25G ports to x86 25G NIC - Lijian
 +
** Follow Jianlin's suggestion, update Uboot and Kernel, and then sync up with Juraj - Lijian
 +
 
 +
'''03/05/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
* General
 +
* VPP Hoststack
 +
** After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
 +
** vlib_worker_loop and session_queue_node_fn are two major hot-spots. - No progress
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. All test failures are resolved.
 +
*** Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Switched to Malvika.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
 +
*** b. merging CSIT patch. - Closing done
 +
*** c. creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
*** Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
 +
*** thunderx2: Status: Talk to edk about deployment strategy with 1-node.
 +
* FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Juraj setup call with LF people. Status: Done.
 +
** ThunderX2
 +
*** Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
 +
* VPP
 +
** Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - No progress
 +
*** Investigate with latest VPP code on x86 server - Lijian - Send emails to vpp-dev mailor if there's problem. Will not put much effort.
 +
** Vectorization
 +
*** ethernet-input
 +
*** buffer pools
 +
** 128B cache line size
 +
*** Will try this on Taishan server - Slightly performance degradation with 128 bytes cache line
 +
** thunderx2 crashing - No update
 +
** Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
 +
 
 +
'''02/26/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
* General
 +
* VPP Hoststack
 +
** After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
 +
** el0_sys hot-spot on Taishan D05 only, no plan to fix it.
 +
** vlib_worker_loop and session_queue_node_fn are two major hot-spots. - No progress
 +
** memcpy optimization
 +
*** memcpy patch verification on taishan by khem l3 forwarding usecase- Lijian Status(khem): No updates.
 +
*** memcpy patch consumes more clocks in OcteonTX2 - updated by Nitin.
 +
*** Stopped working on this patch.
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Test failure on SCTP, not root-caused yet.
 +
*** Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Switched to Marvikar
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
 +
*** b. merging CSIT patch. - Closing done
 +
*** c. creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
*** Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
 +
*** thunderx2: Status: Talk to edk about deployment strategy with 1-node.
 +
* FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Juraj setup call with LF people. Status: Done.
 +
** ThunderX2
 +
*** Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
 +
* VPP
 +
** Buffer Pools per NUMA
 +
** Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560
 +
*** Investigate with latest VPP code on x86 server - Lijian - Send emails to vpp-dev mailor if there's problem. Will not put much effort.
 +
** Vectorization
 +
*** ethernet-input
 +
*** buffer pools
 +
** 128B cache line size
 +
*** Will try this on Taishan server - Slightly performance degradation with 128 bytes cache line
 +
** Qualcomm no change iperf3
 +
** thunderx2 crashing - No update
 +
** Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
 +
 
 +
'''02/19/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
* General
 +
* VPP Hoststack
 +
** After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
 +
** memcpy optimization
 +
*** memcpy patch verification on taishan by khem l3 forwarding usecase- Lijian Status(khem): No updates.
 +
*** memcpy patch consumes more clocks in OcteonTX2 - updated by Nitin.
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch.
 +
*** Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/
 +
*** b. merging CSIT patch.
 +
*** c. creating a job.
 +
** Target: master trending job
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
*** Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
 +
*** thunderx2: Status: Talk to edk about deployment strategy with 1-node.
 +
* FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Juraj setup call with LF people. Status: Done.
 +
** ThunderX2
 +
*** Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
 +
* VPP
 +
** Buffer Pools per NUMA
 +
** Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560
 +
** 1GB page taking long time Status: fixed.
 +
*** Investigate with latest VPP code on x86 server
 +
** Vectorization
 +
*** ethernet-input
 +
*** buffer pools
 +
*** memcpy
 +
** 128B cache line size
 +
*** Will try this on Taishan server - Lijian
 +
** Qualcomm no change iperf3
 +
** thunderx2 crashing - No update
 +
** Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
 +
 
 +
'''02/11/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
* General
 +
* VPP Hoststack
 +
** memcpy optimization
 +
*** memcpy patch verification on taishan by khem l3 forwarding usecase- Lijian Status(khem): No updates.
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch.
 +
*** Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible.
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config
 +
*** b. merging CSIT patch.
 +
*** c. creating a job.
 +
** Target: master trending job
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
*** thunderx2: Status: Talk to edk about deployment strategy with 1-node.
 +
* FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Juraj setup call with LF people. Status: Done.
 +
** ThunderX2
 +
*** Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
 +
* VPP
 +
** Buffer Pools per NUMA
 +
** Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560
 +
** 1GB page taking long time Status: fixed.
 +
** Vectorization
 +
*** ethernet-input
 +
*** buffer pools
 +
*** memcpy
 +
** 128B cache line size
 +
** Qualcomm no change iperf3
 +
** thunderx2 crashing
 +
** Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
 +
**
 +
'''02/05/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
* General
 +
* VPP Hoststack
 +
** memcpy optimization
 +
*** Check optimized memory copy version are deployed on Taishan and ThunderX2 during runtime - Lijian
 +
*** Send memcpy patch to Khem and Fede for further verification - Lijian Status: fede: small improvement in mcbin with iperf3, khem to try them with l3 forwarding
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation
 +
*** Working on svm_fifo alternate version with front and back pointers synchronized instead of cursize.
 +
** Verifying per NUMA node buffer pool https://gerrit.fd.io/r/#/c/16638/
 +
*** sirshak create jira id in fd.io jira. https://jira.fd.io/browse/VPP-1560
 +
*** Hanging of VPP is actually VPP taking a lot of time to allocate 400K chunks for 1GB - Damjan has this in his todo list
 +
*** gcc-8 compilation still fails on ARM.
 +
**** sirshak create a jira id in fd.io jira. Status: https://jira.fd.io/browse/VPP-1559
 +
*** Octeon-Tx failure. Status: unknown
 +
** Gorka is trying some optimal configs for VCL. Status: no updates.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** OcteonTx boots to buildroot with no dhclient hence an impasse. Still not clear how to use USB stick.
 +
* CSIT
 +
** VPP Path
 +
*** Sirshak to keep track of gcc-8 compilation, once clean we can switch to gcc-8. https://jira.fd.io/browse/VPP-1559
 +
*** ARM CI results are overwritten by x86 machines. Should be a Jenkin issue. Monitor if this corner will happen again. - Juraj
 +
*** Add cross compilation CI Juraj: https://jira.fd.io/browse/CTP-3
 +
** VPP Device
 +
*** thunderx: 1-node topology on cavium thunderx. Status: no updates.
 +
*** Kernel Migration on mcbin. Status:
 +
*** ThunderX2:
 +
** VPP Performance Test
 +
*** Stan and Khem to come up with a summary of current status and an estimate of at least upstreaming basic L2/L3 performance suites.
 +
*** Juraj to come up with a solution for NUMA node anamoly in Taishan.
 +
*** https://gerrit.fd.io/r/#/c/16850/ Status: Juraj has a version all ready to work. Package installation blocker.
 +
*** Package installation error Status: Juraj to investigate logs.
 +
* FD.io lab
 +
** ThunderX1 -
 +
*** New QSFP+ switch for ThunderX1 is available now: QSFP+ to be connected SFP+ switch.
 +
*** Juraj to setup a call with LF folks on.
 +
** ThunderX2 -
 +
*** Andy still waiting cables.
 +
*** Juraj to remind Andy of when the cable will be available.
 +
*** Juraj to follow up on ssh connectivity to thunderx2.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
 +
*** [Lijian] Check if setting default cache line size to 128 will degradate thru-put on Taishan/Qualcomm/ThunderX2
 +
**** no perf diff in Qualcomm
 +
**** vpp crashes on thunderx2
 +
**** waiting for results on A72 (Taishan)
 +
*** [Sirshak] on ethernet-input node, investigate vectorized buffer index, Damjan's per numa node buffer pool patch. Status: No updates
 +
**** open fd.io jira tkt. https://jira.fd.io/browse/VPP-1560
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
 +
**
 +
 
 +
'''01/29/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Lijian Zhang
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Garcia
 +
** Manuel
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
** John Ddigilio
 +
* General
 +
** DPDK multi-core scheduler
 +
** https://gerrit.fd.io/r/#/c/15084/
 +
** Please join slack.
 +
** Merge TCP optimization meeting into VPP/Aarch64 community public meeting.
 +
* VPP Hoststack
 +
** TaiShan Server with Debian distro crashed with command of 'ip probe-neighbor' when doing VPP hoststack with iperf3
 +
** With 64 bytes packets, on ThunderX2, 10G NIC, VPP hoststack bandwidth is about 1/2 of Linux Kernel stack.
 +
** With 64 bytes packets, on Taishan, 10G NIC, VPP hoststack bandwidth is about 2x of Linux Kernel stack.
 +
** Memory copy patch gives 4% improvement on VPP hoststack on Taishan server.
 +
** Check optimized memory copy version are deployed on Taishan and ThunderX2 during runtime - Lijian
 +
** Send memcopy patch to Khem and Fede for further verification - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** Investigating lock-less fifo, ip4_local_forward node and memory reordering for VPP hoststack - Sirshak
 +
** Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
 +
** Verifying https://gerrit.fd.io/r/#/c/16638/ - Suppose to give better performance, but VPP hang with this patch on some Arm machines.
 +
** Gorka is trying some optimal configs for VCL. - No Updates.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** OcteonTX is received in ARM lab. Will boot it up firstly and then start doing profiling with it.
 +
* FD.io lab
 +
** ThunderX1 -
 +
*** New Arista switch for ThunderX1 is available now. Gathering details that required by LF lab before sending the switch to CSIT lab. - Juraj - Andy will try to send the switch to CSIT this Thursday.
 +
** ThunderX2 -
 +
*** Cable type is confirmed. Procurement is in the process.
 +
*** Juraj to remind Andy of when the cable will be available.
 +
*** Require access to these servers in FD.io lab. Anton gives the IP to access them.(ADMIN/ADMIN)
 +
* CSIT
 +
** VPP Path
 +
*** So far so good.
 +
*** ARM CI results are overwritten by x86 machines. Should be a Jenkin issue. Monitor if this corner will happen again. - Juraj
 +
** VPP Device
 +
*** thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts. Juraj is able to modify, execute the scripts in Container. Things to fix is scripts for 1-link 1-node topology and interfaces binding to VPP. Is able to run successfully a traffic test.
 +
*** Kernel Migration on mcbin. Juraj is able to build all the images, but got kernel panic. Try with the more recent uBoot version. Tried latest uBoot image, but still has the same issue.
 +
*** Juraj to investigate further work once ThunderX2 is available.
 +
** VPP Performance Test
 +
*** perftest - https://jenkins.fd.io/job/vpp-csit-verify-perf-master-2n-skx - Triggered manually now if patch is perf sensitive.
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** Development of L2 test script is under-going now.
 +
*** Khem will get L2 work in CI firstly, and then IP4, and other test cases.
 +
*** Stan starts to work on performance scripts with Khem. Is able to connect Taishan machines in CSIT lab.
 +
*** The performance topology in wiki link is to update per below file.
 +
*** https://github.com/FDio/csit/blob/master/docs/lab/Testbeds_Xeon_Skx_Arm_Atom.md
 +
*** Stan and Khem to come up with a summary of current status and an estimate of at least upstreaming basic L2/L3 performance suites.
 +
**** Install Ubuntu-18.04 on Huawei Taishan servers firstly, and then investigate upstreaming performance test framework to enable Aarch64
 +
**** Taishan server works with Ubuntu 18.04, CSIT lab updated Ubuntu 18.04 in Taishan
 +
**** Install the packages on Taishan server from cloud repository, to check if VPP can get intel NICs on Taishan - Lijian
 +
**** https://packagecloud.io/app/fdio/master/search?q=19.01-rc0%7E642-g31fe7aa3&filter=debs&filter=debs&dist=ubuntu%2Fbionic
 +
*** Stan installed latest CSIT scripts on packet generator server(x86 NEON) and Tainshan servers in FD.io lab.
 +
*** https://gerrit.fd.io/r/#/c/16850/
 +
*** Some of L2 and L3 test cases passed.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
 +
*** [Lijian] Check if setting default cache line size to 128 will degradate thru-put on Taishan/Qualcomm/ThunderX2
 +
*** [Sirshak] on ethernet-input node, investigate vectorized buffer index.
 +
** Memory Ordering
 +
*** [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
 +
** [Juraj] IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
 +
** [Juraj] Kernel Migration on mcbin. Juraj is able to build all the images.
 +
* Action Items - Next Week
 +
** [Sirshak] -
 +
 
 +
'''01/22/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Lijian Zhang
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Garcia
 +
** Manuel
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
** John Ddigilio
 +
* General
 +
** DPDK multi-core scheduler
 +
** https://gerrit.fd.io/r/#/c/15084/
 +
** Please join slack.
 +
** Merge TCP optimization meeting into VPP/Aarch64 community public meeting.
 +
* VPP Hoststack
 +
** TaiShan Server with Debian distro crashed with command of 'ip probe-neighbor' when doing VPP hoststack with iperf3
 +
** With 64 bytes packets, on ThunderX2, 10G NIC, VPP hoststack bandwidth is about 1/4 of Linux Kernel stack.
 +
** iperf3 performance with Hoststack.
 +
*** Investigating lock-less fifo, ip4_local_forward node and memory reordering for VPP hoststack - Sirshak
 +
** Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
 +
** Gorka is trying some optimal configs for VCL. - No Updates.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** OcteonTX is received in ARM lab. Will boot it up firstly and then start doing profiling with it.
 +
* FD.io lab
 +
** ThunderX1 -
 +
*** New Arista switch for ThunderX1 is available now. Gathering details that required by LF lab before sending the switch to CSIT lab. - Juraj - Andy will try to send the switch to CSIT this Thursday.
 +
** ThunderX2 -
 +
*** Cable type is confirmed. Procurement is in the process.
 +
*** Require access to these servers in FD.io lab.
 +
* CSIT
 +
** VPP Path
 +
*** So far so good.
 +
*** ARM CI results are overwritten by x86 machines. Should be a Jenkin issue. Monitor if this corner will happen again. - Juraj
 +
** VPP Device
 +
*** thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts. Juraj is able to modify, execute the scripts in Container. Things to fix is scripts for 1-link 1-node topology and interfaces binding to VPP.
 +
*** Kernel Migration on mcbin. Juraj is able to build all the images, but got kernel panic. Try with the more recent uBoot version.
 +
*** Juraj to investigate further work once ThunderX2 is available.
 +
** VPP Performance Test
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** Development of L2 test script is under-going now.
 +
*** Khem will get L2 work in CI firstly, and then IP4, and other test cases.
 +
*** Stan starts to work on performance scripts with Khem. Is able to connect Taishan machines in CSIT lab.
 +
*** The performance topology in wiki link is to update per below file.
 +
*** https://github.com/FDio/csit/blob/master/docs/lab/Testbeds_Xeon_Skx_Arm_Atom.md
 +
*** Stan and Khem to come up with a summary of current status and an estimate of at least upstreaming basic L2/L3 performance suites.
 +
**** Install Ubuntu-18.04 on Huawei Taishan servers firstly, and then investigate upstreaming performance test framework to enable Aarch64
 +
**** Lijian to verify Ubuntu-18.04 on Taishan server.
 +
*** Stan installed latest CSIT scripts on packet generator server(x86 NEON) and Tainshan servers in FD.io lab.
 +
*** https://gerrit.fd.io/r/#/c/16850/
 +
*** Some of L2 and L3 test cases passed.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
 +
*** [Sirshak] on ethernet-input node, investigate vectorized buffer index.
 +
** Memory Ordering
 +
*** [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
 +
** [Juraj] IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
 +
** [Juraj] Kernel Migration on mcbin. Juraj is able to build all the images.
 +
* Action Items - Next Week
 +
** [Sirshak] - To update patch list in VPP/Aarch64 wiki
 +
 
 +
'''01/15/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Lijian Zhang
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Garcia
 +
** Manuel
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
** John Ddigilio
 +
* General
 +
** DPDK multi-core scheduler
 +
** https://gerrit.fd.io/r/#/c/15084/
 +
** Please join slack.
 +
** Merge TCP optimization meeting into VPP/Aarch64 community public meeting.
 +
* VPP Hoststack
 +
** iperf3 performance with Hoststack.
 +
*** Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
 +
** Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
 +
** Gorka is trying some optimal configs for VCL. - No Updates.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** OcteonTX is received in ARM lab. Will boot it up firstly and then start doing profiling with it.
 +
* FD.io lab
 +
** ThunderX2 -
 +
*** New Arista switch is available now. Gathering details that required by LF lab before sending the switch to CSIT lab. - Juraj
 +
*** Cable type is confirmed. Procurement is in the process.
 +
* CSIT
 +
** VPP Path
 +
*** IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
 +
*** We have voting verify on bionic. Upload nexus disabled but merge job working. - Juraj created LF ticket for nexus upload. Both master merge job and verifying job are working fine.
 +
*** ARM CI results are overwritten by x86 machines. Should be a Jenkin issue. Monitor if this corner will happen again. - Juraj
 +
** VPP Device
 +
*** thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts.
 +
*** Kernel Migration on mcbin. Juraj is able to build all the images.
 +
** VPP Performance Test
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** Development of L2 test script is under-going now.
 +
*** Khem will get L2 work in CI firstly, and then IP4, and other test cases.
 +
*** Stan starts to work on performance scripts with Khem. Is able to connect Taishan machines in CSIT lab.
 +
*** The performance topology in wiki link is to update per below file.
 +
*** https://github.com/FDio/csit/blob/master/docs/lab/Testbeds_Xeon_Skx_Arm_Atom.md
 +
*** Stan and Khem to come up with a summary of current status and an estimate of at least upstreaming basic L2/L3 performance suites.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
 +
** Memory Ordering
 +
*** [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
 +
** [Juraj] IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
 +
** [Juraj] Kernel Migration on mcbin. Juraj is able to build all the images.
 +
* Action Items - Next Week
 +
** [Sirshak] - To update patch list in VPP/Aarch64 wiki
 +
 
 +
'''01/08/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Lijian Zhang
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
* General
 +
** DPDK multi-core scheduler
 +
** https://gerrit.fd.io/r/#/c/15084/
 +
** Please join slack.
 +
* VPP Hoststack
 +
** iperf3 performance with Hoststack.
 +
*** Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
 +
** Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
 +
** Gorka is trying some optimal configs for VCL. - No Updates.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown). 
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
 +
** [Lijian] Working on IP4 reassembly and GBP failures. - fixed. Juraj has upstreamed patched to enable these two tests.
 +
** [Sirshak] Kernel Migration mcbin. Juraj is working on based on Jianlin's suggestion.
 +
** [Andy] Getting a new Arista switch next year.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] working on vectorized memory copy - Macro benchmarking is done and data is updated to Jira.
 +
** Memory Ordering
 +
*** [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
 +
* CSIT
 +
** VPP Path
 +
* VPP Path Failures
 +
*** We have voting verify on bionic. Upload nexus disabled but merge job working. - Juraj created LF ticket for nexus upload. Both merge job and verifying job are working fine.
 +
** VPP Device
 +
*** thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts.
 +
*** thunderx2: Juraj working with LF to get this resolved.
 +
*** mcbin: Juraj can contact Jianlin if needed.
 +
** VPP Performance Test
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** Development of L2 test script is under-going now.
 +
*** Khem will get L2 work in CI firstly, and then IP4, and other test cases.
 +
*** Stan is starting working on VPP performance test. Khem to send email to Stan on VPP performance testing stuff.
 +
* FD.io lab
 +
** New Arista switch to be proccured next year.
 +
** ThunderX2 - Racked. Andy is trying to buy cables compatible to Intel XL710. Juraj to confirm info required by lab people before sending out the cables.
 +
* Action Items - Next Week
 +
 
 +
'''12/18/2018'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Tina Tsou
 +
** Stanislav Chlebec
 +
** Avinash
 +
** Khemendra
 +
* General
 +
** DPDK multi-core scheduler
 +
** https://gerrit.fd.io/r/#/c/15084/
 +
** Cancelling calls on 25th of Dec and 1st of jan. Next meeting 8th Jan.
 +
** Please join slack.
 +
* VPP Hoststack
 +
** iperf3 performance with Hoststack.
 +
*** Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
 +
** Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
 +
** Gorka is trying some optimal configs for VCL. - No Updates.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown). 
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working.
 +
** [Lijian] Working on IP4 reassembly and GBP failures. - Some preliminary on gbp waiting Neale. Juraj to give access to Lijian to investigate on ThunderX.
 +
** [Sirshak] Kernel Migration mcbin. Status: Jianlin to work with Juraj to get fd.io mcbins up and running. Sirshak to setup a meeting.
 +
** [Andy] Getting a new Arista switch next year.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] working on vectorized memory copy - Still benchmarking and setting it up for internal review.
 +
*** [Lijian] Patch for compiling issue with GCC-8.x is under community review. Status: No updtaes.
 +
*** [Lijian] Patch for fixing StringTest failure is under community review. Status: Abandoned.
 +
*** [Lijian] Patch for CDP failure is under community review. Status: No updates.
 +
** Memory Ordering
 +
*** [Sirshak] svm_fifo lockless alternate algorithm for SPSC.
 +
* CSIT
 +
** VPP Path
 +
* VPP Path Failures
 +
** https://jira.fd.io/browse/VPP-1475 - IP4 random reassembly failure in master, also seen on x86
 +
** https://jira.fd.io/browse/VPP-1491 - GBP L3/L2 Endpoint Learning failure
 +
*** We have voting verify on bionic. Upload nexus disabled but merge job working. Juraj to create LF ticket for nexus upload.
 +
** VPP Device
 +
*** thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts.
 +
*** thunderx2: Sirshak working with LF to get this resolved.
 +
*** mcbin: Sirshak to setup a meeting between Juraj and Jianlin.
 +
** VPP Performance Test
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** Development of L2 test script is under-going now.
 +
*** Khem will get L2 work in CI firstly, and then IP4, and other test cases.
 +
* FD.io lab
 +
** New Arista switch to be proccured next year.
 +
** ThunderX2 - Racked. IPMI Static IP configuration missing. Sirshak with LF.
 +
* Action Items - Next Week
 +
 
 +
'''12/11/2018'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Tina Tsou
 +
** Stanislav Chlebec
 +
* VPP Hoststack
 +
** iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance. Three case: kernel to kernel; kernel to VPP hoststack; VPP hoststack to VPP hoststack
 +
** ongoing perf analysis. One patch(https://gerrit.fd.io/r/#/c/16184/) is merged, and the other one is under internal review.
 +
** Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
 +
** Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
 +
** Gorka is trying some optimal configs for VCL.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
**
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. Two scripts of L2 performance suites for CI management repository are done, investigating on for CSIT repository, and three more scripts to be developed.
 +
** [Lijian] Working on IP4 reassembly and GBP failures
 +
** [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far. - To confirm with Jianling and Joyce.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] working on vectorized memory copy - Second priority, no update so far.
 +
*** [Lijian] Patch for compiling issue with GCC-8.x is under community review.
 +
*** [Lijian] Patch for fixing StringTest failure is under community review.
 +
*** [Lijian] Patch for CDP failure is under community review.
 +
** Memory Ordering
 +
*** [Sirshak] To start work on Arithmetic and Logic relaxed functions.
 +
* VPP Path failures
 +
** https://jira.fd.io/browse/VPP-1475 - IP4 random reassembly failure in master, also seen on x86
 +
** https://jira.fd.io/browse/VPP-1491 - GBP L3/L2 Endpoint Learning failure
 +
* CSIT
 +
** VPP Path
 +
*** Actually, everything is ready. The only thing is to get CI patch merged.
 +
** VPP Device
 +
*** thunderx: 1-node topology on cavium thunderx is in place, but there are errors. Will continue investigation.
 +
*** thunderx2: Racked. Lack of static IP. Sirshak gave a work-around to fix lacking of static IP to Anton.
 +
*** mcbin: Kernel issue yet to try suggestion from Garcia and Damjan. To confirm with Jianling and Joyce - Lijian
 +
** VPP Performance Test
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** Development of L2 test script is under-going now. Khem will get L2 work in CI firstly, and then IP4, and other test cases.
 +
* FD.io lab
 +
** Arista switch is still not working. Andy and Anton are working on the exact requirement for the switch, and two possible option: Andy to replace the Arista or buy a new one.
 +
** ThunderX2 - Racked. Lack of IP.
 +
* Action Items - Next Week
 +
** [Lijian] to continue to investigate make test failures.
 +
** [Andy] to work with Anton to resolve Arista problem.
 +
 
 +
'''12/04/2018'''
 +
* Attendees
 +
** Sirshak Das
 +
** Andy Wang
 +
** Juraj Linkeš
 +
** Khemendra
 +
** Garcia
 +
** Manuel
 +
** Gorka
 +
** Fede
 +
** Tina Tsou
 +
* VPP Hoststack
 +
** iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance. Three case: kernel to kernel; kernel to VPP hoststack; VPP hoststack to VPP hoststack
 +
** ongoing perf analysis. Two patches ongoing. One is upstreamed and the other is under internal review. Hotpots on memory copy or maybe other stuff.
 +
** Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
 +
** Gorka is trying some optimal configs for VCL.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
**
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. Two scripts of L2 performance suites for CI management repository are done, investigating on for CSIT repository, and three more scripts to be developed.
 +
** [Lijian] VPP dlmalloc crash issue root-caused and fixed by maintainer. Florin Coras fixed time-out issues.
 +
** [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far. - To confirm with Jianling and Joyce.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] working on vectorized memory copy - Second priority, no update so far.
 +
*** [Lijian] Patch for compiling issue with GCC-8.x is under internal review.
 +
*** [Lijian] Patch for fixing StringTest failure is under internal review.
 +
** Memory Ordering
 +
*** [Sirshak] To start work on Arithmetic and Logic relaxed functions.
 +
* CSIT
 +
** VPP Path
 +
*** https://jira.fd.io/browse/VPP-1475 - IP4 random reassembly failure in master, also seen on x86
 +
*** https://jira.fd.io/browse/VPP-1476 - L2FIB failures in master, also seen on x86 - fixed
 +
*** https://jira.fd.io/browse/VPP-1491 - GBP L3/L2 Endpoint Learning failure
 +
*** https://jira.fd.io/browse/VPP-1490 - Traffic doesn't work in make test, 1604 issue(pmalloc issue) - to confirmed its current status
 +
*** https://jira.fd.io/browse/VPP-1497 - Cannot run in parallel problem - fixed
 +
*** VPP-1476, VPP-1475, VPP-1478. These failures are seen on Debian x86 VM also.
 +
*** Get CSIT/Aarch64 pass with partial test cases - Juraj - https://gerrit.fd.io/r/#/c/16282/
 +
*** VPP dlmalloc crash issue root-caused and fixed by maintainer.
 +
*** Florin Coras fixed time-out issue.
 +
** VPP Device
 +
*** thunderx: 1-node topology on cavium thunderx is in place, but there are errors. Will continue investigation.
 +
*** thunderx2: Racked. Lack of IP. To confirm with Anton.
 +
*** mcbin: Kernel issue yet to try suggestion from Garcia and Damjan. To confirm with Jianling and Joyce - Lijian
 +
** VPP Performance Test
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** Development of L2 test script is under-going now. Khem will get L2 work in CI firstly, and then IP4, and other test cases.
 +
* FD.io lab
 +
** Arista switch is still not working. Andy and Anton are working on the exact requirement for the switch, and two possible option: Andy to replace the Arista or buy a new one.
 +
** ThunderX2 - Racked. Lack of IP.
 +
* Action Items - Next Week
 +
** [Lijian] to continue to investigate make test failures.
 +
** [Andy] to work with Anton to resolve Arista problem.
 +
 
 +
 
 +
'''11/27/2018'''
 +
* Attendees
 +
** Juraj Linkeš
 +
** Khemendra
 +
** Garcia
 +
** Manuel
 +
** Gorka
 +
** Fede
 +
** Tina Tsou
 +
* VPP Hoststack
 +
** iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance.
 +
** ongoing perf analysis, two patches ongoing. Hotpots on memory copy or maybe other stuff. Will share patches with community.- Sirshak
 +
** Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
 +
** Gorka is trying some optimal configs for VCL.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** Alternate test cases.
 +
**
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. Status: Working with Juraj to get scripts ready for jobs. - Need to prepare some scripts. First to understand how the script works and then add more options.
 +
** [Lijian] Status on VPP path failures. Status: Still debugging. Still at early study stage.
 +
** [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] working on vectorized memory copy
 +
** Memory Ordering
 +
*** [Sirshak] To start work on Arithmetic and Logic relaxed functions.
 +
* CSIT
 +
** VPP Path
 +
*** 3 failures currently stalling deployment.
 +
*** VPP-1476, VPP-1475, VPP-1478
 +
*** These failures are seen on Debian x86 VM also.
 +
*** Parallelization(n=32) is resulting in failures. Seems also be caused by below two patches.
 +
*** VPP-1490, caused by https://gerrit.fd.io/r/#/c/15106/ and https://gerrit.fd.io/r/#/c/15534/.
 +
*** VPP-1491, VPP-1497 about parallazation and GBP failure are filed.
 +
*** Get CSIT/Aarch64 pass with partial test cases - Juraj
 +
** VPP Device
 +
*** thunderx: Juraj created a LF tkt for wiring the 1-node topology on cavium thunderx.
 +
*** thunderx2: to be racked by this Friday.
 +
*** mcbin: Kernel issue yet to try suggestion from Garcia and Damjan.
 +
** VPP Performance Test
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** L2 test is working now manually. Khem is trying to get it work in CI, and then IP4, and other test cases.
 +
* FD.io lab
 +
** Arista switch is missing cable. Andy will send tracking no. for cables.
 +
** ThunderX2 - to be racked by this Friday.
 +
* Action Items - Next Week
 +
** [Lijian] to investigate VPP-1490 issue.
 +
** [Andy] Andy will send tracking no. for cables.
 +
 
 +
'''11/20/2018'''
 +
* Attendees
 +
** Sirshak Das
 +
** Andy Wang
 +
** Juraj Linkeš
 +
** Khemendra
 +
** Garcia
 +
** Manuel
 +
** Gorka
 +
** Fede
 +
** Tina Tsou
 +
* VPP Hoststack
 +
** iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance.
 +
** ongoing perf analysis, two patches ongoing. Hotpots on memory copy or maybe other stuff. Will share patches with community.- Sirshak
 +
** Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
 +
** Gorka is trying some optimal configs for VCL.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** Alternate test cases.
 +
**
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. Status: Working with Juraj to get scripts ready for jobs. - Need to prepare some scripts. First to understand how the script works and then add more options.
 +
** [Lijian] Status on VPP path failures. Status: Still debugging. Still at early study stage.
 +
** [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] working on vectorized memory copy
 +
** Memory Ordering
 +
*** [Sirshak] To start work on Arithmetic and Logic relaxed functions.
 +
* CSIT
 +
** VPP Path
 +
*** 3 failures currently stalling deployment.
 +
*** VPP-1476, VPP-1475, VPP-1478
 +
*** These failures are seen on Debian x86 VM also.
 +
*** Parallelization(n=32) is resulting in failures. Seems also be caused by below two patches.
 +
*** VPP-1490, caused by https://gerrit.fd.io/r/#/c/15106/ and https://gerrit.fd.io/r/#/c/15534/.
 +
*** VPP-1491, VPP-1497 about parallazation and GBP failure are filed.
 +
*** Get CSIT/Aarch64 pass with partial test cases - Juraj
 +
** VPP Device
 +
*** thunderx: Juraj created a LF tkt for wiring the 1-node topology on cavium thunderx.
 +
*** thunderx2: to be racked by this Friday.
 +
*** mcbin: Kernel issue yet to try suggestion from Garcia and Damjan.
 +
** VPP Performance Test
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** L2 test is working now manually. Khem is trying to get it work in CI, and then IP4, and other test cases.
 +
* FD.io lab
 +
** Arista switch is missing cable. Andy will send tracking no. for cables.
 +
** ThunderX2 - to be racked by this Friday.
 +
* Action Items - Next Week
 +
** [Lijian] to investigate VPP-1490 issue.
 +
** [Andy] Andy will send tracking no. for cables.
 +
 
 +
 
 +
'''11/12/2018'''
 +
* Attendees
 +
** Sirshak Das
 +
** Andy Wang
 +
** Juraj Linkeš
 +
** Khemendra
 +
** Garcia
 +
** Gorka
 +
* VPP Hoststack
 +
** iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance.
 +
** ongoing perf analysis, two patches ongoing. Hotpots on memory copy or maybe other stuff. - Sirshak
 +
** Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
 +
** Gorka is trying some optimal configs for VCL.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** Alternate test cases.
 +
** khem to get more information on benchmarking DMM. Khem to send the information to
  
* Gabriel working on vpp init failure in linux_pci_init()
+
== Status Report Ligato/Contiv ==
* Kumar to check with GeorgeZ on Huawei boards shipped to CSIT; need to verify tests also on this environment (package versions from distro)
+
[[File:Capture LandC.PNG]]
* Brian to check whether anything else needs to be done besides 'make test' for upstream enablement
+

Latest revision as of 15:13, 21 November 2023

Get Involved

Meeting Details

IRC Channel

#fdio-arm on freenode.net

Slack

Request invitation at https://slack.fd.io/

Jira

Jira issues with ARM64 label

Presentations

Release Milestones

18.10

18.07

18.04

  • CI
    • Upstream patch verification on ARMv8 machines
    • .deb packages

Machines

The FD.io lab is hosted at VEXXHOST colocation centre in Montreal Québec, Canada.

Platform Role Status Hostname IP IPMI Cores RAM Ethernet Distro
Marvell ThunderX VPP dev debug server Running vpp-marvell-dev 10.30.51.38 10.30.50.38 96 128GB 3x40GbE QSFP+ / 4x10GbE SFP+ Ubuntu 18.04.4
CI build server Running in Nomad s53-nomad 10.30.51.39 10.30.50.39 96 128GB 3x40GbE QSFP+ / 4x10GbE SFP+ Ubuntu 18.04.4
CI build server Running in Nomad s54-nomad 10.30.51.40 10.30.50.40 96 128GB 3x40GbE QSFP+ / 4x10GbE SFP+ Ubuntu 18.04.4
CI build server Running in Nomad s52-nomad 10.30.51.65 10.30.50.65 96 256GB 2xQSFP+ / USB Ethernet Ubuntu 18.04.4
CI build server Running in Nomad s51-nomad 10.30.51.66 10.30.50.66 96 256GB 2xQSFP+ / USB Ethernet Ubuntu 18.04.4
CI build server Running in Nomad s49-nomad 10.30.51.67 10.30.50.67 96 256GB 2xQSFP+ / USB Ethernet Ubuntu 18.04.4
CI build server Running in Nomad s50-nomad 10.30.51.68 10.30.50.68 96 256GB 2xQSFP+ / USB Ethernet Ubuntu 18.04.4
Marvell ThunderX2 Perf DUT candidate Running s27-t13-sut1 10.30.51.69 10.30.50.69 224 128GB 3x40GbE QSFP+ XL710-QDA2 Ubuntu 18.04.2
VPP device server Running in Nomad s55-t36-sut1 10.30.51.70 10.30.50.70 256 256GB 2x40GbE QSFP+ XL710-QDA2 / 2x10/25GE SFP+ ConnectX5 Ubuntu 18.04.4
VPP device server Running in Nomad s56-t37-sut1 10.30.51.71 10.30.50.71 256 256GB 2x40GbE QSFP+ XL710-QDA2 / 2x10/25GE SFP+ ConnectX5 Ubuntu 18.04.4
Huawei TaiShan 2280 CSIT testbed Running in CI s17-t33-sut1 10.30.51.36 10.30.50.36 64 128GB 2x10GbE SFP+ Intel X520-DA2 / 2x25GbE SFP28 Mellanox CX-4 18.04.1
CSIT testbed Running in CI s18-t33-sut2 10.30.51.37 10.30.50.37 64 128GB 2x10GbE SFP+ Intel X520-DA2 / 2x25GbE SFP28 Mellanox CX-4 18.04.1
Marvell MACCHIATObin N/A Decommissioned s20-t34-sut1 10.30.51.41 10.30.51.49, then connect to /dev/ttyUSB0 4 16GB 2x10GbE SFP+ Ubuntu 16.04.4
N/A Decommissioned s21-t34-sut2 10.30.51.42 10.30.51.49, then connect to /dev/ttyUSB1 4 16GB 2x10GbE SFP+ Ubuntu 16.04.5
N/A Decommissioned fdio-mcbin3 10.30.51.43 10.30.51.49, then connect to /dev/ttyUSB2 4 16GB 2x10GbE SFP+ Ubuntu 16.04.5
Power Cycler Operational 10.30.50.80
SoftIron OverDrive 1000 N/A Decommissioned softiron-1 10.30.51.12 N/A 4 8GB openSUSE
N/A Decommissioned softiron-2 10.30.51.13 N/A 4 8GB openSUSE
N/A Decommissioned softiron-3 10.30.51.14 N/A 4 8GB openSUSE

Note: to get lab access, create a gpg key, upload it to keyserver, have it signed by a trusted anchor in a video call (fingerprint will be needed) and then an ARM authority (Tina) needs to send an e-mail to helpdesk@fd.io with your name, e-mail, keygrip and key fingerprint

CI

Covers automated build, unit test, and packaging for various Linux distros on ARMv8 machines.

Jenkins job Status Description
vpp-arm-verify-master-ubuntu1604 Running xxx
vpp-arm-merge-master-ubuntu1604 Running xxx
vpp-arm-verify-1804-ubuntu1604 Running xxx
vpp-arm-merge-1804-ubuntu1604 Running xxx

Next steps:

  • make test added to verify jobs
  • Clang build
  • openSUSE Leap 15 | CentOS 7 | Ubuntu 18.04
  • vpp-csit-verify-virl-master or equivalent CSIT functional testing

CSIT

Covers automated functional and performance integration testing on ARMv8 3-node and 2-node testbeds.

https://wiki.fd.io/view/CSIT/AArch64

Contiv-VPP

This Kubernetes network plugin uses FD.io VPP to provide network connectivity between PODs.

https://github.com/contiv/vpp

The installation guide of Contiv-VPP on Arm64 platform is

https://github.com/contiv/vpp/blob/master/docs/arm64/MANUAL_INSTALL_ARM64.md

Porting and Tuning Roadmap

  • VPP Vectorization: Expanding the Neon Library for IPv4 forwarding code path - Sirshak/Lijian
  • Tuning the quad loop/dual loop for small cores - Lijian
  • General performance analysis and tuning of various graph nodes for IPv4 forwarding test case - Sirshak/Lijian
  • Memory Ordering - Sirshak
  • CSIT Performance Test - Khemendra
  • CSIT Device Test - Juraj
  • CSIT Path Test - Juraj

Known Issues

GCC 5.3 ICEs during FP register allocation. Please use GCC 5.4 or newer.

Activity

Recent Patches

misc: vppctl fix heap-buffer-overflow & memleaks Merged 12/14 Tianyu Li
crypto-native: fix build error on Arm using clang-13 Merged 12/14 Jieqiang Wang
snort: fix unused result warning for gcc-10 Merged 11/06 Tianyu Li
l2: fix array-bounds error for prefetch on Arm Merged 11/07 Tianyu Li
ip6: fix IPv6 address calculation error using "ip route add" CLI Merged 10/21 Jieqiang Wang
ipsec: Performance improvement of ipsec4_output_node using flow cache Merged 10/13 Govindarajan Mohandoss
build: fix centos rpm build Merged 10/08 Tianyu Li
vppinfra: fix potential memory access error in _pool_init_fixed Merged 10/05 Jieqiang Wang
svm: fix asan check failed @svm_map_region on arm Merged 06/24 Tianyu Li
l2: fix vrrp prefix mac comparison Merged 06/09 Tianyu Li
build: fix build error after make wipe Merged 06/04 Tianyu Li
memif: fix input node buffer prefetch Merged 05/21 Tianyu Li
memif: fix gcc-10 build error on arm platform Merged 05/21 Tianyu Li
papi: fix ubuntu 1804 make test socket.close error Merged 04/16 Tianyu Li
rdma: fix skip_ipv4_cksum behavior in scalar path Merged 04/15 Tianyu Li
vppinfra: correct intrinsic called by u16x16_from_u8x16 Merged 04/15 Lijian Zhang
vppinfra: fix compiling error due to incompatible udphdr field names Merged 03/05 Jieqiang Wang
avf: optimized with NEON SIMD instruction Merged 12/18 Lijian Zhang
ip: fix compiling error with gcc-10 Merged 09/01 Jieqiang Wang
build: Fix 'make install-deps' errors on aarch64 CentOS 7 Merged 07/29 Jieqiang Wang
acl: correct acl vat help message Merged 07/24 Lijian Zhang
build: add libssl-dev library for ubuntu 20.04 Merged 06/04 Jieqiang Wang
dpdk: fix compiling issue with clang Merged 05/08 Lijian Zhang
vppinfra: fix u32x4_byte_swap on Arm Merged 05/08 Lijian Zhang
build: support arch-specific compiling for Neoverse N1 Merged 04/30 Lijian Zhang
dpdk: false link down issue with ixgbe NIC Merged 03/23 Lijian Zhang
vlib: fix error when creating avf interface on SMP system Merged 03/21 Jieqiang Wang
vlib: leave SIGPROF signal with its default handler Merged 03/21 Jieqiang Wang
build: add libssl-dev for ubuntu 16.04 and 18.04 Merged 03/11 Jieqiang Wang
vlib: fix code of getting numa node with specific cpu_id Merged 02/17 Lijian Zhang
docs: add physmem section in configuration parameters Merged 12/19 Jieqiang Wang
vlib: add max-size configuration parameter for pmalloc Merged 12/18 Jieqiang Wang
crypto: not use vec api with opt_data[VNET_CRYPTO_N_OP_IDS] Merged 11/13 Lijian Zhang
acl: add missing square brackets to vat_help option in acl api Merged 10/31 Jieqiang Wang
dpdk: apply dual loop unrolling in DPDK TX Merged 09/12 Lijian Zhang
ip: apply dual loop unrolling in ip4_rewrite Merged 09/12 Lijian Zhang
ip: apply dual loop unrolling in ip4_input Merged 09/12 Lijian Zhang
build: fix running error with vmxnet3_test_plugin.so Merged 09/11 Jianlin Lv
build: fix unsupported CMake comparison operation Merged 09/05 Jianlin Lv
tap: fix tap interface not working on Arm issue Merged 09/04 Lijian Zhang
build: fix vpp compilation failure on ThunderX2 and Amp Merged 08/19 Jianlin Lv
vppinfra: Update "show cpu" output for AArch64 chips Merged 08/19 Nitin Saxena
vppinfra: refactor test_and_set spinlocks to use clib_spinlock_t Merged 08/02 Jason Zhang
vppinfra: added performance test for clib_rwlock_t (test_rwlock.c) Merged 08/02 Jason Zhang
vppinfra: refactor clib_rwlock_t to use single condition variable Merged 08/02 Jason Zhang
vppinfra: refactor clib_spinlock_t to use compare and swap Merged 08/02 Jason Zhang
vppinfra: added lock performance test for clib_spinlock_t (test_spinlock.c) Merged 08/02 Jason Zhang
vppinfra: refactor use of CLIB_MEMORY_BARRIER () Merged 08/02 Jason Zhang
vppinfra: conformed spinlocks to use CLIB_PAUSE Merged 08/02 Jason Zhang
vppinfra: add u64x2_scatter/u32x4_scatter Merged 06/21 Lijian Zhang
vppinfra: add u64x2_gather/u32x4_gather Merged 06/21 Lijian Zhang
fix compiling error with marvell pp2 plugin Merged 06/11 Jianlin Lv
Switch atomic release API from __sync to __atomic builtin Merged 06/05 Sirshak Das
Switch atomic test and set API from __sync to __atomic builtin Merged 06/05 Sirshak Das
Build packages for generic Arm architecture Merged 05/15 Lijian Zhang
Enable NEON instructions in memcpy_le Merged 05/01 Lijian Zhang
svm_fifo rework to avoid contention on cursize Merged 04/17 Sirshak Das
Re-enable aarch64 neon instruction in vlib_buffer_free_inline Merged 03/20 Lijian Zhang
sctp chunk_len fix Merged 03/06 Sirshak Das
Use acquire/release ordering when accessing svm_fifo shared variable cursize Merged 11/29 Sirshak Das
Optimize xxx_zero_byte_mask NEON function. Merged 11/07 Lijian Zhang
Enable atomic swap and store macro with acquire and release ordering. Merged 11/03 Sirshak Das
Add and enable msb mask vector intrinsic for aarch64. Merged 10/31 Lijian Zhang
vppinfra: add atomic macros for __sync builtins Merged 10/19 Sirshak Das
vppinfra: Fix extendto_high aarch64 NEON api. Merged 10/09 Sirshak Das
Support dynamic dual/quad loop selection on aarch64 Merged 10/01 Lijian Zhang
Enable verbose output during VPP cmake compiling Merged 9/25 Lijian Zhang
dpdk_plugin: fix mlx5 build and runtime issues Merged 9/27 Sirshak Das
Add and enable u32x4_extend_to_u64x2_high for aarch64 NEON intrinsics. Merged 9/12 Sirshak Das
Add horizontal add (hadd) vector intrinsic via NEON. Merged 9/11 Sirshak Das
Add u32x4_extend_to_u64x2 for aarch64 using NEON intrinsics Merged 9/11 Sirshak Das
Replacing vtbl NEON intrinsic with rev NEON intrinsic for byte_swap. Merged 9/11 Sirshak Das
Fix array bound failure in api_sr_localsid_add_del Merged 8/30 Lijian Zhang
cmake: fix marvell plugin build Merged 8/28 Brian Brooks
fix dpdk_plugin.so load failure with DPDK 18.08 Merged 8/23 Lijian Zhang
Fix a bug in function pipe_rx Merged 8/17 Lijian Zhang
fix compiling warnings with GCC Merged 8/17 Lijian Zhang
Update AArch64 CSIT machines into FD.io VPP docs Merged 8/17 Lijian Zhang
Add support for shuffle vector intrinsic via Neon in ARM Merged 8/1 Sirshak Das
Improve cpu { coremask-% } configure option Merged 8/1 Yi He
Fix undefined symbol: fformat_append_cr in vat plugins loading Merged 7/31 Yi He
pp2: increase recycle batch size Merged 7/10 Brian Brooks
pp2: change default queue size Merged 7/26 Brian Brooks
pp2: use configured RX queue size Merged 7/10 Brian Brooks
Fix load_unaligned undefined and other possible build failures Merged 6/26 Sirshak Das
Enable PMU cycle counter for graph node cycles Sirshak Das
Fix clang compilation on aarch64: extraneous parentheses Merged 6/13 Sirshak Das
Fix clang compilation on aarch64: value size does not match register size Merged 5/30 Sirshak Das
Fix clang compilation on aarch64: sizeof operator error Merged 5/30 Sirshak Das
Fix clang compilation on aarch64: replace -pie with -fPIE for dpdk compilation Merged 5/30 Sirshak Das
dpdk: set dmamap iova address value according to eal_iova_mode Merged 5/28 Sachin Saxena
Fixes make test errors with clang compiler on aarch64 Merged 5/27 Sirshak Das
Fix broken compilation for non-numa aware platforms Merged 5/16 Sachin Saxena
build-data: Common makefile for NXP DPAA1/DPAA2 platforms Merged 5/4 Sachin Saxena
arm64: Avoid setting march to corei7 when Cross Compiling for ARM Merged 5/4 Sachin Saxena
use restrict keyword VPP-1126 Khemendra Kumar
Autotools: Autodetection of cache line size VPP-1064 Nitin Saxena
add 'is_all_zero(x)' for NEON - fix build break Merged 2/20 Adrian Oanca
u8x16_compare_byte_mask optimization Merged 2/24 Adrian Oanca
Added u8x16,u32x4,u64x2 variants of _zero_byte_mask(x) for ARM/NEON platform Merged 2/26 VPP-1129 Adrian Oanca
add CLIB_HAVE_VEC128 with NEON intrinsics Merged 02/08 VPP-1127 Gabriel Ganne
Use neutral vector code for ethernet_frame_is_tagged Merged 2/19 Damjan Marion
vhost: Added ARMV8 NEON version of function map_guest_mem() Merged 2/7 VPP-1085 Nitin Saxena
vppinfra: use __atomic_fetch_add instead of __sync_fetch_and_add builtins VPP-1114 Kevin Wang
Arm system counter cleanup Merged 1/30 VPP-1125 Brian Brooks
svm: ... on autodetected VA space size (fixup again) Merged 01/10 Gabriel Ganne
svm: calc base address on AArch64 based on autodetected VA space size (fixup) Merged 01/10 Gabriel Ganne
svm: calc base address on AArch64 based on autodetected VA space size Merged 01/09 Damjan Marion
show cpu microarchitecture Merged 01/06 Gabriel Ganne
Fix Debian Packaging on AARCH64 Merged 01/06 Nitin Saxena
more extended tests fixes Merged 12/16 Gabriel Ganne
Use crc32 wrapper Merged 12/16 VPP-1086 Gabriel Ganne
implement clib_smp_pause() for arm and aarch64 platform Merged 12/15 VPP-1066 Kevin Wang
make "test-all" target pass again (for all platforms) Merged 12/13 Gabriel Ganne
fill "show cpu" Flag list on aarch64 platforms Merged 12/06 VPP-1065 Gabriel Ganne
remove smp dead code Merged 12/06 VPP-1066 Gabriel Ganne
net/virtio: support modern device id Merged 11/28 Gabriel Ganne
use REV on aarch64 for endianness swapping Merged 11/21 VPP-1067 Gabriel Ganne
armv8 crc32 - fix macro name Merged 11/15 Gabriel Ganne
bier - fix node table declaration Merged 11/14 Gabriel Ganne
Map SVM regions at a sane offset on arm64 Merged 11/10 Brian Brooks
bfd tests fix Merged 11/07 Gabriel Ganne
debian packaging fix Merged 11/06 Gabriel Ganne
lb test fix Merged 10/31 Gabriel Ganne
conditional x86intrin.h inclusion Merged 10/25 Gabriel Ganne
fix test_lb_ip4_gre6() cleanup Merged 10/24 Gabriel Ganne
null-terminate some formatted string Merged 10/20 Gabriel Ganne
lb plugin - fix format() type mismatches Merged 10/16 Gabriel Ganne
Use AESNI=y only on x86_64 machines Merged 10/14 Brian Brooks
Improved arm64 chip detection Merged 09/11 Brian Brooks
Native arm64 build: dpdk/Makefile change Merged 08/31 Brian Brooks

Meeting Minutes

11/21/2023

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Niyaz Murshed
    • Jieqiang Wang
  • CSIT
    • Status
      • Dave Wallace help monitor the AArch64 CI/CD status, which looks fine
      • Replace old thunderX2 with Ampera Altra, bugdets got approved, still in progress
        • Sync with CSIT folks in the call when possible -- Juraj
      • Maciek asked about the availability of N2-based hardwares
        • Plans to ship N2-based servers(Nvidia Grace(V2)/Ampere One(in-house design by Ampere)) to FD.io lab in next year
        • Timeline TBD
      • IPSec test cases
        • Patch already merged
        • QAT cards in Austin labs, plan to ship them to FD.io lab
      • RDMA test cases
        • MLX DPDK test cases are enabled, RDMA are not on AArch64
  • VPP
    • Detailed planning for VPP projects in the next call
    • Refactor OpenSSL usage in VPP IPsec -- Lijian
      • Move key generation and initialization steps out of data plane to control plane, see performance boost
    • Investigate make test framework in VPP -- Lijian
      • Patch broke wireguard test cases so need to figure out the work flow
    • VPP ramp-up -- Niyaz
      • Investigate VPP graph node mechanism and how to add nodes to the group
    • IPSec scalability tests -- Jieqiang
      • Try to figure out dpdk-rss-flows.py and how to generate balanced rss flows for IPSec tests

07/18/2023

  • Attendees
    • Jieqiang Wang
    • Tianyu Li
    • Juraj Linkes
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • IPSec & VxLAN performance drop issue on Ampere Altra
      • QAT cards are planned to be shipped
      • need to pay attention to the execution time for IPSec release testing
      • Need to investigate further on performance degradation issue
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
      • Will have a debug meeting with RDMA maintainers on the issues.
      • ConnectX6 NIC info will be updated in doc first
      • Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
    • MRR failed cases
      • Probably due to latest DPDK upgrade, not an arm-specific issue.
    • New test cases list on 3n-alt
      • NAT tests cannot be added because they are running on 2-node testbed only
      • enable IPSec flow cache(arm)/IPSec SPD fast path feature
    • Release testing
    • Plan to replace TX2 with Altra as VPP device testing testbed

06/20/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj Linkes
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • IPSec & VxLAN performance drop issue on Ampere Altra
      • QAT cards are planned to be shipped
      • need to pay attention to the execution time for IPSec release testing
      • Need to investigate further on performance degradation issue
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
      • Will have a debug meeting with RDMA maintainers on the issues.
      • ConnectX6 NIC info will be updated in doc first
      • Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
    • MRR failed cases
      • Probably due to latest DPDK upgrade, not an arm-specific issue.
    • New test cases list on 3n-alt
      • NAT tests cannot be added because they are running on 2-node testbed only
      • enable IPSec flow cache(arm)/IPSec SPD fast path feature

05/16/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • The link issue in DPDK testpmd test cases on Ampere Altra is still there.
      • Try cable switch while upgrading NIC firmeare and drivers
      • Try to reproduce the tests after the NIC firmware
      • Try different port pairs of the same two NICs
    • IPSec & VxLAN performance drop issue on Ampere Altra
      • QAT cards are planned to be shipped
      • need to pay attention to the execution time for IPSec release testing
      • Need to investigate further on performance degradation issue
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
      • Will have a debug meeting with RDMA maintainers on the issues.
      • ConnectX6 NIC info will be updated in doc first
      • Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
    • MRR failed cases
      • Probably due to latest DPDK upgrade, not an arm-specific issue.
  • VPP

04/18/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • The link issue in DPDK testpmd test cases on Ampere Altra is still there.
    • IPSec & VxLAN performance drop issue on Ampere Altra
      • QAT cards are planned to be shipped
      • need to pay attention to the execution time for IPSec release testing
      • Need to investigate further on performance degradation issue
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
      • Will have a debug meeting with RDMA maintainers on the issues.
      • ConnectX6 NIC info will be updated in doc first
  • VPP

04/04/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • The link issue in DPDK testpmd test cases on Ampere Altra is still there.
    • IPSec & VxLAN performance drop issue on Ampere Altra
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
      • Will have a debug meeting with RDMA maintainers on the issues.
  • VPP

03/07/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • The link issue in DPDK testpmd test cases on Ampere Altra is still there.
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
  • VPP

2/21/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • Miscellaneous
    • Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Confirm with Vexxhost people if replacing intel NICs is feasible
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
            • Will talk to dpdk i40e maintainer to seek their help
              • Dpdk Port/link status broken - l3fwd have the some issue
              • Sent detail email to i40e maintainer in dpdk-dev mail list waiting for response
        • New links for VPP perf trending/report pages
          • Daily trending: https://s3-docs.fd.io/csit/master/trending/
          • Release report: https://s3-docs.fd.io/csit/master/report/
          • Need to investigate 22.10 release testing result
            • Compiler version change seems to be one of factors for perf degradation
              • Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
              • New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
          • VM testcase timeout issue on 3-tsh testbed
            • Timeout issue occured when starting VPP inside VM, but not for starting testpmd
            • Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
              • isolcpus seems to be working fine
              • still need to root cause the timeout issue- sometimes slower
              • run dpdk build, just use the non-isolated cores for build
              • both VM and VPP start slower than before
              • VPP loading plugins and timeout happens
              • Is VPP crashing? - not crash
              • Is the VM bound with isolated core? - need to check
              • Will set up a live debug session for Tianyu and Juraj
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Enable VPP device testing per patch
        • Voting right for VPP device testing on Arm is enabled
        • VPP device testing on Arm runs per VPP/CSIT patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
    • MLX NICs Planning
      • CX6 and CX7 - CX7 is hard to get on market - MLX Nics will be used and reported
      • CX6 vpp native rdma driver has issues, dpdk mlx driver is fine.
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate


2/7/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • Miscellaneous
    • Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Confirm with Vexxhost people if replacing intel NICs is feasible
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
            • Will talk to dpdk i40e maintainer to seek their help
              • Dpdk Port/link status broken - l3fwd have the some issue
              • Sent detail email to i40e maintainer in dpdk-dev mail list waiting for response
        • New links for VPP perf trending/report pages
          • Daily trending: https://s3-docs.fd.io/csit/master/trending/
          • Release report: https://s3-docs.fd.io/csit/master/report/
          • Need to investigate 22.10 release testing result
            • Compiler version change seems to be one of factors for perf degradation
              • Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
              • New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
          • VM testcase timeout issue on 3-tsh testbed
            • Timeout issue occured when starting VPP inside VM, but not for starting testpmd
            • Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
              • isolcpus seems to be working fine
              • still need to root cause the timeout issue- sometimes slower
              • run dpdk build, just use the non-isolated cores for build
              • both VM and VPP start slower than before
              • VPP loading plugins and timeout happens
              • Is VPP crashing? - not crash
              • Is the VM bound with isolated core? - need to check
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Enable VPP device testing per patch
        • Voting right for VPP device testing on Arm is enabled
        • VPP device testing on Arm runs per VPP/CSIT patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
    • MLX NICs Planning
      • CX6 and CX7 - CX7 is hard to get on market - MLX Nics will be used and reported
      • CX6 vpp native rdma driver has issues, dpdk mlx driver is fine.
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

1/17/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
  • Miscellaneous
    • Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Confirm with Vexxhost people if replacing intel NICs is feasible
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
            • Will talk to dpdk i40e maintainer to seek their help
        • New links for VPP perf trending/report pages
          • Daily trending: https://s3-docs.fd.io/csit/master/trending/
          • Release report: https://s3-docs.fd.io/csit/master/report/
          • Need to investigate 22.10 release testing result
            • Compiler version change seems to be one of factors for perf degradation
              • Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
              • New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
          • VM testcase timeout issue on 3-tsh testbed
            • Timeout issue occured when starting VPP inside VM, but not for starting testpmd
            • Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Enable VPP device testing per patch
        • Voting right for VPP device testing on Arm is enabled
        • VPP device testing on Arm runs per VPP/CSIT patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

12/20/2022

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
  • Miscellaneous
    • Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Confirm with Vexxhost people if replacing intel NICs is feasible
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
            • Will talk to dpdk i40e maintainer to seek their help
        • New links for VPP perf trending/report pages
          • Daily trending: https://s3-docs.fd.io/csit/master/trending/
          • Release report: https://s3-docs.fd.io/csit/master/report/
          • Need to investigate 22.10 release testing result
            • Compiler version change seems to be one of factors for perf degradation
              • Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
              • New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
          • VM testcase timeout issue on 3-tsh testbed
            • Timeout issue occured when starting VPP inside VM, but not for starting testpmd
            • Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Enable VPP device testing per patch
        • Voting right for VPP device testing on Arm is enabled
        • VPP device testing on Arm runs per VPP/CSIT patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

12/06/2022

  • Attendees
    • Juraj Linkes
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
  • Miscellaneous
    • Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Confirm with Vexxhost people if replacing intel NICs is feasible
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
            • Will talk to dpdk i40e maintainer to seek their help
        • New links for VPP perf trending/report pages
          • Daily trending: https://s3-docs.fd.io/csit/master/trending/
          • Release report: https://s3-docs.fd.io/csit/master/report/
          • Need to investigate 22.10 release testing result
            • Compiler version change seems to be one of factors for perf degradation
              • Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
              • New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
          • VM testcase timeout issue on 3-tsh testbed
            • Timeout issue occured when starting VPP inside VM, but not for starting testpmd
            • Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Enable VPP device testing per patch
        • Voting right for VPP device testing on Arm is enabled
        • VPP device testing on Arm runs per VPP/CSIT patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

11/15/2022

  • Attendees
    • Juraj Linkes
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
  • Miscellaneous
    • Reschedule the meeting to 9 am for Juraj and 3 pm for Shanghai folks
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
        • Good news, No more slow down after 200 rounds of testing.
          • Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
          • Distro upgrade to ubuntu 22.04 is still ongoing - no ETA yet
            • Server configuration will remain the same, already integrated in ansible playbook
          • Re-enable voting IF no more issue with 22.04 device testing
            • Submit a patch to enable voting right after meeting
      • Test meltdown/spectre vulnerabilities
        • CSIT maintainers ask for tools if existing to test vulnerabilities on Arm platform(not just limited to Arm)
        • Will confirm this issue with support team - Lijian
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • VM cases failed only on 3n-alt performance testbed, error log report some file missing, likely configuration issue
        • Another intermit failed VM issue happens on tx2 and alt, need to figure out above case first
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate


10/18/2022

  • Attendees
    • Juraj Linkes
    • Tianyu Li
    • Lijian Zhang
    • Jieqiang Wang
  • Miscellaneous
    • Reschedule the meeting to 9 am for Juraj and 3 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • CSIT perf numbers VS local perf numbers
            • VPP cloud image in CSIT VS native built VPP in local env
            • One DPDK patch introduced perf degradation on Arm platform
            • Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
          • Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
            • And the procedures of developing/developing test cases in CSIT (performance & device testing)
            • Juraj should have already sent to Jieqiang previously.
          • 22.06 release testing will happen soon
          • NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Replace XL710 NIC? - try asking tomorrow.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
        • New links for VPP perf trending/report pages
        • NUMA issue
          • Will run performance report on Arm testbed onece patch to resolve NUMA issue is merged
          • Dave will help merge the patch into the corresponding branches


    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
        • Good news, No more slow down after 200 rounds of testing.
          • Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
          • Suggest to rerun test after upgrade to 22.04
          • Re-enable voting after not more issue with 22.04 device testing
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

9/20/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Lijian Zhang
    • Jieqiang Wang
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • CSIT perf numbers VS local perf numbers
            • VPP cloud image in CSIT VS native built VPP in local env
            • One DPDK patch introduced perf degradation on Arm platform
            • Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
          • Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
            • And the procedures of developing/developing test cases in CSIT (performance & device testing)
            • Juraj should have already sent to Jieqiang previously.
          • 22.06 release testing will happen soon
          • NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Replace XL710 NIC? - try asking tomorrow.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
        • New links for VPP perf trending/report pages
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
        • Good news, No more slow down after 200 rounds of testing.
          • Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
          • Suggest to rerun test after upgrade to 22.04
          • Re-enable voting after not more issue with 22.04 device testing
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
        • QAT enabled Kernel patch release about October, upgrade kernel required.
      • QAT test cases are developed based on Python APIs / CLIs
    • VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate


9/6/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Lijian Zhang
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • CSIT perf numbers VS local perf numbers
            • VPP cloud image in CSIT VS native built VPP in local env
            • One DPDK patch introduced perf degradation on Arm platform
            • Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
          • Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
            • And the procedures of developing/developing test cases in CSIT (performance & device testing)
            • Juraj should have already sent to Jieqiang previously.
          • 22.06 release testing will happen soon
          • NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Replace XL710 NIC? - try asking tomorrow.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
        • New links for VPP perf trending/report pages
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
        • Good news, No more slow down after 200 rounds of testing.
          • Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
          • Suggest to rerun test after upgrade to 22.04
          • Re-enable voting after not more issue with 22.04 device testing
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
        • QAT enabled Kernel patch release about October, upgrade kernel required.
      • QAT test cases are developed based on Python APIs / CLIs
    • VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

8/16/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Masksym Vynnvk
    • Jieqiang Wang
    • Tianyu Li
    • Lijian Zhang
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
    • VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

8/2/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Masksym Vynnvk
    • Jieqiang Wang
    • Tianyu Li
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
    • VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

7/19/2022

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX NIC
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case


7/5/2022

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP on N1 platforms
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Tested perfmon patch - Jieqiang
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case

6/21/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Tested perfmon patch - Jieqiang
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card

6/7/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Tested perfmon patch - Jieqiang
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card

5/17/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Lijian Zhang
    • Tina Tsou
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage

4/5/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Lijian Zhang
    • Tina Tsou
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)

3/15/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)

3/1/2022

  • Attendees
    • Govindarajan Mohandoss
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Paper work for shipment is done
      • Build servers will arrive at end of Jan
      • Performance servers will arrive in Feb
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

1/25/2022

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Paper work for shipment is done
      • Build servers will arrive at end of Jan
      • Performance servers will arrive in Feb
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

1/18/2022

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
    • VPP IPv4 fragmentation - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

1/11/2022

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
    • VPP IPv4 fragmentation - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

12/14/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
    • VPP IPv4 fragmentation - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
    • VPP IPv6 Benchmarking and Profiling - Jieqiang
      • IPv6 profiling
        • No perf bump for lookup_x2 function in Fd.io gerrit
        • Try Mellaonx nics for IPv6 routing tests
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

12/07/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
    • VPP IPv4 fragmentation - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
    • VPP IPv6 Benchmarking and Profiling - Jieqiang
      • IPv6 profiling
        • No perf bump for lookup_x2 function in Fd.io gerrit
        • Try Mellaonx nics for IPv6 routing tests
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

11/30/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
    • VPP IPv4 fragmentation - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
    • VPP IPv6 Benchmarking and Profiling - Jieqiang
      • IPv6 profiling
        • No perf bump for lookup_x2 function in Fd.io gerrit
        • Try Mellaonx nics for IPv6 routing tests
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach

11/23/2021

  • Attendees
    • Tianyu Li
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
    • New Arm servers shipment to the FD.io lab
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enbaled with DMC-620
    • VPP IPv4 fragmentation
    • VPP IPv6 Benchmarking and Profiling
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Performance degradation with quad loop unrolling applied on ip6_lookup_inline
        • Patch the current kernel to enable perfmon plugin on VPP
        • Need to check performance for IPv6 subnet routing
        • Try Mellaonx nics for IPv6 routing tests
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach


11/16/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
          • Enable VPP device testing per patch
    • New Arm servers shipment to the FD.io lab
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enbaled with DMC-620
    • VPP IPv4 fragmentation
    • VPP IPv6 Benchmarking and Profiling
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Performance degradation with quad loop unrolling applied on ip6_lookup_inline
        • Patch the current kernel to enable perfmon plugin on VPP
        • Need to check performance for IPv6 subnet routing
        • Try Mellaonx nics for IPv6 routing tests
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach

11/09/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tina Tsou
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunce page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Perfmonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
      • VPP IPv4 fragmetation
    • VPP IPv6 Benchmarking and Profiling
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Performance degradation with quad loop unrolling applied on ip6_lookup_inline
        • Patch the current kernel to enable perfmon plugin on VPP
        • Need to check performance for IPv6 subnet routing
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach

11/02/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Jieqiang Wang
    • Tina Tsou
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunce page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Perfmonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
    • VPP IPv6 Benchmarking and Profiling
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Performance degradation with quad loop unrolling applied on ip6_lookup_inline
        • Patch the current kernel to enable perfmon plugin on VPP
        • Need to check performance for IPv6 subnet routing
    • VPP Prefetch
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach


10/26/2021

  • Attendees
    • Juraj Linkes
    • Tianyu Li
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week. - closed
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • Inbound IPsec: reproduced and need to investigate - Juraj
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • IPsec SPD input/output case ongoing
          • Adding IPsec SPD outbound test cases 64B 1, 100 and 1k SPD entries, 1, 2, 4 cores, on tx2 testbed - clarified
            • Flow cache on and off cases need to be measured.
          • L2 BD 20k test cases execute time too long, removed on taishan.
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
        • 3n-tsh testbed unreachable, investigating right now - Juraj
          • TG firmware is under upgradation
          • Server unreachable due to firmware & driver update - resolved - update all done
        • Release testing for 21.10 starts
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - Juraj
          • Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
        • AVF interface creation issue:
          • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
            • race condition occur
            • try mounting a part of /dev/vfio to see if issue can be resolved
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and looks good right now
              • Addressed comments, waiting Peter's review..
              • Will enable voting right soon after the patch gets merged
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
        • New servers are in the procurement process
        • Plan to replace old thunderx1 build servers with more advanced Arm servers
            • CPU not fully utilized on Arm, need further investigation
    • Intel NIC firmware upgrade on Arm - not supported
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
            • Enable DMC 620 more close to real system, but performance will drop
            • Build a system using VPP memif and pktgen
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
        • Plan to try quad loop unrolling for ip6_lookup_inline function
    • VPP Prefetch
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Try to use ansible to deploy VPP automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

10/19/2021

  • Attendees
    • Juraj Linkes
    • Tianyu Li
    • Govindarajan Mohandoss
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - Juraj
          • Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
        • AVF interface creation issue:
          • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
            • race condition occur
            • try mounting a part of /dev/vfio to see if issue can be resolved
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and looks good right now
              • Will enable voting right soon after the patch gets merged
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
        • New servers are in the procurement process
        • Plan to replace old thunderx1 build servers with more advanced Arm servers
            • CPU not fully utilized on Arm, need further investigation
    • Intel NIC firmware upgrade on Arm - not supported
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
            • Enable DMC 620 more close to real system, but performance will drop
            • Build a system using VPP memif and pktgen
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
        • Plan to try quad loop unrolling for ip6_lookup_inline function
    • VPP Prefetch
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Try to use ansible to deploy VPP automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

10/12/2021

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Govindarajan Mohandoss
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - Juraj
          • Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
        • AVF interface creation issue:
          • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
            • race condition occur
            • try mounting a part of /dev/vfio to see if issue can be resolved
            • Talked with Peter, Juraj is working on prototype of mounting part of /dev/vfio
            • x86 vpp device job is fine, duo to firmware & driver is old
            • arm vpp device servers have drivers updated, vlan striping not allowed, vlan configuration cannot removed from lab view.
            • only performance testbeds have NIC drivers updated
            • maintainer doesn't want to a option from vpp config
            • may need to check x86 have the same issue with the same version driver before reaching intel folks
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
        • New servers are in the procurement process
        • Plan to replace old thunderx1 build servers with more advanced Arm servers
            • CPU not fully utilized on Arm, need further investigation
    • Intel NIC firmware upgrade on Arm - not supported
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
            • Enable DMC 620 more close to real system, but performance will drop
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
    • VPP Prefetch
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

09/28/2021

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - Juraj
          • Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
        • AVF interface creation issue:
          • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
            • race condition occur
            • try mounting a part of /dev/vfio to see if issue can be resolved
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
        • New servers are in the procurement process
        • Plan to replace old thunderx1 build servers with more advanced Arm servers
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
    • VPP Prefetch
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

09/14/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patchset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate - Juraj
                • Learn more about RFC and need time to understand more - on hold - waiting Neale's response
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing done.
          • A few more jobs run for release 21.06 and will be finished soon
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - juraj
          • Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
        • AVF interface creation issue:
          • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
      • Try IPv4 multicast testing to verify the scenario when refcnt > 1
        • GDB shows that mbufs are copied instead of reference from src port to all dst ports
        • Direct/Indirect mbuf for VPP multicast testing
        • Try IPv4 multicasting & L2 flood testing which works fine
        • ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
          • show mbuf is copied so that ref_cnt will always be one
            • dpdk 21.08 have the patches, need to verify on vpp
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

09/07/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patchset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate - Juraj
                • Learn more about RFC and need time to understand more - on hold - waiting Neale's response
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing done.
          • A few more jobs run for release 21.06 and will be finished soon
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - juraj
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
      • Try IPv4 multicast testing to verify the scenario when refcnt > 1
        • GDB shows that mbufs are copied instead of reference from src port to all dst ports
        • Direct/Indirect mbuf for VPP multicast testing
        • Try IPv4 multicasting & L2 flood testing which works fine
        • ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
          • show mbuf is copied so that ref_cnt will always be one
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

08/31/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patchset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate - Juraj
                • Learn more about RFC and need time to understand more - on hold - waiting Neale's response
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing done.
          • A few more jobs run for release 21.06 and will be finished soon
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
        • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
          • config option set to N, /dev/vfio device or resource busy error
          • config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
            • The longer the server runs, more test cases fails
          • Next to do
            • Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
              • Juraj modified script to reproduce the issue - Lijian will try it locally
              • Also seen in Intel QAT card from Zach
            • Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
            • Lijian have slight different firmware version, driver version
            • Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
            • Lijian can use Juraj's script to reproduce the issue on local tx2 server
              • Reducing the numa buffer allocation size resolves this issue
              • Observed from the error log of numa buffer allocation
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
      • Try IPv4 multicast testing to verify the scenario when refcnt > 1
        • GDB shows that mbufs are copied instead of reference from src port to all dst ports
        • Direct/Indirect mbuf for VPP multicast testing
        • Try IPv4 multicasting & L2 flood testing which works fine
        • ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
          • show mbuf is copied so that ref_cnt will always be one
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done

08/24/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patchset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate - Juraj
                • Learn more about RFC and need time to understand more - on hold - waiting Neale's response
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing done.
          • A few more jobs run for release 21.06 and will be finished soon
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
        • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
          • config option set to N, /dev/vfio device or resource busy error
          • config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
            • The longer the server runs, more test cases fails
          • Next to do
            • Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
              • Juraj modified script to reproduce the issue - Lijian will try it locally
              • Also seen in Intel QAT card from Zach
            • Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
            • Lijian have slight different firmware version, driver version
            • Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
            • Lijian can use Juraj's script to reproduce the issue on local tx2 server
              • Reducing the numa buffer allocation size resolves this issue
              • Observed from the error log of numa buffer allocation
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
      • Try IPv4 multicast testing to verify the scenario when refcnt > 1
        • GDB shows that mbufs are copied instead of reference from src port to all dst ports
        • Will try L2 flood test case & understand VPP/multicast code
        • Direct/Indirect mbuf for VPP multicast testing
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
        • Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
        • For 64B cacheline size native build on Arm, may need to change code.
        • Current VPP does not support 64B cacheline size compilation for Arm images.
        • Issues about prefetch on current VPP code base
          • Issue 1 support 128B/64B cache-line size in Arm image
          • Issue 2 prefetch 'overflow' for native build
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach

08/17/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patchset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate - Juraj
                • Learn more about RFC and need time to understand more
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing ongoing
          • A few more jobs run for release 21.06 and will be finished soon
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
        • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
          • config option set to N, /dev/vfio device or resource busy error
          • config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
            • The longer the server runs, more test cases fails
          • Next to do
            • Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
              • Juraj modified script to reproduce the issue - Lijian will try it locally
              • Also seen in Intel QAT card from Zach
            • Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
            • Lijian have slight different firmware version, driver version
            • Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
            • Lijian can use Juraj's script to reproduce the issue on local tx2 server
              • Reducing the numa buffer allocation size resolves this issue
              • Observed from the error log of numa buffer allocation
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
      • Try IPv4 multicast testing to verify the scenario when refcnt > 1
        • GDB shows that mbufs are copied instead of reference from src port to all dst ports
        • Will try L2 flood test case & understand VPP/multicast code
        • Direct/Indirect mbuf for VPP multicast testing
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
        • Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
        • For 64B cacheline size native build on Arm, may need to change code.
        • Current VPP does not support 64B cacheline size compilation for Arm images.
        • Issues about prefetch on current VPP code base
          • Issue 1 support 128B/64B cache-line size in Arm image
          • Issue 2 prefetch 'overflow' for native build
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach

08/10/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries - still under review
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patcheset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing ongoing
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863

`

    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
        • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
          • config option set to N, /dev/vfio device or resource busy error
          • config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
            • The longer the server runs, more test cases fails
          • Next to do
            • Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
              • Juraj modified script to reproduce the issue - Lijian will try it locally
              • Also seen in Intel QAT card from Zach
            • Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
            • Lijian have slight different firmware version, driver version
            • Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
    • VPN access request to FD.io Arm servers
      • Lijian has got VPN access now
      • Juraj singed Jieqiang's key
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128, CLI issue only, CSIT's python API works fine.
      • Internal patch to resolve this issue under review - upstreamed
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
        • Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
        • For 64B cacheline size native build on Arm, may need to change code.
        • Current VPP does not support 64B cacheline size compilation for Arm images.
    • VPP Classifier - Lijian
      • 3 patches: prefetch, key-value compare simd improvement, cache to look up
      • 4 loop unrolling decreasing performance
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach

08/03/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries - still under review
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Waiting for new version of patcheset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing ongoing
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863

`

    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
        • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
          • config option set to N, /dev/vfio device or resource busy error
          • config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
            • The longer the server runs, more test cases fails
          • Next to do
            • Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
              • Also seen in Intel QAT card from Zach
            • Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
            • Will try Mellanox card to see if same issue happens - Juraj
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
    • VPN access request to FD.io Arm servers
      • Lijian has got VPN access now
      • Juraj singed Jieqiang's key
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • Internal patch to resolve this issue under review
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
        • Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
        • For 64B cacheline size native build on Arm, may need to change code.
        • Current VPP does not support 64B cacheline size compilation for Arm images.
    • VPP Classifier - Lijian
      • 3 patches: prefetch, key-value compare simd improvement, cache to look up
      • 4 loop unrolling decreasing performance
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach

07/27/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
            • Random issue, more frequently happening on Arm
              • Not see in CI recently or manually.
        • scapy unexpected timeout issue: packet drop or slow issue?
            • vfio-pci driver may be the root cause - bind/unbind
      • Connection issue between Jenkins and the build executor in FD.io lab
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
    • VPN access request to FD.io Arm servers
      • Lijian has got VPN access now
      • Juraj singed Jieqiang's key
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
        • Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
        • For 64B cacheline size native build on Arm, may need to change code.
    • VPP Classifier - Lijian
      • 3 patches: prefetch, key-value compare simd improvement, cache to look up
      • 4 loop unrolling descreasing performance
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach

07/20/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
            • Random issue, more frequently happening on Arm
            • vfio-pci driver may be the root cause - bind/unbind
      • Connection issue between Jenkins and the build executor in FD.io lab
    • Shipment of new advanced server to the FD.io lab
      • Two advanced servers are in plan to ship
    • VPN access request to FD.io Arm servers
      • Lijian has got VPN access now
      • Juraj singed Jieqiang's key
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
    • VPP mbuf-fast-free tx offload
      • Performance improvement for IPv4 routing test cases using vector path
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach

07/13/2021

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries - still under review
            • Expected to be merged soon
          • Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
            • Hugepage size, numa-node, core isolation etc. may need to check.
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
        • Release testing ongoing
        • IPsec SPD input/output case ongoing
        • Juraj may share the steps how CSIT handle new configuration changes
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
        • 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
          • May need to check VM and IPsec cases
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
            • Random issue, more frequently happening on Arm
            • vfio-pci driver may be the root cause - bind/unbind
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
    • Shipment of new advanced server to the FD.io lab
      • New servers are in shortage.
    • VPN access request to FD.io Arm servers
      • Will remind Machiek to sign Lijian's GPG public key.
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)


07/06/2021

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries - still under review
            • Expected to be merged soon
          • Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
            • Hugepage size, numa-node, core isolation etc. may need to check.
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
        • Release testing ongoing
        • IPsec SPD input/output case ongoing
        • Juraj may share the steps how CSIT handle new configuration changes
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
        • 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
          • May need to check VM and IPsec cases
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
            • Random issue, more frequently happening on Arm
            • vfio-pci driver may be the root cause - bind/unbind
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
    • Shipment of new advanced server to the FD.io lab
      • New servers are in shortage.
    • VPN access request to FD.io Arm servers
      • Will remind Machiek to sign Lijian's GPG public key.
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform
    • VPP Prefetch
      • Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
        • Repeat the same test on Ampere server - PMU cache-miss less for write always
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
      • 3 patches: prefetch, key-value compare simd improvement, cache to look up
      • Investigating VPP classify function, use case, benchmarking - Lijian
        • Start with simple use case
        • VPP Classify basic inbound L3 src ip / prot case
        • Benchmark VPP classifier on Arm/X86 platform
          • Done some NEON changes, see some microbranchmark improvement
      • investigate CSIT case
        • No classify test case in CSIT. - Jieqiang
        • may be there is a CSIT case named iacldstbase
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
        • Benchmarking DPDK memif vs VPP memif
          • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
          • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
          • Patches have been upstreamed and waiting for review
          • Prepare the memif readout - Tianyu
          • Add support for VPP aarch64 docker image build
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
      • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
    • VPP IPsec on Arm - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
        • VPP uses linear search on SPD lookups
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach

06/29/2021

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries - still under review
            • Expected to be merged soon
          • Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
            • Hugepage size, numa-node, core isolation etc. may need to check.
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
        • Release testing ongoing
        • IPsec SPD input/output case ongoing
        • Juraj may share the steps how CSIT handle new configuration changes
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
        • 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
          • May need to check VM and IPsec cases
    • VPP Path
      • Voting and working fine.
      • Community plans to drop the support for CentOS-8.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
            • Debugging
            • vfio-pci driver may be the root cause - bind/unbind
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
    • Shipment of new advanced server to the FD.io lab
      • New servers are in shortage.
    • VPN access request to FD.io Arm servers
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform
    • VPP Prefetch
      • Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
        • Repeat the same test on Ampere server - PMU cache-miss less for write always
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
      • 3 patches: prefetch, key-value compare simd improvement, cache to look up
      • Investigating VPP classify function, use case, benchmarking - Lijian
        • Start with simple use case
        • VPP Classify basic inbound L3 src ip / prot case
        • Benchmark VPP classifier on Arm/X86 platform
          • Done some NEON changes, see some microbranchmark improvement
      • investigate CSIT case
        • No classify test case in CSIT. - Jieqiang
        • may be there is a CSIT case named iacldstbase
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
        • Benchmarking DPDK memif vs VPP memif
          • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
          • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
          • Patches have been upstreamed and waiting for review
          • Prepare the memif readout - Tianyu
          • Add support for VPP aarch64 docker image build
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
      • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
    • VPP IPsec on Arm - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
        • VPP uses linear search on SPD lookups
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach

06/22/2021

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries
            • Expected to be merged soon
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
        • Release testing ongoing
        • IPsec SPD input/output case ongoing
        • Juraj may share the steps how CSIT handle new configuration changes
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • Community plans to drop the support for CentOS-8.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • vfio-pci driver may be the root cause
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
    • Shipment of new advanced server to the FD.io lab
      • New servers are in shortage.
    • VPN access request to FD.io Arm servers
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform
    • VPP Prefetch
      • Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
        • Repeat the same test on Ampere server
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
      • Investigating VPP classify function, use case, benchmarking - Lijian
        • Start with simple use case
        • VPP Classify basic inbound L3 src ip / prot case
        • Benchmark VPP classifier on Arm/X86 platform
          • Done some NEON changes, see some microbranchmark improvement
      • investigate CSIT case
        • No classify test case in CSIT. - Jieqiang
        • may be there is a CSIT case named iacldstbase
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
        • Benchmarking DPDK memif vs VPP memif
          • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
          • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
          • Patches have been upstreamed and waiting for review
          • Prepare the memif readout - Tianyu
          • Add support for VPP aarch64 docker image build
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
      • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
    • VPP IPsec on Arm - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
        • VPP uses linear search on SPD lookups
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach

06/15/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
          • VPP exits with the IPsec startup config, try startup config from Zach's email - Juraj
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
        • Release testing ongoing
        • IPsec SPD input/output case ongoing
        • Juraj may share the steps how CSIT handle new configuration changes
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • Community plans to drop the support for CentOS-8.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly. - DaveW
    • Shippment of new adavanced server to the FD.io lab
      • New servers are in shortage.
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform
    • VPP Prefetch
      • Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
        • Repeat the same test on Ampere server
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
      • Investigating VPP classify function, use case, benchmarking - Lijian
        • Start with simple use case
        • VPP Classify basic inbound L3 src ip / prot case
        • Benchmark VPP classifier on Arm/X86 platform
          • Done some NEON changes, see some microbranchmark improvement
      • investigate CSIT case
        • No classify test case in CSIT. - Jieqiang - may be there is a CSIT case named iacldstbase
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
        • Benchmarking DPDK memif vs VPP memif
          • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
          • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
          • Patches have been upstreamed and waiting for review
          • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
      • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
    • VPP IPsec on Arm - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
        • VPP uses linear search on SPD lookups
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Waiting for review comments on outbound side before upstream to VPP
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach


06/08/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
          • VPP exits with the IPsec startup config, try startup config from Zach's email - Juraj
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Work in progress.
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform.
    • VPP Path
      • Voting and working fine.
      • Community plans to drop the support for CentOS-8.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results.
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
    • Shippment of new adavanced server to the FD.io lab
      • New servers are in shortage.
  • VPP
    • VPP default compiler on Arm platform
      • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
        • Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
          • No obvious performance improvement, keep the original default compiler
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • SVE validation on FPGA platform
    • VPP Prefetch
      • Benchmark VPP using prefetch read always vs prefetch write always - Jieqiang
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
      • Investigating VPP classify function, use case, benchmarking - Lijian
        • Start with simple use case
        • VPP Classify basic inbound L3 src ip / prot case
        • Benchmark VPP classifier on Arm/X86 platform
      • investigate CSIT case
        • No classify test case in CSIT. - Jieqiang
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
        • Benchmarking DPDK memif vs VPP memif
          • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
          • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
          • Patches have been upstreamed and waiting for review
          • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
      • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
    • VPP IPsec on Arm - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input/output nodes - Govind & Zach
        • VPP uses linear search on SPD lookups
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Waiting for review comments on outbound side before upstream to VPP
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach


06/01/2021

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
              • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
              • IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
            • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
            • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Work in progress.
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate cabling issue on Taishan performance test-bed - resolved.
          • Some container case are seems failure on all platform.
    • VPP Path
      • Voting and working fine.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
          • Internal ticket has been raised
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
      • Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
    • Vector length specific patch is ready
    • Investigating VPP classify function, use case, benchmarking - Lijian
      • Start with simple use case
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
    • SVE patch ready and upstreamed, under review - Lijian
      • SVE patch sent to Nitin, Nitin will review the patch when back to work.
      • Review memif patch
        • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
      • VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case - No classify test case in CSIT. - Jieqiang
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
    • Work on IPsec input/output nodes - VPP uses linear search on SPD lookups - Govind & Zach
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Waiting for review comments on outbound side before upstream to VPP
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • Perfmon plugin enablement on Arm - Zach

05/25/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Zachary Leaf
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
              • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
              • IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
            • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
            • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - will look into it
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate cabling issue on Taishan performance test-bed - resolved.
          • Some container case are seems failure on all platform.
    • VPP Path
      • Voting and working fine.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
          • Internal ticket has been raised
            • Try the new version of DPDK but it does not help
            • Contact Intel devs for the possible advice
            • Workaround may impact too much to all test cases
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
      • Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
    • Vector length specific patch is ready
    • Investigating VPP classify function, use case, benchmarking - Lijian
      • Start with simple use case
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
    • SVE patch ready and upstreamed, under review - Lijian
      • SVE patch sent to Nitin, Nitin will review the patch when back to work.
      • Review memif patch
        • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
      • VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case - No classify test case in CSIT. - Jieqiang
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform. - Govind
      • IPsec input node optimization work in progress - Zach & Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
          • https://gerrit.fd.io/r/c/vpp/+/31694
          • IPSec unit test - make test new cases implementation
          • Make test cases for IPSec policy mode - Done, included in Govind's patch, waiting for maintainer review - Zach
            • Add/Remove/Add+Remove+Readd/Hash collisions/Multiple interfaces & rules
          • Review the patch and grasp the basics about IPSec - Lijian
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • perfmon CMN-600 investigating - Zach
      • VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
      • Plan to upstream perfmon plugin - resolving review comments - Zach
      • IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
      • IPSec decryption / input node - Zach

05/18/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Zachary Leaf
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
              • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
              • IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
            • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
            • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308
            • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
    • VPP Path
      • Voting and working fine.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
          • Internal ticket has been raised
            • Try the new version of DPDK but it does not help
            • Contact Intel devs for the possible advice
            • Workaround may impact too much to all test cases
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
      • Lab move for the fd.io lab
        • Lab moving started stage 2, moved part of the servers to make sure ci service not down.
        • Lab move is done, some issues with taishan testbed
        • Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
      • Plan to benchmark gcc-10 vs clang-12
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
    • SVE patch ready and upstreamed, under review - Lijian
      • SVE patch sent to Nitin, Nitin will review the patch when back to work.
      • Review memif patch
        • Functional bug related to C11 atomics has been resolved by VPP maintainer.
      • VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case. - Jieqiang
      • Make test cases for IPSec policy mode - Zach
        • Add/Remove/Add+Remove+Readd/Hash collisions/Multiple interfaces & rules - Add more test cases
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Performance improvement using loop unrolling for memif nodes
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform. - Govind
      • IPsec input node optimization work in progress - Zach & Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • perfmon CMN-600 investigating - Zach
      • VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
      • Plan to upstream perfmon plugin - resolving review comments - Zach
      • IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
      • IPSec decryption / input node - Zach

05/11/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Zachary Leaf
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
              • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
              • IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
            • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
    • VPP Path
      • Voting and working fine.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
          • https://doc.dpdk.org/guides/nics/i40e.html
          • Internal ticket has been raised
            • Try the new version of DPDK but it does not help
            • Contact Intel devs for the possible advice
            • Workaround may impact too much to all test cases
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
      • Lab move for the fd.io lab
        • Lab moving started stage 2, moved part of the servers to make sure ci service not down.
        • Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
        • Almost all except performance testbed, which will be moved this week, everything is smooth so far.
        • ubuntu 1804 -> 2004
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
    • SVE patch ready and upstreamed, under review - Lijian
      • SVE patch sent to Nitin, Nitin will review the patch when back to work.
      • Review memif patch
      • VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case.
      • Make test cases for IPSec policy mode - Jieqiang
    • Review memif test cases/memif cases
    • Finished coding of SVE string library, bihash key compare functions
    • SVE unit testing based on test_vec, fix test_vec issues
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
          • Will do readout presentation with extended people - Tianyu
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Performance improvement using loop unrolling for memif nodes
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform. - Govind
      • IPsec input node optimization work in progress - Zach & Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • Using startup parameter to enable the IPsec flow cache feature
    • Adding Python test case to test IPSec node behavior - Jieqiang
    • perfmon CMN-600 investigating - Zach
      • VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
      • Plan to upstream perfmon plugin - resolving review comments - Zach
      • IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
      • IPSec decryption / input node - Zach

04/27/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
    • VPP Path
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • https://doc.dpdk.org/guides/nics/i40e.html
          • Internal ticket has been raised
            • Try the new version of DPDK but it does not help
            • Contact Intel devs for the possible advice
            • Workaround may impact too much to all test cases
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
    • SVE patch ready and upstreamed, under review - Lijian
      • Make test cases for IPSec policy mode - Jieqiang
    • Review memif test cases/memif cases
    • Finished coding of SVE string library, bihash key compare functions
    • SVE unit testing based on test_vec, fix test_vec issues
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
          • Will do readout presentation with extended people - Tianyu
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Performance improvement using loop unrolling for memif nodes
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform. - Govind
      • IPsec input node optimization work in progress - Zach & Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • Using startup parameter to enable the IPsec flow cache feature
    • Adding Python test case to test IPSec node behavior - Jieqiang
    • perfmon CMN-600 investigating - Zach
      • Plan to upstream perfmon plugin - resolving review comments - Zach
      • IPSec decryption / input node - Zach

04/13/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Some issues occurred during the upgrade.
          • Patch to resolve the building error of DPDK on 3n-tsh testbed.
          • Root cause is the change of build system of DPDK on 3n-tsh related to SOC id detection.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
    • VPP Path
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
      • Make test cases for IPSec policy mode - Jieqiang
    • Review memif test cases/memif cases
    • Finished coding of SVE string library, bihash key compare functions
    • SVE unit testing based on test_vec, fix test_vec issues
    • Test template update - Jieqiang
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
          • Will do readout presentation with extended people - Tianyu
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • Using startup parameter to enable the IPsec flow cache feature
    • Adding Python test case to test IPSec node behavior - Jieqiang
    • perfmon CMN-600 investigating - Zach
      • Plan to upstream perfmon plugin - resolving review comments - Zach
      • IPSec decryption - Zach


03/30/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • 2 node IPsec SPD policy test case patch is ready, starting with 1 and 1k tunnels. (40, 400 tunnels in seperate patch)
            • https://gerrit.fd.io/r/c/csit/+/31605
            • Fix the wrong CLI commands but configuration still has problems.
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Some issues occurred during the upgrade.
          • Patch to resolve the building error of DPDK on arm testbed.(taishan dpdk cases still have issues, investigating)
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
        • Will try to reproduce the issue with x86 servers.
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Review memif test cases/memif cases
    • Finished coding of SVE string library, bihash key compare functions
    • SVE unit testing based on test_vec, fix test_vec issues
    • Test template update
    • SVE unit test in qemu-vm, met compiling issue, investigating
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
          • Will do readout presentation with extended people - Tianyu
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Prepare the memif readout - Tianyu
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • Using startup parameter to enable the IPsec flow cache feature
    • Discuss with jieqiang adding python test case to test ipsec node behavior
    • perfmon CMN-600 investigating - Zach

03/16/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
      • Verify SVE vector length specific wrappers - Jieqiang
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
        • Extend vector length agnostic opportunities
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
          • Will do readout presentation with extented people - Tianyu
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

03/09/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • CentOS-7 will be enabled with master branch for support lts release
        • CentOS-7 Jenkins on Arm will not be supported.
        • CentOS-8 will be supported by the end of this year by Redhat.
      • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
        • Dependency on maintainers to fix this issue
        • Intel will ship a new NIC with latest firmware
          • Shipment takes a long time empirically
            • NIC has been shipped to vexxhost, wait for NIC arrival.
          • Try to reproduce the issue on this NIC on Arm platform
          • Updating firmware on the current NIC is risky
        • Voting rights will be enabled once this issue is fixed
            • Maintainer raised the ticket to get intel people involved
            • Will not update the firmware because the release testing is ongoing
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
        • Will show Arm roadmap in the next TSC meeting
        • Proposals have been sent to VPP maintainer on verifying SVE/SVE2
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
        • 128 and 256 fixed size vector wrappers are ready, needs verification
        • Verify SVE vector length specific wrappers - Jieqiang
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
        • Extend vector length agnostic opportunities
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
        • Analysis of benchmarking results for Ampere Altra
          • A lot of context switches occur on Ampere Altra compared to N1SDP
          • perf tools used to capture the perf events
          • Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
          • Remove interrupts on altra but no performance improvement seen
          • instruction cache misses are higher on altra than N1
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
    • VPP compiling error on CentOS 7 - Jieqiang
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

02/23/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • CentOS-7 will be enabled with master branch for support lts release
        • CentOS-7 Jenkins on Arm will be supported.
      • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
        • Dependency on maintainers to fix this issue
        • Intel will ship a new NIC with latest firmware
          • Shipment takes a long time empirically
          • Try to reproduce the issue on this NIC on Arm platform
          • Updating firmware on the current NIC is risky
        • Voting rights will be enabled once this issue is fixed
            • Maintainer raised the ticket to get intel people involved
            • Will not update the firmware because the release testing is ongoing
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features inside QEMU docker - Lijian
        • Latest VPP binary crash on the QEMU docker
          • System call fails inside QEMU docker when running VPP
        • Verify SVE/SVE2 features inside ARM QEMU VM
        • VPP maintainers want real hardware to verify SVE code
          • This solution will be abandoned.
        • 'make test' execution is slow
        • Sync with DPDK team/VPP community to decide the solution
        • Proposals have been sent to VPP maintainer on verifying SVE/SVE2
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
        • 128 and 256 fixed size vector wrappers are ready, needs verification
        • Verify SVE vector length specific wrappers - Jieqiang
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
        • Extend vector length agnostic opportunities
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
        • Analysis of benchmarking results for Ampere Altra
          • A lot of context switches occur on Ampere Altra compared to N1SDP
          • perf tools used to capture the perf events
          • Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
          • Remove interrupts on altra but no performance improvement seen
          • instruction cache misses are higher on altra than N1
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Investigate VPP memif test cases in VPP CSIT - Jieqiang
      • Investigate VPP use cases proposals in containers - Tianyu
        • 3 use cases are investigated.
        • Will explore the memif logic and share the progress.
        • Will share the link on details about how to run VPP in container.
      • Investigate VPP agent usage - Tianyu
        • Focus more on data-plane performance benchmarking and optimization - Tianyu
    • VPP compiling error on CentOS 7 - Jieqiang
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

02/09/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
        • Dependency on maintainers to fix this issue
        • Intel will ship a new NIC with latest firmware
          • Shippment takes a long time empirically
          • Try to reproduce the issue on this NIC on Arm platform
          • Updating firmware on the current NIC is risky
        • Voting rights will be enabled once this issue is fixed
            • Maintainer raised the ticket to get intel people involved
            • Will not update the firmware because the release testing is ongoing
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features inside QEMU docker - Lijian
        • Latest VPP binary crash on the QEMU docker
          • System call fails inside QEMU docker when running VPP
        • Verify SVE/SVE2 features inside ARM QEMU VM
        • 'make test' execution is slow
        • Sync with DPDK team/VPP community to decide the solution
        • Proposals have been sent to VPP maintainer on verifying SVE/SVE2
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
        • Analysis of benchmarking results for Ampere Altra
          • A lot of context switches occur on Ampere Altra compared to N1SDP
          • perf tools used to capture the perf events
          • Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
          • Remove interrupts on altra but no performance improvement seen
          • instruction cache misses are higher on altra than N1
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Investigate VPP memif test cases in VPP CSIT - Jieqiang
      • Investigate VPP use cases proposals in containers - Tianyu
        • 3 use cases are investigated.
        • Will explore the memif logic and share the progress.
        • Will share the link on details about how to run VPP in container.
      • Investigate VPP agent usage - Tianyu
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

02/02/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
        • Dependency on maintainers to fix this issue
        • Voting rights will be enabled once this issue is fixed
          • Implementation is ready, and will do test it with actual patches.
          • Apply file locking mechanism to allow that only one VPP instances are running.
            • https://gerrit.fd.io/r/c/csit/+/30425
            • Patches are under review
            • Maintainer raised the ticket to get intel people involved
            • Will not update the firmware because the release testing is ongoing
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features inside QEMU docker - Lijian
        • Latest VPP binary crash on the QEMU docker
          • System call fails inside QEMU docker when running VPP
        • Verify SVE/SVE2 features inside ARM QEMU VM
        • 'make test' execution is slow
        • Sync with DPDK team/VPP community to decide the solution
        • Proposals have been sent to VPP maintainer on verifying SVE/SVE2
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
        • Analysis of benchmarking results for Ampere Altra
          • A lot of context switches occur on Ampere Altra compared to N1SDP
          • perf tools used to capture the perf events
          • Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
          • Remove interrupts on altra but no performance improvement seen
          • instruction cache misses are higher on altra than N1
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Investigate VPP memif test cases in VPP CSIT - Jieqiang
      • Investigate VPP use cases proposals in containers - Tianyu
        • 3 use cases are investigated.
        • Will explore the memif logic and share the progress.
        • Will share the link on details about how to run VPP in container.
      • Investigate VPP agent usage - Tianyu
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

01/19/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
        • CSIT official release 20.09 is available
          • https://docs.fd.io/csit/rls2009/report/
          • Jieqiang will compare the performance data with release 20.09
            • Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
            • DPDK testpmd running inside VM, l2 cross connect running inside VPP.
            • Check the number for CSIT 2101 release
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Hardware configurations/wiring are done; Physical connection to the TG is done.
        • almost done, two steps need to be done
          • start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • Take the execution time into consideration if we want run release testing on 2n-thx2.
          • It takes 9 hours to finish the one round testing.
          • Tests are running fine
            • L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
            • Suitable time to run release testing on 2n-tx2 testbed.
            • Will investigate IPSec test cases on 2n-tx2 - Juraj
            • Add memif test case to 2n-tx2 once the release testing is done.
    • VPP Path
      • CentOS-7 will be enabled with master branch for support lts release
        • CentOS-7 Jenkins on Arm will be supported.
      • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
          • Implementation is ready, and will do test it with actual patches.
          • Apply file locking mechanism to allow that only one VPP instances are running.
            • https://gerrit.fd.io/r/c/csit/+/30425
            • Patches are under review
            • Machiek raised the ticket to get intel people involved
            • Will not update the firmaware because the release testing is ongoing
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features inside QEMU docker
        • Latest VPP binary crash on the QEMU docker - Lijian
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
        • Analysis of benchmarking results for Ampere Altra
          • A lot of context switches occur on Ampere Altra compared to N1SDP
          • perf tools used to capture the perf events
          • Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Investigate VPP memif test cases in VPP CSIT - Jieqiang
      • Investigate VPP use cases proposals in containers - Tianyu
        • 3 use cases are investigated.
        • Will explore the memif logic and share the progress.
        • Will share the link on details about how to run VPP in container.
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind


01/05/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
        • CSIT official release 20.09 is available
          • https://docs.fd.io/csit/rls2009/report/
          • Jieqiang will compare the performance data with release 20.09
            • Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
            • DPDK testpmd running inside VM, l2 cross connect running inside VPP.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Hardware configurations/wiring are done; Physical connection to the TG is done.
        • almost done, two steps need to be done
          • start with basic L2/L3/IPSec/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • Take the execution time into consideration if we want run release testing on 2n-thx2.
          • Tests are running fine
            • L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
            • Suitable time to run release testing on 2n-tx2 testbed.
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
          • Implementation is ready, and will do test it with actual patches.
          • Apply file locking mechanism to allow that only one VPP instances are running.
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features inside QEMU docker
        • Latest VPP binary crash on the QEMU docker - Lijian
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Investigate VPP test cases in VPP CSIT - Jieqiang
      • Investigate VPP use cases proposals in containers - Tianyu
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

12/22/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
    • Will cancel the meeting on Dec 29th;
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • LF will provide QSFP+ fiber switch for FD.io lab.
        • Basically done. LF just procured the existing fiber switch currently rent by Arm in FD.io lab.
        • Send the progress to relavent people in Arm - Lijian
        • Confirm with Tina to ensure Arm is not charged - Lijian
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features on VPP CSIT
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind


12/15/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
    • Will cancel the meeting on Dec 29th;
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
          • Implementation is ready, and will do test it with actual patches.
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maitainers
      • LF will provide QSFP+ fiber switch for FD.io lab.
        • Basically done. LF just procured the existing fiber switch currently rent by Arm in FD.io lab.
        • Send the progress to relavent people in Arm - Lijian
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind


12/08/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • Working with VPP/DPDK/Intel to root cause this issue. - Juraj
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maitainers
      • LF will provide QSFP+ fiber switch for FD.io lab.
        • Vexxhost just has a spare one, and LF will buy it for FD.io lab, which will probably happen this month.
      • N1SDP shipment to FD.io
        • Govind will track the status
      • CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
        • Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
        • Arm is required to present Arm achievement and plan to TSC.
          • Govind will prepare the slides
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • SOC id will be available on /proc entry starting from kernel version 5.9
        • Will investigate the details - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Benchmarked cross-connect and TX queue is dropping packets
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • To optimize ethernet-input and avf-input node with NEON intrinsics
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals upstreamed
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • Have to repeat the testing in the future.
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Ampere altra server has some PCIe bugs.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

12/1/2020

  • Attendees
    • Govindarajan Mohandoss
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
      • LF will provide QSFP+ fiber switch for FD.io lab.
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • To enable voting right for the VPP device jobs. - Juraj
          • Failed tests due to sw_interface_dump api issue. - Juraj
        • VPP device job is unstable
          • Race condition occurs when multiple VPP instances are starting.
          • Will try to update the i40e driver & firmware.
      • N1SDP shipment to FD.io
        • Govind will update the shippment status to Juraj and Machiek.
        • Will still have to ship N1SDP to FD.io lab, and Machiek confirmed that there will be enough rack space.
      • CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
        • Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
        • Arm is required to present Arm achievement and plan to TSC.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • SOC id will be available on /proc entry starting from kernel version 5.9
        • Will investigate the details - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • To optimize ethernet-input and avf-input node with NEON intrinsics
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 proposal
      • Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
      • Patches are upstreamed for comments
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Ampere altra server has some PCIe bugs.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • IPsec on Arm platform. - Govind
      • Apply prefetches with ipsec-out node
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

11/24/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
      • LF will provide QSFP+ fiber switch for FD.io lab.
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • To enable voting right for the VPP device jobs. - Juraj
          • Failed tests due to sw_interface_dump api issue. - Juraj
      • N1SDP shipment to FD.io
        • Govind will update the shippment status to Juraj and Machiek.
        • Will still have to ship N1SDP to FD.io lab, and Machiek confirmed that there will be enough rack space.
      • CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
        • Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
        • Arm is required to present Arm achievement and plan to TSC.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • SOC id will be available on /proc entry starting from kernel version 5.9
        • Will investigate the details - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • To optimize ethernet-input and avf-input node with NEON intrinsics
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 proposal
      • Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
      • Patches are upstreamed for comments
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Ampere altra server has some PCIe bugs.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • IPsec on Arm platform. - Govind
      • Apply prefetches with ipsec-out node
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

11/17/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Tina Tsou
  • General
  • CSIT
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • SOC id will be available on /proc entry starting from kernel version 5.9
        • Will investigate the details - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • To optimize ethernet-input and avf-input node with NEON intrinsics
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 proposal
      • Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
      • Patches are upstreamed for comments
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Ampere altra server has some PCIe bugs.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • IPsec on Arm platform. - Govind
      • Apply prefetches with ipsec-out node
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

11/10/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
      • CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
      • L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
        • The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
        • Repeat tests on local N1SDP and cascade server. - Jieqiang
        • Repeat the test case with latest master branch. - Jieqiang
        • The patch introduced this perf drop need to be analyzed. - Jieqiang, Lijian
        • This patch needs to be analysed on VPP 2005 and 2001 releases. - Jieqiang, Lijian
        • The perf drop rate is ~5-8% on latest VPP code compared to the original data.
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
      • 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
        • Juraj to check with Peter about the feasibility.
        • Move the thx2 to the same rack for tg and install the same nic on tg.
        • 1g NIC for management installed on thx2, but cannot be net-booted.
          • Able to net-boot from the built-in 10G NIC.
          • The tx2 has been moved to the same rack where the tg is located.
          • Plan to set up the weekly perf tests on the new topo.
        • Port the robotframe configuration steps for tsh testbeds from thx1 to thx2 to speed up perf tests. - Juraj
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
        • https://gerrit.fd.io/r/c/ci-management/+/28022 automate the generation of docker builder images.
        • Plan to drop the support for CentOS 7 from Dave.
        • Tried Dave's patch to generate docker image on Arm and saw some errors. - Juraj
        • Test arm centos7 jenkins builder image. - Juraj.
        • Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
        • Figure out which host of two hosts to run the Jenkins job.
        • Revert to old kernel version 4.15.0-55 to avoid AVF issue.
        • AVF issue is common across the platform.
          • Differences between avf driver versions may be the root cause of behavior changes.
        • New VPP device job takes about 55 minutes to finish, which needs to be reduced to 45 minutes around.
          • Python runs slower on new thx2 servers than 1-node skylake.
          • Try new version of Python(such as 3.8) or split the device tests into two parts.
          • Check how many CPUs get utilized for robot framework execution on thx2 server.
          • Two thunderx2 are running fine right now and the VPP device jobs are almost done.
          • Disabling hyperthreading on new thx2 will speed up the VPP device tests.
          • Enable the voting right for the VPP device jobs. - Juraj
            • Failed tests due to sw_interface_dump api issue. - Juraj
      • N1SDP shippment to FD.io
        • Get response from Maciek about the rack space and traffic generator availability.
      • CSIT budget plan for 10g switch purchase in FD.io lab. - Juraj, Tina
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
      • Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
      • Summarize the meeting minutes and action items. - Lijian
      • SOC id will be available on /proc entry from kernel version 5.9 - Ljian, Honnappa
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
      • Start benchmarking AVF PMD driver in VPP on N1SDP.
      • Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
      • avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
    • SVE/SVE2 proposal
      • Will send email to Damjan asking him to review
      • SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
      • No further comments from VPP community.
      • Apply the SVE/SVE2 intrinsics on refactoring ethernet-input node. - Lijian
        • SVE/SVE2 functionality to be tested on the new development platform.
        • Verify SVE/SVE2 code changes on simulator.
        • Try to run standalone SVE codes on the new FPGA platform.
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
    • Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
      • Ampere altra server has some PCIe bugs.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
      • Find out the tuned configuration for cross connect test cases using AVF PMD driver.
        • Figure out corresponding configurations in CSIT scripts.
        • Repeat the ACL ingress SL test cases locally for N1SDP.
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Plans

11/03/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
        • https://gerrit.fd.io/r/c/ci-management/+/28022 automate the generation of docker builder images.
        • Test arm centos7 jenkins builder image. - Juraj.
        • Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
        • Figure out which host of two hosts to run the Jenkins job.
        • Revert to old kernel version 4.15.0-55 to avoid AVF issue.
        • AVF issue is common across the platform.
          • Differences between avf driver versions may be the root cause of behavior changes.
        • New VPP device job takes about 55 minutes to finish, which needs to be reduced to 45 minutes around.
          • Python runs slower on new thx2 servers than 1-node skylake.
          • Try new version of Python(such as 3.8) or split the device tests into two parts.
          • Check how many CPUs get utilized for robot framework execution on thx2 server.
          • Two thunderx2 are running fine right now and the VPP device jobs are almost done.
      • N1SDP shippment to FD.io
        • Get response from Machiek about the rack space and traffic generator avalability.
      • CSIT budget plan for 10g switch purchase in FD.io lab. - Juraj, Tina
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
      • Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
      • Summarize the meeting minutes and action items. - Lijian
      • SOC id will be available on /proc entry from kernel version 5.9 - Ljian, Honnappa
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
      • Start benchmarking AVF PMD driver in VPP on N1SDP.
      • Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
      • avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
    • SVE/SVE2 proposal
      • Will send email to Damjan asking him to review
      • SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
      • No further comments from VPP community.
      • Apply the SVE/SVE2 intrinsics on refractoring ethernet-input node. - Lijian
        • SVE/SVE2 functionality to be tested on the new development platform.
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
    • Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
      • Find out the tuned configuration for cross connect test cases using AVF PMD driver.
        • Figure out corresponding configurations in CSIT scripts
        • Repeat the ACL ingress SL test cases locally for N1SDP.
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind.
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
    • Plans

10/27/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
        • Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
        • Figure out which host of two hosts to run the Jenkins job.
        • Revert to old kernel version 4.15.0-55 to aviod AVF issue.
          • Differences between avf driver versions may be the root cause of behavior changes.
        • New VPP device job takes about 55 minutes to finish, which needs to be reduced to 40 minutes around.
          • Python runs slower on new thx2 servers than 1-node skylake.
          • Try new version of Python(such as 3.8) or split the device tests into two parts.
          • Check how many CPUs get utilized for robot framework execution on thx2 server.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
      • Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
      • Summarize the meeting minutes and action items. - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
      • Start benchmarking AVF PMD driver in VPP on N1SDP.
      • Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
      • avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
    • SVE/SVE2 proposal
      • Will send email to Damjan asking him to review
      • SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
      • No further comments from VPP community.
      • Apply the SVE/SVE2 on ethernet-input node. - Lijian
    • Repeat the 4x and 2x loop unrolling tests on Ampere server. - Jieqiang
    • Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
        • Figure out corresponding configurations in CSIT scripts
        • Repeat the ACL ingress SL test cases locally for N1SDP.
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
    • Plans

10/20/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
        • Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
        • Errors happen when running latest VPP debug image, which was introduced by https://gerrit.fd.io/r/c/vpp/+/29490 - Lijian
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
        • Figure out which host of two hosts to run the Jenkins job.
        • Two failed test cases related to AVF plugin.
          • The root cause is the newer kernel version - 4.15.0-118-generic fails, 4.15.0-72-generic works.
          • Downgrade the kernel version to 4.15.0-72-generic and continue the VPP device testing.
          • Try the same experiment on X86 to see if this issue is arm-specific or not. - Juraj
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
      • Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
      • Start benchmarking AVF PMD driver in VPP on N1SDP.
      • Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
    • SVE/SVE2 proposal
    • Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
      • Finished the benchmarking and shared the data to team. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
        • Figure out corresponding configurations in CSIT scripts
        • Repeat the ACL ingress SL test cases locally for N1SDP.
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
    • Plans

10/13/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
        • Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
        • Figure out which host of two hosts to run the Jenkins job.
        • Two failed test cases related to AVF plugin.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
      • Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
    • SVE/SVE2 proposal
    • Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
      • Finished the benchmarking and shared the data to team.
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
        • Figure out corresponding configurations in CSIT scripts
        • Repeat the ACL ingress SL test cases locally for N1SDP.
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
    • Plans

10/06/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tina Tsou
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs and other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
    • SVE/SVE2 proposal
    • Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
      • Finished the benchmarking and shared the data to team.
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
      • Figure out corresponding configurations in CSIT scripts
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
    • Plans

09/29/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate Vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
    • SVE/SVE2 proposal
    • Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
      • Finished the benchmarking and shared the data to team.
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
      • Figure out corresponding configurations in CSIT scripts
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
    • Plans

09/22/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate Vendor CPUs with other Perseus CPUs
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
    • SVE/SVE2 proposal
    • Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
      • Figure out corresponding configurations in CSIT scripts
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
    • Plans

09/15/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • LF will pay for the expense, and Vexhost has or will make the order for new RAM module.
      • Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
      • Check with Juraj with the latest news about the faulty RAMs.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - first step;
        • Add CentOS-7 on Arm will be second step.
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
        • Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
    • VPP Device
      • We can decommission 3x SoftIron servers directly, but for the existing ThunderX2 servers, the decommission with it could be temporarily. We probably will reinstall it in the near future.
      • Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
    • Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
    • Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
    • Budget plan for CSIT FD.io lab.
      • We have enough servers for VPP path & device tests.
      • We can ask the CSIT FD.io lab folks for saving rack space for arm servers.
      • We may plan to send new advanced servers for perf tests in future but we won't mention the specific server type.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • Vendor CPU server enablement in VPP - Lijian
      • Ready for internal review
      • Will discuss with VPP maintainer
    • Investigate VPP Intel AVF driver - Lijian
    • SVE
      • SVE intrinsics wrapper is done. Proposal patch is ready for review.
      • ACLE, architecture, sve-sve2-programming-example
      • SVE intrinsics is preferred.
      • Share dpdk team with SVE knowledge.
    • Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
      • Will repeat scalability testing on N1SDP.
    • Benchmark AVF driver btw Cascade Lake and N1SDP - Jieqiang
      • Will investigate AVF drivers on Arm. - Lijian
    • Jieqiang will figure out performance data for 1x, 10Kx flows on Cascade Lake in CSIT.
      • Conform if the system is same for the local dell server and cascade server in CSIT. - Jieqiang
      • Check if there are any test cases with 1t1c/2t2c/4t4c configured for 2n-clx testbed in CSIT - Jieqiang
      • Performance data; Configurations;
    • VM2VM
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Started system tuning on PMD TX direction.
      • Investigate mempool configuration.
      • Change the descriptor size by modifying the DPDK source code.
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans

09/08/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • LF will pay for the expense, and Vexhost has or will make the order for new RAM module.
      • Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - first step;
        • Add CentOS-7 on Arm will be second step.
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
        • Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
    • VPP Device
      • We can decommission 3x SoftIron servers directly, but for the existing ThunderX2 servers, the decommission with it could be temporarily. We probably will reinstall it in the near future.
      • Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
    • Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
    • Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • SVE
      • SVE intrinsics wrapper is done. Proposal patch is ready for review.
      • ACLE, architecture, sve-sve2-programming-example
      • SVE intrinsics is preferred.
    • Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
      • Will repeat scalability testing on N1SDP.
    • Benchmark AVF driver btw Cascade Lake and N1SDP - Jieqiang
      • Will investigate AVF drivers on Arm. - Lijian
    • Jieqiang will figure out performance data for 1x, 10Kx flows on Cascade Lake in CSIT.
      • Performance data; Configurations;
    • VM2VM
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Started system tuning on PMD TX direction.
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans

09/01/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
        • Seems plugin working RAMs into empty slots will resolve the problem.
        • Juraj will send email to Machiek about the ownership of any FD.io lab servers, and who should pay for the charge.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
        • IPMI IP is configured via SSH Linux prompt. It's working fine now.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
        • Pending with Vexx host to proceed further.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
        • Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
    • VPP Device
      • Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
    • Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
    • Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
  • VPP
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
      • gcc-10.1.0 has compiling errors with latest VPP source code.
        • This issue is fixed by Jieqiang and available for internal review.
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • gcc-10 compiling issue is resolved and merged.
    • SVE
      • ACLE, architecture, sve-sve2-programming-example
      • SVE intrinsics is preferred.
    • Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
    • VM2VM
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Started system tuning on PMD TX direction.
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans

08/25/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
        • Seems plugin working RAMs into empty slots will resolve the problem.
        • Juraj will send email to Machiek about the ownership of any FD.io lab servers, and who should pay for the charge.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
        • IPMI IP is configured via SSH Linux prompt. It's working fine now.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
        • Pending with Vexx host to proceed further.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
        • Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
    • VPP Device
      • Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
    • Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
    • Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
  • VPP
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
      • gcc-10.1.0 has compiling errors with latest VPP source code.
        • This issue is fixed by Jieqiang and available for internal review.
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • SVE
      • ACLE, architecture, sve-sve2-programming-example
      • SVE intrinsics is preferred.
    • Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
    • VM2VM
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans


08/18/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
      • Jieqiang is investigating some performance drop (between 2005 and 2008 releases) cases on Taishan servers.
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
      • They have finished collecting data with performance testing setup, and the mrr daily is resumed
      • FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
      • Jieqiang will share investigation report, but so far there is no apparent performance differences.
        • Some 4T4C test cases on Taishan have obvious performance drop. Will compare the trending with x86 machines.
    • VPP Path
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
        • Pending with Vexx host to proceed further.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
        • Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
    • VPP Device
    • Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
    • Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
      • Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
      • Investigating vlib_timer and timer wheel in VPP.
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
      • gcc-10.1.0 has compiling errors with latest VPP source code.
        • This issue is fixed by Jieqiang and available for internal review.
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans


08/11/2020

  • Attendees
    • Honnappa Nagarahalli
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
    • Filip Varga
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
      • Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
      • Investigating vlib_timer and timer wheel in VPP.
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
      • gcc-10.1.0 has compiling errors with latest VPP source code.
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans

08/04/2020

  • Attendees
    • Honnappa Nagarahalli
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
    • Filip Varga
  • General
  • CSIT
    • VPP Performance Test
      • They have finished collecting data with performance testing setup, and the mrr daily is resumed
      • FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
      • Jieqiang will share investigation report, but so far there is no apparent performance differences.
        • Some 4T4C test cases on Taishan have obvious performance drop. Will compare the trending with x86 machines.
    • VPP Path
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
      • Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
      • Investigating vlib_timer and timer wheel in VPP.
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
      • gcc-10.1.0 has compiling errors with latest VPP source code.
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans


07/28/2020

  • Attendees
    • Honnappa Nagarahalli
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • They have finished collecting data with performance testing setup, and the mrr daily is resumed
      • FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
      • Jieqiang will share investigation report, but so far there is no apparent performance differences.
      • VPP performance testing is running once a week.
    • VPP Path
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
      • Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
      • Investigating vlib_timer and timer wheel in VPP.
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans

07/21/2020

  • Attendees
    • Honnappa Nagarahalli
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Arm has
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
      • Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
      • Investigating vlib_timer and timer wheel in VPP.
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
    • N1SDP enablement. - Lijian
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans


07/14/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community has started collecting performance data with these CSIT machines.
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
        • Two of the three ThunderX1 servers cannot be accessed.
        • Spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
        • Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
      • Investigating various No. of rx_q_bufs & tx_q_bufs
      • Investigating various No. of vector size, and check its affection on throughput
      • Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
    • ACL optimization investigation on n1sdp - Govind
      • Investigating using SPE counters to profile ACL plugin bottle-neck
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

07/07/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community has started collecting performance data with these CSIT machines.
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
        • Two of the three ThunderX1 servers cannot be accessed.
        • Spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
        • Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
      • Investigating various No. of rx_q_bufs & tx_q_bufs
      • Investigating various No. of vector size, and check its affection on throughput
      • Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
    • ACL optimization investigation on n1sdp - Govind
      • Investigating using SPE counters to profile ACL plugin bottle-neck
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang


06/30/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community has started collecting performance data with these CSIT machines.
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
        • Two of the three ThunderX1 servers cannot be accessed.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
        • Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
      • Investigating various No. of rx_q_bufs & tx_q_bufs
      • Investigating various No. of vector size, and check its affection on throughput
      • Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
    • ACL optimization investigation on n1sdp - Govind
      • Investigating using SPE counters to profile ACL plugin bottle-neck
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

06/23/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community has started collecting performance data with these CSIT machines.
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • Two of the three ThunderX1 servers cannot be accessed.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
        • Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • L3FWD status
    • CSIT status
    • EPIC plan
      • SVE2 investigation in VPP;
      • VPP hoststack TCP/CPS(Connnection per Second) investigation;
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • Profiling with NMU-600 counters.
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

06/16/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community has started collecting performance data with these CSIT machines.
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
      • Investigate Ubuntu-20.04 on Arm servers - Jieqiang
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
        • labelled by Dave Wallace to use it for VPP Jenkins job.
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
        • gcc-9 is hard-coded and used, so compilation issue is gone.
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • Profiling with NMU-600 counters.
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

06/09/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community will collect performance data with these CSIT machines.
      • IPSec tunnel configuration issue.
        • Issue is resolved.
        • Also failing on x86. CSIT maintainer is trying to root cause the problem.
          • Juraj to run the IPSec regression on Taishan server with the IPSec patch.
    • VPP Path
      • Investigate Ubuntu-20.04 on Arm servers - Jieqiang
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
        • labelled by Dave Wallace to use it for VPP Jenkins job.
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
        • gcc-9 is hard-coded and used, so compilation issue is gone.
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • Profiling with NMU-600 counters.
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

06/02/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • IPSec tunnel configuration issue.
        • Also failing on x86. CSIT maintainer is trying to root cause the problem.
          • Juraj to run the IPSec regression on Taishan server with the IPSec patch.
    • VPP Path
      • Investigate Ubuntu-20.04 on Arm servers - Jieqiang
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
        • labelled by Dave wallace to use it for VPP Jenkins job.
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
        • gcc-9 is hard-coded and used, so compilation issue is gone.
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

05/26/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
      • IPSec tunnel configuration issue.
        • Also failing on x86. CSIT maintainer is trying to root cause the problem.
          • Juraj to run the IPSec regression on Taishan server with the IPSec patch.
    • VPP Path
      • Investigate Ubuntu-20.04 on Arm servers - Jieqiang
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
        • labelled by Dave wallace to use it for VPP Jenkins job.
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
        • gcc-9 is hard-coded and used, so compilation issue is gone.
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers. Jieqiang will setup a meeting with Juraj regarding this documentation.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
    • N1SDP enablement. - Lijian
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang


05/19/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • the other failure is related with VPP image on Arm, IPSec tunnel configuration issue.
        • Also failing on x86. CSIT maintainer is trying to root cause the problem.
    • VPP Path
      • Investigate Ubuntu-20.04 on Arm servers - Jieqiang
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
    • N1SDP enablement. - Lijian
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

04/28/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
  • General
  • CSIT
  • FD.io lab
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • Resolve VPP compiling issue with clang-6.
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
    • N1SDP enablement. - Lijian
      • Multi-arch, arch-specific compiling and dynamic function selection patch is merged.
      • IOMMU limitation issue is gone after upgrade the kernel and fw
        • Share kernel/fw upgrade version to Govind
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
    • Trying to make IPsec enabled with Arm platform. - Govind
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

04/28/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
    • Arthur Marshall
  • General
  • CSIT
  • FD.io lab
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
    • Investigate bihash operations in L2 throughput are hot-spots
      • To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
      • To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
    • Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
    • N1SDP enablement. - Lijian
    • GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
      • Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
      • It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
    • iova_mode == VA not working issue is not root-caused
      • DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
      • However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
        • This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
    • Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
    • Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
      • The degradation is seen even when L3 cache is enabled.
    • Trying to make IPsec enabled with Arm platform. - Govind
    • Create Confluence page to record all the performance benchmarking data - Lijian
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
        • Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.

04/21/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
  • General
  • CSIT
  • FD.io lab
  • VPP
    • Investigate bihash operations in L2 throughput are hot-spots
      • To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
      • To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
    • Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
    • N1SDP enablement. - Lijian
    • gcc-10 is not working so far.
      • GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
      • Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
      • It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
    • iova_mode == VA not working issue is not root-caused
      • DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
      • However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
        • This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
    • Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
    • Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
      • The degradation is seen even when L3 cache is enabled.
    • Trying to make IPsec enabled with Arm platform. - Govind
    • Create Confluence page to record all the performance benchmarking data - Lijian
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
        • Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.

04/14/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
  • General
  • CSIT
  • FD.io lab
  • VPP
    • Investigate bihash operations in L2 throughput are hot-spots
      • To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
      • To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
    • Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
    • N1SDP enablement. - Lijian
      • GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
      • Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
      • It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
    • iova_mode == VA not working issue is not root-caused
      • DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
      • However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
        • This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
    • Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
    • Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
      • The degradation is seen even when L3 cache is enabled.
    • Trying to make IPsec enabled with Arm platform. - Govind
    • Create Confluence page to record all the performance benchmarking data - Lijian
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
        • Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.

04/07/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
  • General
  • CSIT
  • FD.io lab
  • VPP
    • Vectorization
    • Investigate bihash operations in L2 throughput are hot-spots
      • To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
      • To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
    • Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
    • N1SDP enablement. - Lijian
      • GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
      • Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
      • It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
    • iova_mode == VA not working issue is not root-caused
      • DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
      • However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
        • This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
    • Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
    • Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
      • The degradation is seen even when L3 cache is enabled.
    • Trying to make IPsec enabled with Arm platform. - Govind
    • Create Confluence page to record all the performance benchmarking data - Lijian
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
        • Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.

03/31/2020

03/24/2020

03/17/2020

03/10/2020

03/03/2020

02/25/2020


02/18/2020


02/11/2020


02/04/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Juraj Linkes
    • Tina Tsou
  • General
  • CSIT
  • FD.io lab
    • Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
      • Cables for intel NICs have been ordered.
      • Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
    • Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
    • Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
    • Current Configurations:
      • RAM: 256G
      • Disk: 480G SSD
      • The boxes are coming with Qlogic cards which are not supported in VPP.
    • Changes required to the servers:
      • The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
      • Need 2 Intel NICs XL710-QDA2 for each server.
      • If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
      • Disk size to 480G
      • Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
      • Cables: N1, P1 to N2, P1 and so on
      • Cables for IPMI and Management port: 2
    • Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
    • Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
    • ThunderX1
  • VPP
    • Align Arm patches with VPP release plan.
      • F0 2020-01-08 APIs frozen. Only low-risk changes accepted on main branch.
      • RC1 2020-01-15 (F0+7) Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
      • RC2 2020-01-22 (RC1+7) Second artifacts posted.
      • Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
    • Vectorization
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Benchmarking AVF drivers on Arm servers - Jieqiang
      • VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
      • Check if performance tests includes AVF driver or not?
    • AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
      • Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
      • Will try one patch to enable N1SDP board.
      • Please try AVF with Mcbin if possible.
    • Investigate bihash operations in L2 throughput are hot-spots
      • Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
      • Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
      • Cache misses and CRC32 calculation are possible opportunities.
        • To check cycles by applying CRC32 calculation unrolling
    • Bench-mark VPP on Dawn N1SDP board
      • Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
    • Investigating bi-hash lockless implementation - Jason
      • Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
    • Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
    • EPIC for next quarter:

01/28/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
  • General
  • CSIT
  • FD.io lab
    • Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
      • Cables for intel NICs have been ordered.
      • Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
    • Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
    • Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
    • Current Configurations:
      • RAM: 256G
      • Disk: 480G SSD
      • The boxes are coming with Qlogic cards which are not supported in VPP.
    • Changes required to the servers:
      • The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
      • Need 2 Intel NICs XL710-QDA2 for each server.
      • If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
      • Disk size to 480G
      • Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
      • Cables: N1, P1 to N2, P1 and so on
      • Cables for IPMI and Management port: 2
    • Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
    • Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
    • ThunderX1
  • VPP
    • Align Arm patches with VPP release plan.
      • F0 2020-01-08 APIs frozen. Only low-risk changes accepted on main branch.
      • RC1 2020-01-15 (F0+7) Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
      • RC2 2020-01-22 (RC1+7) Second artifacts posted.
      • Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
    • Vectorization
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Benchmarking AVF drivers on Arm servers - Jieqiang
      • VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
      • Check if performance tests includes AVF driver or not?
    • AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
      • Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
      • Will try one patch to enable N1SDP board.
      • Please try AVF with Mcbin if possible.
    • Investigate bihash operations in L2 throughput are hot-spots
      • Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
      • Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
      • Cache misses and CRC32 calculation are possible opportunities.
        • To check cycles by applying CRC32 calculation unrolling
    • Bench-mark VPP on Dawn N1SDP board
      • Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
    • Investigating bi-hash lockless implementation - Jason
      • Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
    • Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
    • EPIC for next quarter:

01/21/2020


01/14/2020

01/07/2020


12/17/2019

12/10/2019


12/03/2019

11/26/2019

11/19/2019

11/12/2019

10/29/2019

10/22/2019

10/15/2019

10/08/2019

10/01/2019

09/24/2019

09/17/2019

09/10/2019

09/03/2019

08/27/2019

08/20/2019

08/13/2019

08/06/2019

07/30/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
    • Christian Hopps
  • General
  • CSIT
  • FD.io lab
  • VPP
    • https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
    • Align Arm patches with VPP release plan.
      • Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
      • Will check VPP release schedual and map with Arm Quaterly plan.
      • Note down patches in community review and align them to VPP release plan.
      • It has been challenging to do that in VPP.
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue optimize it with relaxed atomic intrinsics - Lijian
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
        • The patch is also enabled for x86. Will ask maintainer to review.
    • Spinlock/read-write lock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
      • Spinlock with inner loop got improvement on both x86 and Arm.
      • Read/write lock got a little degradation with the patch.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
      • Jieqiang checked the video by Sirshak
    • Apply dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
      • Will do bench-marking profiling on mcbin/Bluefield.
    • Think of memory usage and optimization for smaller device/memory
    • Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
    • To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

07/23/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
    • Christian Hopps
  • General
  • CSIT
    • VPP Performance Test
    • Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
    • Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
    • Only 1 out of 199 test cases failed, 8 test cases show random 'show interface' failure.
    • Some failures are related with 'show hardware'/'show interface'/'show vhost dump', time-out.
    • Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
      • creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • VPP Path
      • Working on MAC learning test failures on Cortex-A72 server - Jieqiang
        • Enlarge duration can fix the failure, but will investigate more details.
        • Issues have been fixed in latest master branch. Investigating the details.
      • cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
    • VPP Device
      • VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
        • Send email and current debug details to community calling for volunteer to fix it. - Lijian
        • pmalloc module test cases failed on Arm server.
      • Changes are uploaded to community gerrit.
      • vpp VMs seems to bring up well. Will work on init script and bring up vpp.
        • VM tests passed. Patches are to be submitted for community review.
        • All the patches are merged and all images are built.
      • Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
      • Ed to help set up numad cluster with dual ThunderX and one ThunderX2
  • FD.io lab
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue optimize it with relaxed atomic intrinsics - Lijian
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
        • The patch is also enabled for x86. Will ask maintainer to review.
    • Spinlock/read-write lock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
      • Spinlock with inner loop got improvement on both x86 and Arm.
      • Read/write lock got a little degradation with the patch.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
      • Inform MAP owner that Jieqiang will take care of MAP on VPP. - Lijian
    • Apply dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
      • Will do bench-marking profiling on mcbin/Bluefield.
    • Think of memory usage and optimization for smaller device/memory
    • Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
    • To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

07/16/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
    • Christian Hopps
  • General
  • CSIT
    • VPP Performance Test
    • Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
    • Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
      • creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • VPP Path
      • Working on MAC learning test failures on Cortex-A72 server - Jieqiang
        • Enlarge duration can fix the failure, but will investigate more details.
        • Issues have been fixed in latest master branch. Investigating the details.
      • cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
    • VPP Device
      • VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
        • Send email and current debug details to community calling for volunteer to fix it. - Lijian
      • Changes are uploaded to community gerrit.
      • vpp VMs seems to bring up well. Will work on init script and bring up vpp.
        • VM tests passed. Patches are to be submitted for community review.
        • Patch is splited into three small pieces. Two patches (kernel image for VM test/generic CSIT changes to support ThunderX2 testbed) are merged. Third patch about code changes for VM test to be merged, Arm specific code and use kernel image.
      • Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
        • Docker images for both Arm and x86 are merged and available.
        • Docker image is verified on Arm server, but to verify it on x86 server also and try it in Jenkins.
      • Ed to help set up numad cluster with dual ThunderX and one ThunderX2
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
      • It’s 1RU blade ThunderX2.
      • The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
      • The machine should have a big RAM, more than 120G and 256G preferred.
      • The machine should Three NICs (XL710-QDA2, 2x40G).
      • The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue optimize it with relaxed atomic intrinsics - Lijian
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
        • The patch is also enabled for x86. Will ask maintainer to review.
    • Spinlock/read-write lock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
      • Spinlock with inner loop got improvement on both x86 and Arm.
      • Read/write lock got a little degradation with the patch.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Apply dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
      • Will do bench-marking profiling on mcbin.
    • Think of memory usage and optimization for smaller device/memory
    • Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
    • To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

07/09/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
    • Christian Hopps
  • General
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
    • Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
      • creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • VPP Path
    • VPP Device
      • VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
        • Send email and current debug details to community calling for volunteer to fix it. - Lijian
      • Changes are uploaded to community gerrit.
      • vpp VMs seems to bring up well. Will work on init script and bring up vpp.
        • VM tests passed. Patches are to be submitted for community review.
      • Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
        • Docker images for both Arm and x86 are merged and available.
      • Ed to help set up numad cluster with dual ThunderX and one ThunderX2
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
      • Update the current status to Pravin. - Lijian
      • The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
      • Require a bigger than 120G RAM, prefer 256G
      • Three NICs and each has two ports.
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue optimize it with relaxed atomic intrinsics - Lijian
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
    • Spinlock/read-write lock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
      • Spinlock with inner loop got improvement on both x86 and Arm.
      • Read/write lock got a little degradation with the patch.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Apply dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
      • Will do bench-marking profiling on mcbin.
    • Think of memory usage and optimization for smaller device/memory
    • Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
    • To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective


07/02/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
  • General
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
    • Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
      • creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • VPP Path
    • VPP Device
      • VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
        • Send email and current debug details to community calling for volunteer to fix it. - Lijian
      • vpp VMs seems to bring up well. Will work on init script and bring up vpp.
      • Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
      • Set up numad cluster with dual ThunderX and one ThunderX2
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
      • Update the current status to Pravin. - Lijian
      • The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
      • Require a bigger than 120G RAM, prefer 256G
      • Three NICs and each has two ports.
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue, remove atomic intrinsics and use lock version only - Lijian
      • Have verified removing atomic intrinsics in message_queue alloc/free APIs, and require confirmation from Florin.
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
    • Spinlock/read-write lock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
    • Fix ip4_forward compiling - Jason
      • Will check gerrit CI/CD related with that patch. Check why it's not warning in gerrit Jenkins.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Spread dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Will do bench-marking profiling on mcbin.
    • Think of memory usage and optimization for smaller device/memory
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

06/25/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
  • General
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
    • Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
      • creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • VPP Path
    • VPP Device
      • VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
      • Crypto test cases, will use dpdk driver if configured, native-vpp implementation, fall back to openSSL
        • Will try Crypto test cases next week - Juraj
      • Juraj to send Lijian the details of vpp VMs, Lijian will confirm internally
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
      • Firstly will sponsor the machine
      • The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
      • Require a bigger than 120G RAM, prefer 256G
      • Three NICs and each has two ports.
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue, remove atomic intrinsics and use lock version only - Lijian
      • Have verified removing atomic intrinsics in message_queue alloc/free APIs, and require confirmation from Florin.
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
    • Spinlock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Spread dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Will do bench-marking profiling on mcbin.
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

06/18/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina Tsou
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
  • General
  • CSIT
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
      • Will confirm with Florin to remove atomic intrinsics in message_queue alloc/free APIs
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - Upstreamed.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Investigate hyperscan plugin in VPP - Sirshak
    • Spread qual/quad optimization - ethernet-input
    • Redo perf/MAP profiling/bench-marking
      • DPI plugin?
    • EPIC for next quarter:
      • Apply dual/quad optimization on more data path nodes
      • Investigate and optimize VPP hash and bihash library
      • VPP translation overhead analysis btw Mbuf and VLIB buffer ENTNET-1293
      • VPP Memif performance analysis and optimization ENTNET-1292
      • VPP l3fwd performance analysis and optimization ENTNET-751
      • Using MAP with VPP ENTNET-1288

06/11/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina Tsou
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj
  • General
  • CSIT
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
      • Will confirm with Florin to remove atomic intrinsics in message_queue alloc/free APIs
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - Upstreamed.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Investigate hyperscan plugin in VPP - Sirshak
    • Spread qual/quad optimization - ethernet-input
    • Redo perf/MAP profiling/bench-marking
      • DPI plugin?
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

06/04/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina Tsou
    • Lijian Zhang
    • Jieqiang Wang
    • Stan
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - Upstreamed.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Investigate hyperscan plugin in VPP - Sirshak
      • DPI plugin?
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

05/28/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina
    • Lijian Zhang
    • Vijay (vijayakumar.rajamanickam@nokia.com)
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - internal Review.
    • MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
    • Investigate hyperscan plugin in VPP - Sirshak
      • DPI plugin?
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

05/21/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina
    • Lijian Zhang
    • Vijay (vijayakumar.rajamanickam@nokia.com)
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - internal Review.
    • MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
    • Investigate hyperscan plugin in VPP - Sirshak
      • DPI plugin?
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

05/14/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina
    • Lijian Zhang
    • Vijay (vijayakumar.rajamanickam@nokia.com)
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • VPP generic distro package building patch - Patch updated. Require Damjan's follow up review.
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - internal Review.
    • MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
    • Investigate hyperscan plugin in VPP - Sirshak
      • DPI plugin?
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

05/07/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina
    • Lijian Zhang
    • Vijay (vijayakumar.rajamanickam@nokia.com)
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
      • Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input.
    • VPP generic distro package building patch - Patch updated Damjan's follow up review required.
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • Vectorization in esp-encrypt, optimize memcpy_le. Upstreamed(https://gerrit.fd.io/r/#/c/18398/). - Lijian
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - internal Review.
    • MAP(Arm Proprietary Performance Analysis Tool) with VPP - Tried internal Patch still failing. Continuing to work on it.
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

04/30/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
      • Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input.
    • VPP generic distro package building patch - Patch updated Damjan's follow up review required.
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • Vectorization in esp-encrypt, optimize memcpy_le. Upstreamed(https://gerrit.fd.io/r/#/c/18398/). - Lijian
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - internal Review.
    • MAP(Arm Proprietary Performance Analysis Tool) with VPP - Tried internal Patch still failing. Continuing to work on it.
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

04/23/2019

  • Attendees
    • Sirshak Das
    • Lijian Zhang
    • Juraj Linkeš
    • Vijay
    • Nitin
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • List all the blockers on aarch64 in CSIT wiki page - Done - Stan or Juraj
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • QSFP+ is available and working now.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
      • Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
      • These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
      • Investigate why these three blades have only one numa node - Juraj
  • VPP
    • Investigate session_queue_node_fn/vlib_worker_loop.
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
      • Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input
    • Rebase VPP distro package building patch; contact Damjan in slack; Talk with Damjan in vpp meeting - Lijian & Sirshak
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Perf degradation is fixed. Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK sample apps on it now.
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
    • Vectorization
    • TAS patch will be ready soon (Sirshak)
    • MAP with VPP is ongoing - Sirshak
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective
  • Action Items - Last Week
  • Action Items - Next Week

04/16/2019

  • Attendees
    • Sirshak Das
    • Lijian Zhang
    • Juraj Linkeš
    • Vijay
    • Nitin
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • List all the blockers on aarch64 in CSIT wiki page - Done - Stan or Juraj
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • QSFP+ is available and working now.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
      • Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
      • These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
      • Investigate why these three blades have only one numa node - Juraj
  • VPP
    • Investigate session_queue_node_fn/vlib_worker_loop. - https://jira.arm.com/browse/ENTNET-1179 - Done
      • Will create two Jira tickets to track the findings. - Lijian
    • Rebase VPP distro package building patch; contact Damjan in slack; Talk with Damjan in vpp meeting - Lijian & Sirshak
    • Investigating message queue - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Perf degradation is fixed. Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK sample apps on it now.
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
    • Vectorization
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective
  • Action Items - Last Week
  • Action Items - Next Week

04/09/2019

  • Attendees
    • Sirshak Das
    • Lijian Zhang
    • Juraj Linkeš
    • Nitin
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • List all the blockers on aarch64 in CSIT wiki page - Done - Stan or Juraj
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • QSFP+ is available and working now.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
      • Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
      • These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
      • Investigate why these three blades have only one numa node - Juraj
  • VPP
  • VPP Hoststack
    • Investigate session_queue_node_fn/vlib_worker_loop. - https://jira.arm.com/browse/ENTNET-1179 - Done
    • Rebase VPP distro package building patch; contact Damjan in slack; Talk with Damjan in vpp meeting - Lijian & Sirshak
    • Investigating message queue - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Perf degradation is fixed. Investigating performance degradation on Bluefield - Sirshak
      • Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK sample apps on it now.
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
    • Vectorization
      • Vecterization in esp-encrypt, https://gerrit.fd.io/r/#/c/18398/ - Get improvement on ThunderX/OcteonTX/Taishan, but degradation on ThunderX2 - Lijian
      • ethernet-input - will implement for aarch64 128bits only
      • Create vectorization specific EPIC - Lijian
  • Action Items - Last Week
  • Action Items - Next Week

04/02/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Nitin
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • Investigate session_queue_node_fn/message queue data structure. - Investigating the source code
    • Review https://gerrit.fd.io/r/#/c/18398/ - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Upstreamed. Perf degradation is fixed.
      • Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK on it now.
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • List all the blockers on aarch64 in CSIT wiki page - Stan or Juraj
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Both binaries and packages built with generic option by default, and provide Makefile variable NATIVE_OPTIMIZE=Y for end user to build native optimized images.
      • Prepare email and a draft patch asking comments from community - Lijian
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • QSFP+ is available and working now.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
      • Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
      • These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
      • Investigate why these three blades have only one numa node - Juraj
  • VPP
    • Write description/expectation about the two NEON related patch - Lijian
    • Investigating performance degradation on CortexA72 - Sirshak
    • Message queue - Sirshak
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - To close it.
    • Vectorization
      • ethernet-input - no progress yet
    • 128B cache line size
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

03/26/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Nitin
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • Investigate session_queue_node_fn/message queue data structure. - Investigating the source code
    • Review https://gerrit.fd.io/r/#/c/18398/ - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Upstreamed. Perf degradation is fixed.
      • Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK on it now.
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Both binaries and packages built with generic option by default, and provide Makefile variable NATIVE_OPTIMIZE=Y for end user to build native optimized images.
      • Prepare email and a draft patch asking comments from community - Lijian
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • QSFP+ is available and working now.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
      • Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
      • These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
      • Investigate why these three blades have only one numa node - Juraj
  • VPP
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - To close it.
    • Vectorization
      • ethernet-input - no progress yet
    • 128B cache line size
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

03/19/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • vlib_worker_loop and session_queue_node_fn are two major hot-spots. - Just started
    • Enable NEON instruction in Buffer pool free function. Patch is committed.
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Upstreamed, but still working on issues, e.g., performance degradation
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Done by Malvika.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
      • Prepare email and a draft patch asking comments from community - Lijian
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node. Also blocked by QSFP+ issue.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - In internal review.
    • Vectorization
    • 128B cache line size
      • VPP image with 128B cache line size crashed on ThunderX2 - Cannot reproduce crash with my setup
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week
    • Commit VPP distro making patch - Lijian
    • Plugin 25G NIC Taishan server, and connect the 25G ports to x86 25G NIC - Lijian
    • Follow Jianlin's suggestion, update Uboot and Kernel, and then sync up with Juraj - Lijian

03/12/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
    • Tina to update the meeting notice.
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • vlib_worker_loop and session_queue_node_fn are two major hot-spots. - No progress
    • Enable NEON instruction in Buffer pool free function. Patch is committed.
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. All test failures are resolved.
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Switched to Malvika.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
      • Prepare email and a draft patch asking comments from community - Lijian
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - In internal review.
    • Vectorization
    • 128B cache line size
      • VPP image with 128B cache line size crashed on ThunderX2
    • thunderx2 crashing - No update
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week
    • Commit VPP distro making patch - Lijian
    • Plugin 25G NIC Taishan server, and connect the 25G ports to x86 25G NIC - Lijian
    • Follow Jianlin's suggestion, update Uboot and Kernel, and then sync up with Juraj - Lijian

03/05/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • vlib_worker_loop and session_queue_node_fn are two major hot-spots. - No progress
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. All test failures are resolved.
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Switched to Malvika.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - No progress
      • Investigate with latest VPP code on x86 server - Lijian - Send emails to vpp-dev mailor if there's problem. Will not put much effort.
    • Vectorization
      • ethernet-input
      • buffer pools
    • 128B cache line size
      • Will try this on Taishan server - Slightly performance degradation with 128 bytes cache line
    • thunderx2 crashing - No update
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

02/26/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • el0_sys hot-spot on Taishan D05 only, no plan to fix it.
    • vlib_worker_loop and session_queue_node_fn are two major hot-spots. - No progress
    • memcpy optimization
      • memcpy patch verification on taishan by khem l3 forwarding usecase- Lijian Status(khem): No updates.
      • memcpy patch consumes more clocks in OcteonTX2 - updated by Nitin.
      • Stopped working on this patch.
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Test failure on SCTP, not root-caused yet.
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Switched to Marvikar
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Buffer Pools per NUMA
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560
      • Investigate with latest VPP code on x86 server - Lijian - Send emails to vpp-dev mailor if there's problem. Will not put much effort.
    • Vectorization
      • ethernet-input
      • buffer pools
    • 128B cache line size
      • Will try this on Taishan server - Slightly performance degradation with 128 bytes cache line
    • Qualcomm no change iperf3
    • thunderx2 crashing - No update
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

02/19/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • memcpy optimization
      • memcpy patch verification on taishan by khem l3 forwarding usecase- Lijian Status(khem): No updates.
      • memcpy patch consumes more clocks in OcteonTX2 - updated by Nitin.
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch.
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
    • Target: master trending job
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Buffer Pools per NUMA
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560
    • 1GB page taking long time Status: fixed.
      • Investigate with latest VPP code on x86 server
    • Vectorization
      • ethernet-input
      • buffer pools
      • memcpy
    • 128B cache line size
      • Will try this on Taishan server - Lijian
    • Qualcomm no change iperf3
    • thunderx2 crashing - No update
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

02/11/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • memcpy optimization
      • memcpy patch verification on taishan by khem l3 forwarding usecase- Lijian Status(khem): No updates.
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch.
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible.
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config
      • b. merging CSIT patch.
      • c. creating a job.
    • Target: master trending job
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Buffer Pools per NUMA
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560
    • 1GB page taking long time Status: fixed.
    • Vectorization
      • ethernet-input
      • buffer pools
      • memcpy
    • 128B cache line size
    • Qualcomm no change iperf3
    • thunderx2 crashing
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

02/05/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • memcpy optimization
      • Check optimized memory copy version are deployed on Taishan and ThunderX2 during runtime - Lijian
      • Send memcpy patch to Khem and Fede for further verification - Lijian Status: fede: small improvement in mcbin with iperf3, khem to try them with l3 forwarding
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation
      • Working on svm_fifo alternate version with front and back pointers synchronized instead of cursize.
    • Verifying per NUMA node buffer pool https://gerrit.fd.io/r/#/c/16638/
      • sirshak create jira id in fd.io jira. https://jira.fd.io/browse/VPP-1560
      • Hanging of VPP is actually VPP taking a lot of time to allocate 400K chunks for 1GB - Damjan has this in his todo list
      • gcc-8 compilation still fails on ARM.
      • Octeon-Tx failure. Status: unknown
    • Gorka is trying some optimal configs for VCL. Status: no updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • OcteonTx boots to buildroot with no dhclient hence an impasse. Still not clear how to use USB stick.
  • CSIT
    • VPP Path
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Status: no updates.
      • Kernel Migration on mcbin. Status:
      • ThunderX2:
    • VPP Performance Test
      • Stan and Khem to come up with a summary of current status and an estimate of at least upstreaming basic L2/L3 performance suites.
      • Juraj to come up with a solution for NUMA node anamoly in Taishan.
      • https://gerrit.fd.io/r/#/c/16850/ Status: Juraj has a version all ready to work. Package installation blocker.
      • Package installation error Status: Juraj to investigate logs.
  • FD.io lab
    • ThunderX1 -
      • New QSFP+ switch for ThunderX1 is available now: QSFP+ to be connected SFP+ switch.
      • Juraj to setup a call with LF folks on.
    • ThunderX2 -
      • Andy still waiting cables.
      • Juraj to remind Andy of when the cable will be available.
      • Juraj to follow up on ssh connectivity to thunderx2.
  • VPP
    • Vectorization
      • [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
      • [Lijian] Check if setting default cache line size to 128 will degradate thru-put on Taishan/Qualcomm/ThunderX2
        • no perf diff in Qualcomm
        • vpp crashes on thunderx2
        • waiting for results on A72 (Taishan)
      • [Sirshak] on ethernet-input node, investigate vectorized buffer index, Damjan's per numa node buffer pool patch. Status: No updates
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

01/29/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Lijian Zhang
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Honnappa Nagarahalli
    • John Ddigilio
  • General
  • VPP Hoststack
    • TaiShan Server with Debian distro crashed with command of 'ip probe-neighbor' when doing VPP hoststack with iperf3
    • With 64 bytes packets, on ThunderX2, 10G NIC, VPP hoststack bandwidth is about 1/2 of Linux Kernel stack.
    • With 64 bytes packets, on Taishan, 10G NIC, VPP hoststack bandwidth is about 2x of Linux Kernel stack.
    • Memory copy patch gives 4% improvement on VPP hoststack on Taishan server.
    • Check optimized memory copy version are deployed on Taishan and ThunderX2 during runtime - Lijian
    • Send memcopy patch to Khem and Fede for further verification - Lijian
    • iperf3 performance with Hoststack.
      • Investigating lock-less fifo, ip4_local_forward node and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
    • Verifying https://gerrit.fd.io/r/#/c/16638/ - Suppose to give better performance, but VPP hang with this patch on some Arm machines.
    • Gorka is trying some optimal configs for VCL. - No Updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • OcteonTX is received in ARM lab. Will boot it up firstly and then start doing profiling with it.
  • FD.io lab
    • ThunderX1 -
      • New Arista switch for ThunderX1 is available now. Gathering details that required by LF lab before sending the switch to CSIT lab. - Juraj - Andy will try to send the switch to CSIT this Thursday.
    • ThunderX2 -
      • Cable type is confirmed. Procurement is in the process.
      • Juraj to remind Andy of when the cable will be available.
      • Require access to these servers in FD.io lab. Anton gives the IP to access them.(ADMIN/ADMIN)
  • CSIT
    • VPP Path
      • So far so good.
      • ARM CI results are overwritten by x86 machines. Should be a Jenkin issue. Monitor if this corner will happen again. - Juraj
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts. Juraj is able to modify, execute the scripts in Container. Things to fix is scripts for 1-link 1-node topology and interfaces binding to VPP. Is able to run successfully a traffic test.
      • Kernel Migration on mcbin. Juraj is able to build all the images, but got kernel panic. Try with the more recent uBoot version. Tried latest uBoot image, but still has the same issue.
      • Juraj to investigate further work once ThunderX2 is available.
    • VPP Performance Test
  • VPP
    • Vectorization
      • [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
      • [Lijian] Check if setting default cache line size to 128 will degradate thru-put on Taishan/Qualcomm/ThunderX2
      • [Sirshak] on ethernet-input node, investigate vectorized buffer index.
    • Memory Ordering
      • [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
    • [Juraj] IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
    • [Juraj] Kernel Migration on mcbin. Juraj is able to build all the images.
  • Action Items - Next Week
    • [Sirshak] -

01/22/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Lijian Zhang
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Honnappa Nagarahalli
    • John Ddigilio
  • General
  • VPP Hoststack
    • TaiShan Server with Debian distro crashed with command of 'ip probe-neighbor' when doing VPP hoststack with iperf3
    • With 64 bytes packets, on ThunderX2, 10G NIC, VPP hoststack bandwidth is about 1/4 of Linux Kernel stack.
    • iperf3 performance with Hoststack.
      • Investigating lock-less fifo, ip4_local_forward node and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
    • Gorka is trying some optimal configs for VCL. - No Updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • OcteonTX is received in ARM lab. Will boot it up firstly and then start doing profiling with it.
  • FD.io lab
    • ThunderX1 -
      • New Arista switch for ThunderX1 is available now. Gathering details that required by LF lab before sending the switch to CSIT lab. - Juraj - Andy will try to send the switch to CSIT this Thursday.
    • ThunderX2 -
      • Cable type is confirmed. Procurement is in the process.
      • Require access to these servers in FD.io lab.
  • CSIT
    • VPP Path
      • So far so good.
      • ARM CI results are overwritten by x86 machines. Should be a Jenkin issue. Monitor if this corner will happen again. - Juraj
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts. Juraj is able to modify, execute the scripts in Container. Things to fix is scripts for 1-link 1-node topology and interfaces binding to VPP.
      • Kernel Migration on mcbin. Juraj is able to build all the images, but got kernel panic. Try with the more recent uBoot version.
      • Juraj to investigate further work once ThunderX2 is available.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now.
      • Khem will get L2 work in CI firstly, and then IP4, and other test cases.
      • Stan starts to work on performance scripts with Khem. Is able to connect Taishan machines in CSIT lab.
      • The performance topology in wiki link is to update per below file.
      • https://github.com/FDio/csit/blob/master/docs/lab/Testbeds_Xeon_Skx_Arm_Atom.md
      • Stan and Khem to come up with a summary of current status and an estimate of at least upstreaming basic L2/L3 performance suites.
        • Install Ubuntu-18.04 on Huawei Taishan servers firstly, and then investigate upstreaming performance test framework to enable Aarch64
        • Lijian to verify Ubuntu-18.04 on Taishan server.
      • Stan installed latest CSIT scripts on packet generator server(x86 NEON) and Tainshan servers in FD.io lab.
      • https://gerrit.fd.io/r/#/c/16850/
      • Some of L2 and L3 test cases passed.
  • VPP
    • Vectorization
      • [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
      • [Sirshak] on ethernet-input node, investigate vectorized buffer index.
    • Memory Ordering
      • [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
    • [Juraj] IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
    • [Juraj] Kernel Migration on mcbin. Juraj is able to build all the images.
  • Action Items - Next Week
    • [Sirshak] - To update patch list in VPP/Aarch64 wiki

01/15/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Lijian Zhang
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Honnappa Nagarahalli
    • John Ddigilio
  • General
  • VPP Hoststack
    • iperf3 performance with Hoststack.
      • Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
    • Gorka is trying some optimal configs for VCL. - No Updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • OcteonTX is received in ARM lab. Will boot it up firstly and then start doing profiling with it.
  • FD.io lab
    • ThunderX2 -
      • New Arista switch is available now. Gathering details that required by LF lab before sending the switch to CSIT lab. - Juraj
      • Cable type is confirmed. Procurement is in the process.
  • CSIT
    • VPP Path
      • IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
      • We have voting verify on bionic. Upload nexus disabled but merge job working. - Juraj created LF ticket for nexus upload. Both master merge job and verifying job are working fine.
      • ARM CI results are overwritten by x86 machines. Should be a Jenkin issue. Monitor if this corner will happen again. - Juraj
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts.
      • Kernel Migration on mcbin. Juraj is able to build all the images.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now.
      • Khem will get L2 work in CI firstly, and then IP4, and other test cases.
      • Stan starts to work on performance scripts with Khem. Is able to connect Taishan machines in CSIT lab.
      • The performance topology in wiki link is to update per below file.
      • https://github.com/FDio/csit/blob/master/docs/lab/Testbeds_Xeon_Skx_Arm_Atom.md
      • Stan and Khem to come up with a summary of current status and an estimate of at least upstreaming basic L2/L3 performance suites.
  • VPP
    • Vectorization
      • [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
    • Memory Ordering
      • [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
    • [Juraj] IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
    • [Juraj] Kernel Migration on mcbin. Juraj is able to build all the images.
  • Action Items - Next Week
    • [Sirshak] - To update patch list in VPP/Aarch64 wiki

01/08/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Lijian Zhang
    • Stanislav Chlebec
    • Khemendra Kumar
  • General
  • VPP Hoststack
    • iperf3 performance with Hoststack.
      • Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
    • Gorka is trying some optimal configs for VCL. - No Updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
    • [Lijian] Working on IP4 reassembly and GBP failures. - fixed. Juraj has upstreamed patched to enable these two tests.
    • [Sirshak] Kernel Migration mcbin. Juraj is working on based on Jianlin's suggestion.
    • [Andy] Getting a new Arista switch next year.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy - Macro benchmarking is done and data is updated to Jira.
    • Memory Ordering
      • [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
  • CSIT
    • VPP Path
  • VPP Path Failures
      • We have voting verify on bionic. Upload nexus disabled but merge job working. - Juraj created LF ticket for nexus upload. Both merge job and verifying job are working fine.
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts.
      • thunderx2: Juraj working with LF to get this resolved.
      • mcbin: Juraj can contact Jianlin if needed.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now.
      • Khem will get L2 work in CI firstly, and then IP4, and other test cases.
      • Stan is starting working on VPP performance test. Khem to send email to Stan on VPP performance testing stuff.
  • FD.io lab
    • New Arista switch to be proccured next year.
    • ThunderX2 - Racked. Andy is trying to buy cables compatible to Intel XL710. Juraj to confirm info required by lab people before sending out the cables.
  • Action Items - Next Week

12/18/2018

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Tina Tsou
    • Stanislav Chlebec
    • Avinash
    • Khemendra
  • General
  • VPP Hoststack
    • iperf3 performance with Hoststack.
      • Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
    • Gorka is trying some optimal configs for VCL. - No Updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working.
    • [Lijian] Working on IP4 reassembly and GBP failures. - Some preliminary on gbp waiting Neale. Juraj to give access to Lijian to investigate on ThunderX.
    • [Sirshak] Kernel Migration mcbin. Status: Jianlin to work with Juraj to get fd.io mcbins up and running. Sirshak to setup a meeting.
    • [Andy] Getting a new Arista switch next year.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy - Still benchmarking and setting it up for internal review.
      • [Lijian] Patch for compiling issue with GCC-8.x is under community review. Status: No updtaes.
      • [Lijian] Patch for fixing StringTest failure is under community review. Status: Abandoned.
      • [Lijian] Patch for CDP failure is under community review. Status: No updates.
    • Memory Ordering
      • [Sirshak] svm_fifo lockless alternate algorithm for SPSC.
  • CSIT
    • VPP Path
  • VPP Path Failures
    • https://jira.fd.io/browse/VPP-1475 - IP4 random reassembly failure in master, also seen on x86
    • https://jira.fd.io/browse/VPP-1491 - GBP L3/L2 Endpoint Learning failure
      • We have voting verify on bionic. Upload nexus disabled but merge job working. Juraj to create LF ticket for nexus upload.
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts.
      • thunderx2: Sirshak working with LF to get this resolved.
      • mcbin: Sirshak to setup a meeting between Juraj and Jianlin.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now.
      • Khem will get L2 work in CI firstly, and then IP4, and other test cases.
  • FD.io lab
    • New Arista switch to be proccured next year.
    • ThunderX2 - Racked. IPMI Static IP configuration missing. Sirshak with LF.
  • Action Items - Next Week

12/11/2018

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Tina Tsou
    • Stanislav Chlebec
  • VPP Hoststack
    • iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance. Three case: kernel to kernel; kernel to VPP hoststack; VPP hoststack to VPP hoststack
    • ongoing perf analysis. One patch(https://gerrit.fd.io/r/#/c/16184/) is merged, and the other one is under internal review.
    • Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
    • Gorka is trying some optimal configs for VCL.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. Two scripts of L2 performance suites for CI management repository are done, investigating on for CSIT repository, and three more scripts to be developed.
    • [Lijian] Working on IP4 reassembly and GBP failures
    • [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far. - To confirm with Jianling and Joyce.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy - Second priority, no update so far.
      • [Lijian] Patch for compiling issue with GCC-8.x is under community review.
      • [Lijian] Patch for fixing StringTest failure is under community review.
      • [Lijian] Patch for CDP failure is under community review.
    • Memory Ordering
      • [Sirshak] To start work on Arithmetic and Logic relaxed functions.
  • VPP Path failures
  • CSIT
    • VPP Path
      • Actually, everything is ready. The only thing is to get CI patch merged.
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx is in place, but there are errors. Will continue investigation.
      • thunderx2: Racked. Lack of static IP. Sirshak gave a work-around to fix lacking of static IP to Anton.
      • mcbin: Kernel issue yet to try suggestion from Garcia and Damjan. To confirm with Jianling and Joyce - Lijian
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now. Khem will get L2 work in CI firstly, and then IP4, and other test cases.
  • FD.io lab
    • Arista switch is still not working. Andy and Anton are working on the exact requirement for the switch, and two possible option: Andy to replace the Arista or buy a new one.
    • ThunderX2 - Racked. Lack of IP.
  • Action Items - Next Week
    • [Lijian] to continue to investigate make test failures.
    • [Andy] to work with Anton to resolve Arista problem.

12/04/2018

  • Attendees
    • Sirshak Das
    • Andy Wang
    • Juraj Linkeš
    • Khemendra
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Tina Tsou
  • VPP Hoststack
    • iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance. Three case: kernel to kernel; kernel to VPP hoststack; VPP hoststack to VPP hoststack
    • ongoing perf analysis. Two patches ongoing. One is upstreamed and the other is under internal review. Hotpots on memory copy or maybe other stuff.
    • Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
    • Gorka is trying some optimal configs for VCL.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. Two scripts of L2 performance suites for CI management repository are done, investigating on for CSIT repository, and three more scripts to be developed.
    • [Lijian] VPP dlmalloc crash issue root-caused and fixed by maintainer. Florin Coras fixed time-out issues.
    • [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far. - To confirm with Jianling and Joyce.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy - Second priority, no update so far.
      • [Lijian] Patch for compiling issue with GCC-8.x is under internal review.
      • [Lijian] Patch for fixing StringTest failure is under internal review.
    • Memory Ordering
      • [Sirshak] To start work on Arithmetic and Logic relaxed functions.
  • CSIT
    • VPP Path
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx is in place, but there are errors. Will continue investigation.
      • thunderx2: Racked. Lack of IP. To confirm with Anton.
      • mcbin: Kernel issue yet to try suggestion from Garcia and Damjan. To confirm with Jianling and Joyce - Lijian
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now. Khem will get L2 work in CI firstly, and then IP4, and other test cases.
  • FD.io lab
    • Arista switch is still not working. Andy and Anton are working on the exact requirement for the switch, and two possible option: Andy to replace the Arista or buy a new one.
    • ThunderX2 - Racked. Lack of IP.
  • Action Items - Next Week
    • [Lijian] to continue to investigate make test failures.
    • [Andy] to work with Anton to resolve Arista problem.


11/27/2018

  • Attendees
    • Juraj Linkeš
    • Khemendra
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Tina Tsou
  • VPP Hoststack
    • iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance.
    • ongoing perf analysis, two patches ongoing. Hotpots on memory copy or maybe other stuff. Will share patches with community.- Sirshak
    • Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
    • Gorka is trying some optimal configs for VCL.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • Alternate test cases.
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. Status: Working with Juraj to get scripts ready for jobs. - Need to prepare some scripts. First to understand how the script works and then add more options.
    • [Lijian] Status on VPP path failures. Status: Still debugging. Still at early study stage.
    • [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy
    • Memory Ordering
      • [Sirshak] To start work on Arithmetic and Logic relaxed functions.
  • CSIT
    • VPP Path
      • 3 failures currently stalling deployment.
      • VPP-1476, VPP-1475, VPP-1478
      • These failures are seen on Debian x86 VM also.
      • Parallelization(n=32) is resulting in failures. Seems also be caused by below two patches.
      • VPP-1490, caused by https://gerrit.fd.io/r/#/c/15106/ and https://gerrit.fd.io/r/#/c/15534/.
      • VPP-1491, VPP-1497 about parallazation and GBP failure are filed.
      • Get CSIT/Aarch64 pass with partial test cases - Juraj
    • VPP Device
      • thunderx: Juraj created a LF tkt for wiring the 1-node topology on cavium thunderx.
      • thunderx2: to be racked by this Friday.
      • mcbin: Kernel issue yet to try suggestion from Garcia and Damjan.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • L2 test is working now manually. Khem is trying to get it work in CI, and then IP4, and other test cases.
  • FD.io lab
    • Arista switch is missing cable. Andy will send tracking no. for cables.
    • ThunderX2 - to be racked by this Friday.
  • Action Items - Next Week
    • [Lijian] to investigate VPP-1490 issue.
    • [Andy] Andy will send tracking no. for cables.

11/20/2018

  • Attendees
    • Sirshak Das
    • Andy Wang
    • Juraj Linkeš
    • Khemendra
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Tina Tsou
  • VPP Hoststack
    • iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance.
    • ongoing perf analysis, two patches ongoing. Hotpots on memory copy or maybe other stuff. Will share patches with community.- Sirshak
    • Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
    • Gorka is trying some optimal configs for VCL.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • Alternate test cases.
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. Status: Working with Juraj to get scripts ready for jobs. - Need to prepare some scripts. First to understand how the script works and then add more options.
    • [Lijian] Status on VPP path failures. Status: Still debugging. Still at early study stage.
    • [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy
    • Memory Ordering
      • [Sirshak] To start work on Arithmetic and Logic relaxed functions.
  • CSIT
    • VPP Path
      • 3 failures currently stalling deployment.
      • VPP-1476, VPP-1475, VPP-1478
      • These failures are seen on Debian x86 VM also.
      • Parallelization(n=32) is resulting in failures. Seems also be caused by below two patches.
      • VPP-1490, caused by https://gerrit.fd.io/r/#/c/15106/ and https://gerrit.fd.io/r/#/c/15534/.
      • VPP-1491, VPP-1497 about parallazation and GBP failure are filed.
      • Get CSIT/Aarch64 pass with partial test cases - Juraj
    • VPP Device
      • thunderx: Juraj created a LF tkt for wiring the 1-node topology on cavium thunderx.
      • thunderx2: to be racked by this Friday.
      • mcbin: Kernel issue yet to try suggestion from Garcia and Damjan.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • L2 test is working now manually. Khem is trying to get it work in CI, and then IP4, and other test cases.
  • FD.io lab
    • Arista switch is missing cable. Andy will send tracking no. for cables.
    • ThunderX2 - to be racked by this Friday.
  • Action Items - Next Week
    • [Lijian] to investigate VPP-1490 issue.
    • [Andy] Andy will send tracking no. for cables.


11/12/2018

  • Attendees
    • Sirshak Das
    • Andy Wang
    • Juraj Linkeš
    • Khemendra
    • Garcia
    • Gorka
  • VPP Hoststack
    • iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance.
    • ongoing perf analysis, two patches ongoing. Hotpots on memory copy or maybe other stuff. - Sirshak
    • Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
    • Gorka is trying some optimal configs for VCL.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • Alternate test cases.
    • khem to get more information on benchmarking DMM. Khem to send the information to

Status Report Ligato/Contiv

Capture LandC.PNG