Difference between revisions of "VPP/AArch64"

From fd.io
< VPP
Jump to: navigation, search
(Activity)
(Meeting Minutes)
 
(329 intermediate revisions by 8 users not shown)
Line 4: Line 4:
 
=== Meeting Details ===
 
=== Meeting Details ===
  
* Regular AArch64 meeting: [https://zoom.us/my/fastdata Tuesdays at 06:00 PT (Pacific Time)] (weekly). [http://www.thetimezoneconverter.com/?t=06:00&tz=PT%20%28Pacific%20Time%29 Convert to your timezone.]
+
* Regular AArch64 meeting: 1st and 3rd Tuesdays of every month at 06:00 PT (Pacific Time) (biweekly). [http://www.thetimezoneconverter.com/?t=06:00&tz=PT%20%28Pacific%20Time%29 Convert to your timezone.]
** [https://zoom.us/my/fastdata FD.io Zoom Meeting room ]
+
** [https://zoom.us/my/fastdata?pwd=Z3Z0UnJyUmRIMlU3eTJLcGF6VEptQT09 FD.io Zoom Meeting room ]
  
 
=== IRC Channel ===
 
=== IRC Channel ===
Line 46: Line 46:
 
* [https://jenkins.fd.io/computer/ '''CI build servers'''] integrated into Jenkins
 
* [https://jenkins.fd.io/computer/ '''CI build servers'''] integrated into Jenkins
  
* [https://wiki.fd.io/view/CSIT/fdio_csit_lab_ext_lld_draft '''CSIT test beds'''] (''under construction'')
+
* [https://github.com/FDio/csit/blob/master/docs/lab/testbed_specifications.md '''CSIT testbed specifications''']
  
 
{| class="wikitable"
 
{| class="wikitable"
Line 61: Line 61:
 
! Distro
 
! Distro
 
|-
 
|-
| [https://softiron.com/development-tools/overdrive-1000/ SoftIron OverDrive 1000] || CI build server || Not Used || softiron-1 || 10.30.51.12 || N/A || 4 || 8GB || || openSUSE
+
| [https://www.marvell.com/server-processors/thunderx-arm-processors/ Marvell ThunderX] || VPP dev debug server|| Running || vpp-marvell-dev || 10.30.51.38 || 10.30.50.38 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 18.04.4
 
|-
 
|-
| || CI build server || Not Used || softiron-2 || 10.30.51.13 || N/A || 4 || 8GB || || openSUSE
+
| || CI build server|| Running in Nomad || s53-nomad || 10.30.51.39 || 10.30.50.39 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 18.04.4
 
|-
 
|-
| || CI build server || Not Used|| softiron-3 || 10.30.51.14 || N/A || 4 || 8GB || || openSUSE
+
| || CI build server|| Running in Nomad || s54-nomad || 10.30.51.40 || 10.30.50.40 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 18.04.4
 
|-
 
|-
| [https://www.marvell.com/server-processors/thunderx-arm-processors/ Marvell ThunderX] || CI build server || Running in CI || nomad3arm || 10.30.51.38 || 10.30.50.38 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 16.04
+
| || CI build server || Running in Nomad || s52-nomad || 10.30.51.65 || 10.30.50.65 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.4
 
|-
 
|-
| || CI build server || Running in CI || nomad4arm || 10.30.51.39 || 10.30.50.39 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 16.04
+
| || CI build server || Running in Nomad || s51-nomad || 10.30.51.66 || 10.30.50.66 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.4
 
|-
 
|-
| || CI build server || Running in CI || nomad5arm || 10.30.51.40 || 10.30.50.40 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 16.04
+
| || CI build server || Running in Nomad || s49-nomad || 10.30.51.67 || 10.30.50.67 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.4
 
|-
 
|-
| || VPP dev debug server || Running || fdio-marvell4 || 10.30.51.65 || 10.30.50.65 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.2
+
| || CI build server || Running in Nomad || s50-nomad || 10.30.51.68 || 10.30.50.68 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.4
 
|-
 
|-
| || CI build server || Running || fdio-marvell5 || 10.30.51.66 || 10.30.50.66 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.2
+
| [https://www.marvell.com/server-processors/thunderx2-arm-processors/ Marvell ThunderX2] || Perf DUT candidate || Running || s27-t13-sut1 || 10.30.51.69 || 10.30.50.69 || 224 || 128GB || 3x40GbE QSFP+ XL710-QDA2 || Ubuntu 18.04.2
 
|-
 
|-
| || CI build server || Running || fdio-marvell6 || 10.30.51.67 || 10.30.50.67 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.2
+
| || VPP device server || Running in Nomad || s55-t36-sut1 || 10.30.51.70 || 10.30.50.70 || 256 || 256GB || 2x40GbE QSFP+ XL710-QDA2 / 2x10/25GE SFP+ ConnectX5 || Ubuntu 18.04.4
 
|-
 
|-
| || CI build server || Running || fdio-marvell7 || 10.30.51.68 || 10.30.50.68 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.2
+
| || VPP device server || Running in Nomad || s56-t37-sut1 || 10.30.51.71 || 10.30.50.71 || 256 || 256GB || 2x40GbE QSFP+ XL710-QDA2 / 2x10/25GE SFP+ ConnectX5 || Ubuntu 18.04.4
 
|-
 
|-
| [https://www.marvell.com/server-processors/thunderx2-arm-processors/ Marvell ThunderX2] || VPP device server || Running || s27-t13-sut1 || 10.30.51.69 || 10.30.50.69 || 112 || 128GB || 3x40GbE QSFP+ XL710-QDA2 || Ubuntu 18.04.2
+
| Huawei TaiShan 2280 || CSIT testbed || Running in CI || s17-t33-sut1 || 10.30.51.36 || 10.30.50.36 || 64 || 128GB || 2x10GbE SFP+ Intel X520-DA2 / 2x25GbE SFP28 Mellanox CX-4 || 18.04.1
 
|-
 
|-
| Huawei TaiShan 2280 || CSIT testbed || Running || s17-t33-sut1 || 10.30.51.36 || 10.30.50.36 || 64 || 128GB || 2x10GbE SFP+ Intel X520-DA2 / 2x25GbE SFP28 Mellanox CX-4 || Ubuntu 17.10
+
| || CSIT testbed || Running in CI || s18-t33-sut2 || 10.30.51.37 || 10.30.50.37 || 64 || 128GB || 2x10GbE SFP+ Intel X520-DA2 / 2x25GbE SFP28 Mellanox CX-4 || 18.04.1
 
|-
 
|-
| || CSIT testbed || Running || s18-t33-sut2 || 10.30.51.37 || 10.30.50.37 || 64 || 128GB || 2x10GbE SFP+ Intel X520-DA2 / 2x25GbE SFP28 Mellanox CX-4 || Ubuntu 17.10
+
| [http://macchiatobin.net/ Marvell MACCHIATObin] || N/A || Decommissioned || s20-t34-sut1 || 10.30.51.41 || 10.30.51.49, then connect to /dev/ttyUSB0 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.4
 
|-
 
|-
| [http://macchiatobin.net/ Marvell MACCHIATObin] || CSIT testbed || Running || s20-t34-sut1 || 10.30.51.41 || 10.30.51.49, then connect to /dev/ttyUSB0 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.4
+
| || N/A || Decommissioned || s21-t34-sut2 || 10.30.51.42 || 10.30.51.49, then connect to /dev/ttyUSB1 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.5
 
|-
 
|-
| || CSIT testbed || Running || s21-t34-sut2 || 10.30.51.42 || 10.30.51.49, then connect to /dev/ttyUSB1 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.5
+
| || N/A || Decommissioned || fdio-mcbin3 || 10.30.51.43 || 10.30.51.49, then connect to /dev/ttyUSB2 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.5
 
|-
 
|-
| || VPP dev debug server || Running || fdio-mcbin3 || 10.30.51.43 || 10.30.51.49, then connect to /dev/ttyUSB2 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.5
+
| || Power Cycler || Operational || || 10.30.50.80 || || || || ||
 +
|-
 +
| [https://softiron.com/development-tools/overdrive-1000/ SoftIron OverDrive 1000] || N/A || Decommissioned || softiron-1 || 10.30.51.12 || N/A || 4 || 8GB || || openSUSE
 +
|-
 +
| || N/A || Decommissioned || softiron-2 || 10.30.51.13 || N/A || 4 || 8GB || || openSUSE
 +
|-
 +
| || N/A || Decommissioned || softiron-3 || 10.30.51.14 || N/A || 4 || 8GB || || openSUSE
 
|-
 
|-
 
|}
 
|}
Line 157: Line 163:
 
=== Recent Patches ===
 
=== Recent Patches ===
 
{| class="wikitable"
 
{| class="wikitable"
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/34716 misc: vppctl fix heap-buffer-overflow & memleaks] || Merged 12/14 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/34634 crypto-native: fix build error on Arm using clang-13] || Merged 12/14 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/33306 snort: fix unused result warning for gcc-10] || Merged 11/06 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/33307 l2: fix array-bounds error for prefetch on Arm] || Merged 11/07 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/33422 ip6: fix IPv6 address calculation error using "ip route add" CLI] || Merged 10/21 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/31694 ipsec: Performance improvement of ipsec4_output_node using flow cache] || Merged 10/13 || || Govindarajan Mohandoss
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/33999 build: fix centos rpm build] || Merged 10/08 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/33324 vppinfra: fix potential memory access error in _pool_init_fixed] || Merged 10/05 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/32885 svm: fix asan check failed @svm_map_region on arm ] || Merged 06/24 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/32638 l2: fix vrrp prefix mac comparison ] || Merged 06/09 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/32565 build: fix build error after make wipe ] || Merged 06/04 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/32367 memif: fix input node buffer prefetch ] || Merged 05/21 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/32366 memif: fix gcc-10 build error on arm platform ] || Merged 05/21 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/31972 papi: fix ubuntu 1804 make test socket.close error] || Merged 04/16 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/31960 rdma: fix skip_ipv4_cksum behavior in scalar path] || Merged 04/15 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/31985 vppinfra: correct intrinsic called by u16x16_from_u8x16] || Merged 04/15 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/31421 vppinfra: fix compiling error due to incompatible udphdr field names] || Merged 03/05 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/30458 avf: optimized with NEON SIMD instruction] || Merged 12/18 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/28252 ip: fix compiling error with gcc-10] || Merged 09/01 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/28044 build: Fix 'make install-deps' errors on aarch64 CentOS 7] || Merged 07/29 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/28034 acl: correct acl vat help message] || Merged 07/24 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/27417 build: add libssl-dev library for ubuntu 20.04] || Merged 06/04 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/26949 dpdk: fix compiling issue with clang] || Merged 05/08 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/26950 vppinfra: fix u32x4_byte_swap on Arm] || Merged 05/08 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/26804 build: support arch-specific compiling for Neoverse N1] || Merged 04/30 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/26023 dpdk: false link down issue with ixgbe NIC] || Merged 03/23 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/25896 vlib: fix error when creating avf interface on SMP system] || Merged 03/21 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/25906 vlib: leave SIGPROF signal with its default handler] || Merged 03/21 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/25259 build: add libssl-dev for ubuntu 16.04 and 18.04] || Merged 03/11 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/25195 vlib: fix code of getting numa node with specific cpu_id] || Merged 02/17 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/23083 docs: add physmem section in configuration parameters] || Merged 12/19 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/23082 vlib: add max-size configuration parameter for pmalloc] || Merged 12/18 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/23075 crypto: not use vec api with opt_data[VNET_CRYPTO_N_OP_IDS]] || Merged 11/13 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/23084 acl: add missing square brackets to vat_help option in acl api] || Merged 10/31 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21968 dpdk: apply dual loop unrolling in DPDK TX] || Merged 09/12 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21969 ip: apply dual loop unrolling in ip4_rewrite] || Merged 09/12 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21970 ip: apply dual loop unrolling in ip4_input] || Merged 09/12 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21940 build: fix running error with vmxnet3_test_plugin.so] || Merged 09/11 || || Jianlin Lv
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21741 build: fix unsupported CMake comparison operation] || Merged 09/05 || || Jianlin Lv
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21469 tap: fix tap interface not working on Arm issue] || Merged 09/04 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20379 build: fix vpp compilation failure on ThunderX2 and Amp] || Merged 08/19 || || Jianlin Lv
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/18564 vppinfra: Update "show cpu" output for AArch64 chips] || Merged 08/19 || || Nitin Saxena
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20861 vppinfra: refactor test_and_set spinlocks to use clib_spinlock_t] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20862 vppinfra: added performance test for clib_rwlock_t (test_rwlock.c)] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20863 vppinfra: refactor clib_rwlock_t to use single condition variable] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20860 vppinfra: refactor clib_spinlock_t to use compare and swap] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20859 vppinfra: added lock performance test for clib_spinlock_t (test_spinlock.c)] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20857 vppinfra: refactor use of CLIB_MEMORY_BARRIER ()] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20856 vppinfra: conformed spinlocks to use CLIB_PAUSE] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/20272/ vppinfra: add u64x2_scatter/u32x4_scatter] || Merged 06/21 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/20271/ vppinfra: add u64x2_gather/u32x4_gather] || Merged 06/21 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/20064/ fix compiling error with marvell pp2 plugin] || Merged 06/11 || || Jianlin Lv
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/19930/ Switch atomic release API from __sync to __atomic builtin] || Merged 06/05 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/19929/ Switch atomic test and set API from __sync to __atomic builtin] || Merged 06/05 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/18278/ Build packages for generic Arm architecture] || Merged 05/15 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/19135/ Enable NEON instructions in memcpy_le] || Merged 05/01 || || Lijian Zhang
 +
|-
 
| [https://gerrit.fd.io/r/#/c/18223/ svm_fifo rework to avoid contention on cursize] || Merged 04/17 || || Sirshak Das
 
| [https://gerrit.fd.io/r/#/c/18223/ svm_fifo rework to avoid contention on cursize] || Merged 04/17 || || Sirshak Das
 
|-
 
|-
Line 163: Line 282:
 
| [https://gerrit.fd.io/r/#/c/18077/ sctp chunk_len fix] || Merged 03/06 || || Sirshak Das
 
| [https://gerrit.fd.io/r/#/c/18077/ sctp chunk_len fix] || Merged 03/06 || || Sirshak Das
 
|-
 
|-
| [https://gerrit.fd.io/r/#/c/15756/ Use acquire/release ordering when accessing svm_fifo shared variable cursize] || Merged 11/29 || || Sirshak Das
+
| [https://gerrit.fd.io/r/#/c/16184/ Use acquire/release ordering when accessing svm_fifo shared variable cursize] || Merged 11/29 || || Sirshak Das
 
|-
 
|-
 
| [https://gerrit.fd.io/r/#/c/15756/ Optimize xxx_zero_byte_mask NEON function.] || Merged 11/07 || || Lijian Zhang
 
| [https://gerrit.fd.io/r/#/c/15756/ Optimize xxx_zero_byte_mask NEON function.] || Merged 11/07 || || Lijian Zhang
Line 308: Line 427:
 
|}
 
|}
  
=== Meeting Minutes ===
+
== Meeting Minutes ==
 +
'''11/21/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Niyaz Murshed
 +
** Jieqiang Wang
 +
 
 +
* CSIT
 +
** Status
 +
*** Dave Wallace help monitor the AArch64 CI/CD status, which looks fine
 +
*** Replace old thunderX2 with Ampera Altra, bugdets got approved, still in progress
 +
**** Sync with CSIT folks in the call when possible -- Juraj
 +
*** Maciek asked about the availability of N2-based hardwares
 +
**** Plans to ship N2-based servers(Nvidia Grace(V2)/Ampere One(in-house design by Ampere)) to FD.io lab in next year
 +
**** Timeline TBD
 +
*** IPSec test cases
 +
**** Patch already merged
 +
**** QAT cards in Austin labs, plan to ship them to FD.io lab
 +
*** RDMA test cases
 +
**** MLX DPDK test cases are enabled, RDMA are not on AArch64
 +
 
 +
* VPP
 +
** Detailed planning for VPP projects in the next call
 +
** Refactor OpenSSL usage in VPP IPsec -- Lijian
 +
*** Move key generation and initialization steps out of data plane to control plane, see performance boost
 +
** Investigate make test framework in VPP -- Lijian
 +
*** Patch broke wireguard test cases so need to figure out the work flow
 +
** VPP ramp-up -- Niyaz
 +
*** Investigate VPP graph node mechanism and how to add nodes to the group
 +
** IPSec scalability tests -- Jieqiang
 +
*** Try to figure out dpdk-rss-flows.py and how to generate balanced rss flows for IPSec tests
 +
 
 +
'''07/18/2023'''
 +
* Attendees
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj Linkes
 +
* CSIT
 +
** Timeout issue happens preriodically on Taishan server, even in release testing.
 +
*** Setting CPU affinity only after VMs boot up fully.
 +
**** https://gerrit.fd.io/r/c/csit/+/38550
 +
*** Another issue maybe related with Taishan NUMA topology
 +
**** https://gerrit.fd.io/r/c/csit/+/35772
 +
*** Increase timeout will bypass the issue and have no effecton VPP VM perf
 +
** IPSec & VxLAN performance drop issue on Ampere Altra
 +
*** QAT cards are planned to be shipped
 +
*** need to pay attention to the execution time for IPSec release testing
 +
*** Need to investigate further on performance degradation issue
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
*** Will have a debug meeting with RDMA maintainers on the issues.
 +
*** ConnectX6 NIC info will be updated in doc first
 +
*** Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
 +
** MRR failed cases
 +
*** Probably due to latest DPDK upgrade, not an arm-specific issue.
 +
** New test cases list on 3n-alt
 +
*** NAT tests cannot be added because they are running on 2-node testbed only
 +
*** enable IPSec flow cache(arm)/IPSec SPD fast path feature
 +
** Release testing
 +
*** 23.06 release testing is done
 +
*** New CSIT page https://csit.fd.io/
 +
** Plan to replace TX2 with Altra as VPP device testing testbed
 +
 
 +
'''06/20/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj Linkes
 +
* CSIT
 +
** Timeout issue happens preriodically on Taishan server, even in release testing.
 +
*** Setting CPU affinity only after VMs boot up fully.
 +
**** https://gerrit.fd.io/r/c/csit/+/38550
 +
*** Another issue maybe related with Taishan NUMA topology
 +
**** https://gerrit.fd.io/r/c/csit/+/35772
 +
*** Increase timeout will bypass the issue and have no effecton VPP VM perf
 +
** IPSec & VxLAN performance drop issue on Ampere Altra
 +
*** QAT cards are planned to be shipped
 +
*** need to pay attention to the execution time for IPSec release testing
 +
*** Need to investigate further on performance degradation issue
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
*** Will have a debug meeting with RDMA maintainers on the issues.
 +
*** ConnectX6 NIC info will be updated in doc first
 +
*** Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
 +
** MRR failed cases
 +
*** Probably due to latest DPDK upgrade, not an arm-specific issue.
 +
** New test cases list on 3n-alt
 +
*** NAT tests cannot be added because they are running on 2-node testbed only
 +
*** enable IPSec flow cache(arm)/IPSec SPD fast path feature
 +
 
 +
'''05/16/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
* CSIT
 +
** Timeout issue happens preriodically on Taishan server, even in release testing.
 +
*** Setting CPU affinity only after VMs boot up fully.
 +
**** https://gerrit.fd.io/r/c/csit/+/38550
 +
*** Another issue maybe related with Taishan NUMA topology
 +
**** https://gerrit.fd.io/r/c/csit/+/35772
 +
*** Increase timeout will bypass the issue and have no effecton VPP VM perf
 +
** The link issue in DPDK testpmd test cases on Ampere Altra is still there.
 +
*** Try cable switch while upgrading NIC firmeare and drivers
 +
*** Try to reproduce the tests after the NIC firmware
 +
*** Try different port pairs of the same two NICs
 +
** IPSec & VxLAN performance drop issue on Ampere Altra
 +
*** QAT cards are planned to be shipped
 +
*** need to pay attention to the execution time for IPSec release testing
 +
*** Need to investigate further on performance degradation issue
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
*** Will have a debug meeting with RDMA maintainers on the issues.
 +
*** ConnectX6 NIC info will be updated in doc first
 +
*** Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
 +
** MRR failed cases
 +
*** Probably due to latest DPDK upgrade, not an arm-specific issue.
 +
* VPP
 +
'''04/18/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
* CSIT
 +
** Timeout issue happens preriodically on Taishan server, even in release testing.
 +
*** Setting CPU affinity only after VMs boot up fully.
 +
**** https://gerrit.fd.io/r/c/csit/+/38550
 +
*** Another issue maybe related with Taishan NUMA topology
 +
**** https://gerrit.fd.io/r/c/csit/+/35772
 +
** The link issue in DPDK testpmd test cases on Ampere Altra is still there.
 +
** IPSec & VxLAN performance drop issue on Ampere Altra
 +
*** QAT cards are planned to be shipped
 +
*** need to pay attention to the execution time for IPSec release testing
 +
*** Need to investigate further on performance degradation issue
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
*** Will have a debug meeting with RDMA maintainers on the issues.
 +
*** ConnectX6 NIC info will be updated in doc first
 +
* VPP
 +
 
 +
'''04/04/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
* CSIT
 +
** Timeout issue happens preriodically on Taishan server, even in release testing.
 +
*** Setting CPU affinity only after VMs boot up fully.
 +
**** https://gerrit.fd.io/r/c/csit/+/38550
 +
*** Another issue maybe related with Taishan NUMA topology
 +
**** https://gerrit.fd.io/r/c/csit/+/35772
 +
** The link issue in DPDK testpmd test cases on Ampere Altra is still there.
 +
** IPSec & VxLAN performance drop issue on Ampere Altra
 +
***
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
*** Will have a debug meeting with RDMA maintainers on the issues.
 +
* VPP
 +
 
 +
'''03/07/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
* CSIT
 +
** Timeout issue happens preriodically on Taishan server, even in release testing.
 +
** The link issue in DPDK testpmd test cases on Ampere Altra is still there.
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
* VPP
 +
 
 +
'''2/21/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
 
 +
* Miscellaneous
 +
** Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Confirm with Vexxhost people if replacing intel NICs is feasible
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
****** Will talk to dpdk i40e maintainer to seek their help
 +
******* Dpdk Port/link status broken - l3fwd have the some issue
 +
******* Sent detail email to i40e maintainer in dpdk-dev mail list waiting for response
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Need to investigate 22.10 release testing result
 +
****** Compiler version change seems to be one of factors for perf degradation
 +
******* Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
 +
******* New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
 +
***** VM testcase timeout issue on 3-tsh testbed
 +
****** Timeout issue occured when starting VPP inside VM, but not for starting testpmd
 +
****** Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
 +
******* isolcpus seems to be working fine
 +
******* still need to root cause the timeout issue- sometimes slower
 +
******* run dpdk build, just use the non-isolated cores for build
 +
******* both VM and VPP start slower than before
 +
******* VPP loading plugins and timeout happens
 +
******* Is VPP crashing? - not crash
 +
******* Is the VM bound with isolated core? - need to check
 +
******* Will set up a live debug session for Tianyu and Juraj
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
 
 +
** VPP Device
 +
*** Enable VPP device testing per patch
 +
**** Voting right for VPP device testing on Arm is enabled
 +
**** VPP device testing on Arm runs per VPP/CSIT patch
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
** MLX NICs Planning
 +
*** CX6 and CX7 - CX7 is hard to get on market - MLX Nics will be used and reported
 +
*** CX6 vpp native rdma driver has issues, dpdk mlx driver is fine.
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
 
 +
'''2/7/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
 
 +
* Miscellaneous
 +
** Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Confirm with Vexxhost people if replacing intel NICs is feasible
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
****** Will talk to dpdk i40e maintainer to seek their help
 +
******* Dpdk Port/link status broken - l3fwd have the some issue
 +
******* Sent detail email to i40e maintainer in dpdk-dev mail list waiting for response
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Need to investigate 22.10 release testing result
 +
****** Compiler version change seems to be one of factors for perf degradation
 +
******* Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
 +
******* New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
 +
***** VM testcase timeout issue on 3-tsh testbed
 +
****** Timeout issue occured when starting VPP inside VM, but not for starting testpmd
 +
****** Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
 +
******* isolcpus seems to be working fine
 +
******* still need to root cause the timeout issue- sometimes slower
 +
******* run dpdk build, just use the non-isolated cores for build
 +
******* both VM and VPP start slower than before
 +
******* VPP loading plugins and timeout happens
 +
******* Is VPP crashing? - not crash
 +
******* Is the VM bound with isolated core? - need to check
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
 
 +
** VPP Device
 +
*** Enable VPP device testing per patch
 +
**** Voting right for VPP device testing on Arm is enabled
 +
**** VPP device testing on Arm runs per VPP/CSIT patch
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
** MLX NICs Planning
 +
*** CX6 and CX7 - CX7 is hard to get on market - MLX Nics will be used and reported
 +
*** CX6 vpp native rdma driver has issues, dpdk mlx driver is fine.
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''1/17/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Tina Tsou
 +
 
 +
* Miscellaneous
 +
** Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Confirm with Vexxhost people if replacing intel NICs is feasible
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
****** Will talk to dpdk i40e maintainer to seek their help
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Need to investigate 22.10 release testing result
 +
****** Compiler version change seems to be one of factors for perf degradation
 +
******* Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
 +
******* New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
 +
***** VM testcase timeout issue on 3-tsh testbed
 +
****** Timeout issue occured when starting VPP inside VM, but not for starting testpmd
 +
****** Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
 
 +
** VPP Device
 +
*** Enable VPP device testing per patch
 +
**** Voting right for VPP device testing on Arm is enabled
 +
**** VPP device testing on Arm runs per VPP/CSIT patch
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''12/20/2022'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Tina Tsou
 +
 
 +
* Miscellaneous
 +
** Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Confirm with Vexxhost people if replacing intel NICs is feasible
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
****** Will talk to dpdk i40e maintainer to seek their help
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Need to investigate 22.10 release testing result
 +
****** Compiler version change seems to be one of factors for perf degradation
 +
******* Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
 +
******* New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
 +
***** VM testcase timeout issue on 3-tsh testbed
 +
****** Timeout issue occured when starting VPP inside VM, but not for starting testpmd
 +
****** Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
 
 +
** VPP Device
 +
*** Enable VPP device testing per patch
 +
**** Voting right for VPP device testing on Arm is enabled
 +
**** VPP device testing on Arm runs per VPP/CSIT patch
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''12/06/2022'''
 +
* Attendees
 +
** Juraj Linkes
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
 
 +
* Miscellaneous
 +
** Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Confirm with Vexxhost people if replacing intel NICs is feasible
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
****** Will talk to dpdk i40e maintainer to seek their help
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Need to investigate 22.10 release testing result
 +
****** Compiler version change seems to be one of factors for perf degradation
 +
******* Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
 +
******* New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
 +
***** VM testcase timeout issue on 3-tsh testbed
 +
****** Timeout issue occured when starting VPP inside VM, but not for starting testpmd
 +
****** Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
 
 +
** VPP Device
 +
*** Enable VPP device testing per patch
 +
**** Voting right for VPP device testing on Arm is enabled
 +
**** VPP device testing on Arm runs per VPP/CSIT patch
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''11/15/2022'''
 +
* Attendees
 +
** Juraj Linkes
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
 
 +
* Miscellaneous
 +
** Reschedule the meeting to 9 am for Juraj and 3 pm for Shanghai folks
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Confirm with Vexxhost people if replacing intel NICs is feasible
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Need to investigate 22.10 release testing result
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
 
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
**** Good news, No more slow down after 200 rounds of testing.
 +
***** Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
 +
***** Distro upgrade to ubuntu 22.04 is still ongoing - no ETA yet
 +
****** Server configuration will remain the same, already integrated in ansible playbook
 +
***** Re-enable voting IF no more issue with 22.04 device testing
 +
****** Submit a patch to enable voting right after meeting
 +
*** Test meltdown/spectre vulnerabilities
 +
**** CSIT maintainers ask for tools if existing to test vulnerabilities on Arm platform(not just limited to Arm)
 +
**** Will confirm this issue with support team - Lijian
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** VM cases failed only on 3n-alt performance testbed, error log report some file missing, likely configuration issue
 +
**** Another intermit failed VM issue happens on tx2 and alt, need to figure out above case first
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
 
 +
'''10/18/2022'''
 +
* Attendees
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
* Miscellaneous
 +
** Reschedule the meeting to 9 am for Juraj and 3 pm for Shanghai folks
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
***** NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Replace XL710 NIC? - try asking tomorrow.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** NUMA issue
 +
***** Will run performance report on Arm testbed onece patch to resolve NUMA issue is merged
 +
***** Dave will help merge the patch into the corresponding branches
 +
 
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
**** Good news, No more slow down after 200 rounds of testing.
 +
***** Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
 +
***** Suggest to rerun test after upgrade to 22.04
 +
***** Re-enable voting after not more issue with 22.04 device testing
 +
 
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''9/20/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
***** NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Replace XL710 NIC? - try asking tomorrow.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
**** Good news, No more slow down after 200 rounds of testing.
 +
***** Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
 +
***** Suggest to rerun test after upgrade to 22.04
 +
***** Re-enable voting after not more issue with 22.04 device testing
 +
 
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
**** QAT enabled Kernel patch release about October, upgrade kernel required.
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
** VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
 
 +
'''9/6/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Lijian Zhang
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
***** NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Replace XL710 NIC? - try asking tomorrow.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
**** Good news, No more slow down after 200 rounds of testing.
 +
***** Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
 +
***** Suggest to rerun test after upgrade to 22.04
 +
***** Re-enable voting after not more issue with 22.04 device testing
 +
 
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
**** QAT enabled Kernel patch release about October, upgrade kernel required.
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
** VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''8/16/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Masksym Vynnvk
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Lijian Zhang
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
***** NDR/PDR data difference - deep dive needed
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
** VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''8/2/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Masksym Vynnvk
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Check if there is customer support can help with the PEX installation issue - Juraj
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
***** NDR PDR data difference - deep dive needed, MRR is
 +
******
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
** VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''7/19/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Check if there is customer support can help with the PEX installation issue - Juraj
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
 
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX NIC
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
 
 +
 
 +
'''7/5/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Check if there is customer support can help with the PEX installation issue - Juraj
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
 
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP on N1 platforms
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Tested perfmon patch - Jieqiang
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
 
 +
'''6/21/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Check if there is customer support can help with the PEX installation issue - Juraj
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
 
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Tested perfmon patch - Jieqiang
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
 
 +
'''6/7/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Check if there is customer support can help with the PEX installation issue - Juraj
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
 
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Tested perfmon patch - Jieqiang
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
 
 +
'''5/17/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Lijian Zhang
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Check if there is customer support can help with the PEX installation issue - Juraj
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
 
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
 
 +
'''4/5/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Lijian Zhang
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** Check if there is customer support can help with the PEX installation issue - Juraj
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have alrady sent to Jieqiang previously.
 +
 
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
 
 +
'''3/15/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
'''3/1/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Paper work for shipment is done
 +
*** Build servers will arrive at end of Jan
 +
*** Performance servers will arrive in Feb
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''1/25/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Paper work for shipment is done
 +
*** Build servers will arrive at end of Jan
 +
*** Performance servers will arrive in Feb
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''1/18/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
 +
** VPP IPv4 fragmentation - Tianyu & Jieqiang
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''1/11/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
 +
** VPP IPv4 fragmentation - Tianyu & Jieqiang
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''12/14/2021'''
 +
* Attendees
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
 +
** VPP IPv4 fragmentation - Tianyu & Jieqiang
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
** VPP IPv6 Benchmarking and Profiling - Jieqiang
 +
*** IPv6 profiling
 +
**** No perf bump for lookup_x2 function in Fd.io gerrit
 +
**** Try Mellaonx nics for IPv6 routing tests
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''12/07/2021'''
 +
* Attendees
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
 +
** VPP IPv4 fragmentation - Tianyu & Jieqiang
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
** VPP IPv6 Benchmarking and Profiling - Jieqiang
 +
*** IPv6 profiling
 +
**** No perf bump for lookup_x2 function in Fd.io gerrit
 +
**** Try Mellaonx nics for IPv6 routing tests
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''11/30/2021'''
 +
* Attendees
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
** New Arm servers shipment to the FD.io lab
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
 +
** VPP IPv4 fragmentation - Tianyu & Jieqiang
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
** VPP IPv6 Benchmarking and Profiling - Jieqiang
 +
*** IPv6 profiling
 +
**** No perf bump for lookup_x2 function in Fd.io gerrit
 +
**** Try Mellaonx nics for IPv6 routing tests
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
'''11/23/2021'''
 +
* Attendees
 +
** Tianyu Li
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch merged and running, expected report shows next week.
 +
****** Inbound patch pending on merge, need maintainer's review
 +
****** https://gerrit.fd.io/r/c/csit/+/34256
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
** New Arm servers shipment to the FD.io lab
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enbaled with DMC-620
 +
** VPP IPv4 fragmentation
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Performance degradation with quad loop unrolling applied on ip6_lookup_inline
 +
**** Patch the current kernel to enable perfmon plugin on VPP
 +
**** Need to check performance for IPv6 subnet routing
 +
**** Try Mellaonx nics for IPv6 routing tests
 +
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
***** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
 
 +
 
 +
'''11/16/2021'''
 +
* Attendees
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Inbound patch pending on merge, need maintainer's review
 +
****** https://gerrit.fd.io/r/c/csit/+/34256
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
***** Enable VPP device testing per patch
 +
** New Arm servers shipment to the FD.io lab
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enbaled with DMC-620
 +
** VPP IPv4 fragmentation
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Performance degradation with quad loop unrolling applied on ip6_lookup_inline
 +
**** Patch the current kernel to enable perfmon plugin on VPP
 +
**** Need to check performance for IPv6 subnet routing
 +
**** Try Mellaonx nics for IPv6 routing tests
 +
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
***** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
 
 +
'''11/09/2021'''
 +
* Attendees
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Inbound patch pending on merge, need maintainer's review
 +
****** https://gerrit.fd.io/r/c/csit/+/34256
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Will enable voting right soon after the patch gets merged
 +
** New Arm servers shippment to the FD.io lab
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunce page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Perfmonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
*** VPP IPv4 fragmetation
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Performance degradation with quad loop unrolling applied on ip6_lookup_inline
 +
**** Patch the current kernel to enable perfmon plugin on VPP
 +
**** Need to check performance for IPv6 subnet routing
 +
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
***** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
 
 +
'''11/02/2021'''
 +
* Attendees
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** https://gerrit.fd.io/r/c/csit/+/34256
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** race condition occur
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Addressed comments, waiting Peter's review.
 +
******* Will enable voting right soon after the patch gets merged
 +
** New Arm servers shippment to the FD.io lab
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunce page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Perfmonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Performance degradation with quad loop unrolling applied on ip6_lookup_inline
 +
**** Patch the current kernel to enable perfmon plugin on VPP
 +
**** Need to check performance for IPv6 subnet routing
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu
 +
**** Two patches unmerged
 +
***** l2: fix array-bounds error for prefetch on Arm https://gerrit.fd.io/r/c/vpp/+/33307
 +
***** ioam: fix prefetch out bound on Arm https://gerrit.fd.io/r/c/vpp/+/33506
 +
 
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
***** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
 
 +
 
 +
'''10/26/2021'''
 +
* Attendees
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week. - closed
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** Inbound IPsec: reproduced and need to investigate - Juraj
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** IPsec SPD input/output case ongoing
 +
***** Adding IPsec SPD outbound test cases 64B 1, 100 and 1k SPD entries, 1, 2, 4 cores, on tx2 testbed - clarified
 +
****** Flow cache on and off cases need to be measured.
 +
***** L2 BD 20k test cases execute time too long, removed on taishan.
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 3n-tsh testbed unreachable, investigating right now - Juraj
 +
***** TG firmware is under upgradation
 +
***** Server unreachable due to firmware & driver update - resolved - update all done
 +
**** Release testing for 21.10 starts
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - Juraj
 +
***** Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
 +
**** AVF interface creation issue:
 +
***** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
****** race condition occur
 +
****** try mounting a part of /dev/vfio to see if issue can be resolved
 +
******  Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and looks good right now
 +
******* Addressed comments, waiting Peter's review..
 +
******* Will enable voting right soon after the patch gets merged
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
**** New servers are in the procurement process
 +
**** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
****** CPU not fully utilized on Arm, need further investigation
 +
** Intel NIC firmware upgrade on Arm - not supported
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
****** Enable DMC 620 more close to real system, but performance will drop
 +
****** Build a system using VPP memif and pktgen
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
**** Plan to try quad loop unrolling for ip6_lookup_inline function
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu
 +
**** Two patches unmerged
 +
***** l2: fix array-bounds error for prefetch on Arm https://gerrit.fd.io/r/c/vpp/+/33307
 +
***** ioam: fix prefetch out bound on Arm https://gerrit.fd.io/r/c/vpp/+/33506
 +
 
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
***** Try to use ansible to deploy VPP automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from maintainer and Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''10/19/2021'''
 +
* Attendees
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week. - closed
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** Inbound IPsec: reproduced and need to investigate - Juraj
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 3n-tsh testbed unreachable, investigating right now - Juraj
 +
***** TG firmware is under upgradation
 +
***** Server unreachable due to firmware & driver update - resolved - update all done
 +
**** Release testing for 21.10 starts
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - Juraj
 +
***** Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
 +
**** AVF interface creation issue:
 +
***** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
****** race condition occur
 +
****** try mounting a part of /dev/vfio to see if issue can be resolved
 +
******  Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and looks good right now
 +
******* Will enable voting right soon after the patch gets merged
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
**** New servers are in the procurement process
 +
**** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
****** CPU not fully utilized on Arm, need further investigation
 +
** Intel NIC firmware upgrade on Arm - not supported
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
****** Enable DMC 620 more close to real system, but performance will drop
 +
****** Build a system using VPP memif and pktgen
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
**** Plan to try quad loop unrolling for ip6_lookup_inline function
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu
 +
**** Two patches unmerged
 +
***** l2: fix array-bounds error for prefetch on Arm https://gerrit.fd.io/r/c/vpp/+/33307
 +
***** ioam: fix prefetch out bound on Arm https://gerrit.fd.io/r/c/vpp/+/33506
 +
 
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
***** Try to use ansible to deploy VPP automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from maintainer and Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''10/12/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week. - closed
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** Inbound IPsec: reproduced and need to investigate - Juraj
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing done.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 3n-tsh testbed unreachable, investigating right now - Juraj
 +
***** TG firmware is under ugradation
 +
***** Server unreachable due to firmware & driver update - resolved - update all done
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - Juraj
 +
***** Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
 +
**** AVF interface creation issue:
 +
***** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
****** race condition occur
 +
****** try mounting a part of /dev/vfio to see if issue can be resolved
 +
****** Talked with Peter, Juraj is working on prototype of mounting part of /dev/vfio
 +
******  x86 vpp device job is fine, duo to firmware & driver is old
 +
******  arm vpp device servers have drivers updated, vlan striping not allowed, vlan configuration cannot removed from lab view.
 +
******  only performance testbeds have NIC drivers updated
 +
******  maintainer doesn't want to a option from vpp config
 +
******  may need to check x86 have the same issue with the same version driver before reaching intel folks
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
**** New servers are in the procurement process
 +
**** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
****** CPU not fully utilized on Arm, need further investigation
 +
** Intel NIC firmware upgrade on Arm - not supported
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
****** Enable DMC 620 more close to real system, but performance will drop
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu
 +
**** Two patches unmerged
 +
***** l2: fix array-bounds error for prefetch on Arm https://gerrit.fd.io/r/c/vpp/+/33307
 +
***** ioam: fix prefetch out bound on Arm https://gerrit.fd.io/r/c/vpp/+/33506
 +
 
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from maintainer and Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''09/28/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** Inbound IPsec: reproduced and need to investigate - Juraj
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing done.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 3n-tsh testbed unreachable, investigating right now - Juraj
 +
***** TG firmware is under ugradation
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - Juraj
 +
***** Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
 +
**** AVF interface creation issue:
 +
***** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
****** race condition occur
 +
****** try mounting a part of /dev/vfio to see if issue can be resolved
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
**** New servers are in the procurement process
 +
**** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu
 +
**** Two patches unmerged
 +
***** l2: fix array-bounds error for prefetch on Arm https://gerrit.fd.io/r/c/vpp/+/33307
 +
***** ioam: fix prefetch out bound on Arm https://gerrit.fd.io/r/c/vpp/+/33506
 +
 
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from maintainer and Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''09/14/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Outbound IPsec finished.
 +
******* Waiting for new version of patchset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate - Juraj
 +
******** Learn more about RFC and need time to understand more - on hold - waiting Neale's response
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing done.
 +
***** A few more jobs run for release 21.06 and will be finished soon
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - juraj
 +
***** Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
 +
**** AVF interface creation issue:
 +
***** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
*** Try IPv4 multicast testing to verify the scenario when refcnt > 1
 +
**** GDB shows that mbufs are copied instead of reference from src port to all dst ports
 +
**** Direct/Indirect mbuf for VPP multicast testing
 +
**** Try IPv4 multicasting & L2 flood testing which works fine
 +
**** ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
 +
***** show mbuf is copied so that ref_cnt will always be one
 +
****** dpdk 21.08 have the patches, need to verify on vpp
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
**** Patch split into 3 components
 +
***** acl: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33597 (Merged)
 +
***** dpdk: fix prefetch assert on Arm https://gerrit.fd.io/r/c/vpp/+/33598 (Merged)
 +
***** session: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33599 (Merged)
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from maintainer and Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''09/07/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Outbound IPsec finished.
 +
******* Waiting for new version of patchset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate - Juraj
 +
******** Learn more about RFC and need time to understand more - on hold - waiting Neale's response
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing done.
 +
***** A few more jobs run for release 21.06 and will be finished soon
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - juraj
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
*** Try IPv4 multicast testing to verify the scenario when refcnt > 1
 +
**** GDB shows that mbufs are copied instead of reference from src port to all dst ports
 +
**** Direct/Indirect mbuf for VPP multicast testing
 +
**** Try IPv4 multicasting & L2 flood testing which works fine
 +
**** ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
 +
***** show mbuf is copied so that ref_cnt will always be one
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
**** Patch split into 3 components
 +
***** acl: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33597 (Merged)
 +
***** dpdk: fix prefetch assert on Arm https://gerrit.fd.io/r/c/vpp/+/33598 (Under review)
 +
***** session: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33599 (Merged)
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from one Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''08/31/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Outbound IPsec finished.
 +
******* Waiting for new version of patchset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate - Juraj
 +
******** Learn more about RFC and need time to understand more - on hold - waiting Neale's response
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing done.
 +
***** A few more jobs run for release 21.06 and will be finished soon
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
**** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
***** config option set to N, /dev/vfio device or resource busy error
 +
***** config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
 +
****** The longer the server runs, more test cases fails
 +
***** Next to do
 +
****** Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
 +
******* Juraj modified script to reproduce the issue - Lijian will try it locally
 +
******* Also seen in Intel QAT card from Zach
 +
****** Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
 +
****** Lijian have slight different firmware version, driver version
 +
****** Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
 +
****** Lijian can use Juraj's script to reproduce the issue on local tx2 server
 +
******* Reducing the numa buffer allocation size resolves this issue
 +
******* Observed from the error log of numa buffer allocation
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
*** Try IPv4 multicast testing to verify the scenario when refcnt > 1
 +
**** GDB shows that mbufs are copied instead of reference from src port to all dst ports
 +
**** Direct/Indirect mbuf for VPP multicast testing
 +
**** Try IPv4 multicasting & L2 flood testing which works fine
 +
**** ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
 +
***** show mbuf is copied so that ref_cnt will always be one
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
**** Patch split into 3 components
 +
***** acl: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33597 (Merged)
 +
***** dpdk: fix prefetch assert on Arm https://gerrit.fd.io/r/c/vpp/+/33598 (Under review)
 +
***** session: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33599 (Merged)
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''08/24/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Outbound IPsec finished.
 +
******* Waiting for new version of patchset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate - Juraj
 +
******** Learn more about RFC and need time to understand more - on hold - waiting Neale's response
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing done.
 +
***** A few more jobs run for release 21.06 and will be finished soon
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
**** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
***** config option set to N, /dev/vfio device or resource busy error
 +
***** config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
 +
****** The longer the server runs, more test cases fails
 +
***** Next to do
 +
****** Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
 +
******* Juraj modified script to reproduce the issue - Lijian will try it locally
 +
******* Also seen in Intel QAT card from Zach
 +
****** Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
 +
****** Lijian have slight different firmware version, driver version
 +
****** Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
 +
****** Lijian can use Juraj's script to reproduce the issue on local tx2 server
 +
******* Reducing the numa buffer allocation size resolves this issue
 +
******* Observed from the error log of numa buffer allocation
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
*** Try IPv4 multicast testing to verify the scenario when refcnt > 1
 +
**** GDB shows that mbufs are copied instead of reference from src port to all dst ports
 +
**** Will try L2 flood test case & understand VPP/multicast code
 +
**** Direct/Indirect mbuf for VPP multicast testing
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
**** Issues about prefetch on current VPP code base
 +
***** Issue 1 support 128B/64B cache-line size in Arm image
 +
***** Issue 2 prefetch 'overflow' for native build
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''08/17/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Outbound IPsec finished.
 +
******* Waiting for new version of patchset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate - Juraj
 +
******** Learn more about RFC and need time to understand more
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing ongoing
 +
***** A few more jobs run for release 21.06 and will be finished soon
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
**** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
***** config option set to N, /dev/vfio device or resource busy error
 +
***** config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
 +
****** The longer the server runs, more test cases fails
 +
***** Next to do
 +
****** Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
 +
******* Juraj modified script to reproduce the issue - Lijian will try it locally
 +
******* Also seen in Intel QAT card from Zach
 +
****** Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
 +
****** Lijian have slight different firmware version, driver version
 +
****** Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
 +
****** Lijian can use Juraj's script to reproduce the issue on local tx2 server
 +
******* Reducing the numa buffer allocation size resolves this issue
 +
******* Observed from the error log of numa buffer allocation
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
*** Try IPv4 multicast testing to verify the scenario when refcnt > 1
 +
**** GDB shows that mbufs are copied instead of reference from src port to all dst ports
 +
**** Will try L2 flood test case & understand VPP/multicast code
 +
**** Direct/Indirect mbuf for VPP multicast testing
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
**** Issues about prefetch on current VPP code base
 +
***** Issue 1 support 128B/64B cache-line size in Arm image
 +
***** Issue 2 prefetch 'overflow' for native build
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''08/10/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Outbound IPsec finished.
 +
******* Waiting for new version of patcheset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing ongoing
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
`
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
**** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
***** config option set to N, /dev/vfio device or resource busy error
 +
***** config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
 +
****** The longer the server runs, more test cases fails
 +
***** Next to do
 +
****** Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
 +
******* Juraj modified script to reproduce the issue - Lijian will try it locally
 +
******* Also seen in Intel QAT card from Zach
 +
****** Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
 +
****** Lijian have slight different firmware version, driver version
 +
****** Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian has got VPN access now
 +
*** Juraj singed Jieqiang's key
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128, CLI issue only, CSIT's python API works fine.
 +
*** Internal patch to resolve this issue under review - upstreamed
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
*** 4 loop unrolling decreasing performance
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''08/03/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Waiting for new version of patcheset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing ongoing
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
`
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
**** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
***** config option set to N, /dev/vfio device or resource busy error
 +
***** config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
 +
****** The longer the server runs, more test cases fails
 +
***** Next to do
 +
****** Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
 +
******* Also seen in Intel QAT card from Zach
 +
****** Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
 +
****** Will try Mellanox card to see if same issue happens - Juraj
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian has got VPN access now
 +
*** Juraj singed Jieqiang's key
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
*** Internal patch to resolve this issue under review
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
*** 4 loop unrolling decreasing performance
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''07/27/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing ongoing
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
 +
***** May need to check VM and IPsec cases
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
****** Random issue, more frequently happening on Arm
 +
******* Not see in CI recently or manually.
 +
**** scapy unexpected timeout issue: packet drop or slow issue?
 +
****** vfio-pci driver may be the root cause - bind/unbind
 +
*** Connection issue between Jenkins and the build executor in FD.io lab
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian has got VPN access now
 +
*** Juraj singed Jieqiang's key
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
*** 4 loop unrolling descreasing performance
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code: having some questions/comments, would like a review meeting - Lijian
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''07/20/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing ongoing
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
 +
***** May need to check VM and IPsec cases
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
****** Random issue, more frequently happening on Arm
 +
****** vfio-pci driver may be the root cause - bind/unbind
 +
*** Connection issue between Jenkins and the build executor in FD.io lab
 +
** Shipment of new advanced server to the FD.io lab
 +
*** Two advanced servers are in plan to ship
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian has got VPN access now
 +
*** Juraj singed Jieqiang's key
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
** VPP mbuf-fast-free tx offload
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
**** https://gerrit.fd.io/r/c/vpp/+/33062
 +
**** https://gerrit.fd.io/r/c/vpp/+/33063
 +
**** https://gerrit.fd.io/r/c/vpp/+/33061
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
***** Patches have been upstreamed and waiting for review
 +
****** https://gerrit.fd.io/r/c/vpp/+/32420
 +
***** Prepare the memif readout - Tianyu
 +
***** Add support for VPP aarch64 docker image build
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''07/13/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
 +
****** Expected to be merged soon
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
 +
****** Hugepage size, numa-node, core isolation etc. may need to check.
 +
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
 +
**** Release testing ongoing
 +
**** IPsec SPD input/output case ongoing
 +
**** Juraj may share the steps how CSIT handle new configuration changes
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
 +
***** May need to check VM and IPsec cases
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
****** Random issue, more frequently happening on Arm
 +
****** vfio-pci driver may be the root cause - bind/unbind
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
** Shipment of new advanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
** VPN access request to FD.io Arm servers
 +
*** Will remind Machiek to sign Lijian's GPG public key.
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
 
 +
'''07/06/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
 +
****** Expected to be merged soon
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
 +
****** Hugepage size, numa-node, core isolation etc. may need to check.
 +
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
 +
**** Release testing ongoing
 +
**** IPsec SPD input/output case ongoing
 +
**** Juraj may share the steps how CSIT handle new configuration changes
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
 +
***** May need to check VM and IPsec cases
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
****** Random issue, more frequently happening on Arm
 +
****** vfio-pci driver may be the root cause - bind/unbind
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
** Shipment of new advanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
** VPN access request to FD.io Arm servers
 +
*** Will remind Machiek to sign Lijian's GPG public key.
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform
 +
** VPP Prefetch
 +
*** Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
 +
**** Repeat the same test on Ampere server - PMU cache-miss less for write always
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
*** Investigating VPP classify function, use case, benchmarking - Lijian
 +
**** Start with simple use case
 +
**** VPP Classify basic inbound L3 src ip / prot case
 +
**** Benchmark VPP classifier on Arm/X86 platform
 +
***** Done some NEON changes, see some microbranchmark improvement
 +
*** investigate CSIT case
 +
**** No classify test case in CSIT. - Jieqiang
 +
**** may be there is a CSIT case named iacldstbase
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
**** Benchmarking DPDK memif vs VPP memif
 +
***** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
***** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
***** Patches have been upstreamed and waiting for review
 +
***** Prepare the memif readout - Tianyu
 +
***** Add support for VPP aarch64 docker image build
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
*** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
** VPP IPsec on Arm - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
**** VPP uses linear search on SPD lookups
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''06/29/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
 +
****** Expected to be merged soon
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
 +
****** Hugepage size, numa-node, core isolation etc. may need to check.
 +
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
 +
**** Release testing ongoing
 +
**** IPsec SPD input/output case ongoing
 +
**** Juraj may share the steps how CSIT handle new configuration changes
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
 +
***** May need to check VM and IPsec cases
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** Community plans to drop the support for CentOS-8.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
****** Debugging
 +
****** vfio-pci driver may be the root cause - bind/unbind
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
** Shipment of new advanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
** VPN access request to FD.io Arm servers
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform
 +
** VPP Prefetch
 +
*** Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
 +
**** Repeat the same test on Ampere server - PMU cache-miss less for write always
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
*** Investigating VPP classify function, use case, benchmarking - Lijian
 +
**** Start with simple use case
 +
**** VPP Classify basic inbound L3 src ip / prot case
 +
**** Benchmark VPP classifier on Arm/X86 platform
 +
***** Done some NEON changes, see some microbranchmark improvement
 +
*** investigate CSIT case
 +
**** No classify test case in CSIT. - Jieqiang
 +
**** may be there is a CSIT case named iacldstbase
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
**** Benchmarking DPDK memif vs VPP memif
 +
***** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
***** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
***** Patches have been upstreamed and waiting for review
 +
***** Prepare the memif readout - Tianyu
 +
***** Add support for VPP aarch64 docker image build
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
*** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
** VPP IPsec on Arm - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
**** VPP uses linear search on SPD lookups
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''06/22/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries
 +
****** Expected to be merged soon
 +
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
 +
**** Release testing ongoing
 +
**** IPsec SPD input/output case ongoing
 +
**** Juraj may share the steps how CSIT handle new configuration changes
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** Community plans to drop the support for CentOS-8.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** vfio-pci driver may be the root cause
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
** Shipment of new advanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
** VPN access request to FD.io Arm servers
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform
 +
** VPP Prefetch
 +
*** Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
 +
**** Repeat the same test on Ampere server
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** Investigating VPP classify function, use case, benchmarking - Lijian
 +
**** Start with simple use case
 +
**** VPP Classify basic inbound L3 src ip / prot case
 +
**** Benchmark VPP classifier on Arm/X86 platform
 +
***** Done some NEON changes, see some microbranchmark improvement
 +
*** investigate CSIT case
 +
**** No classify test case in CSIT. - Jieqiang
 +
**** may be there is a CSIT case named iacldstbase
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
**** Benchmarking DPDK memif vs VPP memif
 +
***** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
***** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
***** Patches have been upstreamed and waiting for review
 +
***** Prepare the memif readout - Tianyu
 +
***** Add support for VPP aarch64 docker image build
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
*** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
** VPP IPsec on Arm - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
**** VPP uses linear search on SPD lookups
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''06/15/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
***** VPP exits with the IPsec startup config, try startup config from Zach's email - Juraj
 +
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
 +
**** Release testing ongoing
 +
**** IPsec SPD input/output case ongoing
 +
**** Juraj may share the steps how CSIT handle new configuration changes
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** Community plans to drop the support for CentOS-8.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly. - DaveW
 +
** Shippment of new adavanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform
 +
** VPP Prefetch
 +
*** Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
 +
**** Repeat the same test on Ampere server
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** Investigating VPP classify function, use case, benchmarking - Lijian
 +
**** Start with simple use case
 +
**** VPP Classify basic inbound L3 src ip / prot case
 +
**** Benchmark VPP classifier on Arm/X86 platform
 +
***** Done some NEON changes, see some microbranchmark improvement
 +
*** investigate CSIT case
 +
**** No classify test case in CSIT. - Jieqiang - may be there is a CSIT case named iacldstbase
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
**** Benchmarking DPDK memif vs VPP memif
 +
***** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
***** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
***** Patches have been upstreamed and waiting for review
 +
***** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
*** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
** VPP IPsec on Arm - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
**** VPP uses linear search on SPD lookups
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Waiting for review comments on outbound side before upstream to VPP
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
 
 +
'''06/08/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
***** VPP exits with the IPsec startup config, try startup config from Zach's email - Juraj
 +
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Work in progress.
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Some container test cases failed on all platform.
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** Community plans to drop the support for CentOS-8.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results.
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
** Shippment of new adavanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
* VPP
 +
** VPP default compiler on Arm platform
 +
*** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
**** Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
 +
***** No obvious performance improvement, keep the original default compiler
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
**** SVE validation on FPGA platform
 +
** VPP Prefetch
 +
*** Benchmark VPP using prefetch read always vs prefetch write always - Jieqiang
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** Investigating VPP classify function, use case, benchmarking - Lijian
 +
**** Start with simple use case
 +
**** VPP Classify basic inbound L3 src ip / prot case
 +
**** Benchmark VPP classifier on Arm/X86 platform
 +
*** investigate CSIT case
 +
**** No classify test case in CSIT. - Jieqiang
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
**** Benchmarking DPDK memif vs VPP memif
 +
***** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
***** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
***** Patches have been upstreamed and waiting for review
 +
***** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
*** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
** VPP IPsec on Arm - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input/output nodes - Govind & Zach
 +
**** VPP uses linear search on SPD lookups
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Waiting for review comments on outbound side before upstream to VPP
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
 
 +
 
 +
'''06/01/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
 +
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
 +
****** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
******* New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
******* IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
 +
****** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
****** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Work in progress.
 +
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Juraj will investigate cabling issue on Taishan performance test-bed - resolved.
 +
***** Some container case are seems failure on all platform.
 +
** VPP Path
 +
*** Voting and working fine.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
*** Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
 +
** Vector length specific patch is ready
 +
** Investigating VPP classify function, use case, benchmarking - Lijian
 +
*** Start with simple use case
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
** SVE patch ready and upstreamed, under review - Lijian
 +
*** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
*** Review memif patch
 +
**** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
*** VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case - No classify test case in CSIT. - Jieqiang
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
** Work on IPsec input/output nodes - VPP uses linear search on SPD lookups - Govind & Zach
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Waiting for review comments on outbound side before upstream to VPP
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
*** Investigated CMN-600 stats in perfmon plugin
 +
**** Abandoned, CMN-600 only gives system level view, no useful stats at node level - linux perf tool can give the same result
 +
 
 +
'''05/25/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tianyu Li
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
 +
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
 +
****** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
******* New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
******* IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
 +
****** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
****** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - will look into it
 +
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Juraj will investigate cabling issue on Taishan performance test-bed - resolved.
 +
***** Some container case are seems failure on all platform.
 +
** VPP Path
 +
*** Voting and working fine.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** Try the new version of DPDK but it does not help
 +
****** Contact Intel devs for the possible advice
 +
****** Workaround may impact too much to all test cases
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
*** Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
 +
** Vector length specific patch is ready
 +
** Investigating VPP classify function, use case, benchmarking - Lijian
 +
*** Start with simple use case
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
** SVE patch ready and upstreamed, under review - Lijian
 +
*** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
*** Review memif patch
 +
**** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
*** VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case - No classify test case in CSIT. - Jieqiang
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform. - Govind
 +
*** IPsec input node optimization work in progress - Zach & Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** IPSec unit test - make test new cases implementation
 +
***** Make test cases for IPSec policy mode - Done, included in Govind's patch, waiting for maintainer review - Zach
 +
****** Add/Remove/Add+Remove+Readd/Hash collisions/Multiple interfaces & rules
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** perfmon CMN-600 investigating - Zach
 +
*** VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
 +
*** Plan to upstream perfmon plugin - resolving review comments - Zach
 +
*** IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
 +
*** IPSec decryption / input node - Zach
 +
 
 +
'''05/18/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Zachary Leaf
 +
** Tianyu Li
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
 +
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
 +
****** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
******* New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
******* IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
 +
****** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
****** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308
 +
****** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready
 +
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
 +
** VPP Path
 +
*** Voting and working fine.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** Try the new version of DPDK but it does not help
 +
****** Contact Intel devs for the possible advice
 +
****** Workaround may impact too much to all test cases
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
*** Lab move for the fd.io lab
 +
**** Lab moving started stage 2, moved part of the servers to make sure ci service not down.
 +
**** Lab move is done, some issues with taishan testbed
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
*** Plan to benchmark gcc-10 vs clang-12
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
** SVE patch ready and upstreamed, under review - Lijian
 +
*** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
*** Review memif patch
 +
**** Functional bug related to C11 atomics has been resolved by VPP maintainer.
 +
*** VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case. - Jieqiang
 +
*** Make test cases for IPSec policy mode - Zach
 +
**** Add/Remove/Add+Remove+Readd/Hash collisions/Multiple interfaces & rules - Add more test cases
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Performance improvement using loop unrolling for memif nodes
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform. - Govind
 +
*** IPsec input node optimization work in progress - Zach & Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** IPSec unit test - make test new cases implementation
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** perfmon CMN-600 investigating - Zach
 +
*** VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
 +
*** Plan to upstream perfmon plugin - resolving review comments - Zach
 +
*** IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
 +
*** IPSec decryption / input node - Zach
 +
 
 +
'''05/11/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Zachary Leaf
 +
** Tianyu Li
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
 +
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
 +
****** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
******* New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
******* IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
 +
****** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
 +
** VPP Path
 +
*** Voting and working fine.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** Try the new version of DPDK but it does not help
 +
****** Contact Intel devs for the possible advice
 +
****** Workaround may impact too much to all test cases
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
*** Lab move for the fd.io lab
 +
**** Lab moving started stage 2, moved part of the servers to make sure ci service not down.
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
**** Almost all except performance testbed, which will be moved this week, everything is smooth so far.
 +
**** ubuntu 1804 -> 2004
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
** SVE patch ready and upstreamed, under review - Lijian
 +
*** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
*** Review memif patch
 +
*** VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case.
 +
*** Make test cases for IPSec policy mode - Jieqiang
 +
** Review memif test cases/memif cases
 +
** Finished coding of SVE string library, bihash key compare functions
 +
*** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
** SVE unit testing based on test_vec, fix test_vec issues
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
***** Will do readout presentation with extended people - Tianyu
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Performance improvement using loop unrolling for memif nodes
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform. - Govind
 +
*** IPsec input node optimization work in progress - Zach & Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** IPSec unit test - make test new cases implementation
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** Using startup parameter to enable the IPsec flow cache feature
 +
** Adding Python test case to test IPSec node behavior - Jieqiang
 +
** perfmon CMN-600 investigating - Zach
 +
*** VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
 +
*** Plan to upstream perfmon plugin - resolving review comments - Zach
 +
*** IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
 +
*** IPSec decryption / input node - Zach
 +
 
 +
'''04/27/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
 +
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
 +
****** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
 +
** VPP Path
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
***** https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Its voting right is enabled on Arm.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** Try the new version of DPDK but it does not help
 +
****** Contact Intel devs for the possible advice
 +
****** Workaround may impact too much to all test cases
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
*** Lab move for the fd.io lab
 +
**** Lab moving started stage 2, moved part of the servers to make sure ci service not down.
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
** SVE patch ready and upstreamed, under review - Lijian
 +
*** Make test cases for IPSec policy mode - Jieqiang
 +
** Review memif test cases/memif cases
 +
** Finished coding of SVE string library, bihash key compare functions
 +
*** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
** SVE unit testing based on test_vec, fix test_vec issues
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
***** Will do readout presentation with extended people - Tianyu
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Performance improvement using loop unrolling for memif nodes
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform. - Govind
 +
*** IPsec input node optimization work in progress - Zach & Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** IPSec unit test - make test new cases implementation
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** Using startup parameter to enable the IPsec flow cache feature
 +
** Adding Python test case to test IPSec node behavior - Jieqiang
 +
** perfmon CMN-600 investigating - Zach
 +
*** Plan to upstream perfmon plugin - resolving review comments - Zach
 +
*** IPSec decryption / input node - Zach
 +
 
 +
'''04/13/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
 +
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
 +
****** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Some issues occurred during the upgrade.
 +
***** Patch to resolve the building error of DPDK on 3n-tsh testbed.
 +
***** Root cause is the change of build system of DPDK on 3n-tsh related to SOC id detection.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
 +
** VPP Path
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
***** https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Its voting right is enabled on Arm.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** Try the new version of DPDK but it does not help
 +
****** Contact Intel devs for the possible advice
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
*** Make test cases for IPSec policy mode - Jieqiang
 +
** Review memif test cases/memif cases
 +
** Finished coding of SVE string library, bihash key compare functions
 +
*** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
** SVE unit testing based on test_vec, fix test_vec issues
 +
** Test template update - Jieqiang
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
***** Will do readout presentation with extended people - Tianyu
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** Using startup parameter to enable the IPsec flow cache feature
 +
** Adding Python test case to test IPSec node behavior - Jieqiang
 +
** perfmon CMN-600 investigating - Zach
 +
*** Plan to upstream perfmon plugin - resolving review comments - Zach
 +
*** IPSec decryption - Zach
 +
 
 +
 
 +
'''03/30/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
****** 2 node IPsec SPD policy test case patch is ready, starting with 1 and 1k tunnels. (40, 400 tunnels in seperate patch)
 +
****** https://gerrit.fd.io/r/c/csit/+/31605
 +
****** Fix the wrong CLI commands but configuration still has problems.
 +
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Some issues occurred during the upgrade.
 +
***** Patch to resolve the building error of DPDK on arm testbed.(taishan dpdk cases still have issues, investigating)
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
 +
** VPP Path
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
***** https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Its voting right is enabled on Arm.
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** Try the new version of DPDK but it does not help
 +
****** Contact Intel devs for the possible advice
 +
**** Will try to reproduce the issue with x86 servers.
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Review memif test cases/memif cases
 +
** Finished coding of SVE string library, bihash key compare functions
 +
*** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
** SVE unit testing based on test_vec, fix test_vec issues
 +
** Test template update
 +
** SVE unit test in qemu-vm, met compiling issue, investigating
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
***** Will do readout presentation with extended people - Tianyu
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Prepare the memif readout - Tianyu
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** Using startup parameter to enable the IPsec flow cache feature
 +
** Discuss with jieqiang adding python test case to test ipsec node behavior
 +
** perfmon CMN-600 investigating - Zach
 +
 
 +
'''03/16/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
*** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
 +
** VPP Path
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Its voting right is enabled on Arm.
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version.
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
**** Will try to reproduce the issue with x86 servers.
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
*** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
***** Will do readout presentation with extented people - Tianyu
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''03/09/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
**** CSIT official release 21.01 is available
 +
***** https://docs.fd.io/csit/rls2101/report/
 +
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
 +
******* 20.09 vs 21.01 show run vector per call drop from 256 to 200 - need to check dpdk version changes
 +
******* Perf drop only observed for VM cases
 +
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP.
 +
****** Check the number for CSIT 2101 release 
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** Tests are running fine
 +
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
 +
****** Suitable time to run release testing on 2n-tx2 testbed.
 +
***** Will investigate IPSec test cases on 2n-tx2 - Juraj
 +
****** Maintainer confirm that it is feasible
 +
******* Patch merged, https://gerrit.fd.io/r/c/csit/+/31309 p
 +
******* Patch created for daily running https://gerrit.fd.io/r/c/csit/+/31478
 +
******* crypto tests will be enabled on daily and report Jenkins job
 +
******* IPv6 / policy mode crypto test cases to be investigated and added
 +
****** Add memif test case to 2n-tx2 once the release testing is done.
 +
******* Take ~ 1 or 1.5 hour for one round of memif testing.
 +
**** release testing for 2n-tx2
 +
***** Performance data added to daily trending page
 +
****** https://docs.fd.io/csit/master/trending/introduction/dashboard.html#n-tx2
 +
****** Test cases include L2/IPv4/IPv6/Classifier/ACL
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will not be supported.
 +
**** CentOS-8 will be supported by the end of this year by Redhat.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Job is enabled https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Running per patch and voting right is enabled
 +
***** Maintainer ask for more servers for sake of redundancy
 +
****** Sync with Dave for ARM server requirement
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
**** Dependency on maintainers to fix this issue
 +
**** Intel will ship a new NIC with latest firmware
 +
***** Shipment takes a long time empirically
 +
****** NIC has been shipped to vexxhost, wait for NIC arrival.
 +
***** Try to reproduce the issue on this NIC on Arm platform
 +
***** Updating firmware on the current NIC is risky
 +
**** Voting rights will be enabled once this issue is fixed
 +
****** Maintainer raised the ticket to get intel people involved
 +
****** Will not update the firmware because the release testing is ongoing
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
**** Will show Arm roadmap in the next TSC meeting
 +
**** Proposals have been sent to VPP maintainer on verifying SVE/SVE2
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
 
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
**** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
**** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
**** Analysis of benchmarking results for Ampere Altra
 +
***** A lot of context switches occur on Ampere Altra compared to N1SDP
 +
***** perf tools used to capture the perf events
 +
***** Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
 +
***** Remove interrupts on altra but no performance improvement seen
 +
***** instruction cache misses are higher on altra than N1
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
** VPP compiling error on CentOS 7 - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/31421
 +
**** CentOS 7 build issue has been fixed
 +
*** Developing NEON wrapper to SVE 128/256bit on qemu
 +
 
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
**** perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''02/23/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
**** CSIT official release 21.01 is available
 +
***** https://docs.fd.io/csit/rls2101/report/
 +
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
 +
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP.
 +
****** Check the number for CSIT 2101 release 
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** Tests are running fine
 +
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
 +
****** Suitable time to run release testing on 2n-tx2 testbed.
 +
****** Will investigate IPSec test cases on 2n-tx2 - Juraj
 +
******* Maintainer confirm that it is feasible
 +
******* Patch created, https://gerrit.fd.io/r/c/csit/+/31309
 +
******* crypto tests will be enabled on daily and report Jenkins job
 +
****** Add memif test case to 2n-tx2 once the release testing is done.
 +
******* Take ~ 1 or 1.5 hour for one round of memif testing.
 +
**** release testing for 2n-tx2
 +
***** Performance data added to daily trending page
 +
****** https://docs.fd.io/csit/master/trending/introduction/dashboard.html#n-tx2
 +
****** Test cases include L2/IPv4/IPv6/Classifier/ACL
 +
****** Release testing done for 2n-tx2, ongoing for 3n-tsh(due to next week)
 +
****** Release report plan to be published on 10th Feb
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Job is enabled https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Running per patch and voting right is enabled
 +
***** Maintainer ask for more servers for sake of redundancy
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
**** Dependency on maintainers to fix this issue
 +
**** Intel will ship a new NIC with latest firmware
 +
***** Shipment takes a long time empirically
 +
***** Try to reproduce the issue on this NIC on Arm platform
 +
***** Updating firmware on the current NIC is risky
 +
**** Voting rights will be enabled once this issue is fixed
 +
****** Maintainer raised the ticket to get intel people involved
 +
****** Will not update the firmware because the release testing is ongoing
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features inside QEMU docker - Lijian
 +
**** Latest VPP binary crash on the QEMU docker
 +
***** System call fails inside QEMU docker when running VPP
 +
**** Verify SVE/SVE2 features inside ARM QEMU VM
 +
**** VPP maintainers want real hardware to verify SVE code
 +
***** This solution will be abandoned.
 +
**** 'make test' execution is slow
 +
**** Sync with DPDK team/VPP community to decide the solution
 +
**** Proposals have been sent to VPP maintainer on verifying SVE/SVE2
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
 
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
**** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
**** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
**** Analysis of benchmarking results for Ampere Altra
 +
***** A lot of context switches occur on Ampere Altra compared to N1SDP
 +
***** perf tools used to capture the perf events
 +
***** Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
 +
***** Remove interrupts on altra but no performance improvement seen
 +
***** instruction cache misses are higher on altra than N1
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Investigate VPP memif test cases in VPP CSIT - Jieqiang
 +
*** Investigate VPP use cases proposals in containers - Tianyu
 +
**** 3 use cases are investigated.
 +
**** Will explore the memif logic and share the progress.
 +
**** Will share the link on details about how to run VPP in container.
 +
*** Investigate VPP agent usage - Tianyu
 +
**** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
** VPP compiling error on CentOS 7 - Jieqiang
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''02/09/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
**** CSIT official release 20.09 is available
 +
***** https://docs.fd.io/csit/rls2009/report/
 +
**** CSIT official release 21.01 is ongoing
 +
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
 +
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP.
 +
****** Check the number for CSIT 2101 release 
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** Tests are running fine
 +
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
 +
****** Suitable time to run release testing on 2n-tx2 testbed.
 +
****** Will investigate IPSec test cases on 2n-tx2 - Juraj
 +
******* Maintainer confirm that it is feasible
 +
****** Add memif test case to 2n-tx2 once the release testing is done.
 +
**** release testing for 2n-tx2
 +
***** Performance data added to daily trending page
 +
****** https://docs.fd.io/csit/master/trending/introduction/dashboard.html#n-tx2
 +
****** Test cases include L2/IPv4/IPv6/Classifier/ACL
 +
****** Release testing done for 2n-tx2, ongoing for 3n-tsh(due to next week)
 +
****** Release report plan to be published on 10th Feb
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Job is enabled https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Running per patch and voting right is enabled
 +
***** Will ask Dave if he needs help with testing CentOS-8 on Arm - Juraj.
 +
****** Will verify the image uploaded by Dave if it is ready.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
**** Jenkins job to verify runs fine but slow
 +
***** https://gerrit.fd.io/r/c/ci-management/+/31083
 +
***** Maintainer ask for more servers for sake of redundancy
 +
**** 'make test' failure on ubuntu 20.04 AARCH64
 +
***** Dave has sent email for the details
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
**** Dependency on maintainers to fix this issue
 +
**** Intel will ship a new NIC with latest firmware
 +
***** Shippment takes a long time empirically
 +
***** Try to reproduce the issue on this NIC on Arm platform
 +
***** Updating firmware on the current NIC is risky
 +
**** Voting rights will be enabled once this issue is fixed
 +
****** Maintainer raised the ticket to get intel people involved
 +
****** Will not update the firmware because the release testing is ongoing
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features inside QEMU docker - Lijian
 +
**** Latest VPP binary crash on the QEMU docker
 +
***** System call fails inside QEMU docker when running VPP
 +
**** Verify SVE/SVE2 features inside ARM QEMU VM
 +
**** 'make test' execution is slow
 +
**** Sync with DPDK team/VPP community to decide the solution
 +
**** Proposals have been sent to VPP maintainer on verifying SVE/SVE2
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
**** Analysis of benchmarking results for Ampere Altra
 +
***** A lot of context switches occur on Ampere Altra compared to N1SDP
 +
***** perf tools used to capture the perf events
 +
***** Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
 +
***** Remove interrupts on altra but no performance improvement seen
 +
***** instruction cache misses are higher on altra than N1
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Investigate VPP memif test cases in VPP CSIT - Jieqiang
 +
*** Investigate VPP use cases proposals in containers - Tianyu
 +
**** 3 use cases are investigated.
 +
**** Will explore the memif logic and share the progress.
 +
**** Will share the link on details about how to run VPP in container.
 +
*** Investigate VPP agent usage - Tianyu
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''02/02/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
**** CSIT official release 20.09 is available
 +
***** https://docs.fd.io/csit/rls2009/report/
 +
**** CSIT official release 21.01 is ongoing
 +
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
 +
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP.
 +
****** Check the number for CSIT 2101 release 
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** Tests are running fine
 +
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
 +
****** Suitable time to run release testing on 2n-tx2 testbed.
 +
****** Will investigate IPSec test cases on 2n-tx2 - Juraj
 +
******* Maintainer confirm that it is feasible
 +
****** Add memif test case to 2n-tx2 once the release testing is done.
 +
**** release testing for 2n-tx2
 +
***** Performance data added to daily trending page
 +
****** https://docs.fd.io/csit/master/trending/introduction/dashboard.html#n-tx2
 +
****** Test cases include L2/IPv4/IPv6/Classifier/ACL
 +
****** Release report plan to be published on 10th Feb
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Will ask Dave if he needs help with testing CentOS-8 on Arm - Juraj.
 +
****** Will verify the image uploaded by Dave if it is ready.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
**** Jenkins job to verify runs fine but slow
 +
***** https://gerrit.fd.io/r/c/ci-management/+/31083
 +
***** Maintainer ask for more servers for sake of redundancy
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
**** Dependency on maintainers to fix this issue
 +
**** Voting rights will be enabled once this issue is fixed
 +
***** Implementation is ready, and will do test it with actual patches.
 +
***** Apply file locking mechanism to allow that only one VPP instances are running.
 +
****** https://gerrit.fd.io/r/c/csit/+/30425
 +
****** Patches are under review
 +
****** Maintainer raised the ticket to get intel people involved
 +
****** Will not update the firmware because the release testing is ongoing
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features inside QEMU docker - Lijian
 +
**** Latest VPP binary crash on the QEMU docker
 +
***** System call fails inside QEMU docker when running VPP
 +
**** Verify SVE/SVE2 features inside ARM QEMU VM
 +
**** 'make test' execution is slow
 +
**** Sync with DPDK team/VPP community to decide the solution
 +
**** Proposals have been sent to VPP maintainer on verifying SVE/SVE2
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
**** Analysis of benchmarking results for Ampere Altra
 +
***** A lot of context switches occur on Ampere Altra compared to N1SDP
 +
***** perf tools used to capture the perf events
 +
***** Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
 +
***** Remove interrupts on altra but no performance improvement seen
 +
***** instruction cache misses are higher on altra than N1
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Investigate VPP memif test cases in VPP CSIT - Jieqiang
 +
*** Investigate VPP use cases proposals in containers - Tianyu
 +
**** 3 use cases are investigated.
 +
**** Will explore the memif logic and share the progress.
 +
**** Will share the link on details about how to run VPP in container.
 +
*** Investigate VPP agent usage - Tianyu
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
'''01/19/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
**** CSIT official release 20.09 is available
 +
***** https://docs.fd.io/csit/rls2009/report/
 +
***** Jieqiang will compare the performance data with release 20.09
 +
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
 +
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP.
 +
****** Check the number for CSIT 2101 release 
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Hardware configurations/wiring are done; Physical connection to the TG is done.
 +
**** almost done, two steps need to be done
 +
***** start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** Take the execution time into consideration if we want run release testing on 2n-thx2.
 +
***** It takes 9 hours to finish the one round testing.
 +
***** Tests are running fine
 +
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
 +
****** Suitable time to run release testing on 2n-tx2 testbed.
 +
****** Will investigate IPSec test cases on 2n-tx2 - Juraj
 +
****** Add memif test case to 2n-tx2 once the release testing is done.
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Will ask Dave if he needs help with testing CentOS-8 on Arm - Juraj.
 +
****** Will verify the image uploaded by Dave if it is ready.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
***** Implementation is ready, and will do test it with actual patches.
 +
***** Apply file locking mechanism to allow that only one VPP instances are running.
 +
****** https://gerrit.fd.io/r/c/csit/+/30425
 +
****** Patches are under review
 +
****** Machiek raised the ticket to get intel people involved
 +
****** Will not update the firmaware because the release testing is ongoing
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features inside QEMU docker
 +
**** Latest VPP binary crash on the QEMU docker - Lijian
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
**** Analysis of benchmarking results for Ampere Altra
 +
***** A lot of context switches occur on Ampere Altra compared to N1SDP
 +
***** perf tools used to capture the perf events
 +
***** Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Investigate VPP memif test cases in VPP CSIT - Jieqiang
 +
*** Investigate VPP use cases proposals in containers - Tianyu
 +
**** 3 use cases are investigated.
 +
**** Will explore the memif logic and share the progress.
 +
**** Will share the link on details about how to run VPP in container.
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
 
 +
 
 +
'''01/05/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
**** CSIT official release 20.09 is available
 +
***** https://docs.fd.io/csit/rls2009/report/
 +
***** Jieqiang will compare the performance data with release 20.09
 +
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
 +
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP. 
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Hardware configurations/wiring are done; Physical connection to the TG is done.
 +
**** almost done, two steps need to be done
 +
***** start with basic L2/L3/IPSec/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** Take the execution time into consideration if we want run release testing on 2n-thx2.
 +
***** Tests are running fine
 +
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
 +
****** Suitable time to run release testing on 2n-tx2 testbed.
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Will ask Dave if he needs help with testing CentOS-8 on Arm - Juraj.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
***** Implementation is ready, and will do test it with actual patches.
 +
***** Apply file locking mechanism to allow that only one VPP instances are running.
 +
****** https://gerrit.fd.io/r/c/csit/+/30425
 +
****** Patches are under review
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features inside QEMU docker
 +
**** Latest VPP binary crash on the QEMU docker - Lijian
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Investigate VPP test cases in VPP CSIT - Jieqiang
 +
*** Investigate VPP use cases proposals in containers - Tianyu
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''12/22/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
** Will cancel the meeting on Dec 29th;
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
**** CSIT official release 20.09 is available
 +
***** https://docs.fd.io/csit/rls2009/report/
 +
***** Jieqiang will compare the performance data with release 20.05
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Hardware configurations/wiring are done; Physical connection to the TG is done.
 +
**** almost done, two steps need to be done
 +
***** codes to update Jenkins job needs to be merged
 +
***** start with basic L2/L3/IPSec/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** Take the execution time into consideration if we want run release testing on 2n-thx2.
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Will ask Dave if he needs help with testing CentOS-8 on Arm - Juraj.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
***** Implementation is ready, and will do test it with actual patches.
 +
***** Apply file locking mechanism to allow that only one VPP instances are running.
 +
****** https://gerrit.fd.io/r/c/csit/+/30425
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** LF will provide QSFP+ fiber switch for FD.io lab.
 +
**** Basically done. LF just procured the existing fiber switch currently rent by Arm in FD.io lab.
 +
**** Send the progress to relavent people in Arm - Lijian
 +
**** Confirm with Tina to ensure Arm is not charged - Lijian
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features on VPP CSIT
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
 
 +
'''12/15/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
** Will cancel the meeting on Dec 29th;
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
**** CSIT official release 20.09 is available
 +
***** https://docs.fd.io/csit/rls2009/report/
 +
***** Jieqiang will compare the performance data with release 20.05
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Hardware configurations/wiring are done; Physical connection to the TG is done.
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
***** Implementation is ready, and will do test it with actual patches.
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maitainers
 +
*** LF will provide QSFP+ fiber switch for FD.io lab.
 +
**** Basically done. LF just procured the existing fiber switch currently rent by Arm in FD.io lab.
 +
**** Send the progress to relavent people in Arm - Lijian
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
 
 +
'''12/08/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
*** Use the spare TX2 server as 2-node topology performance test-bed.
 +
**** Hardware configurations/wiring are done.
 +
**** Physical connection to the TG is done.
 +
**** Software installation for the perf tests is pending.
 +
**** Execution time is much slower due to thunderx
 +
***** Code changes related to SSH calls speed up 4x.
 +
** VPP Path
 +
*** Dave will add CentOS-8 Jenkins on Arm job
 +
*** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** Working with VPP/DPDK/Intel to root cause this issue. - Juraj
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maitainers
 +
*** LF will provide QSFP+ fiber switch for FD.io lab.
 +
**** Vexxhost just has a spare one, and LF will buy it for FD.io lab, which will probably happen this month.
 +
*** N1SDP shipment to FD.io
 +
**** Govind will track the status
 +
*** CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
 +
**** Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
 +
**** Arm is required to present Arm achievement and plan to TSC.
 +
***** Govind will prepare the slides
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** SOC id will be available on /proc entry starting from kernel version 5.9
 +
**** Will investigate the details - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Benchmarked cross-connect and TX queue is dropping packets
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** To optimize ethernet-input and avf-input node with NEON intrinsics
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals upstreamed
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** Have to repeat the testing in the future.
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Ampere altra server has some PCIe bugs.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''12/1/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/ - Done
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/ - Done
 +
*** 20% perf-drop with L2 learning 1Mx flows, 4T4C, in release-2005
 +
**** Issue caused by - https://gerrit.fd.io/r/c/vpp/+/26549 - Sync up with Lijian
 +
*** Perf data capture for CSIT official release is done, so MRR testing with Taishan server is resolved.
 +
**** Huge-pages are not configured on Taishan, or previous 4K huge-pages are not enough.
 +
***** The issues are gone with 32k huge pages configured on the Taishan servers.
 +
**** Some random failed test cases due to SSH connection failures.
 +
*** Use the spare TX2 server as 2-node topology performance test-bed.
 +
**** Hardware configurations/wiring are done.
 +
**** Physical connection to the TG is done.
 +
**** Software installation for the perf tests is pending.
 +
**** Execution time is much slower due to thunderx
 +
***** Code changes related to SSH calls speed up 4x.
 +
** VPP Path
 +
*** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022 - auto-generate docker image
 +
**** Will keep the CentOS 7 with master branch.
 +
** VPP Device
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
*** LF will provide QSFP+ fiber switch for FD.io lab.
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** To enable voting right for the VPP device jobs. - Juraj
 +
***** Failed tests due to sw_interface_dump api issue. - Juraj
 +
**** VPP device job is unstable
 +
***** Race condition occurs when multiple VPP instances are starting.
 +
***** Will try to update the i40e driver & firmware.
 +
*** N1SDP shipment to FD.io
 +
**** Govind will update the shippment status to Juraj and Machiek.
 +
**** Will still have to ship N1SDP to FD.io lab, and Machiek confirmed that there will be enough rack space.
 +
*** CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
 +
**** Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
 +
**** Arm is required to present Arm achievement and plan to TSC.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** SOC id will be available on /proc entry starting from kernel version 5.9
 +
**** Will investigate the details - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** To optimize ethernet-input and avf-input node with NEON intrinsics
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 proposal
 +
*** Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
 +
*** Patches are upstreamed for comments
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Ampere altra server has some PCIe bugs.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches with ipsec-out node
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''11/24/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
*** 20% perf-drop with L2 learning 1Mx flows, 4T4C, in release-2005
 +
**** Issue caused by - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** Perf data capture for CSIT official release is done, so MRR testing with Taishan server is resolved.
 +
**** Huge-pages are not configured on Taishan, or previous 4K huge-pages are not enough.
 +
*** Use the spare TX2 server as 2-node topology performance test-bed.
 +
**** Hardware configurations/wiring are done.
 +
** VPP Path
 +
*** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022 - auto-generate docker image
 +
** VPP Device
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
*** LF will provide QSFP+ fiber switch for FD.io lab.
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** To enable voting right for the VPP device jobs. - Juraj
 +
***** Failed tests due to sw_interface_dump api issue. - Juraj
 +
*** N1SDP shipment to FD.io
 +
**** Govind will update the shippment status to Juraj and Machiek.
 +
**** Will still have to ship N1SDP to FD.io lab, and Machiek confirmed that there will be enough rack space.
 +
*** CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
 +
**** Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
 +
**** Arm is required to present Arm achievement and plan to TSC.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** SOC id will be available on /proc entry starting from kernel version 5.9
 +
**** Will investigate the details - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** To optimize ethernet-input and avf-input node with NEON intrinsics
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 proposal
 +
*** Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
 +
*** Patches are upstreamed for comments
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Ampere altra server has some PCIe bugs.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches with ipsec-out node
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''11/17/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
*** 20% perf-drop with L2 learning 1Mx flows, 4T4C, in release-2005
 +
**** Issue caused by - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** Use the spare TX2 server as 2-node topology performance test-bed.
 +
**** Juraj to check with Peter about the feasibility.
 +
** VPP Path
 +
*** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022 - auto-generate docker image
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** To enable voting right for the VPP device jobs. - Juraj
 +
***** Failed tests due to sw_interface_dump api issue. - Juraj
 +
*** N1SDP shipment to FD.io
 +
**** Will still have to ship N1SDP to FD.io lab, and Machiek confirmed that there will be enough rack space.
 +
*** CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
 +
**** Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
 +
**** Arm is required to present Arm achievement and plan to TSC.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** SOC id will be available on /proc entry starting from kernel version 5.9
 +
**** Will investigate the details - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** To optimize ethernet-input and avf-input node with NEON intrinsics
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 proposal
 +
*** Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
 +
*** Patches are upstreamed for comments
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Ampere altra server has some PCIe bugs.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches with ipsec-out node
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''11/10/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
***** Already done by juraj, the data is published on CSIT 2009 report.
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
**** Repeat tests on local N1SDP and cascade server. - Jieqiang
 +
**** Repeat the test case with latest master branch. - Jieqiang
 +
**** The patch introduced this perf drop need to be analyzed. - Jieqiang, Lijian
 +
**** This patch needs to be analysed on VPP 2005 and 2001 releases. - Jieqiang, Lijian
 +
**** The perf drop rate is ~5-8% on latest VPP code compared to the original data.
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
**** The iterative jobs for VPP 2009 are still running.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-report-iterative-2009-3n-tsh/
 +
**** Still running for one more weeks.
 +
**** Still running for more time due to Jenkins issues like Jenkins restart.
 +
**** Daily performance jobs only run MRR tests, while iterative jobs run MRR tests and NDR/PDR tests, etc, which takes longer time.
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
**** Move the thx2 to the same rack for tg and install the same nic on tg.
 +
**** 1g NIC for management installed on thx2, but cannot be net-booted.
 +
***** Able to net-boot from the built-in 10G NIC.
 +
***** The tx2 has been moved to the same rack where the tg is located.
 +
***** Plan to set up the weekly perf tests on the new topo.
 +
**** Port the robotframe configuration steps for tsh testbeds from thx1 to thx2 to speed up perf tests. - Juraj
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022 automate the generation of docker builder images.
 +
**** Plan to drop the support for CentOS 7 from Dave.
 +
**** Tried Dave's patch to generate docker image on Arm and saw some errors. - Juraj
 +
**** Test arm centos7 jenkins builder image. - Juraj.
 +
**** Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
**** Figure out which host of two hosts to run the Jenkins job.
 +
**** Revert to old kernel version 4.15.0-55 to avoid AVF issue.
 +
**** AVF issue is common across the platform.
 +
***** Differences between avf driver versions may be the root cause of behavior changes.
 +
**** New VPP device job takes about 55 minutes to finish, which needs to be reduced to 45 minutes around.
 +
***** Python runs slower on new thx2 servers than 1-node skylake.
 +
***** Try new version of Python(such as 3.8) or split the device tests into two parts.
 +
***** Check how many CPUs get utilized for robot framework execution on thx2 server.
 +
***** Two thunderx2 are running fine right now and the VPP device jobs are almost done.
 +
***** Disabling hyperthreading on new thx2 will speed up the VPP device tests.
 +
***** Enable the voting right for the VPP device jobs. - Juraj
 +
****** Failed tests due to sw_interface_dump api issue. - Juraj
 +
*** N1SDP shippment to FD.io
 +
**** Get response from Maciek about the rack space and traffic generator availability.
 +
*** CSIT budget plan for 10g switch purchase in FD.io lab. - Juraj, Tina
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
*** Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
 +
*** Summarize the meeting minutes and action items. - Lijian
 +
*** SOC id will be available on /proc entry from kernel version 5.9 - Ljian, Honnappa
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
*** Start benchmarking AVF PMD driver in VPP on N1SDP.
 +
*** Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
 +
*** avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
 +
** SVE/SVE2 proposal
 +
*** Will send email to Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
*** No further comments from VPP community.
 +
*** Apply the SVE/SVE2 intrinsics on refactoring ethernet-input node. - Lijian
 +
**** SVE/SVE2 functionality to be tested on the new development platform.
 +
**** Verify SVE/SVE2 code changes on simulator.
 +
**** Try to run standalone SVE codes on the new FPGA platform.
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
 +
** Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Ampere altra server has some PCIe bugs.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
*** Find out the tuned configuration for cross connect test cases using AVF PMD driver.
 +
**** Figure out corresponding configurations in CSIT scripts.
 +
**** Repeat the ACL ingress SL test cases locally for N1SDP.
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Plans
 +
 
 +
'''11/03/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
**** Repeat tests on local N1SDP and cascade server. - Jieqiang
 +
**** Repeat the test case with latest master branch. - Jieqiang
 +
**** The patch introduced this perf drop need to be analyzed. - Jieqiang, Lijian
 +
**** Look into the patch to get some ideas about the code changes. - Jieqiang, Lijian
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
**** The iterative jobs for VPP 2009 are still running.
 +
**** Still running for one more weeks.
 +
***** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-report-iterative-2009-3n-tsh/
 +
***** Daily performance jobs only run MRR tests, while iterative jobs run MRR tests and NDR/PDR tests, etc, which takes longer time.
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
**** Move the thx2 to the same rack for tg and install the same nic on tg.
 +
**** 1g NIC for management installed on thx2, but cannnot be net-booted.
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022 automate the generation of docker builder images.
 +
**** Test arm centos7 jenkins builder image. - Juraj.
 +
**** Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
**** Figure out which host of two hosts to run the Jenkins job.
 +
**** Revert to old kernel version 4.15.0-55 to avoid AVF issue.
 +
**** AVF issue is common across the platform.
 +
***** Differences between avf driver versions may be the root cause of behavior changes.
 +
**** New VPP device job takes about 55 minutes to finish, which needs to be reduced to 45 minutes around.
 +
***** Python runs slower on new thx2 servers than 1-node skylake.
 +
***** Try new version of Python(such as 3.8) or split the device tests into two parts.
 +
***** Check how many CPUs get utilized for robot framework execution on thx2 server.
 +
***** Two thunderx2 are running fine right now and the VPP device jobs are almost done.
 +
*** N1SDP shippment to FD.io
 +
**** Get response from Machiek about the rack space and traffic generator avalability.
 +
*** CSIT budget plan for 10g switch purchase in FD.io lab. - Juraj, Tina
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
*** Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
 +
*** Summarize the meeting minutes and action items. - Lijian
 +
*** SOC id will be available on /proc entry from kernel version 5.9 - Ljian, Honnappa
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
*** Start benchmarking AVF PMD driver in VPP on N1SDP.
 +
*** Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
 +
*** avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
 +
** SVE/SVE2 proposal
 +
*** Will send email to Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
*** No further comments from VPP community.
 +
*** Apply the SVE/SVE2 intrinsics on refractoring ethernet-input node. - Lijian
 +
**** SVE/SVE2 functionality to be tested on the new development platform.
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
 +
** Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
*** Find out the tuned configuration for cross connect test cases using AVF PMD driver.
 +
**** Figure out corresponding configurations in CSIT scripts
 +
**** Repeat the ACL ingress SL test cases locally for N1SDP.
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind.
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
** Plans
 +
 
 +
'''10/27/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
**** Repeat tests on local N1SDP and cascade server. - Jieqiang
 +
**** Look into the patch to get some ideas about the code changes. - Jieqiang, Lijian
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
**** The iterative jobs for VPP 2009 are still running.
 +
**** Still running for one or two weeks.
 +
***** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-report-iterative-2009-3n-tsh/
 +
***** Daily performance jobs only run MRR tests, while iterative jobs run MRR tests and NDR/PDR tests, etc, which takes longer time.
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
**** Move the thx2 to the same rack for tg and install the same nic on tg.
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
**** Figure out which host of two hosts to run the Jenkins job.
 +
**** Revert to old kernel version 4.15.0-55 to aviod AVF issue.
 +
***** Differences between avf driver versions may be the root cause of behavior changes.
 +
**** New VPP device job takes about 55 minutes to finish, which needs to be reduced to 40 minutes around.
 +
***** Python runs slower on new thx2 servers than 1-node skylake.
 +
***** Try new version of Python(such as 3.8) or split the device tests into two parts.
 +
***** Check how many CPUs get utilized for robot framework execution on thx2 server.
 +
 
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
*** Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
 +
*** Summarize the meeting minutes and action items. - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
*** Start benchmarking AVF PMD driver in VPP on N1SDP.
 +
*** Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
 +
*** avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
 +
** SVE/SVE2 proposal
 +
*** Will send email to Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
*** No further comments from VPP community.
 +
*** Apply the SVE/SVE2 on ethernet-input node. - Lijian
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server. - Jieqiang
 +
** Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
**** Figure out corresponding configurations in CSIT scripts
 +
**** Repeat the ACL ingress SL test cases locally for N1SDP.
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
** Plans
 +
 
 +
'''10/20/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
**** The iterative jobs for VPP 2009 are still running.
 +
***** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-report-iterative-2009-3n-tsh/
 +
***** Daily performance jobs only run MRR tests, while iterative jobs run MRR tests and NDR/PDR tests and etc, which takes longer time.
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
 +
**** Errors happen when running latest VPP debug image, which was introduced by https://gerrit.fd.io/r/c/vpp/+/29490 - Lijian
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
**** Figure out which host of two hosts to run the Jenkins job.
 +
**** Two failed test cases related to AVF plugin.
 +
***** The root cause is the newer kernel version - 4.15.0-118-generic fails, 4.15.0-72-generic works.
 +
***** Downgrade the kernel version to 4.15.0-72-generic and continue the VPP device testing.
 +
***** Try the same experiment on X86 to see if this issue is arm-specific or not. - Juraj
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
*** Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
*** Start benchmarking AVF PMD driver in VPP on N1SDP.
 +
*** Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
 +
** SVE/SVE2 proposal
 +
*** Will send email to Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
*** No further comments from VPP community.
 +
** Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
 +
*** Finished the benchmarking and shared the data to team. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
**** Figure out corresponding configurations in CSIT scripts
 +
**** Repeat the ACL ingress SL test cases locally for N1SDP.
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
** Plans
 +
 
 +
'''10/13/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
**** Figure out which host of two hosts to run the Jenkins job.
 +
**** Two failed test cases related to AVF plugin.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
*** Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
** SVE/SVE2 proposal
 +
*** Will send email Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
*** No further comments from VPP community.
 +
** Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
 +
*** Finished the benchmarking and shared the data to team.
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
**** Figure out corresponding configurations in CSIT scripts
 +
**** Repeat the ACL ingress SL test cases locally for N1SDP.
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
** Plans
 +
 
 +
'''10/06/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs and other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
** SVE/SVE2 proposal
 +
*** Will send email Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
** Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
 +
*** Finished the benchmarking and shared the data to team.
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
*** Figure out corresponding configurations in CSIT scripts
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
** Plans
 +
 
 +
'''09/29/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate Vendor CPUs with other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
** SVE/SVE2 proposal
 +
*** Will send email Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
** Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
 +
*** Finished the benchmarking and shared the data to team.
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
*** Figure out corresponding configurations in CSIT scripts
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
** Plans
 +
 
 +
'''09/22/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
**
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
** VPP Path
 +
*** VexxHost will replace the faulty RAM with a new one, and get the expense reimbursed by LF.
 +
**** Issue is resolved by replugining back the previous RAM, and server is alive now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** Add CentOS-7 on Arm - Second step;
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
** VPP Device
 +
*** 3x SoftIron servers will be decommissioned directly to free rack space for 2x ThunderX2 servers.
 +
*** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
*** VexxHost people will setup the servers and provide IP connectivity.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate Vendor CPUs with other Perseus CPUs
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
** SVE/SVE2 proposal
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
** Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
*** Figure out corresponding configurations in CSIT scripts
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
** Plans
 +
 
 +
'''09/15/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
*** On L2 learning 1Mx flows, 4T4C, with release-2005, there is about 20% performance drop.
 +
**** The patch caused this issue has been identified.
 +
***** https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
** VPP Path
 +
*** LF will pay for the expense, and Vexhost has or will make the order for new RAM module.
 +
*** Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
*** Check with Juraj with the latest news about the faulty RAMs.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - first step;
 +
**** Add CentOS-7 on Arm will be second step.
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
**** Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
 +
** VPP Device
 +
*** We can decommission 3x SoftIron servers directly, but for the existing ThunderX2 servers, the decommission with it could be temporarily. We probably will reinstall it in the near future.
 +
*** Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
 +
** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
** Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
 +
** Budget plan for CSIT FD.io lab.
 +
*** We have enough servers for VPP path & device tests.
 +
*** We can ask the CSIT FD.io lab folks for saving rack space for arm servers.
 +
*** We may plan to send new advanced servers for perf tests in future but we won't mention the specific server type.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** Vendor CPU server enablement in VPP - Lijian
 +
*** Ready for internal review
 +
*** Will discuss with VPP maintainer
 +
** Investigate VPP Intel AVF driver - Lijian
 +
** SVE
 +
*** SVE intrinsics wrapper is done. Proposal patch is ready for review.
 +
*** ACLE, architecture, sve-sve2-programming-example
 +
*** SVE intrinsics is preferred.
 +
*** Share dpdk team with SVE knowledge.
 +
** Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
 +
*** Will repeat scalability testing on N1SDP.
 +
** Benchmark AVF driver btw Cascade Lake and N1SDP - Jieqiang
 +
*** Will investigate AVF drivers on Arm. - Lijian
 +
** Jieqiang will figure out performance data for 1x, 10Kx flows on Cascade Lake in CSIT.
 +
*** Conform if the system is same for the local dell server and cascade server in CSIT. - Jieqiang
 +
*** Check if there are any test cases with 1t1c/2t2c/4t4c configured for 2n-clx testbed in CSIT - Jieqiang
 +
*** Performance data; Configurations;
 +
** VM2VM
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Started system tuning on PMD TX direction.
 +
*** Investigate mempool configuration.
 +
*** Change the descriptor size by modifying the DPDK source code.
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
 
 +
'''09/08/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
*** On L2 learning 1Mx flows, 4T4C, with release-2005, there is about 20% performance drop.
 +
**** The patch caused this issue has been identified.
 +
***** https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
** VPP Path
 +
*** LF will pay for the expense, and Vexhost has or will make the order for new RAM module.
 +
*** Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - first step;
 +
**** Add CentOS-7 on Arm will be second step.
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
**** Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
 +
** VPP Device
 +
*** We can decommission 3x SoftIron servers directly, but for the existing ThunderX2 servers, the decommission with it could be temporarily. We probably will reinstall it in the near future.
 +
*** Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
 +
** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
** Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** SVE
 +
*** SVE intrinsics wrapper is done. Proposal patch is ready for review.
 +
*** ACLE, architecture, sve-sve2-programming-example
 +
*** SVE intrinsics is preferred.
 +
** Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
 +
*** Will repeat scalability testing on N1SDP.
 +
** Benchmark AVF driver btw Cascade Lake and N1SDP - Jieqiang
 +
*** Will investigate AVF drivers on Arm. - Lijian
 +
** Jieqiang will figure out performance data for 1x, 10Kx flows on Cascade Lake in CSIT.
 +
*** Performance data; Configurations;
 +
** VM2VM
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Started system tuning on PMD TX direction.
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
 
 +
'''09/01/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** On L2 learning 1Mx flows, 4T4C, with release-2005, there is about 20% performance drop.
 +
**** The patch caused this issue has been identified.
 +
***** https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
** VPP Path
 +
*** Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
**** Seems plugin working RAMs into empty slots will resolve the problem.
 +
**** Juraj will send email to Machiek about the ownership of any FD.io lab servers, and who should pay for the charge.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
**** IPMI IP is configured via SSH Linux prompt. It's working fine now.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
**** Pending with Vexx host to proceed further.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
**** Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
 +
** VPP Device
 +
*** Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
 +
** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
** Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
 +
* VPP
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
*** gcc-10.1.0 has compiling errors with latest VPP source code.
 +
**** This issue is fixed by Jieqiang and available for internal review.
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** gcc-10 compiling issue is resolved and merged.
 +
** SVE
 +
*** ACLE, architecture, sve-sve2-programming-example
 +
*** SVE intrinsics is preferred.
 +
** Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
 +
** VM2VM
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Started system tuning on PMD TX direction.
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
 
 +
'''08/25/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** On L2 learning 1Mx flows, 4T4C, with release-2005, there is about 20% performance drop.
 +
**** Jieqiang is trying to narrow down the patch that causes the issue.
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
** VPP Path
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
**** Seems plugin working RAMs into empty slots will resolve the problem.
 +
**** Juraj will send email to Machiek about the ownership of any FD.io lab servers, and who should pay for the charge.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
**** IPMI IP is configured via SSH Linux prompt. It's working fine now.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
**** Pending with Vexx host to proceed further.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
**** Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
 +
** VPP Device
 +
*** Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
 +
** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
** Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
 +
* VPP
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
*** gcc-10.1.0 has compiling errors with latest VPP source code.
 +
**** This issue is fixed by Jieqiang and available for internal review.
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** SVE
 +
*** ACLE, architecture, sve-sve2-programming-example
 +
*** SVE intrinsics is preferred.
 +
** Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
 +
** VM2VM
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
 
 +
 
 +
'''08/18/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Jieqiang is investigating some performance drop (between 2005 and 2008 releases) cases on Taishan servers.
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
*** Jieqiang will share investigation report, but so far there is no apparent performance differences.
 +
**** Some 4T4C test cases on Taishan have obvious performance drop. Will compare the trending with x86 machines.
 +
** VPP Path
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
**** Pending with Vexx host to proceed further.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
**** Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
 +
** VPP Device
 +
** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
** Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
*** Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
 +
*** Investigating vlib_timer and timer wheel in VPP.
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
*** gcc-10.1.0 has compiling errors with latest VPP source code.
 +
**** This issue is fixed by Jieqiang and available for internal review.
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
 
 +
 
 +
'''08/11/2020'''
 +
* Attendees
 +
** Honnappa Nagarahalli
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
** Filip Varga
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Jieqiang is investigating some performance drop cases on Taishan servers.
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
*** Jieqiang will share investigation report, but so far there is no apparent performance differences.
 +
**** Some 4T4C test cases on Taishan have obvious performance drop. Will compare the trending with x86 machines.
 +
** VPP Path
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
*** Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
 +
*** Investigating vlib_timer and timer wheel in VPP.
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
*** gcc-10.1.0 has compiling errors with latest VPP source code.
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
 
 +
'''08/04/2020'''
 +
* Attendees
 +
** Honnappa Nagarahalli
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
** Filip Varga
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
*** Jieqiang will share investigation report, but so far there is no apparent performance differences.
 +
**** Some 4T4C test cases on Taishan have obvious performance drop. Will compare the trending with x86 machines.
 +
** VPP Path
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
*** Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
 +
*** Investigating vlib_timer and timer wheel in VPP.
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
*** gcc-10.1.0 has compiling errors with latest VPP source code.
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
 
 +
 
 +
'''07/28/2020'''
 +
* Attendees
 +
** Honnappa Nagarahalli
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
*** Jieqiang will share investigation report, but so far there is no apparent performance differences.
 +
*** VPP performance testing is running once a week.
 +
** VPP Path
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
*** Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
 +
*** Investigating vlib_timer and timer wheel in VPP.
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify  the patch.
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
 
 +
'''07/21/2020'''
 +
* Attendees
 +
** Honnappa Nagarahalli
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
*** VPP performance testing is running once a week.
 +
** VPP Path
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Arm has
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
*** Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
 +
*** Investigating vlib_timer and timer wheel in VPP.
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
** N1SDP enablement. - Lijian
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify  the patch.
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
 
 +
 
 +
'''07/14/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VPP performance testing is running once a week.
 +
*** Community has started collecting performance data with these CSIT machines.
 +
** VPP Path
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
 +
**** Two of the three ThunderX1 servers cannot be accessed.
 +
**** Spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
 +
**** Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
*** Investigating various No. of rx_q_bufs & tx_q_bufs
 +
*** Investigating various No. of vector size, and check its affection on throughput
 +
*** Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Investigating using SPE counters to profile ACL plugin bottle-neck
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''07/07/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VPP performance testing is running once a week.
 +
*** Community has started collecting performance data with these CSIT machines.
 +
** VPP Path
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
 +
**** Two of the three ThunderX1 servers cannot be accessed.
 +
**** Spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
 +
**** Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
*** Investigating various No. of rx_q_bufs & tx_q_bufs
 +
*** Investigating various No. of vector size, and check its affection on throughput
 +
*** Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Investigating using SPE counters to profile ACL plugin bottle-neck
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
 
 +
'''06/30/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VPP performance testing is running once a week.
 +
*** Community has started collecting performance data with these CSIT machines.
 +
** VPP Path
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
 +
**** Two of the three ThunderX1 servers cannot be accessed.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
 +
**** Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
*** Investigating various No. of rx_q_bufs & tx_q_bufs
 +
*** Investigating various No. of vector size, and check its affection on throughput
 +
*** Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Investigating using SPE counters to profile ACL plugin bottle-neck
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''06/23/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VPP performance testing is running once a week.
 +
*** Community has started collecting performance data with these CSIT machines.
 +
** VPP Path
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** Two of the three ThunderX1 servers cannot be accessed.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
 +
**** Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** L3FWD status
 +
** CSIT status
 +
** EPIC plan
 +
*** SVE2 investigation in VPP;
 +
*** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** Profiling with NMU-600 counters.
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''06/16/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VPP performance testing is running once a week.
 +
*** Community has started collecting performance data with these CSIT machines.
 +
** VPP Path
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
**** Patch is merged.
 +
**** libssl.so is missing in dependencies in vpp Makefile.
 +
**** Committed internal code review to address the issue - https://gerrit.oss.arm.com/#/c/162878/
 +
**** using gcc-9.3 now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
 +
**** labelled by Dave Wallace to use it for VPP Jenkins job.
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** Profiling with NMU-600 counters.
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''06/09/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VPP performance testing is running once a week.
 +
*** Community will collect performance data with these CSIT machines.
 +
*** IPSec tunnel configuration issue.
 +
**** Issue is resolved.
 +
**** Also failing on x86. CSIT maintainer is trying to root cause the problem.
 +
***** Juraj to run the IPSec regression on Taishan server with the IPSec patch.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
**** Patch is merged.
 +
**** libssl.so is missing in dependencies in vpp Makefile.
 +
**** Committed internal code review to address the issue - https://gerrit.oss.arm.com/#/c/162878/
 +
**** using gcc-9.3 now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
 +
**** labelled by Dave Wallace to use it for VPP Jenkins job.
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** Profiling with NMU-600 counters.
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''06/02/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** IPSec tunnel configuration issue.
 +
**** Also failing on x86. CSIT maintainer is trying to root cause the problem.
 +
***** Juraj to run the IPSec regression on Taishan server with the IPSec patch.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
**** libssl.so is missing in dependencies in vpp Makefile.
 +
**** Committed internal code review to address the issue - https://gerrit.oss.arm.com/#/c/162878/
 +
**** using gcc-9.3 now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
 +
**** labelled by Dave wallace to use it for VPP Jenkins job.
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''05/26/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** IPSec tunnel configuration issue.
 +
**** Also failing on x86. CSIT maintainer is trying to root cause the problem.
 +
***** Juraj to run the IPSec regression on Taishan server with the IPSec patch.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
**** libssl.so is missing in dependencies in vpp Makefile.
 +
**** Committed internal code review to address the issue - https://gerrit.oss.arm.com/#/c/162878/
 +
**** using gcc-9.3 now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
 +
**** labelled by Dave wallace to use it for VPP Jenkins job.
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers. Jieqiang will setup a meeting with Juraj regarding this documentation.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
** N1SDP enablement. - Lijian
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
 
 +
'''05/19/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** the other failure is related with VPP image on Arm, IPSec tunnel configuration issue.
 +
**** Also failing on x86. CSIT maintainer is trying to root cause the problem.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
**** libssl.so is missing in dependencies in vpp Makefile.
 +
**** Committed internal code review to address the issue - https://gerrit.oss.arm.com/#/c/162878/
 +
**** using gcc-9.3 now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
** Ed Kern - Install nomad service in those two servers - Juraj & Jieqiang
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
** N1SDP enablement. - Lijian
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''04/28/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Two failures in performance testing
 +
**** one failure is related with CSIT script, NAT44 is common issue, failing with x86 also.
 +
***** Has been fixed already.
 +
**** the other failure is related with VPP image on Arm, IPSec tunnel configuration issue.
 +
***** Also failing on x86. CSIT maintainer is trying to root cause the problem.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
** Ed Kern - Install nomad service in those two servers - Juraj & Jieqiang
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** Resolve VPP compiling issue with clang-6.
 +
*** Patch (https://gerrit.fd.io/r/c/vpp/+/26949) is merged.
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
** N1SDP enablement. - Lijian
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is merged.
 +
**** https://gerrit.fd.io/r/c/vpp/+/26804
 +
*** IOMMU limitation issue is gone after upgrade the kernel and fw
 +
**** Share kernel/fw upgrade version to Govind
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''04/28/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
** Arthur Marshall
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Two failures in performance testing
 +
**** one failure is related with CSIT script, NAT44 is common issue, failing with x86 also.
 +
**** the other failure is related with VPP image on Arm, IPSec tunnel configuration issue.
 +
*** iommu_passthrough=1 does not make any differences on Taishan server - Lijian
 +
*** We cannot do kernel upgrade with Ubuntu-18.04.1/Ubuntu-18.04.2/Ubuntu-18.04.3/Ubuntu-18.04.4 on Taishan.
 +
**** For now, can the kernel of Taishan server be left as it is now, linux-4.15.0.54. - Juraj
 +
**** One possible option/improvement is to port FD.io CSIT performance testing to some more advanced Arm servers, e.g., Ampere
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will send email to community about two options to resolve gcc-7 issue with CentOS-7
 +
***** 1. update gcc-7 requirement to gcc-8 in Makefile
 +
***** 2. remove gcc-7 limitation in Makefile, and get user install gcc-8 manually
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** https://gerrit.oss.arm.com/#/c/160812/
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1SDP enablement. - Lijian
 +
** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
**** This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
*** The degradation is seen even when L3 cache is enabled.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
**** Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.
 +
 
 +
'''04/21/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** iommu_passthrough=1 does not make any differences on Taishan server - Lijian
 +
*** We cannot do kernel upgrade with Ubuntu-18.04.1/Ubuntu-18.04.2/Ubuntu-18.04.3/Ubuntu-18.04.4 on Taishan.
 +
**** For now, can the kernel of Taishan server be left as it is now. Please confirm with Peter. - Juraj
 +
**** One possible option/improvement is to port FD.io CSIT performance testing to some more advanced Arm servers, e.g., Ampere
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** CentOS-8 is working fine. Will try CentOS-7 later.
 +
**** Is there any gcc version requirement in VPP official release?
 +
**** AES instructions in VPP source code requires gcc version newer than gcc-8.
 +
**** 'make install-deps' failure with CentOS-7 on Arm.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1SDP enablement. - Lijian
 +
** gcc-10 is not working so far.
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
**** This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
*** The degradation is seen even when L3 cache is enabled.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
**** Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.
 +
 
 +
'''04/14/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
 +
**** Ubuntu-18.04 lts version is supposed to be kernel 4.15.72?
 +
**** Will try fresh install with local Taishan servers.
 +
***** Will try with Ubuntu-18.04.1/Ubuntu-18.04.2/Ubuntu-18.04.3/Ubuntu-18.04.4
 +
***** Will do fresh installation with Ubuntu-18.04.2 and then install kernel 4.15.72
 +
** VPP Path
 +
*** Try iommu_passthrough=1 in Taishan servers and see if it makes any differences - Lijian
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** CentOS-8 is working fine. Will try CentOS-7 later.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
**** This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
*** The degradation is seen even when L3 cache is enabled.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
**** Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.
 +
 
 +
'''04/07/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
 +
**** Ubuntu-18.04 lts version is supposed to be kernel 4.15.72?
 +
**** Will try cobbler with local Taishan servers, to try fresh install.
 +
***** Jieqiang will try fresh installation of kernel 4.15.72 in local Taishan through cobbler.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
**** Jieqiang updated docker file locally to add centOS as part of CI and facing some issues.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
**** Need 2 Thunderx2 servers to run the jobs for every VPP/CSIT patch submission instead of every half hour with a new VPP build. The current
 +
**** ThunderX2 server doesn't respond when the jobs are requested to run for every patch submission. No voting rights (+1 from CI) for VPP device
 +
**** suite.
 +
* FD.io lab
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
***** These patches are kept in backlog for now.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
**** This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
*** The degradation is seen even when L3 cache is enabled.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
**** Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.
 +
 
 +
'''03/31/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
 +
**** Ubuntu-18.04 lts version is supposed to be kernel 4.15.72?
 +
**** Will try cobbler with local Taishan servers, to try fresh install.
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** https://docs.fd.io/csit/master/trending/introduction/failures.html#n-tsh
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-tsh/161/archives/log.html.gz
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in Apil.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''03/24/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** We are required to provide justification and use case for cross-compilation for VPP on Arm - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and externally with NXP.
 +
***** make build/build-release TARGET_PLATFORM=n1sdp  // for n1sdp cross compiling
 +
***** make build/build-release  // for generic vpp image
 +
***** make build/build-release TARGET_PLATFORM=native  // for native vpp image
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in Apil.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''03/17/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** We are required to provide justification and use case for cross-compilation for VPP on Arm - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and externally with NXP.
 +
** VPP Device
 +
*** Issue of huge pages running out has been resolved by resetting the servers.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in Apil.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - Usage of MAP is recorded in confluence
 +
*** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Patch is upstreamed for community review
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Confirm if community agrees with patch - Lijian
 +
*** Check how DPDK is detecting numa-id for a specific NIC device - Lijian
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1 SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
**** Sending Govind the steps on installing GCC-9.2.0
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
**** Will sync up with James Yang about cache line fill buffers
 +
** Investigating mtrie data structure on control-plane. - Govind
 +
*** 3% macro-benchmarking by adding prefetches to adj table on ThunderX2
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''03/10/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** We are required to provide justification and use case for cross-compilation for VPP on Arm - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and externally with NXP.
 +
** VPP Device
 +
*** Issue of huge pages running out has been resolved by resetting the servers.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in Apil.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - Usage of MAP is recorded in confluence
 +
*** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Check if detecting the source of SIGPROF is possible - Govind
 +
*** Confirm with Community about the possible solutions to this issue - Lijian
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Confirm if community agrees with patch - Lijian
 +
*** Check how DPDK is detecting numa-id for a specific NIC device - Lijian
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** Fix Makefile issue recently introduced on Arm machine - Jieqiang
 +
*** Patch is upstreamed for code review - https://gerrit.fd.io/r/c/vpp/+/25259
 +
** N1 SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
**** Sending Govind the steps on installing GCC-9.2.0
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
**** Will sync up with James Yang about cache line fill buffers
 +
** Investigating mtrie data structure on control-plane. - Govind
 +
*** 3% macro-benchmarking by adding prefetches to adj table on ThunderX2
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
 
 +
'''03/03/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** We are required to provide justification and use case for cross-compilation for VPP on Arm - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and externally with NXP.
 +
** VPP Device
 +
*** Issue of huge pages running out has been resolved by resetting the servers.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** The current ThunderX2 in Arm lab are pre-production servers.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - Usage of MAP is recorded in confluence
 +
*** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** Fix Makefile issue recently introduced on Arm machine - Jieqiang
 +
*** Patch is upstreamed for code review - https://gerrit.fd.io/r/c/vpp/+/25259
 +
** Investigating memory copy in ip4-rewrite on ThunderX2 - Govind
 +
*** Check the assembly code with other Arm CPU also.
 +
*** Send Govind the memory copy with fixed length.
 +
** N1 SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
**** Sending Govind the steps on installing GCC-9.2.0
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
**** Will sync up with James Yang about cache line fill buffers
 +
** Investigating mtrie data structure on control-plane. - Govind
 +
*** 3% macro-benchmarking by adding prefetches to adj table on ThunderX2
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
 
 +
'''02/25/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Govind will talk with George Zhao for Taishan fw version supporting Meltdown issue.
 +
*** Huawei is investigating which fw version of Taishan server supporting Meltdown issue. Will update with us soon.
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** We are required to provide justification and use case for cross-compilation for VPP on Arm - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and externally with NXP.
 +
** VPP Device
 +
*** Issue of huge pages running out has been resolved by resetting the servers.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** The current ThunderX2 in Arm lab are pre-production servers.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - Usage of MAP is recorded in confluence
 +
*** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** Fix Makefile issue recently introduced on Arm machine - Jieqiang
 +
*** Patch is upstreamed for code review - https://gerrit.fd.io/r/c/vpp/+/25259
 +
** Investigating memory copy in ip4-rewrite on ThunderX2 - Govind
 +
*** Check the assembly code with other Arm CPU also.
 +
*** Send Govind the memory copy with fixed length.
 +
** N1 SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
**** Sending Govind the steps on installing GCC-9.2.0
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
**** Will sync up with James Yang about cache line fill buffers
 +
** Investigating mtrie data structure on control-plane. - Govind
 +
*** 3% macro-benchmarking by adding prefetches to adj table on ThunderX2
 +
 
 +
 
 +
'''02/18/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VM-VHost test failing on 3n-tsh server.
 +
**** Issue with testpmd failure in VM has been resolved and merged.
 +
*** Govind will talk with Geoge for Taishan fw version supporting Meltdown issue.
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** Will discuss about the cross compilation with qemu emulation solution in the monthly VPP call tomorrow - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and with NXP.
 +
*** VPP crash issue on Taishan server is resolved and patch is resolved.
 +
**** ThunderX2 has the same issue and has been resolved also.
 +
** VPP Device
 +
*** Issue of huge pages running out has been resolved by resetting the servers.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Customer engineers claims ThunderX2 does not support i40e intel NIC, which seems not correct.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
**** Patch is updated by adding more comments. - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** Fix Makefile issue recently introduced on Arm machine - Jieqiang
 +
*** Patch is ready for code review.
 +
** Investigating memory in ip4-rewrite on ThunderX2 - Govind
 +
*** Check the assembly with other Arm CPU also.
 +
** N1 SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** Investigating mtrie data structure on control-plane. - Govind
 +
*** 3% macro-benchmarking by adding prefetches to adj table on ThunderX2
 +
 
 +
 
 +
'''02/11/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VM-VHost test failing on 3n-tsh server.
 +
*** Tina to confirm which BIOS version on Taishan server support Meldown.
 +
**** NICs cannot be bound to VFIO_PCI driver in VM which caused the failure.
 +
**** Will try iommu-passthrough=0/1 - Juraj
 +
*** Will confirm with Joyce about this issue - Lijian
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** Will discuss about the cross compilation with qemu emulation solution in the monthly VPP call tomorrow - Juraj
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Customer engineers claims ThunderX2 does not support i40e intel NIC, which seems not correct.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** Fix Makefile issue recently introduced on Arm machine - Jieqiang
 +
** Investigating memory in ip4-rewrite on ThunderX2 - Govind
 +
*** Check the assembly with other Arm CPU also.
 +
 
 +
 
 +
'''02/04/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VM-VHost test failing on 3n-tsh server.
 +
*** Govind to send background details about Taishan kernel upgrade to Tina to confirm with George Zhao.
 +
**** The VM-VHost test cases have never passed before as per the previous logs in Taishan server.
 +
**** Issue is not reproducible locally - VHost/Virtual Ethernet interface creation passes in Taishan server in local setup.
 +
**** Next Steps: Follow up with Peter Mikus to debug the issue in Taishan server in CSIT lab.
 +
*****                Build a local test setup to run the Testpmd application in VM.
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Patch to resolve "show pci" crash issue is merged. Will ask CSIT team to remove the workaround. - Lijian
 +
**** Have upgraded Python2 to Python3 successfully.
 +
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
 +
***** Will send email to CSIT-dev on how to avoid the similar case/issue.
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
**** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** Will investigate 4-5 test failures in 'make test' - Lijian
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Cables for intel NICs have been ordered.
 +
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
 +
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
*** Check if performance tests includes AVF driver or not?
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
 +
*** Will try one patch to enable N1SDP board.
 +
*** Please try AVF with Mcbin if possible.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
 +
** EPIC for next quarter:
 +
 
 +
'''01/28/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VM-VHost test failing on 3n-tsh server.
 +
**** The VM-VHost test cases have never passed before as per the previous logs in Taishan server.
 +
**** Issue is not reproducible locally - VHost/Virtual Ethernet interface creation passes in Taishan server in local setup.
 +
**** Next Steps: Follow up with Peter Mikus to debug the issue in Taishan server in CSIT lab.
 +
*****                Build a local test setup to run the Testpmd application in VM.
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Patch to resolve "show pci" crash issue is merged. Will ask CSIT team to remove the workaround. - Lijian
 +
**** Have upgraded Python2 to Python3 successfully.
 +
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
 +
***** Will send email to CSIT-dev on how to avoid the similar case/issue.
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
**** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** Will investigate 4-5 test failures in 'make test' - Lijian
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Cables for intel NICs have been ordered.
 +
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
 +
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
*** Check if performance tests includes AVF driver or not?
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
 +
*** Will try one patch to enable N1SDP board.
 +
*** Please try AVF with Mcbin if possible.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
 +
** EPIC for next quarter:
 +
 
 +
'''01/21/2020'''
 +
* Attendees
 +
** Tina Tsou
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VM-VHost test failing on 3n-tsh server.
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Patch to resolve "show pci" crash issue is merged. Will ask CSIT team to remove the workaround. - Lijian
 +
**** Have upgraded Python2 to Python3 successfully.
 +
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
 +
***** Will send email to CSIT-dev on how to avoid the similar case/issue.
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
**** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** Will investigate 4-5 test failures in 'make test' - Lijian
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Cables for intel NICs have been ordered.
 +
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
 +
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
*** Check if performance tests includes AVF driver or not?
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
 +
*** Will try one patch to enable N1SDP board.
 +
*** Please try AVF with Mcbin if possible.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
 +
** EPIC for next quarter:
 +
 
 +
 
 +
'''01/14/2020'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Patch to resolve "show pci" crash issue is merged. Will ask CSIT team to remove the workaround. - Lijian
 +
**** Have upgraded Python2 to Python3 successfully.
 +
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
 +
***** Will send email to CSIT-dev on how to avoid the similar case/issue.
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
**** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** Will investigate 4-5 test failures in 'make test' - Lijian
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Cables for intel NICs have been ordered.
 +
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
 +
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
*** Check if performance tests includes AVF driver or not?
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
 +
*** Will try one patch to enable N1SDP board.
 +
*** Please try AVF with Mcbin if possible.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
 +
** EPIC for next quarter:
 +
 
 +
'''01/07/2020'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Patch to resolve "show pci" crash issue is merged. Will ask CSIT team to remove the workaround. - Lijian
 +
**** Have upgraded Python2 to Python3 successfully.
 +
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
 +
***** Will send email to CSIT-dev on how to avoid the similar case/issue.
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
*** Trying to update kernel to 64K page size on CentOS - No update - Lijian
 +
**** VPP can boot up normally with 16K/64K page size. Will investigate 4-5 test failures in 'make test' - Lijian
 +
**** Will try with CentOS 8 which seems to be working fine with 64K page size.
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Cables for intel NICs have been ordered.
 +
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
 +
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
*** Check if performance tests includes AVF driver or not?
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
 +
 
 +
 
 +
'''12/17/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
** Xiaoming Jiang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Have upgraded Python2 to Python3 successfully.
 +
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
*** Trying to update kernel to 64K page size on CentOS - No update - Lijian
 +
**** Will try with CentOS 8 which seems to be working fine with 64K page size.
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Cables for intel NICs have been ordered.
 +
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
 +
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
*** Patches are upstreamed, but not reviewed yet.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Use rte_mbuf_sanity_check checking meta data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
 +
 
 +
'''12/10/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
** Xiaoming Jiang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Have upgraded Python2 to Python3 successfully.
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
*** Trying to update kernel to 64K page size on CentOS - Lijian
 +
**** Will try with CentOS 8 which seems to be working fine with 64K page size.
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2.
 +
*** What's the preferred work method with Mellanox NIC, using DPDK pmd or RDMA? - Juraj
 +
*** Check BIOS version - Lijian
 +
*** Make sure all NICs are plugged into same PCI slot number - Lijian
 +
*** Verify intel i40e driver/firmware version - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** MAP can give profiling data at certain different time-line spots
 +
*** MAP cannot do profiling with specific CPU cores, and cannot give assembly views
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Use rte_mbuf_sanity_check checking meta data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
 +
 
 +
 
 +
'''12/03/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
** Xiaoming Jiang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** There's a Python API issue which affects all performance tests on Taishan server only.
 +
**** The failure turns out to be caused by PCI show with Mellanox NICs on Taishan servers.
 +
**** Talk to Peter to temporarily remove 'PCI dump' for Taishan servers - Juraj
 +
**** Could you try debug version of VPP with the setup and capture the traceback log? - Juraj
 +
**** Will try to root cause the problem with Taishan + Mellanox NIC - Lijian
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
*** Trying to update kernel to 64K page size on CentOS - Lijian
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
*** VPP device failed after Python3 upgrade
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** MAP can give profiling data at certain different time-line spots
 +
*** MAP cannot do profiling with specific CPU cores, and cannot give assembly views
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Use rte_mbuf_sanity_check checking meta data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
 +
 
 +
'''11/26/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
** Xiaoming Jiang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** There's a Python API issue which affects all performance tests on Taishan server only.
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Use rte_mbuf_sanity_check checking meta data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
 +
 
 +
'''11/19/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** There's a Python API issue which affects all performance tests on Taishan server only.
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Use rte_mbuf_sanity_check checking meta data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
 +
 
 +
'''11/12/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Stanislav Clebec
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure - Merged.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lockless
 +
** EPIC for next quarter:
 +
 
 +
'''10/29/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Stanislav Clebec
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure - Merged.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lockless
 +
** EPIC for next quarter:
 +
 
 +
'''10/22/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Stanislav Clebec
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure - Merged.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Add max-size parameter to pmalloc module. - Lijian
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lockless
 +
** EPIC for next quarter:
 +
 
 +
'''10/15/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Stanislav Clebec
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure - Merged.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 240G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Investigating bi-hash lockless implementation - Jason
 +
** EPIC for next quarter:
 +
 
 +
'''10/08/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Stanislav Clebec
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure - Merged.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 240G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Investigating bi-hash lockless implementation - Jason
 +
** EPIC for next quarter:
 +
 
 +
'''10/01/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Stanislav Clebec
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 240G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan. - Lijian
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
** Investigating bi-hash lockless implementation - Jason
 +
** EPIC for next quarter:
 +
 
 +
'''09/24/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 +
** Finished PPT and demo to Pravin - Will share with Juraj and Honnappa.
 +
* CSIT
 +
** VPP Performance Test
 +
*** Investigate DPDK performance job - Juraj
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
 +
**** Currently trending data could be monitored manually only.
 +
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Inform Prashant in FD.io lab for the incoming ThunderX2 - Lijian
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
 +
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
**** Vectorize the data buffer index to data buffer pointer function.
 +
**** Jieqiang has finished code reviewing. Honnappa to review the patches.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
*** Finished reviewing the patches. Honnappa to review the patches.
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''09/17/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 +
** Show CSIT CI/CD, Jenkins status, log and the voting right if there's any failure - Juraj & Lijian
 +
*** Will sync up with Juraj/Stan on Thursday on CSIT demo to Arm product manager.
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
 +
**** Currently trending data could be monitored manually only.
 +
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilaion
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Inform Prashant in FD.io lab for the incoming ThunderX2 - Lijian
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
 +
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
**** Vectorize the data buffer index to data buffer pointer function.
 +
**** Jieqiang has finished code reviewing. Honnappa to review the patches.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Crash issue is reproduced - Jieqiang
 +
*** Crash is gone after applying the patch.
 +
**** There's crash issue when executing 'show hardwares'
 +
**** https://gerrit.oss.arm.com/#/c/131831/
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
*** Finished reviewing the patches. Honnappa to review the patches.
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''09/10/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 +
** Show CSIT CI/CD, Jenkins status, log and the voting right if there's any failure - Juraj & Lijian
 +
*** Talk to Song about it.
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
 +
**** Currently trending data could be monitored manually only.
 +
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilaion
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Inform Prashant in FD.io lab for the incoming ThunderX2 - Lijian
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
 +
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Crash issue is reproduced - Jieqiang
 +
*** Crash is gone after applying the patch
 +
**** There's crash issue when executing 'show hardwares'
 +
**** https://gerrit.oss.arm.com/#/c/131831/
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''09/03/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 +
** Show CSIT CI/CD, Jenkins status, log and the voting right if there's any failure - Juraj & Lijian
 +
*** Talk to Song about it.
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
 +
**** Currently trending data could be monitored manually only.
 +
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Issue is root-caused. Patch is in community review - https://gerrit.fd.io/r/c/vpp/+/21469
 +
** Totally 29 VPP device test cases executed, and 26 cases passed, and only 3 tap related tests failed.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Inform Prashant in FD.io lab for the incoming ThunderX2 - Lijian
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
 +
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Crash issue is reproduced - Jieqiang
 +
**** https://gerrit.oss.arm.com/#/c/131831/
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''08/27/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
 +
**** Currently trending data could be monitored manually only.
 +
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Issue is root-caused. Patch is in community review - https://gerrit.fd.io/r/c/vpp/+/21469
 +
** Totally 29 VPP device test cases executed, and 26 cases passed, and only 3 tap related tests failed.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Inform Prashant in FD.io lab for the incoming ThunderX2 - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
 +
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Run VPP with MAP and reproduce the previous crash/failures - Jieqiang
 +
*** Got latest license to install MAP on Shanghai server.
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''08/20/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
 +
**** Currently trending data could be monitored manually only.
 +
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** on Arm, different default memory map regions for normal page and huge page;
 +
**** vring with huge-page mapped to normal page region addresses is not working.
 +
**** 1. Reserve 16G VA space for future usage, automatic, private, anonymous and without HUGETLB option.
 +
***** base = mmap (0x410000000, 16 << 30, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 +
**** 2. From the 16G VA space, pick up a 40M unused space, redo mmap() with the HUGETLB option, address fixed
 +
***** vaWithinBase = mmap (base, 40 << 20, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED | MAP_HUGETLB | MAP_LOCKED, fd, 0);
 +
**** 3. Use vaWithinBase to initialize vring and vring_desc
 +
** Totally 29 VPP device test cases executed, and 26 cases passed, and only 3 tap related tests failed.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
 +
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Run VPP with MAP and reproduce the previous crash/failures - Jieqiang
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''08/13/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 +
* CSIT
 +
** VPP Performance Test
 +
** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
** Only 1 out of 199 test cases failed, 8 test cases show random 'show hardware-interfaces' failure.
 +
** Some failures are related with 'show hardware-interfaces'/'show vhost dump', time-out.
 +
*** Juraj to send Lijian the commands/APIs in random dump failure.
 +
*** https://jira.fd.io/browse/CSIT-1453
 +
*** SFP eeprom dump is enabled with 'show hardware-interfaces detail' only. Patch is merged.
 +
*** Juraj will change CSIT script with 'show hardware-interfaces verbose', https://gerrit.fd.io/r/#/c/csit/+/21085/
 +
**** CSIT patch is merged.
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
*** Patch to generate daily data and trending graph is committed.
 +
**** https://gerrit.fd.io/r/#/c/csit/+/20962/
 +
**** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine. Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Have gone thru the whole patch, pmalloc module and tap interface code, but cannot identify the root-cause - Lijian
 +
**** Buffer allocate/free based pmalloc seems to be causing the problem.
 +
**** mmap() regions with normal page and huge-page have separate VA spaces.
 +
** Totally 29 VPP device test cases executed, and 26 cases passed, and only 3 tap related tests failed.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
 +
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** All 7 patches are merged.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Jieqiang checked the video by Sirshak
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''08/06/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
** Only 1 out of 199 test cases failed, 8 test cases show random 'show hardware-interfaces' failure.
 +
** Some failures are related with 'show hardware-interfaces'/'show vhost dump', time-out.
 +
*** Juraj to send Lijian the commands/APIs in random dump failure.
 +
*** https://jira.fd.io/browse/CSIT-1453
 +
*** SFP eeprom dump is enabled with 'show hardware-interfaces detail' only. Patch is merged.
 +
*** Juraj will change CSIT script with 'show hardware-interfaces verbose', https://gerrit.fd.io/r/#/c/csit/+/21085/
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
*** Patch to generate daily data and trending graph is committed.
 +
**** https://gerrit.fd.io/r/#/c/csit/+/20962/
 +
**** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Have gone thru the whole patch, pmalloc module and tap interface code, but cannot identify the root-cause - Lijian
 +
**** Buffer allocate/free based pmalloc seems to be causing the problem.
 +
** Totally 29 VPP device test cases executed, and 26 cases passed, and only 3 tap related tests failed.
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
 +
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** All 7 patches are merged.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Jieqiang checked the video by Sirshak
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''07/30/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
** Only 1 out of 199 test cases failed, 8 test cases show random 'show hardware-interfaces' failure.
 +
** Some failures are related with 'show hardware-interfaces'/'show vhost dump', time-out.
 +
*** Juraj to send Lijian the commands/APIs in random dump failure.
 +
*** https://jira.fd.io/browse/CSIT-1453
 +
*** Will check details with x86 server also. It's slow also on x86, but only 5 sec, but it takes 40 sec on Taishan  - Lijian
 +
*** It’s quite time-consuming for ‘show hardware-interfaces’ reading eeprom of the SFP, via software emulated I2C bus.
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Have gone thru the whole patch, pmalloc module and tap interface code, but cannot identify the root-cause - Lijian
 +
**** pmalloc module test cases failed on Arm server due to sudo privilege.
 +
** Totally 35 VPP device test cases passed, and only 3 tap related tests failed.
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
 +
** Align Arm patches with VPP release plan.
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedual and map with Arm Quaterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Message queue optimize it with relaxed atomic intrinsics - Lijian
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
 +
*** Spinlock with inner loop got improvement on both x86 and Arm.
 +
*** Read/write lock got a little degradation with the patch.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Jieqiang checked the video by Sirshak
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''07/23/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
** Only 1 out of 199 test cases failed, 8 test cases show random 'show interface' failure.
 +
** Some failures are related with 'show hardware'/'show interface'/'show vhost dump', time-out.
 +
*** https://jira.fd.io/browse/CSIT-1453
 +
*** Will check details with x86 server also. It's slow also on x86, but only 5 sec, but it takes 40 sec on Taishan  - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
**** 1. All tests are failing. 'show hardware' takes too much time. https://jira.fd.io/browse/VPP-1722
 +
**** 2. To figure out which test cases are executed
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** Working on MAC learning test failures on Cortex-A72 server - Jieqiang
 +
**** Enlarge duration can fix the failure, but will investigate more details.
 +
**** Issues have been fixed in latest master branch. Investigating the details.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Send email and current debug details to community calling for volunteer to fix it. - Lijian
 +
**** pmalloc module test cases failed on Arm server.
 +
*** Changes are uploaded to community gerrit.
 +
*** vpp VMs seems to bring up well. Will work on init script and bring up vpp.
 +
**** VM tests passed. Patches are to be submitted for community review.
 +
**** All the patches are merged and all images are built.
 +
*** Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
 +
**** Docker images for both Arm and x86 are merged and available.
 +
**** https://jenkins.fd.io/sandbox/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/1/console
 +
**** Docker image is verified on Arm server, but to verify it on x86 server also and try it in Jenkins.
 +
**** Arm and x86 have separate docker image. Arm docker image is to be built.
 +
**** Totally 35 test cases, and only 3 tap related tests failed.
 +
*** Ed to help set up numad cluster with dual ThunderX and one ThunderX2
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Message queue optimize it with relaxed atomic intrinsics - Lijian
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
 +
*** Spinlock with inner loop got improvement on both x86 and Arm.
 +
*** Read/write lock got a little degradation with the patch.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Inform MAP owner that Jieqiang will take care of MAP on VPP. - Lijian
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''07/16/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
**** 1. All tests are failing. 'show hardware' takes too much time. https://jira.fd.io/browse/VPP-1722
 +
**** 2. To figure out which test cases are executed
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** Working on MAC learning test failures on Cortex-A72 server - Jieqiang
 +
**** Enlarge duration can fix the failure, but will investigate more details.
 +
**** Issues have been fixed in latest master branch. Investigating the details.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Send email and current debug details to community calling for volunteer to fix it. - Lijian
 +
*** Changes are uploaded to community gerrit.
 +
*** vpp VMs seems to bring up well. Will work on init script and bring up vpp.
 +
**** VM tests passed. Patches are to be submitted for community review.
 +
**** Patch is splited into three small pieces. Two patches (kernel image for VM test/generic CSIT changes to support ThunderX2 testbed) are merged. Third patch about code changes for VM test to be merged, Arm specific code and use kernel image.
 +
*** Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
 +
**** Docker images for both Arm and x86 are merged and available.
 +
**** Docker image is verified on Arm server, but to verify it on x86 server also and try it in Jenkins.
 +
*** Ed to help set up numad cluster with dual ThunderX and one ThunderX2
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Message queue optimize it with relaxed atomic intrinsics - Lijian
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
 +
*** Spinlock with inner loop got improvement on both x86 and Arm.
 +
*** Read/write lock got a little degradation with the patch.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin.
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''07/09/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** IPSEC test cases are failing and skipped on Arm server in CI/CD
 +
**** https://jira.fd.io/browse/VPP-1714
 +
**** Create a Jira ticket to track all the info related to this issue - Juraj
 +
*** Working on MAC learning test failures on Cortex-A72 server - Jieqiang
 +
**** Enlarge duration can fix the failure, but will investigate more details.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Send email and current debug details to community calling for volunteer to fix it. - Lijian
 +
*** Changes are uploaded to community gerrit.
 +
*** vpp VMs seems to bring up well. Will work on init script and bring up vpp.
 +
**** VM tests passed. Patches are to be submitted for community review.
 +
*** Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
 +
**** Docker images for both Arm and x86 are merged and available.
 +
*** Ed to help set up numad cluster with dual ThunderX and one ThunderX2
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
 +
*** Update the current status to Pravin. - Lijian
 +
*** The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** Require a bigger than 120G RAM, prefer 256G
 +
*** Three NICs and each has two ports.
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Message queue optimize it with relaxed atomic intrinsics - Lijian
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
 +
*** Spinlock with inner loop got improvement on both x86 and Arm.
 +
*** Read/write lock got a little degradation with the patch.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin.
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
 
 +
'''07/02/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** IPSEC test cases are failing and skipped on Arm server in CI/CD
 +
**** https://jira.fd.io/browse/VPP-1714
 +
**** Create a Jira ticket to track all the info related to this issue - Juraj
 +
*** Working on MAC learning test failures on Cortex-A72 server - Jieqiang
 +
**** Enlarge duration can fix the failure, but will investigate more details.
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - Lijian
 +
**** The current default C compiler identification is GNU 8.3.0
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Send email and current debug details to community calling for volunteer to fix it. - Lijian
 +
*** vpp VMs seems to bring up well. Will work on init script and bring up vpp.
 +
*** Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
 +
*** Set up numad cluster with dual ThunderX and one ThunderX2
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
 +
*** Update the current status to Pravin. - Lijian
 +
*** The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** Require a bigger than 120G RAM, prefer 256G
 +
*** Three NICs and each has two ports.
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Message queue, remove atomic intrinsics and use lock version only - Lijian
 +
*** Have verified removing atomic intrinsics in message_queue alloc/free APIs, and require confirmation from Florin.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
 +
** Fix ip4_forward compiling - Jason
 +
*** Will check gerrit CI/CD related with that patch. Check why it's not warning in gerrit Jenkins.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Spread dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Will do bench-marking profiling on mcbin.
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''06/25/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** IPSEC test cases are failing and skipped on Arm server in CI/CD
 +
**** https://jira.fd.io/browse/VPP-1714
 +
**** Create a Jira ticket to track all the info related to this issue - Juraj
 +
*** Working on MAC learning test failures on Cortex-A72 server - Jieqiang
 +
**** Enlarge duration can fix the failure, but will investigate more details.
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - Juraj
 +
**** The current default C compiler identification is GNU 8.3.0
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
*** Crypto test cases, will use dpdk driver if configured, native-vpp implementation, fall back to openSSL
 +
**** Will try Crypto test cases next week - Juraj
 +
*** Juraj to send Lijian the details of vpp VMs, Lijian will confirm internally
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
 +
*** Firstly will sponsor the machine
 +
*** The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** Require a bigger than 120G RAM, prefer 256G
 +
*** Three NICs and each has two ports.
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Message queue, remove atomic intrinsics and use lock version only - Lijian
 +
*** Have verified removing atomic intrinsics in message_queue alloc/free APIs, and require confirmation from Florin.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
** Spinlock optimization - Jason
 +
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Spread dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Will do bench-marking profiling on mcbin.
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''06/18/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina Tsou
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - Juraj
 +
**** The current default C compiler identification is GNU 8.3.0
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
*** Will confirm with Florin to remove atomic intrinsics in message_queue alloc/free APIs
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - Upstreamed.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
** Spread qual/quad optimization - ethernet-input
 +
** Redo perf/MAP profiling/bench-marking
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** Apply dual/quad optimization on more data path nodes
 +
*** Investigate and optimize VPP hash and bihash library
 +
*** VPP translation overhead analysis btw Mbuf and VLIB buffer ENTNET-1293
 +
*** VPP Memif performance analysis and optimization ENTNET-1292
 +
*** VPP l3fwd performance analysis and optimization ENTNET-751
 +
*** Using MAP with VPP ENTNET-1288
 +
 
 +
'''06/11/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina Tsou
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - Juraj
 +
**** The current default C compiler identification is GNU 8.3.0
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
*** Will confirm with Florin to remove atomic intrinsics in message_queue alloc/free APIs
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - Upstreamed.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
** Spread qual/quad optimization - ethernet-input
 +
** Redo perf/MAP profiling/bench-marking
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''06/04/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina Tsou
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Stan
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - Upstreamed.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''05/28/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina
 +
** Lijian Zhang
 +
** Vijay (vijayakumar.rajamanickam@nokia.com)
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - internal Review.
 +
** MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''05/21/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina
 +
** Lijian Zhang
 +
** Vijay (vijayakumar.rajamanickam@nokia.com)
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - internal Review.
 +
** MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''05/14/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina
 +
** Lijian Zhang
 +
** Vijay (vijayakumar.rajamanickam@nokia.com)
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** VPP generic distro package building patch - Patch updated. Require Damjan's follow up review.
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - internal Review.
 +
** MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''05/07/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina
 +
** Lijian Zhang
 +
** Vijay (vijayakumar.rajamanickam@nokia.com)
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
*** Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input.
 +
** VPP generic distro package building patch - Patch updated Damjan's follow up review required.
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** Vectorization in esp-encrypt, optimize memcpy_le. Upstreamed(https://gerrit.fd.io/r/#/c/18398/). - Lijian
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - internal Review.
 +
** MAP(Arm Proprietary Performance Analysis Tool) with VPP - Tried internal Patch still failing. Continuing to work on it.
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''04/30/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
*** Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input.
 +
** VPP generic distro package building patch - Patch updated Damjan's follow up review required.
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** Vectorization in esp-encrypt, optimize memcpy_le. Upstreamed(https://gerrit.fd.io/r/#/c/18398/). - Lijian
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - internal Review.
 +
** MAP(Arm Proprietary Performance Analysis Tool) with VPP - Tried internal Patch still failing. Continuing to work on it.
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''04/23/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Lijian Zhang
 +
** Juraj Linkeš
 +
** Vijay
 +
** Nitin
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** List all the blockers on aarch64 in CSIT wiki page - Done - Stan or Juraj
 +
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
 +
*** b. merging CSIT patch. - Closing done
 +
*** c. creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** QSFP+ is available and working now.
 +
**** Juraj to resend email to Mahamad about the details, including Sirshak and Tina
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
 +
*** Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
 +
*** These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
 +
*** Investigate why these three blades have only one numa node - Juraj
 +
* VPP
 +
** Investigate session_queue_node_fn/vlib_worker_loop.
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
*** Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input
 +
** Rebase VPP distro package building patch; contact Damjan in slack; Talk with Damjan in vpp meeting - Lijian & Sirshak
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Perf degradation is fixed. Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK sample apps on it now.
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
** Vectorization
 +
*** Vecterization in esp-encrypt, https://gerrit.fd.io/r/#/c/18398/ - Get improvement on ThunderX/OcteonTX/Taishan, but degradation on ThunderX2 - Lijian
 +
** TAS patch will be ready soon (Sirshak)
 +
** MAP with VPP is ongoing - Sirshak
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
* Action Items - Last Week
 +
* Action Items - Next Week
 +
 
 
'''04/16/2019'''
 
'''04/16/2019'''
 
* Attendees
 
* Attendees
Line 1,547: Line 14,650:
 
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).  
 
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).  
 
** Alternate test cases.
 
** Alternate test cases.
** khem to get more information on benchmarking DMM. Khem to send the information to community if there's more.
+
** khem to get more information on benchmarking DMM. Khem to send the information to
* Action Items - Last Week
+
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
** [Khem] Deployment of only L2 CSIT performance suite. Status: Working with Juraj to get scripts ready for jobs. - Need to prepare some scripts. First to understand how the script works and then add more options.
+
** [Lijian] Status on VPP path failures. Status: Still debugging. Still at early study stage.
+
** [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far.
+
** [Andy] has sent out on Nov 12th. Juraj has sent the info to LF.
+
* VPP
+
** Vectorization
+
*** [Lijian] Zero byte mask NEON implementation. - Merged.
+
*** [Lijian] ip4 lookup buffer index to buffer pointer optimizations. - Both micro and macro prove it not worth implementing vectorization here.
+
*** [Lijian] working on vectorized memory copy - Khem to send the vectorized memory copy done previously.
+
** Memory Ordering
+
*** [Sirshak] To start work on Arithmetic and Logic relaxed functions.
+
* CSIT
+
** VPP Path
+
*** 3 failures currently stalling deployment.
+
*** VPP-1476, VPP-1475, VPP-1478
+
*** These failures are seen on Debian x86 VM also.
+
*** Parallelization(n=32) is resulting in failures. Seems also be caused by below two patches.
+
*** VPP-1490, caused by https://gerrit.fd.io/r/#/c/15106/ and https://gerrit.fd.io/r/#/c/15534/.
+
** VPP Device
+
*** thunderx: Juraj created a LF tkt for wiring the 1-node topology on cavium thunderx.
+
*** thunderx2: no updates. Anton is figuring out where/how to put these servers.
+
*** mcbin: Kernel issue yet to try suggestion from Garcia and Damjan.
+
** VPP Performance Test
+
*** Working ongoing on writing scripts for Performance Jobs.
+
*** L2 test is working now manually. Khem is trying to get it work in CI, and then IP4, and other test cases.
+
* FD.io lab
+
** Arista switch and power supply. Setup was sent out and tracking no. info was sent to LF.
+
** ThunderX2 - No updates.
+
* Action Items - Next Week
+
** [Lijian] to investigate VPP-1490 issue.
+
 
+
 
+
'''11/06/2018'''
+
* Attendees
+
** Sirshak Das
+
** Honnappa
+
** Tina Tsou
+
** Andy Wang
+
** Khemendra
+
** Garcia
+
** Manuel
+
** Fede
+
* VPP Hoststack
+
** iperf3 perfomance with Hoststack.
+
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
+
** Alternate test cases.
+
** khem to get more information on benchmarking DMM.
+
* Action Items - Last Week
+
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
** [Khem] Deployment of only L2 CSIT performance suite. Status: Working with Juraj to get scripts ready for jobs.
+
** [Lijian] Status on VPP path failures. Status: Still debugging.
+
** [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan.
+
** [Andy] to send tracking no for Arista power supply. Status: PO is still being worked on internally.
+
* VPP
+
** Vectorization
+
*** [Lijian] msb patch. - merged
+
*** [Lijian] Zero byte mask NEON implementation. - internal review completed to up-streamed soon.
+
*** [Lijian] ip4 lookup buffer index to buffer pointer optimizations. - internal investigations ongoing.
+
** Memory Ordering
+
*** [Sirshak] atomic exchange acquire and release macro patch.- Merged.
+
*** [Sirshak] To start work on Arithmetic and Logic relaxed functions.
+
* CSIT
+
** VPP Path
+
*** 3 failures currently stalling deployment.
+
*** VPP-1476, VPP-1475, VPP-1478
+
*** These failures are seen on Debian x86 VM also.
+
*** Parallelization(n=32) is resulting in failures.
+
** VPP Device
+
*** thunderx: Juraj created a LF tkt for wiring the 1-node topology on cavium thunderx.
+
*** thunderx2: no updates.
+
*** mcbin: Kernel issue yet to try suggestion from Garcia and Damjan.
+
** VPP Performance Test
+
*** Working ongoing on writing scripts for Performance Jobs.
+
* FD.io lab
+
** Arista switch power supply and rack rails. Status: being worked internally
+
** ThunderX2 - No updates.
+
* Action Items - Next Week
+
**
+
'''10/30/2018'''
+
* Attendees
+
** Sirshak Das
+
** Honnappa
+
** Lijian Zhang
+
** Tina Tsou
+
** Andy Wang
+
** Khemendra
+
* Action Items - Last Week
+
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
** [Khem] To start with deployment of only L2 CSIT performance suite. - Discussed in CSIT meeting. 
+
** [Khem] to send the ip4 failure logs csit-dev, vpp-dev. - Sent csit-dev and vpp-dev. 
+
** [Juraj] Anton to install NICs. Juraj has sent the instructions. - Sent
+
* VPP
+
** Vectorization
+
*** [Lijian] Internal review going on msb patch.
+
*** [Lijian] Zero byte mask neon function not consistent with x86. Working optimizing it.
+
*** [Lijian] ip4 lookup buffer index to buffer pointer optimization. - Still investigating.
+
** Memory Ordering:
+
*** [Sirshak] Internal review going on atomic exchange acquire and release macro patch.
+
* CSIT
+
** VPP Path
+
*** 2 failures currently stalling deployment.
+
*** https://jira.fd.io/browse/VPP-1476
+
*** https://jira.fd.io/browse/VPP-1475
+
*** VPP-1478 - L2FIB failures in Taishan.
+
*** 18.04 does not solve the problem
+
*** ThunderX2 - 4-5 failiures L2BD cases.
+
*** ThunderX1 - l2fib and juraj
+
*** [Lijian] to take a look.
+
** VPP Device
+
*** Waiting for NICs to be installed on ThunderX2.
+
*** mcbin : Linux Kernel not able to use the data ports. Needed for scapy. [Sirshak] to take a look
+
*** mcbin: New Kernel is not working.
+
** Performance Test
+
*** Discussed in CSIT meeting
+
*** Researching on how to create jobs
+
*** More things to be discussed with CSIT meeting.
+
* FD.io lab
+
** Arista switch need additional hardware. - Andy to send tracking no.
+
** ThunderX2 - Waiting on LF for NICs installing and rack installing in general.
+
* Action Items - Next Week
+
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
** [Khem] Deployment of only L2 CSIT performance suite.
+
** [Lijian] Status on VPP path failures.
+
** [Sirshak] Kernel Migration mcbin.
+
** [Andy] to send tracking no for Arista power supply.
+
 
+
'''10/23/2018'''
+
* Attendees
+
** Sirshak Das
+
** Juraj Linkeš
+
** Lijian Zhang
+
** Tina Tsou
+
** Andy Wang
+
** Khemendra
+
* Action Items - Last Week
+
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
** [Khem] VPP performance suite. L2XC, L2BD working. IPv4 failing.
+
** [Juraj and Sirshak] We need to check RC2 after 17th of october. [khem] To send the test failure logs to vpp-dev.
+
* VPP
+
** Vectorization
+
*** [Lijian] Looking at shuffle and msb patches.
+
*** [Lijian] Sent the shuffle analysis to Nitin for feedback.
+
*** [Lijian] Preliminary analysis says we take more instructions hence the slow down. To reevaluate the algorithm itself.
+
** Memory Ordering:
+
*** [Sirshak] Part 1 of the patch merged which now introduces macros. Working on part 2.
+
* CSIT
+
** VPP Path
+
*** To be deployed as a part of CI after 1810 release.
+
** VPP Device
+
*** Waiting for NICs to be installed on ThunderX2.
+
*** mcbins working. Juraj to start work on getting one instance of the VPP device test running.
+
*** Docker working musdk working. Currently facing issues with the 1-node topology.
+
** Performance Test
+
*** L2XC, L2BD working. IPv4 failing. [Khem] to send the failure logs csit-dev, vpp-dev.
+
*** [Khem] To start with deployment of only L2 performance suite. 
+
* FD.io lab
+
** Arista switch need additional hardware. To be shipped in 2 weeks.
+
** NICs and wires for ThunderX2 - Received. Anton to install NICs. Juraj has sent the instructions. 
+
** Power Cycler to be ordered by LF - Its available and working.
+
* Documentation
+
** Lijian's patch merged.
+
* Action Items - Next Week
+
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
** [Khem] To start with deployment of only L2 performance suite. 
+
** [Khem] to send the failure logs csit-dev, vpp-dev.
+
** [Juraj] Anton to install NICs. Juraj has sent the instructions.
+
'''10/16/2018'''
+
* Attendees
+
** Sirshak Das
+
** Juraj Linkeš
+
** Lijian Zhang
+
** Tina Tsou
+
** Andy Wang
+
* Action Items - Last Week
+
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
** [Khem] to try VPP performance suite to see change after Vectorization and Loop unrolling patch.
+
** Remaining issues in L2 and IPv4 - Sirshak to try debug. Status: With inputs from Neale issue fixed.
+
* VPP
+
** Vectorization
+
*** [Lijian] Studying about Vectorization and Memory ordering.
+
** Memory Ordering:
+
*** [Sirshak] One patch up-streamed. Relaxed memory ordering patch being reviewed internally.
+
* CSIT
+
** VPP Path
+
*** Should be deployed on 25th Oct after 1810 release.
+
*** We need to check RC2 after 17th of october [Juraj and Sirshak].
+
** VPP Device
+
*** Waiting for ThunderX2 NICs. Mcbin has issues with 2 VPP instances and traffic being sent.
+
** Performance Test
+
*** No updates
+
* FD.io lab
+
** QSFP+ switch for Cavium blades. - Recieved
+
** NICs and wires for ThunderX2 - Andy to confirm that the NICs have been sent. Juraj LF tkt to be opened.
+
** Power Cycler to be ordered by LF - Its available and should be operational this week.
+
* Documentation
+
** Scott has reviewed the changes from Lijian he will merge it this week after few modifications of his own.
+
* Action Items - Next Week
+
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
** [Khem] to try VPP performance suite to see change after Vectorization and Loop unrolling patch.
+
** [Juraj and Sirshak] We need to check RC2 after 17th of october.
+
 
+
'''10/09/2018'''
+
* Attendees
+
** Sirshak Das
+
** Maciek Konstantinowicz
+
** Juraj Linkeš
+
** Lijian Zhang
+
** Tina Tsou
+
** Andy Wang
+
* Action Items - Last Week
+
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
** [Juraj] to try musdk enabled kernel. - Juraj retried it worked.
+
** [Khem] to try VPP performance suite - Diff in nos based on previous patches.
+
* VPP
+
** Vectorization
+
*** Do perf analysis of the compiled code i.e. compare code with buffer indices to buffer pointers code with and without in quad loop.
+
*** Lijian to rebase the patch and try a few experiments
+
*** Khem Updates: No updates.
+
*** Memory Ordering Patch reintroduced. Broken down into smaller patches and the first patch has been upstreamed, others will be phased in gradually.
+
* CSIT
+
** VPP Path
+
*** Set up publicly accessible Cavium machine for debugging purposes
+
*** 2 tkts resolved by Neale.
+
*** Remaining issues in L2 and IPv4 - Sirshak to try debug; Neale will look into it in spare cycles
+
** VPP Device
+
*** SRIOV reservation system code exists and is being tested
+
** Performance Test
+
*** NDR issue Status:
+
*** Issues in ip4. Status:
+
* FD.io lab
+
** QSFP+ switch for Cavium blades. - Has been shipped
+
** NICs and wires for ThunderX2 - Wires have been shipped; NICs will be delayed
+
** Power Cycler to be ordered by LF.
+
* Documentation
+
** Started upstreaming, no response yet
+
* Action Items - Next Week
+
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
** [Khem] to try VPP performance suite to see change after Vectorization and Loop unrolling patch.
+
** Remaining issues in L2 and IPv4 - Sirshak to try debug. Status: With inputs from Neale issue fixed.
+
 
+
'''10/02/2018'''
+
* Attendees
+
** Sirshak Das
+
** Honnappa
+
** Juraj Linkeš
+
** Lijian Zhang
+
** Khemendra
+
** Tina Tsou
+
** Andy Wang
+
** Honnappa
+
* Action Items - Last Week
+
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
** [Khem] VPP-1391 - VPP 'make verify' failed on Huawei Taishan server: Similar failures to cavium setup.
+
** [Sirshak] - mlnx patch merged.
+
** [Sirshak] to setup meeting with andy, tina and honnappa regarding fd.io lab purchase planning. - Not needed.
+
** [Juraj] to try musdk enabled kernel. - failing Sirshak to debug this.
+
** [Lijian] to resolve upstream comments. - Patch merged up-streamed.
+
** [Khem] to try VPP performance suite with Lijian's Patch. - No need can try with master now.
+
** [Khem] To try to debug traffic flow form TG to DUT with current master. - No updates.
+
** [Khem] To send current status. - Sent.
+
* VPP
+
** Vectorization
+
*** Understand Performance degradation.
+
**** msb correct version.
+
**** ip4_forward buffer index to buffer pointers.
+
** Tuning dual/quad loop
+
*** Patch merged.
+
*** Khem Updates: No updates.
+
* CSIT
+
** VPP Path
+
*** 2 categories of failures primarily. 8 tkts opened.
+
** VPP Device
+
*** shim layer to be leaner and most of the functionality will reside in jenkins-slave.
+
** Performance Test
+
*** L2-basic(L2XC, L2BD) PDR, MDR passing. NDR has issues but debugged.
+
*** Issues in ip4.
+
* FD.io lab
+
** mcbin - Sirshak to help debug connectivity issues
+
** QSFP+ switch for Cavium blades. - Andy working getting a refurbished one.
+
** Power Cycler to be ordered by LF.
+
* Documentation
+
**
+
* Action Items - Next Week
+
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
** [Juraj] to try musdk enabled kernel. - Juraj retried it worked.
+
** [Khem] to try VPP performance suite - Diff in nos based on previous patches.
+
 
+
'''9/25/2018'''
+
* Attendees
+
** Sirshak Das
+
** Honnappa
+
** Juraj Linkeš
+
** Lijian Zhang
+
** Khemendra
+
** Tina Tsou
+
** Andy Wang
+
** Honnappa
+
* Action Items - Last Week
+
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. Khem to to measure verify and test timings.
+
** Sirshak to verify if Damjan's suggestion on VPP compiling with Mellanox PMD driver works. - Does not. Lijian suggested fix merged by damjan, Additional patch needed: under internal review.
+
** [VPP performance Suite] shm issues seen sporadically. Not seeing currently.
+
** Sirshak to setup meeting with andy, tina and honnappa regarding fd.io lab purchase planning. - Not done will do it this week.
+
** Sirshak to send musdk instructions. - Done.
+
** Juraj to try musdk enabled kernel and see if musdk and docker can coexist in the dirty kernel. - Trying them out.
+
* VPP
+
** Mellanox NIC not working with VPP
+
*** [Sirshak] glue library not detected by VPP. Compiling static has cmake issues. - Fix done pending internal review.
+
** VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. Khem to try things suggested by Juraj.
+
** Vectorization
+
*** Understand Performance degradation.
+
**** msb correct version.
+
**** ip4_forward buffer index to buffer pointers.
+
** Tuning dual/quad loop
+
*** Patch upstreamed pending review.
+
*** Lijian to resolve upstream comments.
+
*** Khem to try VPP performance suite with Lijian's Patch. (affects 1 route configuration more than 10k routes.)
+
* CSIT
+
** VPP Path
+
*** Current Test Cases Failure: 8
+
*** [Juraj] Mail sent regarding failures. Need community support regarding failures.
+
*** [Khem] To see if he can take up one test case failure and resolve it.
+
** VPP Device
+
*** In case of 2 loopbacks we can use 2 physical devices.
+
*** Facing issues with console connection to mcbin.
+
** Performance Test
+
*** [Khem] working on IPv4 testing failures with CSIT script. Will be starting on v4 suite. To debug at DPDK level and talk to VPP community.
+
*** [Khem] To try to debug traffic flow form TG to DUT with current master.
+
*** [Khem] To send current status.
+
* FD.io lab
+
** mcbin - trying new kernel images to resolve connectivity issues.
+
** QSFP+ switch for Cavium blades. - Waiting for reply from Anton.
+
* Documentation
+
** Trevor to help with contiv documentation. - Trevor has made changes, Lijian has sent for comments.
+
* Action Items - Next Week
+
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server.
+
** Sirshak - mlnx patch merged.
+
** Sirshak to setup meeting with andy, tina and honnappa regarding fd.io lab purchase planning. - Not needed.
+
** Juraj to try musdk enabled kernel. - failing Sirshak to debug this.
+
** Lijian to resolve upstream comments. - Patch merged up-streamed.
+
** Khem to try VPP performance suite with Lijian's Patch.
+
** [Khem] To try to debug traffic flow form TG to DUT with current master.
+
** [Khem] To send current status.
+
'''9/18/2018'''
+
* Attendees
+
** Sirshak Das
+
** Honnappa
+
** Juraj Linkeš
+
** Lijian Zhang
+
** Tina Tsou
+
* Action Items - Last Week
+
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server, and requires steps on reproducing the issue from Juraj.
+
** Lijian to update dual/quad loop code review per comments. - Resolved.
+
** Sirshak to verify if Damjan's suggestion on VPP compiling with Mellanox PMD driver works. - It doesnt. Send him a reminder.
+
** Juraj to investigate the compiler versions on fd.io lab machines. Upgrade to latest GCC-7.3.
+
** [VPP performance Suite] shm issues seen sporadically. Khem to send an email to vpp-dev.
+
** Honnappa to confirm the if we order NIC for ThunderX1, specify if they are external NICs.
+
** Sirshak to setup meeting with andy, tina and honnappa regarding fd.io lab purchase planning.
+
** Sirshak to send musdk instructions.
+
* VPP
+
** Mellanox NIC not working with VPP
+
*** [Sirshak] glue library not detected by VPP. Compiling static has cmake issues.
+
** VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. - make build-release and make test work. Khem to see if make verify is still failing.
+
** Vectorization
+
*** [Sirshak] Patches submitted. Patch widening the usage has performance issues.
+
** Tuning dual/quad loop
+
*** [Lijian] 3%- throughput improvement (mcbin).
+
*** Include mcbin nos in commit message.
+
** [Khem] working on IPv4 testing failures with CSIT script
+
* CSIT
+
** CSIT-1139 - parallelize 'make verify'
+
*** make verify - working. Test Failure still there
+
*** Juraj investigating FAILED test cases in make verify.
+
** CSIT VPP Device updates
+
*** juraj to try musdk enabled kernel and see if musdk and docker can coexist in the dirty kernel.
+
** CSIT Performance Test Suite Updates
+
* FD.io lab
+
** mcbin issue
+
** QSFP+ ports for Cavium blades.
+
* Documentation
+
** Trevor to help with contiv documentation.
+
* Action Items - Next Week
+
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server, and requires steps on reproducing the issue from Juraj.
+
** Sirshak to verify if Damjan's suggestion on VPP compiling with Mellanox PMD driver works. -
+
** [VPP performance Suite] shm issues seen sporadically. Khem to send an email to vpp-dev.
+
** Sirshak to setup meeting with andy, tina and honnappa regarding fd.io lab purchase planning.
+
** Sirshak to send musdk instructions.
+
** Juraj to try musdk enabled kernel and see if musdk and docker can coexist in the dirty kernel.
+
'''9/11/2018'''
+
* Attendees
+
** Honnappa Nagarahalli
+
** Juraj Linkeš
+
** Sirshak Das
+
** Lijian Zhang
+
** Tina Tsou
+
** Khemendra Kumar
+
* Action Items - Last Week
+
** Khem and Sachin to verify Sirshak's vectorization patches. - Ongoing(Khem)
+
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. - try make build-release and then make test. - Continue investigation on this issue.
+
*** [Juraj] sends his steps to Khem to reproduce this issue.
+
** Lijian to talk to Damjan regarding adding Architecture specific TAG in make.- Code review internally
+
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
** Juraj to investigate the compiler versions on fd.io lab machines. - No update
+
** [VPP performance Suite] shm issues seen sporadically. Khem to send an email to vpp-dev. - Not reproducible so far, will keep observing.
+
* VPP
+
** Mellanox NIC not working with VPP caused by libmnl.so missing in cmake
+
*** [Sirshak]
+
** VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. - make build-release and make test work. Khem to see if make verify is still failing.
+
** Vectorization
+
*** [Sirshak] Patches ready and sent for review. Need arm community feedback on performance implications.
+
*** [Khem/Nitin] verify how those patches effect
+
** Tuning dual/quad loop
+
*** [Lijian] Dual/Quad loop code change is under internal review. Sirshak gives some comments, and Lijian will update diff accordingly.
+
** [Khem] working on IPv4 testing failures with CSIT script
+
* CSIT
+
** CSIT-1139 - parallelize 'make verify'
+
*** Parallazation is stable and working fine. We might see feature improvement requirement or new bugs.
+
*** No new requirement so far.
+
** CSIT VPP Device updates
+
*** Two tickets on VPP FD.io done. Have to verify to make sure they are working.
+
*** x86 has improvement and some experiments are done. Juraj will try those experiments on ARM platforms.
+
** CSIT Performance Test Suite Updates
+
***  Facing issues with ip4 forwarding test case.
+
*** shm issues seen sporadically. Khem to send an email to vpp-dev.
+
* FD.io lab
+
** 3 LF tkts - 1 Node Topology wiring mcbin, 3-node topology wiring Mcbin, 1-node Topology wiring Cavium. - Finished
+
* Documentation
+
** Sirshak to take a look.
+
* Action Items - Next Week
+
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server, and requires steps on reproducing the issue from Juraj.
+
** Lijian to update dual/quad loop code review per comments
+
** Sirshak to verify if Damjan's suggestion on VPP compiling with Mellanox PMD driver works
+
** Juraj to investigate the compiler versions on fd.io lab machines. Upgrade to latest GCC-8.2.0
+
** [VPP performance Suite] shm issues seen sporadically. Khem to send an email to vpp-dev.
+
** Honnappa to confirm the if we order NIC for ThunderX1, specify if they are external NICs
+
 
+
'''9/4/2018'''
+
* Attendees
+
** Sirshak Das
+
** Lijian Zhang
+
** Tina Tsou
+
** Khemendra Kumar
+
* Action Items - Last Week
+
** Lijian to try merging it upstream - Mellanox Changes. - Resolved.
+
** Sachin to introduce RFC for IPsec offload support in DPDK plugin. - No updates.
+
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. - try make build-release and then make test.
+
** Lijian to talk to Damjan regarding adding Architecture specific TAG in make.- Discuss Internally.
+
** Juraj to investigate the compiler versions on fd.io lab machines. - No updates.
+
* VPP
+
** VPP-1339 - Mellanox NIC not working with VPP. - Resolved
+
** VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. - make build-release and make test work. Khem to see if make verify is still failing.
+
** Vectorization
+
*** [Sirshak] Patches ready and sent for review. Need arm community feedback on performance implications.
+
** Tuning dual/quad loop
+
*** [Lijian/Brian] Investigate dynamic function selection. Brian to looking at this.
+
* CSIT
+
** CSIT-1139 - parallelize 'make verify'
+
*** No updates
+
** CSIT VPP Device updates
+
***  No updates
+
** CSIT Performance Test Suite Updates
+
***  Facing issues with ip4 forwarding test case.
+
*** shm issues seen sporadically. Khem to send an email to vpp-dev.
+
* FD.io lab
+
** 3 LF tkts - 1 Node Topology wiring mcbin, 3-node topology wiring Mcbin, 1-node Topology wiring Cavium.
+
* Documentation
+
** Sirshak to take a look.
+
* Action Items - Next Week
+
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server.
+
** Lijian to talk to Damjan regarding adding Architecture specific TAG in make.
+
** Juraj to investigate the compiler versions on fd.io lab machines.
+
** [VPP performance Suite] shm issues seen sporadically. Khem to send an email to vpp-dev.
+
 
+
'''8/28/2018'''
+
* Attendees
+
** Sirshak Das
+
** Sachin Saxena
+
** Juraj Linkes
+
** Lijian Zhang
+
** Tina Tsou
+
** Khemendra Kumar
+
** Honnappa Nagarahalli
+
** Brian Brooks
+
* Action Items - Last Week
+
** Khem to create Jira Tkt - startup-config issues (NUMA node and memory issues). Jira ID [VPP-1405]
+
** Sirshak to ask Juraj to create a LF tkt for Power cycling. - Done
+
** Lijian to follow up Mellanox issue. - Done. Patch Verified. To try merging it upstream.
+
** Andy following up on cavium. - Done. Unavailability of resource from cavium. Box not priority right now, will take up later.
+
** Khem to create Jira IDs for Jumbo frames. [CSIT-1259]In CSIT performance suite, Jumbo frames TCs failing on ARM servers
+
** Sachin to introduce RFC for IPsec offload support in DPDK plugin. - No updates, trying to resolve cmake issues.
+
** Sirshak to add porting and tuning section to wiki. - Done
+
** Sirshak/Juraj talk about Mellanox issue in CSIT call. - Done
+
* VPP
+
** VPP-1339 - Mellanox NIC not working with VPP - Mellanox provided DPDK Patch ready, Lijian to try upstream it to VPP.
+
** VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. - try make build-release and then make test.
+
** Vectorization
+
*** [Sirshak] Rough patch ready. Currently facing a crash due it.
+
** Tuning dual/quad loop
+
*** Discussion ongoing with Damjan.
+
** Lijian to talk to Damjan regarding adding Architecture specific TAG in make.
+
** Also to proceed with generic compilation to build for all micro architecture and do dynamic selection.
+
* CSIT
+
** Juraj to investigate the compiler versions on fd.io lab machines.
+
** CSIT-1139 - parallelize 'make verify'
+
*** Discussing with EdK, how to use that in jenkins job.
+
** CSIT VPP Device updates
+
*** No problems. Trying basic package installation of container topology.
+
*** aarch64 vpp packages not built for 18.04 LTS, potential problem when we switch from 16.04->18.04.
+
** CSIT Performance Test Suite Updates
+
***  No updates.
+
* FD.io lab
+
** 3 LF tkts - 1 Node Topology wiring mcbin, 3-node topology wiring Mcbin, 1-node Topology wiring Cavium.
+
* Documentation
+
* Action Items - Next Week
+
** Lijian yo try merging it upstream - Mellanox Changes.
+
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. - try make build-release and then make test.
+
** Lijian to talk to Damjan regarding adding Architecture specific TAG in make.
+
'''8/21/2018'''
+
* Attendees
+
** Sirshak Das
+
** Sachin Saxena
+
** Juraj Linkes
+
** Lijian Zhang
+
** Andy Wang
+
** Tina Tsou
+
** Khemendra Kumar
+
** Honnappa Nagarahalli
+
** Brian Brooks
+
* Action Items - Last Week
+
** [Sirshak] Create LF RT ticket for power cycling mcbins - Not Done yet
+
** [Honnappa] Add module owners list and performance analysis items to wiki page - Discussion Still going on.
+
** [Lijian] Check if DPDK 18.08 helps Mellanox NIC issues. - Waiting for patch from Mellanox
+
** [Sirshak] Create Jira ticket to see impact of Florin's patch : VPP-1401
+
** [Sirshak] Create Jira ticket for msb : VPP-1402
+
** [Khem] Try dual loop ip4_lookup_inline patch to see if it helps on A72-based D05. : Problems with Ipv4 forwarding(startup-config issues- NUMA Node and Memory issues).
+
** [Brian] [https://projects.linaro.org/browse/LTN-10 LTN-10] - Help resolve VPP build failure on mcbins in FD.io lab
+
** [Juraj] Enable VPP Device on 1-node SoC now that SFP+ cables have arrived. : No Response from LF.
+
** [Sirshak] Follow up with Cavium regarding Ubuntu installation on cavium-4. Status: Andy Following up.
+
** [Khem] Create Jira ticket for CSIT failures with jumbo frames
+
** [Khem] Create Jira ticket for running a subset of tests via a tag : [CSIT1250]In ARM Perf verify CI, running a subset of tests via a tag
+
* VPP
+
** VPP-1339 - Mellanox NIC not working with VPP
+
** VPP-1391 - VPP 'make verify' failed on Huawei Taishan server
+
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
** Vectorization 
+
*** Stalled due to Mellanox NIC issues as benchmarking patches is not posible.
+
*** hadd and msb - Done.
+
*** extendto and shuffle going on.
+
*** Shuffle using __built_in gives same performance as vector intrinsic as at -O2 neither compile tbl instruction.
+
** Tuning dual/quad loop
+
** Sirshak to add Porting and Tuning Section to Wiki.
+
* CSIT
+
** CSIT-1139 - parallelize 'make test'
+
*** dave barach to take a final look and merge.
+
** Sirshak/Juraj to talk about having Mellanox in CSIT seeing current compatibility issues post-release.
+
** CSIT VPP Device updates
+
*** Trying to get the 1-node topology: mcbin and cavium thunderx.
+
** CSIT Perfomance Test Suite Updates
+
*** Current issues: NDR, PDR Jumbo frames failure but MRR passing. Memory and Numa Nodes issues in Taishan.
+
* FD.io lab
+
** 3 LF tkts - Ubuntu Installation cavium-4, 1 Node Topology, Power Cycling mcbins(to be opened).
+
* Documentation
+
** Documentation changes by Lijian Merged.
+
* Action Items - Next Week
+
** Khem to create Jira Tkt - startup-config issues (NUMA node and memory issues).
+
** Sirshak to ask Juraj to create a LF tkt for Power cycling.
+
** Lijian to follow up Mellanox issue.
+
** Andy following up on cavium.
+
** Khem to create Jira IDs for Jumbo frames.
+
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
** Sirshak to add porting and tuning section to wiki.
+
** Sirshak/Juraj talk about Mellanox issue in CSIT call.
+
+
 
+
'''8/14/2018'''
+
* Attendees
+
** Juraj Linkes
+
** Lijian Zhang
+
** Andy Wang
+
** Tina Tsou
+
** Khemendra Kumar
+
** Honnappa Nagarahalli
+
** Brian Brooks
+
* FD.io lab
+
** SFP+ cables shipment showing as delivered
+
* VPP
+
** VPP-1339 - Mellanox NIC not working with VPP
+
*** Lijian noticed DPDK version updated to 18.08 and might help - https://gerrit.fd.io/r/#/c/14154/
+
*** Tina helping find someone from Mellanox to help
+
** VPP-1391 - VPP 'make verify' failed on Huawei Taishan servers
+
*** Khem looking into this
+
** No updates on crypto
+
** No updates on vectorization
+
** Tuning dual/quad loop
+
*** DaveB suggests looking at MULTIARCH macros
+
* CSIT
+
** CSIT-1139 - parallelize 'make test'
+
*** Juraj updated patch with comments from Klement
+
** Khem seeing failures with jumbo frames
+
** Khem noticed new CSIT machines using tag to run a subset of tests
+
* Documentation
+
** Lijian working on patch to add Arm to Architecture section and Arm-based CSIT testbeds to CSIT section
+
* Action Items - Next Week
+
** [Sirshak] Create LF RT ticket for power cycling mcbins
+
** [Honnappa] Add module owners list and performance analysis items to wiki page
+
** [Lijian] Check if DPDK 18.08 helps Mellanox NIC issues
+
** [Sirshak] Create Jira ticket to see impact of Florin's patch
+
** [Sirshak] Create Jira ticket for msb
+
** [Khem] Try dual loop ip4_lookup_inline patch to see if it helps on A72-based D05
+
** [Brian] Help resolve VPP build failure on mcbins in FD.io lab
+
** [Juraj] Enable VPP Device on 1-node SoC now that SFP+ cables have arrived
+
** [Sirshak] Follow up with Cavium regarding Ubuntu installation on cavium-4
+
** [Khem] Create Jira ticket for CSIT failures with jumbo frames
+
** [Khem] Create Jira ticket for running a subset of tests via a tag
+
 
+
'''8/7/2018'''
+
* Attendees
+
** Sirshak Das
+
** Juraj Linkes
+
** Lijian
+
** Andrew Pinski
+
** Andy Wang
+
** Tina Tsou
+
** Nitin Saxena
+
** Khemendra
+
** Sachin Saxena
+
 
+
* General Topic
+
* Action Items - Last Week
+
** [Khem] make verify on Taishan failure Status: No Status. Khem to create a Jira Tkt.
+
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status: [Andy] Cables to be sent today.
+
** [Sirshak] Open Jira tkt look at Florin's patch. Status: To be done next week
+
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status: Failing in different place, like rx-error reported to Mellanox people by Lijian. [Lijian] To send the mail vpp-dev. [Honnappa] To talk to DPDK Mellanox DPDK community.
+
** [Sirshak] Share Mellanox settings with nitin.
+
** [Sirshak] to send email to yi and lijian for documentation. Status: Lijian has done the documentation under internal review.
+
** [Honnappa/Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Status: [Sachin] To include Nitin suggestions and upstream.
+
** [Sirshak] Send mail to LF for power cycler access for mcbin due to lack of IPMI interface Status: Sirshak to open LF Tkt
+
** [Nitin] Create a Jira Tkt - ip cksum 128 bit vector support. Status: Nitin using ip incremental cksum.
+
** [Sirshak] To create a LF tkt for Ubuntu 16.04 installation on cavium-4. Status: Anton tried 16.04 but it didnt work, sent mail to Cavium contact for help.
+
* VPP
+
** [Sirshak] Vectorization
+
*** msb is already implemented verifying correctness and performance.
+
*** [Sirshak] To raise a Jira Tkt for msb changes.
+
*** Have communicated to ARM compiler team related to vtbl performance.
+
*** planning to add cvt (extend_to) and hadd(horizontal) equivalents.
+
** [Brian/Sirshak] Tuning Dual or Quad loop.
+
*** [Khem/Sachin] Updates on the changes seen after applying Brian's Patch. Status: No Updates.
+
** [Sachin] To create Jira Card, DPDK IOVA issue. (Created VPP-1377)
+
** [Honnappa] Module Ownership Discussion. Status: To come back to discussion next time. Community feedback to move to more use-case based approach.
+
* CSIT
+
** [Juraj] Parallelizing the make test(CSIT-1139) Status: Scheduling Done, Waiting for community review, got some internal comments Juraj working on it. To try this patch on jenkins sandbox.
+
** [Sirshak] replying to cavium regarding Ubuntu 18.04/16.04 installation problem cavium-4. Done. Status: Following up with Cavium
+
** [Khem] Performance Suite: 64B, 9000Jumbo. Jumbo Frames is failing.(khem to jira tkt: startup.conf, Frame size, NIC Card, Hugepages configuration).
+
** [Khem] Have a subset of tests running with tag.
+
** [Juraj/Sirshak] VPP Device SoC one node topology constraints Status: Orchestration still under discussion.
+
* fd.io lab
+
** [Juraj] mcbin access Status: Accessible mcbin build failing, wait fro Brian for help.
+
** [Sirshak] cavium blades. Status: [Sirshak] Following up with cavium
+
* Documentation
+
** Need to update the working ARM boards in the documentation section.
+
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
** Subscribe to: docs@lists.fd.io
+
* Action Items - Next Week
+
** [Khem] make verify on Taishan failure, Khem to create a Jira Tkt. Status:
+
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status:
+
** [Sirshak] Open Jira tkt look at Florin's patch. Status: Not done to be done next week
+
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status:
+
** [Lijian] To send the mail vpp-dev (VPP-1339) Status:
+
** [Honnappa] To talk to DPDK Mellanox DPDK community. Status:
+
** [Sachin] To include Nitin suggestions and upstream.(ARMv8 Crypto changes) Status:
+
** [Sirshak] To Open a LF Tkt regarding power cycler remote access fro mcbin. Status:
+
** [Sirshak] To raise a Jira Tkt for msb changes. Status:
+
'''7/30/2018'''
+
* Attendees
+
** Sirshak Das
+
** Juraj Linkes
+
** Lijian
+
** Andrew Pinski
+
** Andy Wang
+
** Tina Tsou
+
** Nitin Saxena
+
** Khemendra
+
** Sachin Saxena
+
 
+
* General Topic
+
* Action Items - Last Week
+
** [Khem] make verify on Taishan failure Status: No Status
+
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status: [Andy] Still working internally, expecting to be done this week.
+
** [Sirshak] mlnx tx non vector version used for no-multiseg. Status: Jira Tkt VPP-1355
+
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status: Yet to be verified , if fixed.
+
** [Sirshak] look at Florin's patch. Status: No status, [Sirshak] Open Jira tkt.
+
** [Tina] to get back on New ARMv8 Crypto. Status: Bob to schedule meeting with Cavium. To be tracked by Nitin, bob, tina.
+
** [Sirshak] Why Quad to Dual loop improves performance. Status: VPP-1356
+
** [Sirshak] To update VPP documentation with fd.io lab devices. Status: Not yet done. [Sirshak] to send email to yi and lijian.
+
** [Sirshak] VPP Vectorization Jira Tkt. Status: VPP-1357
+
** [Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Status: Waiting for Nitin to help on changes for Internal DPDK. External DPDK support patch done. [Sachin : created VPP-1378] To create a Jira Tkt for Internal Tkt. [Honnappa] To comment on current gerrit item to get it moving.
+
** [Sirshak] replying to cavium regarding Ubuntu 18.04 installation problem cavium-4. Status: Sent
+
** [Sirshak] Get credentails from Brian for mcbin Status: Done
+
** [Sirshak] Send mail to LF for power cycler access for mcbin due to lack of IPMI interface Status: Done
+
* VPP
+
** [Sirshak] Vectorization
+
*** Almost done with shuffle.
+
*** Will get to working with msb.
+
*** AARCH32 compilation to be discussed.(Shuffle Vector Intrinsic AARCH64 ARMv8 specific)
+
*** There are no specific requirements on aarch32 at this time.
+
** [Lijian && Yi] To continue effort on analyzing IPv4 nos on available platforms with Intel and Mellanox NICs
+
*** [Sirshak] Why is Mellanox NIC not used in CSIT ? Performance Suite Designed for Intel and Cisco NICs.
+
** [Brian/Sirshak] Tuning Dual or Quad loop.
+
*** [Khem/Sachin] Updates on the changes seen after applying Brian's Patch. Status: No Updates.
+
** [Khem] Updates on Benchmarking on taishan. Status: Held up hardware.
+
** [Nitin] Any new findings from IPv4 VPP test case. Status: Working HW offloading.
+
** [Sachin] To create Jira Card, DPDK IOVA issue. (Created VPP-1377)
+
** [Lijian] ipcksum - No Degradation on Qualcomm.
+
** [Nitin] Create a Jira Tkt - ip cksum 128 bit vector support.
+
* CSIT
+
** [Juraj] Parallelizing the make test(CSIT-1139) Status: All VPP instances running on same core. Tried scheduling cores. Dynamically finding available cores. Sweetspot currently:  8 containers with 96 core.
+
** [Juraj] Test features listed by talking to dave.
+
** [Sirshak] replying to cavium regarding Ubuntu 18.04 installation problem cavium-4. Done. Status: To open a new LF tkt to ask for 16.04 installation.
+
** [Juraj/Sirshak] VPP Device SoC one node topology constraints Status: Orchestration still under discussion.
+
** [Sirshak] to ask brian about mcbin credentials. Status: Done.
+
* fd.io lab
+
** [Juraj] mcbin access Status: Created LF tkt.
+
** [Sirshak] cavium blades. Status: [Sirshak] To create a LF tkt for Ubuntu 16.04 installation on cavium-4.
+
* Documentation
+
** Need to update the working ARM boards in the documentation section.
+
*** [Lijian/Yi] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin).
+
*** [Lijian/Yi] Add only fd.io lab devices.
+
*** [Sirshak] To send email with details.
+
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
** Subscribe to: docs@lists.fd.io
+
* Action Items - Next Week
+
** [Khem] make verify on Taishan failure Status:
+
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status: [Andy]
+
** [Sirshak] Open Jira tkt look at Florin's patch. Status:
+
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status:
+
** [Sirshak] to send email to yi and lijian for documentation.
+
** [Honnappa/Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Status:
+
** [Sirshak] Send mail to LF for power cycler access for mcbin due to lack of IPMI interface Status:
+
** [Nitin] Create a Jira Tkt - ip cksum 128 bit vector support.
+
** [Sirshak] To create a LF tkt for Ubuntu 16.04 installation on cavium-4.
+
 
+
'''7/24/2018'''
+
* Attendees
+
** Sirshak Das
+
** Juraj Linkes
+
** Lijian
+
** Andrew Pinski
+
** Andy Wang
+
** Tina Tsou
+
** Nitin Saxena
+
** Khemendra
+
 
+
* General Topic
+
** .
+
* Action Items - Last Week
+
** [Honnappa/Nitin] Aarch64 rpms not building - DPDK Neon Build Break. Status: Nitin, compiler does not support arm neon intrinsics. Honnappa working with compiler team: neon intrinsics is supported #defines not present. Tmp solution available. Honnappa to follow up in DPDK.
+
** [Nitin/Sachin] Follow up: Add Virtual addressing support in IOVA dmap Status: patch committed by sachin merged.
+
** [Khem] make verify on Taishan failure Status: No updates.
+
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status: PO Approved. Should get going in few days.
+
** [Sirshak] mlnx tx non vector version used for no-multiseg. Status: Waiting for x86 hotspots for confirmation and will then open a ticket.
+
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status: Seems to be fixed but have not tried all the test cases to confirm.
+
** [Sirshak] look at Florin's patch. Status: Not done yet
+
** [Tina] to get back on New ARMv8 Crypto. Status: No updates. Close to complete but not upstreamed yet.
+
** [Sirshak] Why Quad to Dual loop improves performance. Status: Not saturating no of outstanding prefetches. AI to raise a Jira Bug.
+
** [Sirshak] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin). Status; Not done yet.
+
* VPP
+
** [Sirshak] vectorization patch effects
+
*** Made few changes no visible changes.
+
*** Plan to read mlnx drivers DPDK to understand how neon intrinsics accelerate the vectors.
+
*** Add Jira Tkt.
+
** [Sirshak] Anamolies with mlx5 and VPP.
+
** [Honnappa <-> Nitin] Nitin okay with ARM contacting Customer Support for help on TX2 optimal settings.
+
** [Brian/Sirshak] Tuning Dual or Quad loop.
+
*** Visible change in A72.
+
*** Sirshak sent patch to Sachin and Khem to analyze if they see any improvement.
+
*** [Khem->Sirshak] Why moving form Quad to Dual improves performance.
+
** [Lijian] x86 nos reported: 9.5 Mpps is not same as reported by Nitin Status: Could because of broadwell and skylake difference.
+
** [Khem] Updates on IPv4 Benchmarking on taishan. Status: CSIT perfomance bringup on fd.io lab. 18.04 gcc 7.3 trex. Workaround done. DUT VPP crashing. Plan for running L2 test cases.
+
** [Nitin] Any new findings from IPv4 VPP test case. Status: not available to discuss
+
**
+
* CSIT
+
** [Juraj] Parallelizing the make test(CSIT-1139) Status: Sent for review, Figuring out optimal no of threads.
+
** [Sirshak] replying to cavium regarding Ubuntu 18.04 installation problem cavium-4.
+
** [Juraj/Sirshak] VPP Device SoC one node topology constraints Status: [Sirshak] Access to one of the three consoles.
+
** [Sirshak] to ask brian about mcbin credentials.
+
** [Adarsh] VPP Path/Device Efforts: Nested Container, trying VM inside a container facing some issues. Status: Work on hold as adarsh moved out of the project.
+
* fd.io lab
+
** [Sirshak] Installation of TG pending. Status: Done
+
** [Juraj] mcbin access Status: Two of them can be accessed the other 1 cant.
+
** [Sirshak] cavium blades connected need SFP and DACs. Status: Up and running, still need SFP and DACs
+
* Documentation
+
** Need to update the working ARM boards in the documentation section.
+
*** [Sirshak] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin).
+
*** [Sirshak] Add only fd.io lab devices.
+
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
** Subscribe to: docs@lists.fd.io
+
* Action Items - Next Week
+
** [Khem] make verify on Taishan failure Status:
+
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status:
+
** [Sirshak] mlnx tx non vector version used for no-multiseg. Status: Jira Tkt VPP-1355
+
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status:
+
** [Sirshak] look at Florin's patch. Status:
+
** [Tina] to get back on New ARMv8 Crypto. Status:
+
** [Sirshak] Why Quad to Dual loop improves performance. Status: VPP-1356
+
** [Sirshak] To update VPP documenetation witrh fd.io lab devices. Status:
+
** [Sirshak] VPP Vectorization Jira Tkt. Status: VPP-1357
+
** [Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Last Status: Waiting for Nitin to help on changes for Internal DPDK. Current Status:
+
** [Sirshak] replying to cavium regarding Ubuntu 18.04 installation problem cavium-4. Status: Sent
+
** [Sirshak] Get credentails from Brian for mcbin Status: Done
+
** [Sirshak] Send mail to LF for power cycler access for mcbin due to lack of IPMI interface Status: Done
+
'''7/17/2018'''
+
* Attendees
+
** Sirshak Das
+
** Sachin Saxena
+
** Khemendra Kumar
+
** Juraj Linkes
+
** Lijian
+
 
+
* General Topic
+
** Austin Folks leaving early meeting. If needs be somebody can takeover after 1 hour (9 am CT).
+
* Action Items - Last Week
+
** [Honnappa/Nitin] Aarch64 rpms not building - DPDK Neon Build Break
+
** [Nitin/Sachin]Follow up: Add Virtual addressing support in IOVA dmap Status: No updates.
+
** [Khem] make test on Taishan timings: Status: Done. To look at why make verify.
+
** [Sirshak] cavium USB-Ethernet adapters to Quantta Switch. Status: Andy waiting for cables to reach him.
+
** [Sirshak] mlnx tx non vector version used for no-multiseg. Jira Tkt Status: Not yet done. Will do this week.
+
** [Sirshak] DPDK 18.05 mlnx bug. Status: Sirshak to open Jira Tkt - VPP-1339
+
** [Sirshak] look at Florin's patch. Status: Not yet done.
+
** [Tina] to get back on New ARMv8 Crypto.
+
* VPP
+
** [Sirshak] vectorization patch effects
+
*** Made few changes no visible change.
+
*** Plan to read mlnx drivers DPDK to understand how neon intrinsics accelerate the vectors.
+
** [Brian/Sirshak] Tuning Dual or Quad loop.
+
*** Visible change in A72.
+
*** None in Qualcomm because of pfrm not being hotspot.
+
*** [Khem->Sirshak] Why moving form Quad to Dual improves performance.
+
*** Commmunity wide investigation needed.
+
** [Lijian] x86 nos reported: 9.5 Mpps is not same as reported by Nitin Status: Investigation.
+
** [Khem] Updates on IPv4 Benchmarking on taishan. Status: Stuck with pktgen.
+
** [Nitin] Any new findings from IPv4 VPP test case. Status:
+
** [Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Status: Waiting for Nitin to help on changes for Internal DPDK.
+
**
+
* CSIT
+
** [Juraj] Parallelizing the make test(CSIT-1139) Status: Almost done, need to work on polishing.
+
** [Juraj/Sirshak] SoC devices as non voting VPP device targets. Status: [Sirshak] pending on TG credentials.
+
** [Adarsh] VPP Path/Device Efforts: Nested Container, trying VM inside a container facing some issues. Status: No update, Adarsh replaced on the project; postponed
+
* fd.io lab
+
** [Sirshak] Installation of TG pending. Status: No update from LF - Anton
+
** [Sirshak] cavium blades connected need SFP and DACs. Status: Up and running, still need SFP and DACs
+
* Documentation
+
** Need to update the working ARM boards in the documentation section.
+
*** [Sirshak] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin).
+
*** [ARM community] Waiting for feedback from Khem and other companies
+
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
** Subscribe to: docs@lists.fd.io
+
* Action Items - Next Week
+
** [Honnappa/Nitin] Aarch64 rpms not building - DPDK Neon Build Break Status:
+
** [Nitin/Sachin] Follow up: Add Virtual addressing support in IOVA dmap Status:
+
** [Khem] make test on Taishan failure Status:
+
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status:
+
** [Sirshak] mlnx tx non vector version used for no-multiseg. Status:
+
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status: 
+
** [Sirshak] look at Florin's patch. Status:
+
** [Tina] to get back on New ARMv8 Crypto.
+
** [Sirshak] Why Quad to Dual loop improves performance.
+
** [Sirshak] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin).
+
 
+
'''7/10/2018'''
+
* Attendees
+
** Sirshak Das
+
** Sachin Saxena
+
** Khemendra Kumar
+
** Tina Tsou
+
** Nitin Saxena
+
** Juraj Linkes
+
** Brian Brooks
+
** Lijian
+
** Tom Herbert
+
* General Topic
+
** Austin Folks leaving early meeting. If needs be somebody can takeover after 1 hour (9 am CT).
+
** [Tom] Aarch64 rpms not building - anyone can help?
+
* Action Items - Last Week
+
** [Nitin/Sachin]Follow up: Add Virtual addressing support in IOVA dmap Status: No updates.
+
** [Nitin] make test on Thunderx2 timings Status: Send error report of make test.
+
** [Khem] make test on Taishan timings: Status: 22 mins. Try make verify.
+
** [Sirshak] cavium USB-Ethernet adapters to Quantta Switch. Status: Done for cavium 1,2,3. Need cables for 4,5,6,7. Cables ordered
+
** [Khem] to update on nested VMs on performance test cases. Status: No updates. Could be a naming problem.
+
** [Sirshak] Q to Maciek: buildroot image with VPP device(within container)? Status: No updates. Check with Brian to see if buildroot works on arm.
+
** [Sirshak] mlnx tx non vector version used for no-multiseg. Reason ? Status: No updates. Sirshak to open Jira Tkt.
+
** [Sirshak] DPDK 18.05 mlnx bug. Status: Asked in the community need to look at backtrace as pointed by damjan. Sirshak to open Jira Tkt.
+
* VPP
+
** [Sirshak] vectorization patch effects. https://gerrit.fd.io/r/#/c/13229/
+
*** I see around 15% in qualcomm with mellanox based on some patch which is not vectorization patch need find that.
+
*** Do others see similar improvement in past 2 weeks.
+
*** [Sirshak] look at Florin's patch.
+
** [Lijian] x86 nos, checking within Nitin for sync on configuration. Skylake Single Core Single Thread: Ipv4 forwarding 64B 15 Mppps.
+
** [Khem] Updates on IPv4 Benchmarking on taishan. Status: No Updates
+
** [Nitin] Any known comparision between AVF nos on aarch64 and DPDK nos ? On Intel its ~25% and ARM ~20%.
+
** [Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Status: Internal DPDK changes effort. Wait for status on New ARMv8 Crypto.
+
** [Sirshak->Nitin] Thunderx2(high core count)coremask for DPDK config in VPP startup conf.
+
** [Tina] to get back on New ARMv8 Crypto.
+
* CSIT
+
** [Juraj] Parallelizing the make test(CSIT-1139) Discussion: On Plan and if anybody wants to join hands. 
+
** [Juraj/Sirshak] SoC devices as non voting VPP device targets. Discussion: mcbin console access will be available once TG credentials are availlable.
+
** [Adarsh] VPP Path/Device Efforts: Nested Container, trying VM inside a container facing some issues.
+
* fd.io lab
+
** [Sirshak] Taishan connected need to verify once we get TG credentials. [Khem] Checked from Taishan side ports connected to TG are up.
+
** [Sirshak] mcbin connected need to verify once we get TG credentials.
+
** [Sirshak] cavium blades connected need to switch the network adapters before using it for CI.
+
* Documentation
+
** Need to update the working ARM boards in the docyumentation section.
+
*** [Sirshak] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin).
+
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
** Subscribe to: docs@lists.fd.io
+
* Action Items - Next Week
+
** [Honnappa/Nitin] Aarch64 rpms not building - DPDK Neon Build Break
+
** [Nitin/Sachin]Follow up: Add Virtual addressing support in IOVA dmap Status: No updates.
+
** [Khem] make test on Taishan timings: Status:
+
** [Sirshak] cavium USB-Ethernet adapters to Quantta Switch.
+
** [Sirshak] mlnx tx non vector version used for no-multiseg. Jira Tkt Status:
+
** [Sirshak] DPDK 18.05 mlnx bug. Status: Sirshak to open Jira Tkt.
+
** [Sirshak] look at Florin's patch.
+
 
+
'''7/3/2018'''
+
* Attendees
+
** Sirshak Das
+
** Sachin Saxena
+
** Khemendra Kumar
+
** Tina Tsou
+
** Nitin Saxena
+
** Juraj Linkes
+
** Brian Brooks
+
** Ed Kern
+
** Song
+
** Lijian
+
* General Topic
+
** Architecture Section in Documentation.
+
* Action Items - Last Week
+
** Khem: Ipv4 layer investigation. To Share some findings next week on parameters for CSIT Status: Done. If yes cover in VPP section.
+
** Nitin Follow up: Sachin: Upstreaming ARMv8 Crypto Changes with external DPDK. Status: Nitin to provide help on using Internal DPDK
+
** Nitin Follow up: Add Virtual addressing support in IOVA dmap Status: Waiting for response from Damjan
+
** Nitin make test on Thunderx2 timings :
+
** Khem: status on make test failures: CSIT-1148 Status: Fixed.
+
** Khem: make test on Taishan timings: Status: No status
+
** Sirshak: cavium USB-Ethernet adapters to Quantta Switch. Status: Still working with LF guys
+
** Khem to update on nested VMs on performance test cases. Status: No Updates
+
** Sirshak & Khem: Documentation review. Status: Done. continuous effort.
+
** Sirshak: Q to Maciek: buildroot image with VPP device(within container) ? Status: No updates.
+
* VPP
+
** Sirshak: Investigate mlnx_burst_rx_vec used in case of no multi-seg but plain mlnx_tx_burst used. Movement of hotspot seen for rx. Probable reason SRIOV(VFs) used. Root cause yet to be found.
+
** Sirshak: VPP DPDK 18.05 change done by damjan. mlnx drivers on Qualcomm are a problem. Urge Everyone to test respective sanity in their setup. set interface state <InerfaceName> up - stuck
+
** Khem: Discuss various parameters in CSIT for IPv4 Testing.
+
** Sirshak: TCP termination performance nos ?
+
** Sirshak: vectorization patch effects. https://gerrit.fd.io/r/#/c/13229/
+
* CSIT
+
** Juraj Make test bottlenecks: Updates: One plausible solution available. Parallelizing the make test(CSIT-1139)
+
** Juraj to start looking at SoC devices as non voting VPP device targets.
+
** Adarsh: openssl issues ? Issue still persists.
+
** Adarsh: VPP Path Tasks.
+
** Tkt updates:
+
*** CSIT-1021: Handle Scapy pcap limit Khem(brief on patch, updates): Status: To check with CSIT team for jenkins build failure. Status: No Updates. Not Priorty.
+
* fd.io lab
+
** Sirshak: Update from LF guys
+
* Documentation
+
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
** Subscribe to: docs@lists.fd.io
+
* Action Items - Next Week
+
** [Nitin/Sachin]Follow up: Add Virtual addressing support in IOVA dmap Status:
+
** [Nitin] make test on Thunderx2 timings :
+
** [Khem] make test on Taishan timings: Status:
+
** [Sirshak] cavium USB-Ethernet adapters to Quantta Switch. Status:
+
** [Khem] to update on nested VMs on performance test cases. Status:
+
** [Sirshak] Q to Maciek: buildroot image with VPP device(within container)? Status:
+
** [Sirshak] mlnx tx non vector version used for no-multiseg. Reason ? Status:
+
** [Sirshak] DPDK 18.05 mlnx bug. Status:
+
 
+
'''6/26/2018'''
+
* Attendees
+
** Sirshak Das
+
** Sachin Saxena
+
** Khemendra Kumar
+
** Tina Tsou
+
** Nitin Saxena
+
** Juraj Linkes
+
** Brian Brooks
+
** Ed Kern
+
** Song
+
* General Topic
+
** Introduce Song, Yi and Lijian
+
* Action Items - Last Week
+
** Adarsh: Updates on Jira tkt for openssl issues. Updates: none
+
** Adarsh: Update on topology for Kubernetes Functional Tests. Updates: Kubernetes, Docker
+
** Sirshak Tuning Section - Not Done
+
** Khem: Ipv4 layer investigation. CSIT: IPv4. To Share some findings next week on parameters for CSIT
+
** Nitin: Send old dpdk input node patch - Done
+
** Sachin: Upstreaming ARMv8 Crypto Changes with external DPDK. - Nitin to send mail
+
** Add Virtual addressing support in IOVA dmamap: Updates - nitin to send mail
+
** Nitin Measure make make test on Thunderx2
+
** Khem: measure make and make test on Taishan (Juraj tested it it failed : https://jira.fd.io/browse/CSIT-1148)
+
** Sirshak: try to switch eth-usb for regular eth ports on ThunderXs - Created a LF tkt have follow up meeting today.
+
* VPP
+
** Discuss vec_en_rx/tx=1 parameters.
+
** Discuss Vectorized rx and tx functions in mlx5 (in case of no multi-seg)
+
** rxd,txd nos in VPP config.
+
** mbcache any configuring done from VPP side ?
+
* CSIT
+
** make test failures Taishan Khem/adarsh (https://jira.fd.io/browse/CSIT-1148)
+
** Juraj Make test bottlenecks: Updates: Ran 4 containers (85 mins) (CSIT-1139)
+
** mcbin, OD(1000/3000), cavium thunderX as one of the targets for VPP Device Test.
+
** Future role of devices. Status: Existing Taishan Servers to be used for performance suite only.
+
** Khem to update on nested VMs on performance test cases.
+
** buildroot image with VPP device(within container) ? Sirshak to ask maciek
+
** Tkt updates:
+
*** CSIT-1043 (Guest OS becomes unresponsive during CSIT): Juraj Updates: Merged and Closed
+
*** CSIT-990 (buildroot package) Juraj Updates: Postponed
+
*** CSIT-1021: Handle Scapy pcap limit Khem(brief on patch, updates): Status: To check with CSIT team for jenkins build failure.
+
* fd.io lab
+
** Sirshak to have follow up LF guys.
+
* Documentation
+
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
** Subscribe to: docs@lists.fd.io
+
** Sirshak and Khem to try doing some reviews this week.
+
* Action Items - Next Week
+
** Khem: Ipv4 layer investigation. To Share some findings next week on parameters for CSIT
+
** Nitin Follow up: Sachin: Upstreaming ARMv8 Crypto Changes with external DPDK.
+
** Nitin Follow up: Add Virtual addressing support in IOVA dmap
+
** Nitin make test on Thunderx2 timings :
+
** Khem: status on make test failures: CSIT-1148
+
** Khem: make test on Taishan timings:
+
** Sirshak: cavium USB-Ethernet adapters to Quantta Switch.
+
 
+
** Sirshak: try to switch eth-usb for regular eth ports on ThunderXs - Created a LF tkt have follow up meeting today.
+
 
+
'''6/19/2018'''
+
* Attendees
+
** Sirshak Das
+
** Sachin Saxena
+
** Khemendra Kumar
+
** Tina Tsou
+
** Nitin Saxena
+
** Juraj Linkes
+
** Brian Brooks
+
** Ed Kern
+
** Song
+
* General Topic
+
** Introduce Yi ,Lijian and Song
+
* Action Items - Last Week
+
** Brian: mcbin Status:
+
** Sirshak: Follow up clang changes. Status: Merged updated wiki.
+
** Khem: Update Kernel Image based on Sanil's input to move past known VM issues. Try this internally.
+
** Khem: LF tkt for Taishan BIOS updates.
+
*** No update for the ticket
+
** Adarsh: openssl updates. Status:
+
*** Raised Jira ticket, needs to be discussed with VPP folks
+
** Adarsh: Kubernetes
+
*** Working with K8s folks, planning on creating topology from containers for functional tests
+
** Khem: VM(s) in container, VFs for containers
+
** Sirshak: Summarize tkts in the Tuning Section. Status: Not Done
+
** Khem: Investigation on ipv4 layer. Status: Not Done
+
** Nitin: Send old patch on dpdk_input node tuning
+
* VPP
+
** Sachin: Upstreaming armv8 crypto changes. Status: Sachin will try to upstream a patch related to external DPDK
+
** Sirshak: Vectorization - Presentation.
+
** Any new findings on hotspots or optimizations. Brian: adjusting queue sizes seem to have an effect
+
** https://gerrit.fd.io/r/#/c/12932/ discussion: Need to understand the usecase(s) for iommu inside VPP
+
* CSIT
+
** Discuss current make test time bottleneck.
+
** AI Nitin: measure make and make test on ThunderX
+
** AI Khem: measure make and make test on Taishan
+
** AI Sirshak: try to switch eth-usb for regular eth ports on Thunderxs
+
** Future role of devices. Status: will be decided when we have more info (performance on different devices etc.)
+
** Question to Nitin/Anyone of how to individually run one test case of the performance suite. Status: no performance testcase can run on 2-node topologies
+
** Tkt updates:
+
*** CSIT-1043 (Guest OS becomes unresponsive during CSIT): Juraj Updates: Sent a patch. Status: Patch is waiting to be merged
+
*** CSIT-990 (buildroot package) Juraj Updates: No updates
+
*** CSIT-1021: Handle Scapy pcap limit Khem(brief on patch, updates): Submitted. Jobs still failing, Khem to investigate. Patch related to Jumbo pkts.
+
* fd.io lab
+
** mcbin get them up, discuss with LF. Status: Brian - No Updates
+
** Cavium Blades LF ticket #56713 Status: Tina - Need to have a meeting
+
* Documentation
+
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
** Subscribe to: docs@lists.fd.io
+
 
+
* Action Items - Next Week
+
 
+
'''6/12/2018'''
+
* Attendees
+
** Sirshak Das
+
** Brian Brooks
+
** John Bromhead
+
** Sachin Saxena
+
** Khemendra Kumar
+
** Adarsh
+
** Andy Wang
+
** Tina Tsou
+
** Andrew Pinski
+
** Nitin Saxena
+
** Natalie Samsonov
+
 
+
* Action Items - Last Week
+
** Brian: mcbin status: Updates from Trishan LF tkt #54490. - No updates
+
** Sirshak: Follow up clang changes. Sent: Follow up patch.
+
** Khem: Update Kernel Image based on Sanil's input to move past known VM issues. Try this internally and then do it fd.io lab.
+
** Khem: LF tkt for Taishan BIOS updates. LF #56898 Status: Not done. Will follow up.
+
** Adarsh: openssl updates. Status: IPSEC SA add entry error. To open a Jira tkt tracking this.
+
** Sirshak: Summarize tkts in the Tuning Section. Didnt get chance to do this week would try to complete it by next week.
+
** Sirshak: Schedule a Meeting between Juraj and Khem. Done
+
* VPP
+
** Brian: Talk on mcbin perf analysis. Nitin to send a old patch on tuning prefetch on dpdk_input node.
+
** Sirshak: VPP Multi-arch optimizations Guidelines
+
** Sirshak: Vectorization - Plan to present something next week. Any thoughts ?
+
** Nitin: anybody willing to take up ipv4 layer ? Khem to take a look.
+
** Sachin: Upstreaming armv8 crypto changes.
+
** Nitin: memcpy updates ?
+
** Sirshak: clang patch status
+
* CSIT
+
** Sirshak: Explain VPP Path and VPP Device
+
** Open Questions and Answers surrounding VPP Device
+
*** Q. Do the Intel onboard NICs support VFs via SRIOV on machiattobin boards ?
+
*** A.[Natalie] We support it but it’s not formally released yet. Will be formally delivered in 18.09.
+
*** BB - Kernel bypass uses UIO possible to do. [natalie] check support for VF for onboard NICs
+
*** Q. If Yes, is it a hardware level support or supported in musdk also ?
+
*** A.[Natalie] MUSDK is not relevant here. Intel NICs are using DPDK and ARM infrastructure directly. We support PCIE SR-IOV with both v4.4 and v4.14 kernels
+
*** Q. Has anybody tested containers (docker) and any container orchestration system on mcbin (e.g Docker Swarm or Kubernetes) ?
+
*** A.[Natalie] Yes.
+
*** Q. K8s or Docker Swarn ?
+
*** A. [Bin Arm Internal] K8s is good choice version(1.9.4). Use kubeadm to install k8s cluster.
+
*** Q. VM inside a container works on ARM ?
+
*** A. [Bin ARM Internal] Use Kata and Runv. Kata/Runv is the solution of hardware-virtualized containers.
+
*** Q. Container within a Container(nested) works on ARM ?
+
*** A.[Bin ARM Internal] ‘Docker in docker’ or ‘Docker of Docker’ can works well on Arm platform.
+
** Sirshak: Explain the proposed role of Cavium Blades for functional tests.
+
** Tkt updates:
+
*** CSIT-1043 (Guest OS becomes unresponsive during CSIT): Juraj Updates: Sent a patch.
+
*** CSIT-990 (buildroot package) Juraj Updates:
+
*** CSIT-1021: Handle Scapy pcap limit Khem(brief on patch, updates): Submitted. Jobs failing Khem to investigate. Patch related to Jumbo pkts.
+
*** Sachin: To open tkt to track ARMv8 crypto.
+
* fd.io lab
+
** mcbin Status: Brian - No Updates
+
** Cavium Blades #56713 Status: Tina
+
*Documentation
+
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
** Subscribe to: docs@lists.fd.io
+
 
+
* Action Items - Next Week
+
 
+
** Brian: mcbin Status:
+
** Sirshak: Follow up clang changes. Status: Merged updated wiki.
+
** Khem: Update Kernel Image based on Sanil's input to move past known VM issues. Try this internally.
+
** Khem: LF tkt for Taishan BIOS updates.
+
** Adarsh: openssl updates. Status:
+
** Sirshak: Summarize tkts in the Tuning Section. Status: Not Done
+
** Khem: Investigation on ipv4 layer. Status:
+
 
+
'''6/4/2018'''
+
* Attendees
+
** Sirshak Das
+
** Brian Brooks
+
** John Bromhead
+
** Sachin Saxena
+
** Khemendra Kumar
+
** Adarsh
+
** Andy Wang
+
** Tina Tsou
+
** Andrew Pinski
+
** Juraj Linkes
+
** Nitin Saxena
+
** Natalie Samsonov
+
 
+
* Action Items - Last Week
+
** Sirshak: To create a LF tkt for mcbin - Didnt create as Brian is handling it offline. If things remain unresolved this week, will create one. - LF Tkt created #54490. [BB]Trishan to follow up over email.
+
** Sirshak: Follow up on cavium-3 : Its integrated to arm CI job.
+
** Sirshak: Upstream clang changes: Failing on Cavium TX1 host up-streamed related patch working on review comments.
+
** Sirshak: Discuss with Maciek and get a signoff for moving the x86 Hosts to arm rack: Done
+
** Honnappa: Provide inputs on how to proceed with comments on Marvell dpdk patch.
+
** Honnappa: VPP-1284: To look at this patch to provide comments on performance implications of the fix
+
** Juraj estimate moving CSIT functional tests to make test. - 1-2 months for 1 person. Others CSIT looking into this. Better estimate soon.
+
** Khem: Create LF tkt for Performance Suite Topology Creation. : Created LF #56736
+
** Adarsh: Create a Jira to document Automation Task. Created Jira Tkt.
+
** Khem: Follow up Sanil : Known taishan vm issues. Update Kernel Image
+
** Khem: LF tkt for Taishan BIOS updates. LF #56898
+
** Adarsh: openssl updates. Updated openssl dpdk. VPP is now stable. Will test soon. Adarsh to close the tkt.
+
** Nitin: VPP-1064 multiple cache line size patch. Nitin to raise to LF tkt to remove DPDK package from Nexus server.
+
 
+
* fd.io lab
+
** mcbin onboarding issue. - Comments in Action Items - Last Week.
+
** new cavium boxes status - JohnB : Blade 1-4 racked. CSIT Functional.
+
** Sirshak : Summarize tkts.
+
 
+
* VPP
+
 
+
** memcpy patch updates/closure: Abandon. Jira to be updated with more data.
+
** clang compilation Sirshak: Working on getting the patch upstreamed.
+
** mcbin performance analysis Brian: To talk about this next week.
+
** vectorization sirshak(Problem, Plausible Solution, Volunteers): SSE2NEON
+
** Sachin: upstreaming armv8 crypto changes.
+
** Sirshak: Add Tuning section in Wiki
+
** Sirshak: Summarize Jira Tkts
+
 
+
* CSIT
+
** Performance Suite Roadmap(topology, work distribution(khem, juraj)):
+
** Sirshak to Schedule a Meeting between Juraj and Khem.
+
** CSIT-1043 (Guest OS becomes unresponsive during CSIT): Juraj Updates: Seen by Juraj. Seeing the issue in ipv6 suite. happens during pcie rescan.
+
** CSIT-990 (buildroot package) Juraj Updates: Peter from pantheon replied Juraj still looking into it.
+
** CSIT-1021: Handle Scapy pcap limit Khem(brief on patch, updates):
+
** Sirshak : Summarize CSIT tkts
+
** Sachin: To open tkt to track ARMv8 crypto.
+
 
+
*Documentation
+
** Special VPP installations(eg. dpaa).
+
** ARMv8 crypto needs to documented.
+
 
+
* Action Items - Next Week
+
** Brian: mcbin status: Updates from Trishan LF tkt #54490.
+
** Sirshak: Follow up clang changes.
+
** Khem: Update Kernel Image based on Sanil's input to move past known VM issues.
+
** Khem: LF tkt for Taishan BIOS updates. LF #56898 Status:
+
** Adarsh: openssl updates.
+
** Sirshak: Summarize tkts in the Tuning Section.
+
** Sirshak: Schedule a Meeting between Juraj and Khem.
+
 
+
 
+
'''5/29/2018'''
+
* Attendees
+
** Sirshak Das
+
** Brian Brooks
+
** John Bromhead
+
** Sachin Saxena
+
** Khemendra Kumar
+
** Adarsh
+
** Andy Wang
+
** Honnappa Nagarahalli
+
** Tina Tsou
+
** Andrew Pinski
+
** Juraj Linkes
+
** Nitin Saxena
+
 
+
* Action Items - Last Week
+
** Sirshak: To update LF RT #54919 to follow up on cavium-2 status. - Not Needed as cavium-2 is present.
+
** Sirshak: Release Machine to EdK as soon as ThunderX is up. - Done
+
** Sirshak: Status on new ThunderXs: Will be decided after talks with Maciek. - Yet to decide.
+
** Sirshak: vm unresponsive issue. Tried again still got 27 errors for ipv4 handed over to Juraj for further investigation.
+
** Sirshak: To ask about CSIT performance topology connection status. Didnt get time mostly discussing VIRL job.
+
** Sirshak: to add OS version to fd.io lab machines. -Done by somebody else.
+
** Sirshak: to add Porting and Tuning section. Check with Honnappa
+
** Sirshak: to track arm master build failure. - Damjan has sent a fix.
+
** Juraj: Access to fd.io lab. - Done.
+
** Khem: to create a Jira tkt to document automation task of CSIT. - Still Working on it.
+
** Khem: to reach out to Sanil(Huawei)regarding known Taishan problems with KVM. - No response from Sanil yet.
+
** Khem: BIOS patch for NUMA node numbering issue. - Khem to create LF RT tkt to do this in fd.io lab.
+
** Nitin: VPP-1064 Support multiple cache line sizes per architecture. - Still in discussion with Dave.
+
** Adarsh: openssl updates. VPP crashing.
+
 
+
 
+
* fd.io lab
+
** mcbin powering on ? Sirshak to create LF tkt. Reach out to Brian offline.
+
** Cavium-3 role. Make decision based on feedback Edk. Sirshak to check availability.
+
** Sirshak to ask Brian to forward old LF tkt to JohnB.
+
 
+
* VPP
+
 
+
** ARMv8 crypto patch from Sachin related to dpdk_plugin only.
+
** memcpy issue: going with memcpy and not hand crafted memcpy.
+
** clang compilation: Sirshak to upstream to clang related changes add all other aarch64 leads.
+
** Brian to use cache stashing result. Updates: No affects for VPP but there is improvement on musdk sample application.
+
** VPP-1267(Marvell dpdk patch mcbin): How to move forward based on Damjan's comments. Still discussing. Honnappa to provide some inputs next week.
+
** VPP-1276(rpm issues aarch64): Not priorty. Status: No updates.
+
** VPP-1284: TLS corruption on aarch64: Status(After Sachin's suggestion): Resolved. Might have performance implications but currently only possible solution. HN to look at this Jira Card in order talk to compiler team if needs be.
+
 
+
* CSIT
+
** TG status in fd.io lab and internal Huawei Lab. - Sirshak to discuss with Maciek. Khem to create LF tkt.
+
** CSIT-1019 (timeout of PacketVerifier.RxQueue is not working): Done.(Upstreamed Merged ?). Status: Merged.
+
** CSIT-1023 (Crypto Func Tests): VPP still crashing - Adarsh
+
** CSIT-1043 (Guest OS becomes unresponsive during CSIT): Sirshak tried pinning the VMs to phy CPUs but tests still failing. Juraj to take over.
+
** CSIT-990 (buildroot package) Brian Status: build issue with grub.
+
** Juraj: Estimate on moving CSIT Functional tests to make test. Maciek proposal does consider all the implications of letting go VIRL especially parallelization VIRL offers.
+
 
+
* Action Items - Next Week
+
** Sirshak: To create a LF tkt for mcbin
+
** Sirshak: Follow up on cavium-3.
+
** Sirshak: Upstream clang changes.
+
** Honnappa: Provide inputs on how to proceed with comments on Marvell dpdk patch.
+
** Honnappa: VPP-1284: To look at this patch to provide comments on performance implications of the fix
+
** Juraj estimate moving CSIT functional tests to make test.
+
** Sirshak: Discuss with Maciek and get a signoff for moving the x86 Hosts to arm rack.
+
** Khem: Create LF tkt for Performance Suite Topology Creation.
+
** Adarsh: Create a Jira to document Automation Task
+
** Khem: Follow up Sanil : Known taishan vm issues.
+
** Khem: LF tkt for Taishan BIOS updates.
+
** Nitin: VPP-1064 multiple cache line size patch.:
+
** Adarsh: openssl updates.
+
 
+
'''5/22/2018'''
+
* Attendees
+
** Sirshak Das
+
** Stanislav Chlebec
+
** John Bromhead
+
** Sachin Saxena
+
** Khemendra Kumar
+
** Andy Wang
+
** Honnappa Nagarahalli
+
** Tina Tsou
+
** Andrew Pinski
+
** John Bromhead
+
** Juraj Linkes
+
** rkinsell
+
** Nitin Saxena
+
 
+
* Action Items - Last Week
+
** Sirshak: To update LF RT #54919 to follow up on cavium-2 status. - having troubles with login will sort it out today.
+
** Sirshak: Release Machine to EdK as soon as ThunderX is up: cavium-1 done cavium-2 still has issues with network connectivity.
+
** Sirshak: Status on new ThunderXs: Will be decided after talks with Maciek.
+
** Sirshak: vm unresponsive issue: No updates didnt get time to try will try this week.
+
** Sirshak: To ask about CSIT performance topology connection status. - TBD after call with Maciek.
+
** Nitin: VPP-1064 (Patch rejected by dave barach) Discuss cross compilation with Sachin. (Seperate or one unified Makefile). - No Updates.
+
** HN: memcpy benchmarking updates honnappa - 2 more tests to be done based on Ola's suggestion.
+
** Adarsh openssl issues: Will communicate with Sachin to get this resolved. Made changes based sachin's suggestions still issues to be resolved.
+
** Adarsh preparing a sheet updated with his progress on CSIT. - Added to the google sheets.
+
 
+
 
+
* fd.io lab
+
** cavium-2 follow up via LF #54919.
+
** Talk to Macek regrading TG physical placement on rack.
+
** Juraj : Needs access to fd.io lab. Tina to help Juraj with this.
+
** Juraj to send email to EdW to get access to fd.io lab.'
+
** Sirshak to add OS version to fd.io lab machines.
+
 
+
* VPP
+
** HN->Nitin: Stick with memcpy. Nitin concern SIMD unit being idle with new GCC. Feedback from arm compiler team that vector instructions dont perform as expected on many platforms. 1ns better(dpdk_input node) if using SIMD memcpy on ThunderX. Nitin to try using restricted on non-SIMD memcpy.
+
** 1019: CSIT. Py-lint issues. Patch submitted. Khem to merge with Lucian's Patch.
+
** 1023: Khem, Adarsh to talk to Sachin to resolve openssl issue. - Sachin suggested some config changes resulted in VPP being unstable. Still working it out.
+
** 1043: No updates. Sirshak to investigate this and Khem to reach out to Sanil regarding known Taishan problems with KVM.
+
** 990: Brian Updates - Sirshak to get status offline.
+
** 1267: l3fwd performance tuning: Status on Marvel patch: - No Updates. Nitin to submit his modified patch with -2.
+
** VPP-1276: Sachin facing issues with building rpm. - Any change in status ? No Updates. Low priorty for Sachin. Needs Help.
+
** VPP-1284: TLS corruption: Dynamic linking related to Thread local storage. Logs recorded with this tkt.
+
** Sirshak to add Porting and Tuning section.
+
** Sirshak to track arm master build failure.
+
* CSIT
+
** Adarsh openssl issues:
+
** Performance Testing Khem : NUMA node numbering issue. Last Update: Still working internally. Status: Internal patch for BIOS.
+
** Khem: to create a Jira tkt to document automation task of CSIT.
+
** Khem : trex installation- Having x86 TG internally. Any luck ?
+
** Brian to use cache stashing result. Updates:
+
 
+
* Action Items - Next Week
+
** Sirshak: To update LF RT #54919 to follow up on cavium-2 status.
+
** Sirshak: Release Machine to EdK as soon as ThunderX is up.
+
** Sirshak: Status on new ThunderXs: Will be decided after talks with Maciek.
+
** Sirshak: vm unresponsive issue: No updates didnt get time to try will try this week.
+
** Sirshak: To ask about CSIT performance topology connection status.
+
** Sirshak: to add OS version to fd.io lab machines.
+
** Sirshak: to add Porting and Tuning section.
+
** Sirshak: to track arm master build failure.
+
** Juraj: Access to fd.io lab.
+
** Nitin: VPP-1064 Support multiple cache line sizes per architecture.
+
** HN: memcpy benchmarking updates honnappa - 2 more tests to be done based on Ola's suggestion.
+
** Adarsh openssl updates
+
** Khem: to create a Jira tkt to document automation task of CSIT.
+
 
+
 
+
'''5/15/2018'''
+
* Attendees
+
** Sirshak Das
+
** Stanislav Chlebec
+
** Sachin Saxena
+
** Khemendra Kumar
+
** Andy Wang
+
** Honnappa Nagarahalli
+
** Tina Tsou
+
** Andrew Pinski
+
** John Bromhead
+
** Juraj Linkes
+
** rkinsell
+
** Nitin Saxena
+
 
+
* Action Items - Last Week
+
** Nitin: Run a VPP performance test to understand if the memcpy neon version provides any benefits. - Able to run with l3fwd test case. Gives better numbers.
+
** Sirshak: Create a higher LF ticket so that it is easier for Trishan/Acton/Venessa/Mohammed to follow up on bringing up ThunderX/mcbin - Not Created yet as I think we are close to solving the issue. If its not solved after today's call will create the tkt.
+
** Nitin: start email discussion with Dave to address the creation of single makefile for all ARMv8 devices. Still understanding cross compilation works. Communicating with Sachin.
+
 
+
* New Joinees
+
** Stanislav Chlebec - pantheon
+
 
+
* fd.io lab
+
** Follow up on ThunderX to getting mgmt IP - IP addresses are assigned, but are not up yet.- Have a call today to discuss this with Mohammed
+
** USB to Ethernet Question: Andrew: shows up as Ethernet interface.
+
** Release Machine to EdK as soon as ThunderX is up. - Sirshak to set mgmt IP and handover the machine.
+
** Cavium has shipped more machines as well - Delivered a week back. Tina to follow up with Trishan: 2 Delivered. Sirshak to ask in todays meeting for status on new ThunderX.
+
** See the Taishan setup for any VM issue. - Sirshak is trying to reproduce the issue. - Reproduced still debugging.
+
** Khemendra : Topology is correct. Sirshak to ask about CSIT performance topology connection status.
+
** Khemendra: Intel NIC to be used or Mellanox. HN: Intially use Intel later move to Mellanox.
+
* VPP
+
** VPP-1064 Dave Barach rejected the patch based on the solution Damjan and Nitin had decided upon following the reason that current approach breaks cross compilation. - NXP has upstreamed the DPAA2 patch, uses a separate segment makefile (dpaa.mk) for DPAA2. NXP does cross compilation most of the time. The approach could be that all platforms create a segment makefile and combine all of them into a single ARMv8 segment makefile. - Nitin Still discussing with Sachin regrading cross compilation
+
** One solution suggested was creating a platform specific Makefile for ThunderX - Any Decisions - Same as above.
+
** memcpy benchmarking updates honnappa - 2 more tests to be done based on Ola's suggestion. Nitin tested with restrict.
+
** 1019: No update. Few rough edges to clean up.
+
** 1021: Is it Closed ? Closed.
+
** 1023: migrated to openssl using DPDK manual but facing failed TCs - openSSL is integrated in his local environment - VPP not stable in his environment - Updated in the ticket. Status: Aadarsh still trying to get help from community. Khem, Aadarsh to talk to Sachin regarding openssl issues.
+
** 1043: No updates. Sirshak to investigate this.
+
** 990: Brian Updates:
+
** 1267: l3fwd performance tuning: Marvell to upstream a patch to enable dpdk on mcbin by making changes to dpdk plugin in vpp. Updates: natalie sent a email. Working on upstreaming changes to VPP for dpdk_plugin. Working on comparing musdk vs dpdk.
+
** Auto-detection of memory channels: Startup conf solution decided. Updates: No updates not priorty now bug raised by Nitin.
+
** Sachin facing issues with build rpm currently on 1801 will open a Jira Tkt if issues persists with 1804. Updates: Jira VPP-1276 to track this issue.
+
* CSIT
+
** Adarsh openssl issues: Will communicate with Sachin to get this reolved
+
** Adarsh preparing a sheet updated with his progress on CSIT.
+
** Performance Testing Khem : NUMA node numbering issue Updates: No updates. Still working internally.
+
** Khem facing issues with trex installation on ARM hence he will try getting a x86 machine as TG. Updates: Still working on getting an x86 in internal lab.
+
** brian to use cache stashing result. Updates:
+
 
+
* Action Items - Next Week
+
** Sirshak: To update LF RT #54919 to follow up on cavium-2 status. - having troubles with login will sort it out today.
+
** Sirshak: Release Machine to EdK as soon as ThunderX is up: cavium-1 done cavium-2 still has issues with network connectivity.
+
** Sirshak: Status on new ThunderXs: Will be decided after talks with Maciek.
+
** Sirshak: vm unresponsive issue: No updates didnt get time to try will try this week.
+
** Sirshak: To ask about CSIT performance topology connection status. - TBD after call with Maciek.
+
** Nitin: VPP-1064 (Patch rejected by dave barach) Discuss cross compilation with Sachin. (Seperate or one unified Makefile).
+
** HN: memcpy benchmarking updates honnappa - 2 more tests to be done based on Ola's suggestion.
+
** Adarsh openssl issues: Will communicate with Sachin to get this resolved
+
** Adarsh preparing a sheet updated with his progress on CSIT.
+
 
+
+
 
+
'''5/8/2018'''
+
* Attendees
+
** Honnappa Nagarahalli
+
** Tina Tsou
+
** Andrew Pinski
+
** Natalie Samsonov
+
** John Bromhead
+
** Sachin Saxena
+
** Khemendra Kumar
+
** Andy Wang
+
** Juraj Linkes
+
** rkinsell
+
** Nitin Saxena
+
** Ed Kern
+
 
+
* Action Items - Last Week
+
** Sirshak: Follow up with Mohammed regarding ThunderX mgmt connectivity and mcbin - IP addresses allocated cavium-2 has IPMI connectivity but console still hanging. cavium-1,3 - Not able to connect to IPMI. - Create a higher LF ticket so that it is easier for Trishan/Acton/Venessa/Mohammed to follow up.
+
** Sirshak to hold a call with Khem and Adarsh to understand the Vm_vhsot issue because of nested VMs - Contact established still working on analyzing the setup.
+
** Sirshak to create consolidated ARM ecosystem xls to reflect CSIT effort. (Need to add the link to the excel sheet to AArch64 page) - Not Done will do it next week.
+
** Honnappa: memcpy benchmarking - Micro benchmarks run on mcbin, qualcomm - vector Load/Store usually go to the LSU unit
+
** Brian : CSIT-990(buildroot) - Nitin ran on mcbin, it is failing at a different place - Brian to continue next week
+
** Brian to publish a pictorial representation of rx queues and tx queues in multicore case for mcbin. - Moved to next week
+
** Khem to analyze make test failure in Taishan - 1804 - Tested with the latest code (make test), all test cases passing
+
** ARM - For TG for deciding connectivity - MCBin and Taishan - Sirshak/Brian working on it.
+
** Sirshak/Brian to recheck validity of ASLR issue. - Not Done. Next Week.
+
 
+
* New Joinees
+
** Yuval Caduri - from Marvell responsible for MUSDK driver - packet processor 8K chips
+
** Natalie - responsible for network PMD DPDK driver
+
** Dmitri Epshtein - Responsible for crypto driver expert
+
 
+
* fd.io lab
+
** Follow up on ThunderX to getting mgmt IP - IP addresses are assigned, but are not up yet.
+
** Release Machine to EdK as soon as ThunderX is up.
+
** Cavium has shipped more machines as well - Delivered a week back. Tina to follow up with Trishan.
+
** See the Taishan setup for any VM issue. - Sirshak is trying to reproduce the issue.
+
* VPP
+
** VPP-1064 Dave Barach rejected the patch based on the solution Damjan and Nitin had decided upon following the reason that current approach breaks cross compilation. - NXP has upstreamed the DPAA2 patch, uses a separate segment makefile (dpaa.mk) for DPAA2. NXP does cross compilation most of the time. The approach could be that all platforms create a segment makefile and combine all of them into a single ARMv8 segment makefile.
+
** One solution suggested was creating a platform specific Makefile for ThunderX
+
** Honnappa Suggested as this not just a ThunderX issue but also Qualcomm issue hence a ARM specific Makefile would be better.(Issue 128 byte Cache Line Size)
+
** Honnappa no update on memcpy benchmarking will do that next week
+
** 1019: fixed in local will upstream soon - Patch has issues and some of the issues are fixed
+
** 1021: Patch submitted centos env issue CSIT follow up. - This can be closed
+
** 1023: migrated to openssl using DPDK manual but facing failed TCs - openSSL is integrated in his local environment - VPP not stable in his environment - Updated in the ticket.
+
** 1043: No updates
+
** 990: Brian to Retry on mcbin
+
** 1267: l3fwd performance tuning: Marvell to upstream a patch to enable dpdk on mcbin by making changes to dpdk plugin in vpp.
+
** Auto-detection of memory channels: Andrew's comment no really way to do that hence to go with making it a runtime argument via startup conf instead of being hard coded.
+
** Sachin facing issues with build rpm currently on 1801 will open a Jira Tkt if issues persists with 1804.
+
* CSIT
+
** Adarsh stalled with failure of test cases after using openssl.
+
** Performance Testing Khem : NUMA node numbering issue.
+
** NUMA node no issue not seen in ThunderX. Khem to post the details of issue and the workaround on Taishan.
+
** Khem facing issues with trex installation on ARM hence he will try getting a x86 machine as TG.
+
** Nitin known issue with trex with arm and mellanox card.
+
** Khem to try L2BD and L2XC.
+
** brian to use cache stashing and see the results.
+
 
+
* Action Items - Next Week
+
** Nitin: Run a VPP performance test to understand if the memcpy neon version provides any benefits.
+
** Sirshak: Create a higher LF ticket so that it is easier for Trishan/Acton/Venessa/Mohammed to follow up on bringing up ThunderX/mcbin
+
** Nitin: start email discussion with Dave to address the creation of single makefile for all ARMv8 devices
+
 
+
''' 5/1/2018 '''
+
* New Joinees
+
** Natalie and Yuval from Marvell for engineering input.
+
* fd.io lab
+
** Follow up on ThunderX to getting mgmt IP
+
** Release Machine to EdK as soon as ThunderX is up.
+
** Cavium has shipped more machines as well.
+
** See the Taishan setup for any VM issue.
+
* VPP
+
** VPP-1064 Dave Barach rejected the patch based on the solution Damjan and Nitin had decided upon following the reason that current approach breaks cross compilation.
+
** One solution suggested was creating a platform specific Makefile for ThunderX
+
** Honnappa Suggested as this not just a ThunderX issue but also Qualcomm issue hence a ARM specific Makefile would be better.(Issue 128 byte Cache Line Size)
+
** Honnappa no update on memcpy benchmarking will do that next week
+
** 1019: fixed in local will upstream soon
+
** 1021: Patch submitted centos env issue CSIT follow up.
+
** 1023: migrated to openssl using DPDK manual but facing failed TCs
+
** 1043: No updates
+
** 990: Brian to Retry on mcbin
+
** 1267: l3fwd performance tuning: Marvell to upstream a patch to enable dpdk on mcbin by making changes to dpdk plugin in vpp.
+
** Auto-detection of memory channels: Andrew's comment no really way to do that hence to go with making it a runtime argument via startup conf instead of being hard coded.
+
** Sachin facing issues with build rpm currently on 1801 will open a Jira Tkt if issues persists with 1804.
+
* CSIT
+
** Adarsh stalled with failure of test cases after using openssl.
+
** Performance Testing Khem : NUMA node numbering issue.
+
** NUMA node no issue not seen in ThunderX. Khem to post the details of issue and the workaround on Taishan.
+
** Khem facing issues with trex installation on ARM hence he will try getting a x86 machine as TG.
+
** Nitin known issue with trex with arm and mellanox card.
+
** Khem to try L2BD and L2XC.
+
** brian to use cache stashing and see the results.
+
 
+
* Action Items - Next Week
+
** Sirshak: Follow up with Mohammed regarding ThunderX mgmt connectivity and mcbin.
+
** Sirshak to hold a call with Khem and Adarsh to understand the Vm_vhsot issue because of nested VMs - Not done yet will do it next week.
+
** Sirshak to create consolidated ARM ecosystem xls to reflect CSIT effort. - Not Done will do it next week.
+
** Honnappa: memcpy benchmarking
+
** Brian : CSIT-990(buildroot)
+
** Brian to publish a pictorial representation of rx queues and tx queues in multicore case for mcbin. - Moved to next week
+
** Khem to analyze make test failure in Taishan - 1804 - Next Week
+
** ARM - For TG for deciding connectivity - MCBin and Taishan - Working on it.
+
** CSIT 990 brian to try - Next Week
+
** Sirshak/Brian to recheck validity of ASLR issue. - Not Done. Next Week.
+
 
+
* Action Items - Last Week
+
** Khem to ask mohammed, anton for power clearance for 2 new taishan. - Ok for Power Clearance
+
** Sirshak to hold a call with Khem and Adarsh to understand the Vm_vhsot issue because of nested VMs - Not done yet will do it next week.
+
** Sirshak and Brian to discuss on TG connectivity. - Done
+
** Sirshak to create consolidated ARM ecosystem xls to reflect CSIT effort. - Not Done will do it next week.
+
** Nitin: To post vlib_main 1804_rc2 issue to community. - Done
+
** Sirshak : to check if vlib_main is a issue in centriq. - Done
+
** Nitin: AI for creating Jira for number of memory channel identification. - Done
+
** Brian to publish a pictorial representation of rx queues and tx queues in multicore case for mcbin. - Moved to next week
+
** John B - 1G to USB adapters Ship to lab. - Done
+
** Khem to analyze make test failure in Taishan - 1802 rc2 - Next Week
+
** ARM - For TG for deciding connectivity - MCBin and Taishan - Working on it.
+
** CSIT 990 brian to try - Next Week
+
** Sirshak to take 1103 and 1114 - Done
+
** Nitin to Create l3fwd tkt - Done
+
** Brian to create a mcbin crash tkt. Next Week
+
** Maen to provide contact for IO Stashing on mcbin. - Contacted Brian. Brian to provide further input.
+
** Sirshak/Brian to recheck validity of ASLR issue. - Not Done. Next Week.
+
 
+
''' 4/25/2018 '''
+
* Meeting Time
+
** Proposed time 6-8am Tuesday PST.
+
** Tina to update wiki with new meeting time.
+
* FD.io lab
+
** ThunderX
+
*** OS installed on ThunderX. Switch being sent.
+
*** 1 ThunderX booted.
+
*** Plan to use 1G to USB adapters.
+
*** Varun POC for Cavium.
+
** Taishan
+
*** Its up and connected to Internet.
+
*** Build and make test 2 TCs failing (VCL TCs failing) - 1802 rc2 used.
+
*** Brian no update for TG - Meeting on it next week.
+
*** Khem to ask mohammed, anton for power clearance for 2 new taishan.
+
** MCBin
+
*** Maen POC - To Contact Mohammed.
+
*** Maen to provide engineering contact for help to Nitin.
+
* VPP
+
** Round Table status on Porting tkts.
+
** Nitin: vlib_main taking a lot of time on both mcbin and thunderx2
+
** Sirshak to take on ARM tkts.
+
* CSIT
+
** Adarsh looking at IPv4 failed test cases with priorty.
+
** Sirshak to hold a call with Khem and Adarsh to understand the Vm_vhsot issue because of nested VMs
+
** Cavium to publish mcbin cist performance nos but low priority. Nitin faced build-root issue with this.
+
** Maciek to host a kick off call.
+
** Sirshak and Brian to discuss on TG connectivity.
+
** Sirshak to create consolidated ARM ecosystem xls to reflect CSIT effort.
+
* Performance Benchmarking
+
** Nitin: To post vlib_main 1804_rc2 issue to community.
+
** Nitin: vlib_main issue in mcbin and thunderx2 at different points within the function. Not a hotspot in x86.
+
** Sirshak : to check if vlib_main is a issue in centriq.
+
** Nitin: AI for creating Jira for number of memory channel identification.
+
** AI for creating Jira for the crash on Mcbin – Brian
+
** Khem to get started on CSIT performance suite this week and publish on shared xls.
+
** Brian to publish a pictorial representation of rx queues and tx queues in multicore case for mcbin.
+
* Action Items - Last Week
+
** Sirshak to add link to xls to wiki page. - Done by somebody else.
+
** Brian to raise LF RT ticket about MACCHIATObins - Done. Pinged Mohammed yet hear back from him.
+
** Nitin to check 'make test' on MACCHIATObin (16GB DRAM) - Failed. Error related to Python scripts.
+
** Honnappa, Khem to check Clang build on arm64. - Tried clang build on Centriq made some changes still fails. clang on x86 has errors still passes. 'make test' fails on x86. Jira Card to be created - '''AI(Sirshak)'''. Khem to try.
+
* Action Items
+
** John B- 1G to USB adapters Ship to lab.
+
** Khem to analyze make test failure in Taishan - 1802 rc2
+
** ARM - For TG for deciding connectivity - MCBin and Taishan
+
** CSIT 990 brian to try
+
** Sirshak to take 1103 and 1114
+
** Nitin to Create l3fwd tkt
+
** Brian to create a mcbin crash tkt.
+
** Maen to provide contact for IO Stashing on mcbin.
+
** Sirshak/Brian to recheck validity of ASLR issue.
+
** Sirshak to track down issues.
+
 
+
''' 4/18/2018 '''
+
* FD.io lab
+
** Temporarily borrow 1x ThunderX to be used for ONAP demo at OpenStack Summit (end of May)? Yes.
+
** OS exists on ThunderXs; Varun will keysign with EdW; need to resolve OS netdev connectivity over 10/40GbE
+
** OS exists on TaiShan2280; no connectivity to the Internet
+
* VPP
+
** RC2
+
*** 'make' passes, 'make test' fail, 'make test-all' ???  - MACCHIATObin (4GB DRAM)
+
*** 'make' passes, 'make test' pass, 'make test-all' fails - Centriq
+
*** 'make' passes, 'make test' pass, 'make test-all' fails - x86
+
** Build
+
*** Testing Verify and Merge jobs for 18.04 master on arm64 today
+
*** Clang build fails on arm? 'CC=clang CXX=clang make'
+
* CSIT
+
** Adarsh updated CSIT status in xls
+
** CSIT-1023: decided to go with OpenSSL instead of ARMv8 crypto library, in DPDK, due to number of algorithms supported
+
*** e.g. AES-GCM not supported by ARMv8 crypto library
+
** Nitin updated CSIT-990 (buildroot) with more information
+
* Action Items
+
** Sirshak to add link to xls to wiki page.
+
** Brian to raise LF RT ticket about MACCHIATObins
+
** Nitin to check 'make test' on MACCHIATObin (16GB DRAM)
+
** Honnappa, Khem to check Clang build on arm64
+
 
+
''' 4/11/2018 '''
+
* Proposal to keep meeting at current time with additional overflow meeting at 8AM PST
+
* FD.io lab
+
** MACCHIATObins just arrived at VEXXHOST
+
** Nitin working on getting IPMI login credentials to provision OS on ThunderX
+
** Need to connect Skylake TG machines to Arm machines
+
*** ETA: 1wk
+
** Khem working with Aton (LF) to provision OS on TaiShan2280
+
*** ETA: 1wk, Ubuntu 17.10
+
* VPP
+
** Brian to do more benchmarking on MACCHIATObin
+
** Khem working on benchmarking clib_memcpy64_x4()
+
* CSIT
+
** Lucian submitted patches for CSIT-1019, CSIT-1021
+
** Lucian looking for contact for ARMv8 crypto driver in DPDK for CSIT-1023
+
*** See CSIT-1023 for details; looks like DPDK issue?
+
** Nitin to add more details to CSIT-990
+
* Action Items
+
** Sirshak to move JIRA tickets to xls
+
** Lucian to work with Nitin/Jerin on CSIT-1023
+
 
+
''' 4/4/2018 '''
+
* Propose to move the meeting +2 hours?
+
* RC1 cut today
+
* FD.io lab
+
** Allocate 3 ThunderX for EdK to integrate into CI
+
*** JohnB from Cavium agreed to supply 3 more ThunderX for CSIT (will pre-install FW & OS)
+
** Brian working on provisioning SSDs for MACCHIATObins
+
** Khem can ping IPMI interfaces on TaiShan2280s; also needs an OS to be installed
+
* VPP
+
** Discussed [https://schd.ws/hosted_files/onsna18/6c/ons_fdio_brooks.pdf ONS slides]
+
** Khem has patch for clib_memcpy64_x4() and needs help benchmarking
+
* CSIT
+
** Lucian found and created JIRA tickets for 3 issues while running CSIT
+
** Nitin created JIRA ticket for buildroot issue
+
** Khem seeing issues with VM
+
* Action Items
+
** Nitin/Varun to help provision Ubuntu 16.04 and firmware update on ThunderX machines
+
 
+
''' 3/28/2018 '''
+
* Sachin Saxena from NXP joined the call, welcome
+
* FD.io lab
+
** Khemendra is having issues with Rudy's emails. Hence, not been able to access Taishan servers
+
** Nitin will try to access the servers this week
+
** MACCHIATObin setup under progress
+
** OD1000 is added to Jenkins slave. The build is failing currently. The build can be triggered manually.
+
* VPP
+
** Discuss Single core, L3Fwd sample perf numbers and analysis next week
+
** Sachin is working on compiling 18.01. Native compilation works fine, cross compilation is failing
+
** Nitin still working on patch for cache line size
+
** VPP-1126 is being used in DPDK input node. Khemendra will take a look at it this week.
+
** VPP-1129 Brian/Sirshak will take a look. Looks like it can be closed.
+
** VPP-1114 Patch under internal review
+
* CSIT
+
** Khemendra having issues with interface bring up failing intermittently. Nitin suggested to add delay.
+
** Nicolas/Lucian debugging TC-07
+
** Khemendra having issues with TG VM crashing randomly with Ubuntu 16.04, QEMU 2.10. Solved by moving to Ubuntu 17.10, QEMU 2.10
+
** Nitin using Ubuntu 16.04 with 4.13 kernel
+
* Action Items
+
** Discuss Single core, L3Fwd sample perf numbers and analysis next week - Brian
+
** VPP-1126 Take a look this week as it affects DPDK input node - Khemendra
+
** Need more attention on solution for buildroot issue, need more information on failure [https://jira.fd.io/browse/CSIT-990 CSIT-990] - Nitin
+
** Create an excel sheet with the test case status - Nicolas/Lucian
+
 
+
''' 3/21/2018 '''
+
* Key signing party! Thank you Ed!
+
* FD.io lab
+
** VEXXHOST currently working on getting another PDU because there are not enough power ports
+
** Received SSDs for MACCHIATObins
+
* VPP
+
** Discuss high level plan for VPP on Arm
+
** Nitin still working on patch for cache line size
+
* CSIT
+
** Need more attention on solution for buildroot issue [https://jira.fd.io/browse/CSIT-990 CSIT-990]
+
** Nitin moving towards L2 & L3 perf test cases
+
** VM crash due to buffer overflow when multiple VMs share NVRAM; resolved in Fedora27
+
''' 3/14/2018 '''
+
* Key signing party! Thank you Ed!
+
* FD.io lab
+
** ToR switch issue resolved; confirm mgmt IP address assignment to racked Huawei/Cavium machines
+
** Started provisioning MACCHIATObins; Andy ordered SSDs to go with them
+
* VPP
+
** No updates
+
* CSIT
+
** Adarsh started running CSIT on virtual topology; moved past a paramiko issue, seeing other test failures
+
** Ongoing discussions on getting Adrian access to machines
+
''' 3/7/2018 '''
+
* FD.io lab
+
** Trishan (LF) to help follow up on progress in FD.io lab
+
* VPP
+
** More discussion on patch for cache line size; use MIDR register exported by proc fs
+
** Decision has been made to use wrappers for atomics
+
** Damjan reworked PCI handling code and added native driver for Intel AVF (XL710 i.e. Fortville)
+
*** Measuring 132 clocks per packet on Skylake (ip4 routing) with VLIB_FRAME_SIZE 256 (default); +1Mpps over DPDK avf/i40e PMD
+
** Damjan reworked memcpy() in MEMIF; achieve 2x25GbE line rate with these changes
+
** Sirshak working on getting VPP running on Qualcomm Centriq with Mellanox NIC
+
*** Seeing issues with external DPDK; static works but not shared; is VPP build system missing -libverbs -lmlx5 in LDFLAGS?
+
*** Nitin noticed DPDK 17.11 Mellanox PMD does not compile
+
*** Mellanox recently submitted a patch to VPP to support dynamic loading of Mellanox libraries
+
* CSIT
+
** Adrian does not have machines to work with in Bucharest; machine in Paris that Gabriel was using no longer available
+
*** AndyW to help resolve
+
** Adarsh moved past VM issues; able to launch VPP in VM with virtio interface; starting to run CSIT scripts
+
''' 2/28/2018 '''
+
* FD.io lab
+
** Ed Kern to try containerized CI on one OD1000 in parallel with Vanessa
+
** Received MACCHIATObins in Austin
+
* VPP
+
** Adarsh trying to run VPP in VM but getting PCI mapping issue; trying to connect to Linux bridge on host
+
** Patches for build breakage were committed; arm64 build stable now
+
** Brian able to reproduce low PPS numbers seen on MACCHIATObin
+
* CSIT
+
** Adarsh can reproduce a crash in qemu 2.10 Ubuntu 16.04; going to try Ubuntu 17.10
+
** Need to partition func test cases across people
+
''' 2/21/2018 '''
+
* FD.io lab
+
* CSIT
+
** Gabriel updated CSIT/AArch64 wiki with PASS/FAIL/OTHER list
+
*** OTHER - failure due to expect-like parsing of output(?)
+
*** FAIL - ssh timeout during PCIe rescan(?)
+
** Moved past first UEFI crash; still seeing crashing on startup (Gabriel)
+
*** Setup new Ubuntu environment
+
*** Continue debugging UEFI issue on Fedora with JeremyL
+
** Ubuntu is used pretty much everywhere except for additional CentOS CSIT perf
+
** Nitin working on upstreaming changes to CSIT
+
** Adarsh working on getting VM interfaces working
+
* VPP
+
** More discussion on how to handle cache line size
+
** Sync'd on patches for build breakage
+
 
+
''' 2/14/2018 '''
+
* FD.io lab
+
** Working on getting access to LF lab in order to setup OD1000 environment
+
** Check with tykeal & zxiiro on trust policy for getting others access (Brian)
+
** VEXXHOST
+
*** Mohammed says they do not have extra rack shelf - we need to send one for 3x MACCHIATObin
+
*** LF RT tickets: #52434 (ThunderX), #52435 (TaiShan2280), #52436 (MACCHIATObin)
+
* VPP
+
** Build, unit test, deb/rpm
+
*** 64B/128B cache line size - working on passing this configuration to rest of build system i.e. DPDK (Nitin)
+
*** RPi3 32-bit
+
**** Some parts of patch are 32-bit related, some RPi3 related
+
**** If there is justification, look into maintaining a 32-bit build on ARM
+
** Porting & Tuning
+
*** If patches need to be tested on multiple Arm chips, please use DO_NOT_MERGE and Code Review -2
+
*** Two NEON related patches merged, working in progress on others, Nitin testing CLASSIFY_USE_SSE
+
* CSIT
+
** Please open JIRA ticket with details on VM crashing on startup. DONE: [https://jira.fd.io/browse/CSIT-922 CSIT-922]
+
** Khem working on running VPP func tests on internal setup
+
 
+
''' 2/7/2018 '''
+
* LF lab
+
** OD1000 - last machine was racked; Vanessa needs credentials
+
** Taishan2280 - machines arrived at Vexxhost; confirm with Rudy/Mohammed
+
** ThunderX - machines arrived at Vexxhost; send board details to Mohammed
+
** MACCHIATObin - boards arrived in Arm SJC waiting for enclosures (Andy)
+
* Build, unit test, packaging
+
** 64B/128B cache line size - working on it (Nitin)
+
** Interest in ILP32 from Cavium; customer coming from MIPS32
+
*** [https://www.slideshare.net/linaroorg/bkk16305b-ilp32-performance-on-aarch64 BKK16-305B ILP32 Performance on AArch64]
+
* VPP
+
** NEON usage in vhost - sent first patch for review (Nitin)
+
*** Need to verify how it performs on other Arm-based machines (Brian)
+
*** VPP maintainers prefer to use SIMD wrappers, but it might not always be possible
+
**** Cavium/Arm had to rewrite algorithm for AArch64 instead of use SIMD wrappers in DPDK
+
** CLIB_HAVE_VEC128 - working on it (Gabriel)
+
** Discussed compiler builtins for atomics in VPP call; need to spin another patch with wrappers based on architecture (Kevin)
+
** Seeing prefetch hostspots on TX2+MlnxCX4en (similar to Armada8040) (Nitin)
+
* CSIT
+
** libvirt crashing on VM startup (Hierofalcon) (Gabriel)
+
*** Need someone who can reproduce this issue (Arm TBD)
+
** Huawei also seeing VM issues (Khem)
+
** buildroot doesn't work on Arm (Nitin)
+
*** Root issue: no support in GRUB for AArch64 in buildroot (?)
+
**** Need someone who can reproduce this issue (Arm TBD)
+
*** Peter Mikus replied to Nitin on csit-dev mail list
+
*** Using a temporary workaround: use a different VM image (Ubuntu Cloud) instead of one produced by buildroot
+
**** Working on patching DPDK in VM image (Ubuntu Cloud) just like done in buildroot
+
* Misc
+
** OpenFlow (Nitin, Damjan)
+
*** Is there an OpenFlow agent for VPP, and can VPP implement OpenFlow rules/tables?
+
*** VPP is not flow-based like OVS is; they are different
+
*** Can ODL/Honeycomb be used?
+
 
+
''' 1/31/2018 '''
+
* LF lab
+
** OD1000 - 1 replacement being installed this week
+
** Huawei & Cavium boards should arrive at colo this week; confirm with Rudy
+
* Build, unit test, packaging
+
** Kubeproxy/NAT failures
+
*** Not arch related
+
*** Part of extended unit tests, so does not block CI
+
** `make test` passes on D03 & D05 (Ubuntu)
+
* MACCHIATObin
+
** Seeing hotspots in VPP graph nodes
+
*** L3 forwarding - ip4 rewrite node
+
*** L2 cross-connect
+
*** Try reducing quad loop to a dual loop
+
*** dpdk-input node highly opt for x86 (could contribute to low perf) but hotspots still in rte_mbuf_t conversion(?)
+
** Some examples of runtime code selection based on uarch exist in the codebase
+
* CSIT
+
** Adrian Oanca join from Enea
+
** Gabriel seeing VM crashing during boot; related to # interfaces assigned (6)
+
** Nitin ran into issue with buildroot on arm64; see thread on csit-dev
+
 
+
''' 1/24/2018 '''
+
* VPP
+
** DPDK issue with non-pci network cards
+
** build & test status updated
+
** VPP-1127 (VEC_128 enable) under discussion. Should we enable this by default ?
+
** add Nitin to review Neon commits
+
** VPP-1114 currently internal review
+
** VPP-1064 under rework after review by Damjan
+
* CSIT
+
** first 3-nodes functional tests status list
+
** TODO Gabriel: share CSIT VM setup env
+
** nested VM: build-root package support for ARM. Create Jira ticket for Brian.
+
 
+
''' 1/17/2018 '''
+
* Tina to send calendar invite for meeting
+
* FD.io lab
+
** Cavium shipping
+
* VPP
+
** Kubeproxy tests failing
+
** Khem trying to find out the PCIe address for a given netdev interface
+
* CSIT
+
** Gabriel setting up 3 node topo with VMs
+
** Gabriel working on PASS/FAIL status
+
* [https://docs.fd.io/csit/rls1710/report/index.html CSIT 17.10 report]
+
 
+
''' 1/10/2018 '''
+
* Meeting moved 2 hours earlier - 6AM PT / 3PM CET / 7:30PM IST / 10PM CST
+
* FD.io lab
+
** Cavium ThunderX shipping soon
+
* VPP
+
** Kumar to look at VPP-1126
+
** Gabriel proposed https://gerrit.fd.io/r/#/c/10049/ as follow-up to Damjan's patch
+
* CSIT
+
** Gabriel's patch for aarch64 support in CSIT merged
+
** VirtualBox not supported on Arm / Vagrant unknown
+
*** This is OK for upstream since automation expects VMs to already exist
+
* Performance
+
** Need plan for 1T; use TaiShans that were sent to lab
+
* AIs
+
** Brian: Follow up with Vanessa and EdW regarding 'resource issue'
+
** Gabriel: Update CSIT wiki page; which tests are passing/failing?
+
** Brian: Check with Vanessa how to split machines between CI jobs and CSIT jobs
+
 
+
''' 1/3/2018 '''
+
* FD.io lab
+
** One OD1000 sent for RMA
+
** Huawei PO sent out
+
** Cavium PO sent out (?)
+
* VPP
+
** Gabriel working on patch for "show cpu" to display MIDR as human readable
+
** Nitin sent preliminary patch for vhost-user NEON impl
+
*** Seeing perf differences on different cores; tradeoff is single-threaded perf vs. NEON
+
** Kumar built and unit test successfully on D03
+
** Nitin to resume patch for supporting different cache line sizes for the same arch
+
* CSIT
+
** Gabriel cleaned up WIP patch; ready for review
+
** Kumar starting CSIT func tests with Ubuntu VMs
+
*** Scripts for running on dedicated hardware need to be modified, e.g. PCIe resources
+
** Kumar to send doc on testing
+
* Performance
+
** Kumar to start thread on performance testing
+
* AIs
+
** Brian: Check with Tina on shipping and open LF RT ticket once they have arrived
+
** Brian: Need a way to choose either SW or NEON impl based on chip
+
** Gabriel: Create list of broken CSIT tests for 2-node topology
+
''' 12/20/2017 '''
+
'''No meeting next week - Dec 27'''
+
* FD.io lab
+
** OD1000s - build only
+
*** 1 of 3 needs to be RMAd
+
*** Can these be up in time to show 'make test' passes on ARM for 18.01 release report?
+
** TaiShan
+
*** PO in progress
+
** ThunderX - build only
+
*** PO went out
+
* VPP
+
** Patches / JIRAs
+
*** Patch for extended test failure, but still more (new) extended test failures - Gabriel
+
*** Nitin to post vhost-user.c changes for NEON
+
**** Nitin will finish Gabriel's original NEON patch to add CLIB_HAVE_VEC_128
+
** Can we share code on Github e.g. NEON perf tests?
+
* CSIT
+
** Leading question: How many CSIT test cases are passing/failing?
+
** Environment issues preventing running through all CSIT test cases; Gabriel needs dedicated machines or more RAM
+
** Cavium & Huawei will join Gabriel in CSIT replication on ARM hardware next week
+
*** Cavium previously ran vhost test cases manually, now moving to CSIT
+
 
+
''' 12/13/2017 '''
+
* VPP
+
** Quick overview of work items
+
** Waiting to hear back from LF about OD1000 connectivity
+
*** Changes needed to ci-mgmt
+
* CSIT
+
** Starting to reproduce CSIT on x86 and ARM (with Gabriel's WIP patch)
+
*** Some issues with environment variables (perf tests on 2-node)
+
** Need Nexus to support aarch64 packages
+
*** Need a contact for Nexus
+
* Share known issues on wiki!
+
* Request CSIT 'deep dive'
+
 
+
''' 12/06/2017 '''
+
* Can we access the OD1000 in csit lab ?
+
** currently mainly working with VMs
+
* added dedicated wiki page for CSIT : https://wiki.fd.io/view/CSIT/AArch64
+
* WIP : https://gerrit.fd.io/r/#/c/9474/
+
 
+
''' 11/29/2017 '''
+
*VPP
+
** vhost-user.c - SSE4.2 only. Implement range search using NEON. (nitin)
+
** OD1000 status ?
+
*** build only
+
*** can we access them ?
+
*** what wan we do to help in general ?
+
** x86 intrinsic review
+
** build VPP on ARM VM on x86
+
*CSIT
+
** what platforms wil lbe made available
+
 
+
''' 11/22/2017 '''
+
* VPP CI
+
** 3 ThunderX for Chrismas
+
* CSIT
+
** func on VM vs perfs on HW
+
** func on x86 VMs OK with 2 nodes
+
** DPDK integration WIP : https://gerrit.fd.io/r/#/c/9474/
+
** issues
+
*** how to access the lab ?
+
* Next steps
+
** VPP
+
** CSIT
+
*** structure work & send email (Gabriel)
+
*** is xxhash vs crc32 finished ? (Gabriel)
+
*** ask Maciek & setup a presentation meeting with someone from CSIT (Tina)
+
*** find a time to reschedule this meeting before the CSIT weekly call (Brian)
+
 
+
''' 11/15/2017 '''
+
* VPP upstream status
+
** build && build-release OK
+
** "make test" && "make test-debug" OK
+
** packaging:
+
*** Ubuntu 16.04 OK
+
*** Ubuntu 17.10 ? (TBC)
+
*** fedora-26 OK
+
* vpp continuous test
+
** all task required for jenkin's "verify" job are ready
+
** TODO: request gerrit hook to Dave Barachs / vpp-dev (NB & GG)
+
** set up ci in fdio lab
+
* CSIT
+
** setting up env
+
** ThunderX platforms should arrive this week
+
** csit work sharing
+
 
+
''' 11/8/2017 '''
+
 
+
* Unit tests
+
** Tests pass except for random initialization failures
+
** Need to hear back from upstream about Extended unit tests
+
* Should we run plugins such as NSH SFC?
+
* Hardware to lab
+
** Huawei h/w stalled
+
** 3x ThunderX shipping to FD.io lab
+
* CSIT replication
+
** Cavium replicating on ThunderX2; getting started
+
* Let's track our work in Jira; Brian to migrate tasks to Jira
+
 
+
''' 10/25/2017 '''
+
 
+
* Gabriel working on vpp init failure in linux_pci_init()
+
* Kumar to check with GeorgeZ on Huawei boards shipped to CSIT; need to verify tests also on this environment (package versions from distro)
+
* Brian to check whether anything else needs to be done besides 'make test' for upstream enablement
+
  
 
== Status Report Ligato/Contiv ==
 
== Status Report Ligato/Contiv ==
 
[[File:Capture LandC.PNG]]
 
[[File:Capture LandC.PNG]]

Latest revision as of 15:13, 21 November 2023

Get Involved

Meeting Details

IRC Channel

#fdio-arm on freenode.net

Slack

Request invitation at https://slack.fd.io/

Jira

Jira issues with ARM64 label

Presentations

Release Milestones

18.10

18.07

18.04

  • CI
    • Upstream patch verification on ARMv8 machines
    • .deb packages

Machines

The FD.io lab is hosted at VEXXHOST colocation centre in Montreal Québec, Canada.

Platform Role Status Hostname IP IPMI Cores RAM Ethernet Distro
Marvell ThunderX VPP dev debug server Running vpp-marvell-dev 10.30.51.38 10.30.50.38 96 128GB 3x40GbE QSFP+ / 4x10GbE SFP+ Ubuntu 18.04.4
CI build server Running in Nomad s53-nomad 10.30.51.39 10.30.50.39 96 128GB 3x40GbE QSFP+ / 4x10GbE SFP+ Ubuntu 18.04.4
CI build server Running in Nomad s54-nomad 10.30.51.40 10.30.50.40 96 128GB 3x40GbE QSFP+ / 4x10GbE SFP+ Ubuntu 18.04.4
CI build server Running in Nomad s52-nomad 10.30.51.65 10.30.50.65 96 256GB 2xQSFP+ / USB Ethernet Ubuntu 18.04.4
CI build server Running in Nomad s51-nomad 10.30.51.66 10.30.50.66 96 256GB 2xQSFP+ / USB Ethernet Ubuntu 18.04.4
CI build server Running in Nomad s49-nomad 10.30.51.67 10.30.50.67 96 256GB 2xQSFP+ / USB Ethernet Ubuntu 18.04.4
CI build server Running in Nomad s50-nomad 10.30.51.68 10.30.50.68 96 256GB 2xQSFP+ / USB Ethernet Ubuntu 18.04.4
Marvell ThunderX2 Perf DUT candidate Running s27-t13-sut1 10.30.51.69 10.30.50.69 224 128GB 3x40GbE QSFP+ XL710-QDA2 Ubuntu 18.04.2
VPP device server Running in Nomad s55-t36-sut1 10.30.51.70 10.30.50.70 256 256GB 2x40GbE QSFP+ XL710-QDA2 / 2x10/25GE SFP+ ConnectX5 Ubuntu 18.04.4
VPP device server Running in Nomad s56-t37-sut1 10.30.51.71 10.30.50.71 256 256GB 2x40GbE QSFP+ XL710-QDA2 / 2x10/25GE SFP+ ConnectX5 Ubuntu 18.04.4
Huawei TaiShan 2280 CSIT testbed Running in CI s17-t33-sut1 10.30.51.36 10.30.50.36 64 128GB 2x10GbE SFP+ Intel X520-DA2 / 2x25GbE SFP28 Mellanox CX-4 18.04.1
CSIT testbed Running in CI s18-t33-sut2 10.30.51.37 10.30.50.37 64 128GB 2x10GbE SFP+ Intel X520-DA2 / 2x25GbE SFP28 Mellanox CX-4 18.04.1
Marvell MACCHIATObin N/A Decommissioned s20-t34-sut1 10.30.51.41 10.30.51.49, then connect to /dev/ttyUSB0 4 16GB 2x10GbE SFP+ Ubuntu 16.04.4
N/A Decommissioned s21-t34-sut2 10.30.51.42 10.30.51.49, then connect to /dev/ttyUSB1 4 16GB 2x10GbE SFP+ Ubuntu 16.04.5
N/A Decommissioned fdio-mcbin3 10.30.51.43 10.30.51.49, then connect to /dev/ttyUSB2 4 16GB 2x10GbE SFP+ Ubuntu 16.04.5
Power Cycler Operational 10.30.50.80
SoftIron OverDrive 1000 N/A Decommissioned softiron-1 10.30.51.12 N/A 4 8GB openSUSE
N/A Decommissioned softiron-2 10.30.51.13 N/A 4 8GB openSUSE
N/A Decommissioned softiron-3 10.30.51.14 N/A 4 8GB openSUSE

Note: to get lab access, create a gpg key, upload it to keyserver, have it signed by a trusted anchor in a video call (fingerprint will be needed) and then an ARM authority (Tina) needs to send an e-mail to helpdesk@fd.io with your name, e-mail, keygrip and key fingerprint

CI

Covers automated build, unit test, and packaging for various Linux distros on ARMv8 machines.

Jenkins job Status Description
vpp-arm-verify-master-ubuntu1604 Running xxx
vpp-arm-merge-master-ubuntu1604 Running xxx
vpp-arm-verify-1804-ubuntu1604 Running xxx
vpp-arm-merge-1804-ubuntu1604 Running xxx

Next steps:

  • make test added to verify jobs
  • Clang build
  • openSUSE Leap 15 | CentOS 7 | Ubuntu 18.04
  • vpp-csit-verify-virl-master or equivalent CSIT functional testing

CSIT

Covers automated functional and performance integration testing on ARMv8 3-node and 2-node testbeds.

https://wiki.fd.io/view/CSIT/AArch64

Contiv-VPP

This Kubernetes network plugin uses FD.io VPP to provide network connectivity between PODs.

https://github.com/contiv/vpp

The installation guide of Contiv-VPP on Arm64 platform is

https://github.com/contiv/vpp/blob/master/docs/arm64/MANUAL_INSTALL_ARM64.md

Porting and Tuning Roadmap

  • VPP Vectorization: Expanding the Neon Library for IPv4 forwarding code path - Sirshak/Lijian
  • Tuning the quad loop/dual loop for small cores - Lijian
  • General performance analysis and tuning of various graph nodes for IPv4 forwarding test case - Sirshak/Lijian
  • Memory Ordering - Sirshak
  • CSIT Performance Test - Khemendra
  • CSIT Device Test - Juraj
  • CSIT Path Test - Juraj

Known Issues

GCC 5.3 ICEs during FP register allocation. Please use GCC 5.4 or newer.

Activity

Recent Patches

misc: vppctl fix heap-buffer-overflow & memleaks Merged 12/14 Tianyu Li
crypto-native: fix build error on Arm using clang-13 Merged 12/14 Jieqiang Wang
snort: fix unused result warning for gcc-10 Merged 11/06 Tianyu Li
l2: fix array-bounds error for prefetch on Arm Merged 11/07 Tianyu Li
ip6: fix IPv6 address calculation error using "ip route add" CLI Merged 10/21 Jieqiang Wang
ipsec: Performance improvement of ipsec4_output_node using flow cache Merged 10/13 Govindarajan Mohandoss
build: fix centos rpm build Merged 10/08 Tianyu Li
vppinfra: fix potential memory access error in _pool_init_fixed Merged 10/05 Jieqiang Wang
svm: fix asan check failed @svm_map_region on arm Merged 06/24 Tianyu Li
l2: fix vrrp prefix mac comparison Merged 06/09 Tianyu Li
build: fix build error after make wipe Merged 06/04 Tianyu Li
memif: fix input node buffer prefetch Merged 05/21 Tianyu Li
memif: fix gcc-10 build error on arm platform Merged 05/21 Tianyu Li
papi: fix ubuntu 1804 make test socket.close error Merged 04/16 Tianyu Li
rdma: fix skip_ipv4_cksum behavior in scalar path Merged 04/15 Tianyu Li
vppinfra: correct intrinsic called by u16x16_from_u8x16 Merged 04/15 Lijian Zhang
vppinfra: fix compiling error due to incompatible udphdr field names Merged 03/05 Jieqiang Wang
avf: optimized with NEON SIMD instruction Merged 12/18 Lijian Zhang
ip: fix compiling error with gcc-10 Merged 09/01 Jieqiang Wang
build: Fix 'make install-deps' errors on aarch64 CentOS 7 Merged 07/29 Jieqiang Wang
acl: correct acl vat help message Merged 07/24 Lijian Zhang
build: add libssl-dev library for ubuntu 20.04 Merged 06/04 Jieqiang Wang
dpdk: fix compiling issue with clang Merged 05/08 Lijian Zhang
vppinfra: fix u32x4_byte_swap on Arm Merged 05/08 Lijian Zhang
build: support arch-specific compiling for Neoverse N1 Merged 04/30 Lijian Zhang
dpdk: false link down issue with ixgbe NIC Merged 03/23 Lijian Zhang
vlib: fix error when creating avf interface on SMP system Merged 03/21 Jieqiang Wang
vlib: leave SIGPROF signal with its default handler Merged 03/21 Jieqiang Wang
build: add libssl-dev for ubuntu 16.04 and 18.04 Merged 03/11 Jieqiang Wang
vlib: fix code of getting numa node with specific cpu_id Merged 02/17 Lijian Zhang
docs: add physmem section in configuration parameters Merged 12/19 Jieqiang Wang
vlib: add max-size configuration parameter for pmalloc Merged 12/18 Jieqiang Wang
crypto: not use vec api with opt_data[VNET_CRYPTO_N_OP_IDS] Merged 11/13 Lijian Zhang
acl: add missing square brackets to vat_help option in acl api Merged 10/31 Jieqiang Wang
dpdk: apply dual loop unrolling in DPDK TX Merged 09/12 Lijian Zhang
ip: apply dual loop unrolling in ip4_rewrite Merged 09/12 Lijian Zhang
ip: apply dual loop unrolling in ip4_input Merged 09/12 Lijian Zhang
build: fix running error with vmxnet3_test_plugin.so Merged 09/11 Jianlin Lv
build: fix unsupported CMake comparison operation Merged 09/05 Jianlin Lv
tap: fix tap interface not working on Arm issue Merged 09/04 Lijian Zhang
build: fix vpp compilation failure on ThunderX2 and Amp Merged 08/19 Jianlin Lv
vppinfra: Update "show cpu" output for AArch64 chips Merged 08/19 Nitin Saxena
vppinfra: refactor test_and_set spinlocks to use clib_spinlock_t Merged 08/02 Jason Zhang
vppinfra: added performance test for clib_rwlock_t (test_rwlock.c) Merged 08/02 Jason Zhang
vppinfra: refactor clib_rwlock_t to use single condition variable Merged 08/02 Jason Zhang
vppinfra: refactor clib_spinlock_t to use compare and swap Merged 08/02 Jason Zhang
vppinfra: added lock performance test for clib_spinlock_t (test_spinlock.c) Merged 08/02 Jason Zhang
vppinfra: refactor use of CLIB_MEMORY_BARRIER () Merged 08/02 Jason Zhang
vppinfra: conformed spinlocks to use CLIB_PAUSE Merged 08/02 Jason Zhang
vppinfra: add u64x2_scatter/u32x4_scatter Merged 06/21 Lijian Zhang
vppinfra: add u64x2_gather/u32x4_gather Merged 06/21 Lijian Zhang
fix compiling error with marvell pp2 plugin Merged 06/11 Jianlin Lv
Switch atomic release API from __sync to __atomic builtin Merged 06/05 Sirshak Das
Switch atomic test and set API from __sync to __atomic builtin Merged 06/05 Sirshak Das
Build packages for generic Arm architecture Merged 05/15 Lijian Zhang
Enable NEON instructions in memcpy_le Merged 05/01 Lijian Zhang
svm_fifo rework to avoid contention on cursize Merged 04/17 Sirshak Das
Re-enable aarch64 neon instruction in vlib_buffer_free_inline Merged 03/20 Lijian Zhang
sctp chunk_len fix Merged 03/06 Sirshak Das
Use acquire/release ordering when accessing svm_fifo shared variable cursize Merged 11/29 Sirshak Das
Optimize xxx_zero_byte_mask NEON function. Merged 11/07 Lijian Zhang
Enable atomic swap and store macro with acquire and release ordering. Merged 11/03 Sirshak Das
Add and enable msb mask vector intrinsic for aarch64. Merged 10/31 Lijian Zhang
vppinfra: add atomic macros for __sync builtins Merged 10/19 Sirshak Das
vppinfra: Fix extendto_high aarch64 NEON api. Merged 10/09 Sirshak Das
Support dynamic dual/quad loop selection on aarch64 Merged 10/01 Lijian Zhang
Enable verbose output during VPP cmake compiling Merged 9/25 Lijian Zhang
dpdk_plugin: fix mlx5 build and runtime issues Merged 9/27 Sirshak Das
Add and enable u32x4_extend_to_u64x2_high for aarch64 NEON intrinsics. Merged 9/12 Sirshak Das
Add horizontal add (hadd) vector intrinsic via NEON. Merged 9/11 Sirshak Das
Add u32x4_extend_to_u64x2 for aarch64 using NEON intrinsics Merged 9/11 Sirshak Das
Replacing vtbl NEON intrinsic with rev NEON intrinsic for byte_swap. Merged 9/11 Sirshak Das
Fix array bound failure in api_sr_localsid_add_del Merged 8/30 Lijian Zhang
cmake: fix marvell plugin build Merged 8/28 Brian Brooks
fix dpdk_plugin.so load failure with DPDK 18.08 Merged 8/23 Lijian Zhang
Fix a bug in function pipe_rx Merged 8/17 Lijian Zhang
fix compiling warnings with GCC Merged 8/17 Lijian Zhang
Update AArch64 CSIT machines into FD.io VPP docs Merged 8/17 Lijian Zhang
Add support for shuffle vector intrinsic via Neon in ARM Merged 8/1 Sirshak Das
Improve cpu { coremask-% } configure option Merged 8/1 Yi He
Fix undefined symbol: fformat_append_cr in vat plugins loading Merged 7/31 Yi He
pp2: increase recycle batch size Merged 7/10 Brian Brooks
pp2: change default queue size Merged 7/26 Brian Brooks
pp2: use configured RX queue size Merged 7/10 Brian Brooks
Fix load_unaligned undefined and other possible build failures Merged 6/26 Sirshak Das
Enable PMU cycle counter for graph node cycles Sirshak Das
Fix clang compilation on aarch64: extraneous parentheses Merged 6/13 Sirshak Das
Fix clang compilation on aarch64: value size does not match register size Merged 5/30 Sirshak Das
Fix clang compilation on aarch64: sizeof operator error Merged 5/30 Sirshak Das
Fix clang compilation on aarch64: replace -pie with -fPIE for dpdk compilation Merged 5/30 Sirshak Das
dpdk: set dmamap iova address value according to eal_iova_mode Merged 5/28 Sachin Saxena
Fixes make test errors with clang compiler on aarch64 Merged 5/27 Sirshak Das
Fix broken compilation for non-numa aware platforms Merged 5/16 Sachin Saxena
build-data: Common makefile for NXP DPAA1/DPAA2 platforms Merged 5/4 Sachin Saxena
arm64: Avoid setting march to corei7 when Cross Compiling for ARM Merged 5/4 Sachin Saxena
use restrict keyword VPP-1126 Khemendra Kumar
Autotools: Autodetection of cache line size VPP-1064 Nitin Saxena
add 'is_all_zero(x)' for NEON - fix build break Merged 2/20 Adrian Oanca
u8x16_compare_byte_mask optimization Merged 2/24 Adrian Oanca
Added u8x16,u32x4,u64x2 variants of _zero_byte_mask(x) for ARM/NEON platform Merged 2/26 VPP-1129 Adrian Oanca
add CLIB_HAVE_VEC128 with NEON intrinsics Merged 02/08 VPP-1127 Gabriel Ganne
Use neutral vector code for ethernet_frame_is_tagged Merged 2/19 Damjan Marion
vhost: Added ARMV8 NEON version of function map_guest_mem() Merged 2/7 VPP-1085 Nitin Saxena
vppinfra: use __atomic_fetch_add instead of __sync_fetch_and_add builtins VPP-1114 Kevin Wang
Arm system counter cleanup Merged 1/30 VPP-1125 Brian Brooks
svm: ... on autodetected VA space size (fixup again) Merged 01/10 Gabriel Ganne
svm: calc base address on AArch64 based on autodetected VA space size (fixup) Merged 01/10 Gabriel Ganne
svm: calc base address on AArch64 based on autodetected VA space size Merged 01/09 Damjan Marion
show cpu microarchitecture Merged 01/06 Gabriel Ganne
Fix Debian Packaging on AARCH64 Merged 01/06 Nitin Saxena
more extended tests fixes Merged 12/16 Gabriel Ganne
Use crc32 wrapper Merged 12/16 VPP-1086 Gabriel Ganne
implement clib_smp_pause() for arm and aarch64 platform Merged 12/15 VPP-1066 Kevin Wang
make "test-all" target pass again (for all platforms) Merged 12/13 Gabriel Ganne
fill "show cpu" Flag list on aarch64 platforms Merged 12/06 VPP-1065 Gabriel Ganne
remove smp dead code Merged 12/06 VPP-1066 Gabriel Ganne
net/virtio: support modern device id Merged 11/28 Gabriel Ganne
use REV on aarch64 for endianness swapping Merged 11/21 VPP-1067 Gabriel Ganne
armv8 crc32 - fix macro name Merged 11/15 Gabriel Ganne
bier - fix node table declaration Merged 11/14 Gabriel Ganne
Map SVM regions at a sane offset on arm64 Merged 11/10 Brian Brooks
bfd tests fix Merged 11/07 Gabriel Ganne
debian packaging fix Merged 11/06 Gabriel Ganne
lb test fix Merged 10/31 Gabriel Ganne
conditional x86intrin.h inclusion Merged 10/25 Gabriel Ganne
fix test_lb_ip4_gre6() cleanup Merged 10/24 Gabriel Ganne
null-terminate some formatted string Merged 10/20 Gabriel Ganne
lb plugin - fix format() type mismatches Merged 10/16 Gabriel Ganne
Use AESNI=y only on x86_64 machines Merged 10/14 Brian Brooks
Improved arm64 chip detection Merged 09/11 Brian Brooks
Native arm64 build: dpdk/Makefile change Merged 08/31 Brian Brooks

Meeting Minutes

11/21/2023

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Niyaz Murshed
    • Jieqiang Wang
  • CSIT
    • Status
      • Dave Wallace help monitor the AArch64 CI/CD status, which looks fine
      • Replace old thunderX2 with Ampera Altra, bugdets got approved, still in progress
        • Sync with CSIT folks in the call when possible -- Juraj
      • Maciek asked about the availability of N2-based hardwares
        • Plans to ship N2-based servers(Nvidia Grace(V2)/Ampere One(in-house design by Ampere)) to FD.io lab in next year
        • Timeline TBD
      • IPSec test cases
        • Patch already merged
        • QAT cards in Austin labs, plan to ship them to FD.io lab
      • RDMA test cases
        • MLX DPDK test cases are enabled, RDMA are not on AArch64
  • VPP
    • Detailed planning for VPP projects in the next call
    • Refactor OpenSSL usage in VPP IPsec -- Lijian
      • Move key generation and initialization steps out of data plane to control plane, see performance boost
    • Investigate make test framework in VPP -- Lijian
      • Patch broke wireguard test cases so need to figure out the work flow
    • VPP ramp-up -- Niyaz
      • Investigate VPP graph node mechanism and how to add nodes to the group
    • IPSec scalability tests -- Jieqiang
      • Try to figure out dpdk-rss-flows.py and how to generate balanced rss flows for IPSec tests

07/18/2023

  • Attendees
    • Jieqiang Wang
    • Tianyu Li
    • Juraj Linkes
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • IPSec & VxLAN performance drop issue on Ampere Altra
      • QAT cards are planned to be shipped
      • need to pay attention to the execution time for IPSec release testing
      • Need to investigate further on performance degradation issue
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
      • Will have a debug meeting with RDMA maintainers on the issues.
      • ConnectX6 NIC info will be updated in doc first
      • Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
    • MRR failed cases
      • Probably due to latest DPDK upgrade, not an arm-specific issue.
    • New test cases list on 3n-alt
      • NAT tests cannot be added because they are running on 2-node testbed only
      • enable IPSec flow cache(arm)/IPSec SPD fast path feature
    • Release testing
    • Plan to replace TX2 with Altra as VPP device testing testbed

06/20/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj Linkes
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • IPSec & VxLAN performance drop issue on Ampere Altra
      • QAT cards are planned to be shipped
      • need to pay attention to the execution time for IPSec release testing
      • Need to investigate further on performance degradation issue
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
      • Will have a debug meeting with RDMA maintainers on the issues.
      • ConnectX6 NIC info will be updated in doc first
      • Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
    • MRR failed cases
      • Probably due to latest DPDK upgrade, not an arm-specific issue.
    • New test cases list on 3n-alt
      • NAT tests cannot be added because they are running on 2-node testbed only
      • enable IPSec flow cache(arm)/IPSec SPD fast path feature

05/16/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • The link issue in DPDK testpmd test cases on Ampere Altra is still there.
      • Try cable switch while upgrading NIC firmeare and drivers
      • Try to reproduce the tests after the NIC firmware
      • Try different port pairs of the same two NICs
    • IPSec & VxLAN performance drop issue on Ampere Altra
      • QAT cards are planned to be shipped
      • need to pay attention to the execution time for IPSec release testing
      • Need to investigate further on performance degradation issue
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
      • Will have a debug meeting with RDMA maintainers on the issues.
      • ConnectX6 NIC info will be updated in doc first
      • Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
    • MRR failed cases
      • Probably due to latest DPDK upgrade, not an arm-specific issue.
  • VPP

04/18/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • The link issue in DPDK testpmd test cases on Ampere Altra is still there.
    • IPSec & VxLAN performance drop issue on Ampere Altra
      • QAT cards are planned to be shipped
      • need to pay attention to the execution time for IPSec release testing
      • Need to investigate further on performance degradation issue
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
      • Will have a debug meeting with RDMA maintainers on the issues.
      • ConnectX6 NIC info will be updated in doc first
  • VPP

04/04/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • The link issue in DPDK testpmd test cases on Ampere Altra is still there.
    • IPSec & VxLAN performance drop issue on Ampere Altra
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
      • Will have a debug meeting with RDMA maintainers on the issues.
  • VPP

03/07/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • The link issue in DPDK testpmd test cases on Ampere Altra is still there.
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
  • VPP

2/21/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • Miscellaneous
    • Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Confirm with Vexxhost people if replacing intel NICs is feasible
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
            • Will talk to dpdk i40e maintainer to seek their help
              • Dpdk Port/link status broken - l3fwd have the some issue
              • Sent detail email to i40e maintainer in dpdk-dev mail list waiting for response
        • New links for VPP perf trending/report pages
          • Daily trending: https://s3-docs.fd.io/csit/master/trending/
          • Release report: https://s3-docs.fd.io/csit/master/report/
          • Need to investigate 22.10 release testing result
            • Compiler version change seems to be one of factors for perf degradation
              • Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
              • New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
          • VM testcase timeout issue on 3-tsh testbed
            • Timeout issue occured when starting VPP inside VM, but not for starting testpmd
            • Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
              • isolcpus seems to be working fine
              • still need to root cause the timeout issue- sometimes slower
              • run dpdk build, just use the non-isolated cores for build
              • both VM and VPP start slower than before
              • VPP loading plugins and timeout happens
              • Is VPP crashing? - not crash
              • Is the VM bound with isolated core? - need to check
              • Will set up a live debug session for Tianyu and Juraj
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Enable VPP device testing per patch
        • Voting right for VPP device testing on Arm is enabled
        • VPP device testing on Arm runs per VPP/CSIT patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
    • MLX NICs Planning
      • CX6 and CX7 - CX7 is hard to get on market - MLX Nics will be used and reported
      • CX6 vpp native rdma driver has issues, dpdk mlx driver is fine.
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate


2/7/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • Miscellaneous
    • Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Confirm with Vexxhost people if replacing intel NICs is feasible
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
            • Will talk to dpdk i40e maintainer to seek their help
              • Dpdk Port/link status broken - l3fwd have the some issue
              • Sent detail email to i40e maintainer in dpdk-dev mail list waiting for response
        • New links for VPP perf trending/report pages
          • Daily trending: https://s3-docs.fd.io/csit/master/trending/
          • Release report: https://s3-docs.fd.io/csit/master/report/
          • Need to investigate 22.10 release testing result
            • Compiler version change seems to be one of factors for perf degradation
              • Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
              • New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
          • VM testcase timeout issue on 3-tsh testbed
            • Timeout issue occured when starting VPP inside VM, but not for starting testpmd
            • Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
              • isolcpus seems to be working fine
              • still need to root cause the timeout issue- sometimes slower
              • run dpdk build, just use the non-isolated cores for build
              • both VM and VPP start slower than before
              • VPP loading plugins and timeout happens
              • Is VPP crashing? - not crash
              • Is the VM bound with isolated core? - need to check
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Enable VPP device testing per patch
        • Voting right for VPP device testing on Arm is enabled
        • VPP device testing on Arm runs per VPP/CSIT patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
    • MLX NICs Planning
      • CX6 and CX7 - CX7 is hard to get on market - MLX Nics will be used and reported
      • CX6 vpp native rdma driver has issues, dpdk mlx driver is fine.
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

1/17/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
  • Miscellaneous
    • Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Confirm with Vexxhost people if replacing intel NICs is feasible
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
            • Will talk to dpdk i40e maintainer to seek their help
        • New links for VPP perf trending/report pages
          • Daily trending: https://s3-docs.fd.io/csit/master/trending/
          • Release report: https://s3-docs.fd.io/csit/master/report/
          • Need to investigate 22.10 release testing result
            • Compiler version change seems to be one of factors for perf degradation
              • Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
              • New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
          • VM testcase timeout issue on 3-tsh testbed
            • Timeout issue occured when starting VPP inside VM, but not for starting testpmd
            • Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Enable VPP device testing per patch
        • Voting right for VPP device testing on Arm is enabled
        • VPP device testing on Arm runs per VPP/CSIT patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

12/20/2022

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
  • Miscellaneous
    • Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Confirm with Vexxhost people if replacing intel NICs is feasible
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
            • Will talk to dpdk i40e maintainer to seek their help
        • New links for VPP perf trending/report pages
          • Daily trending: https://s3-docs.fd.io/csit/master/trending/
          • Release report: https://s3-docs.fd.io/csit/master/report/
          • Need to investigate 22.10 release testing result
            • Compiler version change seems to be one of factors for perf degradation
              • Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
              • New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
          • VM testcase timeout issue on 3-tsh testbed
            • Timeout issue occured when starting VPP inside VM, but not for starting testpmd
            • Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Enable VPP device testing per patch
        • Voting right for VPP device testing on Arm is enabled
        • VPP device testing on Arm runs per VPP/CSIT patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

12/06/2022

  • Attendees
    • Juraj Linkes
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
  • Miscellaneous
    • Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Confirm with Vexxhost people if replacing intel NICs is feasible
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
            • Will talk to dpdk i40e maintainer to seek their help
        • New links for VPP perf trending/report pages
          • Daily trending: https://s3-docs.fd.io/csit/master/trending/
          • Release report: https://s3-docs.fd.io/csit/master/report/
          • Need to investigate 22.10 release testing result
            • Compiler version change seems to be one of factors for perf degradation
              • Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
              • New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
          • VM testcase timeout issue on 3-tsh testbed
            • Timeout issue occured when starting VPP inside VM, but not for starting testpmd
            • Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Enable VPP device testing per patch
        • Voting right for VPP device testing on Arm is enabled
        • VPP device testing on Arm runs per VPP/CSIT patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

11/15/2022

  • Attendees
    • Juraj Linkes
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
  • Miscellaneous
    • Reschedule the meeting to 9 am for Juraj and 3 pm for Shanghai folks
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
        • Good news, No more slow down after 200 rounds of testing.
          • Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
          • Distro upgrade to ubuntu 22.04 is still ongoing - no ETA yet
            • Server configuration will remain the same, already integrated in ansible playbook
          • Re-enable voting IF no more issue with 22.04 device testing
            • Submit a patch to enable voting right after meeting
      • Test meltdown/spectre vulnerabilities
        • CSIT maintainers ask for tools if existing to test vulnerabilities on Arm platform(not just limited to Arm)
        • Will confirm this issue with support team - Lijian
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • VM cases failed only on 3n-alt performance testbed, error log report some file missing, likely configuration issue
        • Another intermit failed VM issue happens on tx2 and alt, need to figure out above case first
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate


10/18/2022

  • Attendees
    • Juraj Linkes
    • Tianyu Li
    • Lijian Zhang
    • Jieqiang Wang
  • Miscellaneous
    • Reschedule the meeting to 9 am for Juraj and 3 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • CSIT perf numbers VS local perf numbers
            • VPP cloud image in CSIT VS native built VPP in local env
            • One DPDK patch introduced perf degradation on Arm platform
            • Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
          • Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
            • And the procedures of developing/developing test cases in CSIT (performance & device testing)
            • Juraj should have already sent to Jieqiang previously.
          • 22.06 release testing will happen soon
          • NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Replace XL710 NIC? - try asking tomorrow.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
        • New links for VPP perf trending/report pages
        • NUMA issue
          • Will run performance report on Arm testbed onece patch to resolve NUMA issue is merged
          • Dave will help merge the patch into the corresponding branches


    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
        • Good news, No more slow down after 200 rounds of testing.
          • Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
          • Suggest to rerun test after upgrade to 22.04
          • Re-enable voting after not more issue with 22.04 device testing
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

9/20/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Lijian Zhang
    • Jieqiang Wang
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • CSIT perf numbers VS local perf numbers
            • VPP cloud image in CSIT VS native built VPP in local env
            • One DPDK patch introduced perf degradation on Arm platform
            • Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
          • Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
            • And the procedures of developing/developing test cases in CSIT (performance & device testing)
            • Juraj should have already sent to Jieqiang previously.
          • 22.06 release testing will happen soon
          • NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Replace XL710 NIC? - try asking tomorrow.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
        • New links for VPP perf trending/report pages
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
        • Good news, No more slow down after 200 rounds of testing.
          • Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
          • Suggest to rerun test after upgrade to 22.04
          • Re-enable voting after not more issue with 22.04 device testing
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
        • QAT enabled Kernel patch release about October, upgrade kernel required.
      • QAT test cases are developed based on Python APIs / CLIs
    • VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate


9/6/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Lijian Zhang
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • CSIT perf numbers VS local perf numbers
            • VPP cloud image in CSIT VS native built VPP in local env
            • One DPDK patch introduced perf degradation on Arm platform
            • Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
          • Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
            • And the procedures of developing/developing test cases in CSIT (performance & device testing)
            • Juraj should have already sent to Jieqiang previously.
          • 22.06 release testing will happen soon
          • NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Replace XL710 NIC? - try asking tomorrow.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
        • New links for VPP perf trending/report pages
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
        • Good news, No more slow down after 200 rounds of testing.
          • Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
          • Suggest to rerun test after upgrade to 22.04
          • Re-enable voting after not more issue with 22.04 device testing
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
        • QAT enabled Kernel patch release about October, upgrade kernel required.
      • QAT test cases are developed based on Python APIs / CLIs
    • VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

8/16/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Masksym Vynnvk
    • Jieqiang Wang
    • Tianyu Li
    • Lijian Zhang
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
    • VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

8/2/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Masksym Vynnvk
    • Jieqiang Wang
    • Tianyu Li
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
    • VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

7/19/2022

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX NIC
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case


7/5/2022

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP on N1 platforms
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Tested perfmon patch - Jieqiang
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case

6/21/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Tested perfmon patch - Jieqiang
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card

6/7/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Tested perfmon patch - Jieqiang
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card

5/17/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Lijian Zhang
    • Tina Tsou
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage

4/5/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Lijian Zhang
    • Tina Tsou
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)

3/15/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)

3/1/2022

  • Attendees
    • Govindarajan Mohandoss
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Paper work for shipment is done
      • Build servers will arrive at end of Jan
      • Performance servers will arrive in Feb
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

1/25/2022

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Paper work for shipment is done
      • Build servers will arrive at end of Jan
      • Performance servers will arrive in Feb
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

1/18/2022

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
    • VPP IPv4 fragmentation - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

1/11/2022

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
    • VPP IPv4 fragmentation - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

12/14/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
    • VPP IPv4 fragmentation - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
    • VPP IPv6 Benchmarking and Profiling - Jieqiang
      • IPv6 profiling
        • No perf bump for lookup_x2 function in Fd.io gerrit
        • Try Mellaonx nics for IPv6 routing tests
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

12/07/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
    • VPP IPv4 fragmentation - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
    • VPP IPv6 Benchmarking and Profiling - Jieqiang
      • IPv6 profiling
        • No perf bump for lookup_x2 function in Fd.io gerrit
        • Try Mellaonx nics for IPv6 routing tests
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

11/30/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
    • VPP IPv4 fragmentation - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
    • VPP IPv6 Benchmarking and Profiling - Jieqiang
      • IPv6 profiling
        • No perf bump for lookup_x2 function in Fd.io gerrit
        • Try Mellaonx nics for IPv6 routing tests
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach

11/23/2021

  • Attendees
    • Tianyu Li
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
    • New Arm servers shipment to the FD.io lab
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enbaled with DMC-620
    • VPP IPv4 fragmentation
    • VPP IPv6 Benchmarking and Profiling
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Performance degradation with quad loop unrolling applied on ip6_lookup_inline
        • Patch the current kernel to enable perfmon plugin on VPP
        • Need to check performance for IPv6 subnet routing
        • Try Mellaonx nics for IPv6 routing tests
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach


11/16/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
          • Enable VPP device testing per patch
    • New Arm servers shipment to the FD.io lab
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enbaled with DMC-620
    • VPP IPv4 fragmentation
    • VPP IPv6 Benchmarking and Profiling
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Performance degradation with quad loop unrolling applied on ip6_lookup_inline
        • Patch the current kernel to enable perfmon plugin on VPP
        • Need to check performance for IPv6 subnet routing
        • Try Mellaonx nics for IPv6 routing tests
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach

11/09/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tina Tsou
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunce page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Perfmonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
      • VPP IPv4 fragmetation
    • VPP IPv6 Benchmarking and Profiling
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Performance degradation with quad loop unrolling applied on ip6_lookup_inline
        • Patch the current kernel to enable perfmon plugin on VPP
        • Need to check performance for IPv6 subnet routing
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach

11/02/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Jieqiang Wang
    • Tina Tsou
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunce page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Perfmonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
    • VPP IPv6 Benchmarking and Profiling
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Performance degradation with quad loop unrolling applied on ip6_lookup_inline
        • Patch the current kernel to enable perfmon plugin on VPP
        • Need to check performance for IPv6 subnet routing
    • VPP Prefetch
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach


10/26/2021

  • Attendees
    • Juraj Linkes
    • Tianyu Li
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week. - closed
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • Inbound IPsec: reproduced and need to investigate - Juraj
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • IPsec SPD input/output case ongoing
          • Adding IPsec SPD outbound test cases 64B 1, 100 and 1k SPD entries, 1, 2, 4 cores, on tx2 testbed - clarified
            • Flow cache on and off cases need to be measured.
          • L2 BD 20k test cases execute time too long, removed on taishan.
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
        • 3n-tsh testbed unreachable, investigating right now - Juraj
          • TG firmware is under upgradation
          • Server unreachable due to firmware & driver update - resolved - update all done
        • Release testing for 21.10 starts
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - Juraj
          • Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
        • AVF interface creation issue:
          • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
            • race condition occur
            • try mounting a part of /dev/vfio to see if issue can be resolved
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and looks good right now
              • Addressed comments, waiting Peter's review..
              • Will enable voting right soon after the patch gets merged
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
        • New servers are in the procurement process
        • Plan to replace old thunderx1 build servers with more advanced Arm servers
            • CPU not fully utilized on Arm, need further investigation
    • Intel NIC firmware upgrade on Arm - not supported
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
            • Enable DMC 620 more close to real system, but performance will drop
            • Build a system using VPP memif and pktgen
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
        • Plan to try quad loop unrolling for ip6_lookup_inline function
    • VPP Prefetch
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Try to use ansible to deploy VPP automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

10/19/2021

  • Attendees
    • Juraj Linkes
    • Tianyu Li
    • Govindarajan Mohandoss
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - Juraj
          • Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
        • AVF interface creation issue:
          • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
            • race condition occur
            • try mounting a part of /dev/vfio to see if issue can be resolved
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and looks good right now
              • Will enable voting right soon after the patch gets merged
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
        • New servers are in the procurement process
        • Plan to replace old thunderx1 build servers with more advanced Arm servers
            • CPU not fully utilized on Arm, need further investigation
    • Intel NIC firmware upgrade on Arm - not supported
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
            • Enable DMC 620 more close to real system, but performance will drop
            • Build a system using VPP memif and pktgen
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
        • Plan to try quad loop unrolling for ip6_lookup_inline function
    • VPP Prefetch
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Try to use ansible to deploy VPP automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

10/12/2021

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Govindarajan Mohandoss
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - Juraj
          • Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
        • AVF interface creation issue:
          • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
            • race condition occur
            • try mounting a part of /dev/vfio to see if issue can be resolved
            • Talked with Peter, Juraj is working on prototype of mounting part of /dev/vfio
            • x86 vpp device job is fine, duo to firmware & driver is old
            • arm vpp device servers have drivers updated, vlan striping not allowed, vlan configuration cannot removed from lab view.
            • only performance testbeds have NIC drivers updated
            • maintainer doesn't want to a option from vpp config
            • may need to check x86 have the same issue with the same version driver before reaching intel folks
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
        • New servers are in the procurement process
        • Plan to replace old thunderx1 build servers with more advanced Arm servers
            • CPU not fully utilized on Arm, need further investigation
    • Intel NIC firmware upgrade on Arm - not supported
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
            • Enable DMC 620 more close to real system, but performance will drop
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
    • VPP Prefetch
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

09/28/2021

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - Juraj
          • Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
        • AVF interface creation issue:
          • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
            • race condition occur
            • try mounting a part of /dev/vfio to see if issue can be resolved
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
        • New servers are in the procurement process
        • Plan to replace old thunderx1 build servers with more advanced Arm servers
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
    • VPP Prefetch
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

09/14/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patchset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate - Juraj
                • Learn more about RFC and need time to understand more - on hold - waiting Neale's response
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing done.
          • A few more jobs run for release 21.06 and will be finished soon
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - juraj
          • Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
        • AVF interface creation issue:
          • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
      • Try IPv4 multicast testing to verify the scenario when refcnt > 1
        • GDB shows that mbufs are copied instead of reference from src port to all dst ports
        • Direct/Indirect mbuf for VPP multicast testing
        • Try IPv4 multicasting & L2 flood testing which works fine
        • ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
          • show mbuf is copied so that ref_cnt will always be one
            • dpdk 21.08 have the patches, need to verify on vpp
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

09/07/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patchset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate - Juraj
                • Learn more about RFC and need time to understand more - on hold - waiting Neale's response
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing done.
          • A few more jobs run for release 21.06 and will be finished soon
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - juraj
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
      • Try IPv4 multicast testing to verify the scenario when refcnt > 1
        • GDB shows that mbufs are copied instead of reference from src port to all dst ports
        • Direct/Indirect mbuf for VPP multicast testing
        • Try IPv4 multicasting & L2 flood testing which works fine
        • ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
          • show mbuf is copied so that ref_cnt will always be one
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

08/31/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patchset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate - Juraj
                • Learn more about RFC and need time to understand more - on hold - waiting Neale's response
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing done.
          • A few more jobs run for release 21.06 and will be finished soon
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
        • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
          • config option set to N, /dev/vfio device or resource busy error
          • config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
            • The longer the server runs, more test cases fails
          • Next to do
            • Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
              • Juraj modified script to reproduce the issue - Lijian will try it locally
              • Also seen in Intel QAT card from Zach
            • Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
            • Lijian have slight different firmware version, driver version
            • Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
            • Lijian can use Juraj's script to reproduce the issue on local tx2 server
              • Reducing the numa buffer allocation size resolves this issue
              • Observed from the error log of numa buffer allocation
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
      • Try IPv4 multicast testing to verify the scenario when refcnt > 1
        • GDB shows that mbufs are copied instead of reference from src port to all dst ports
        • Direct/Indirect mbuf for VPP multicast testing
        • Try IPv4 multicasting & L2 flood testing which works fine
        • ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
          • show mbuf is copied so that ref_cnt will always be one
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done

08/24/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patchset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate - Juraj
                • Learn more about RFC and need time to understand more - on hold - waiting Neale's response
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing done.
          • A few more jobs run for release 21.06 and will be finished soon
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
        • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
          • config option set to N, /dev/vfio device or resource busy error
          • config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
            • The longer the server runs, more test cases fails
          • Next to do
            • Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
              • Juraj modified script to reproduce the issue - Lijian will try it locally
              • Also seen in Intel QAT card from Zach
            • Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
            • Lijian have slight different firmware version, driver version
            • Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
            • Lijian can use Juraj's script to reproduce the issue on local tx2 server
              • Reducing the numa buffer allocation size resolves this issue
              • Observed from the error log of numa buffer allocation
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
      • Try IPv4 multicast testing to verify the scenario when refcnt > 1
        • GDB shows that mbufs are copied instead of reference from src port to all dst ports
        • Will try L2 flood test case & understand VPP/multicast code
        • Direct/Indirect mbuf for VPP multicast testing
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
        • Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
        • For 64B cacheline size native build on Arm, may need to change code.
        • Current VPP does not support 64B cacheline size compilation for Arm images.
        • Issues about prefetch on current VPP code base
          • Issue 1 support 128B/64B cache-line size in Arm image
          • Issue 2 prefetch 'overflow' for native build
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach

08/17/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patchset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate - Juraj
                • Learn more about RFC and need time to understand more
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing ongoing
          • A few more jobs run for release 21.06 and will be finished soon
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
        • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
          • config option set to N, /dev/vfio device or resource busy error
          • config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
            • The longer the server runs, more test cases fails
          • Next to do
            • Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
              • Juraj modified script to reproduce the issue - Lijian will try it locally
              • Also seen in Intel QAT card from Zach
            • Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
            • Lijian have slight different firmware version, driver version
            • Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
            • Lijian can use Juraj's script to reproduce the issue on local tx2 server
              • Reducing the numa buffer allocation size resolves this issue
              • Observed from the error log of numa buffer allocation
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
      • Try IPv4 multicast testing to verify the scenario when refcnt > 1
        • GDB shows that mbufs are copied instead of reference from src port to all dst ports
        • Will try L2 flood test case & understand VPP/multicast code
        • Direct/Indirect mbuf for VPP multicast testing
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
        • Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
        • For 64B cacheline size native build on Arm, may need to change code.
        • Current VPP does not support 64B cacheline size compilation for Arm images.
        • Issues about prefetch on current VPP code base
          • Issue 1 support 128B/64B cache-line size in Arm image
          • Issue 2 prefetch 'overflow' for native build
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach

08/10/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries - still under review
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patcheset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing ongoing
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863

`

    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
        • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
          • config option set to N, /dev/vfio device or resource busy error
          • config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
            • The longer the server runs, more test cases fails
          • Next to do
            • Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
              • Juraj modified script to reproduce the issue - Lijian will try it locally
              • Also seen in Intel QAT card from Zach
            • Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
            • Lijian have slight different firmware version, driver version
            • Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
    • VPN access request to FD.io Arm servers
      • Lijian has got VPN access now
      • Juraj singed Jieqiang's key
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128, CLI issue only, CSIT's python API works fine.
      • Internal patch to resolve this issue under review - upstreamed
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
        • Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
        • For 64B cacheline size native build on Arm, may need to change code.
        • Current VPP does not support 64B cacheline size compilation for Arm images.
    • VPP Classifier - Lijian
      • 3 patches: prefetch, key-value compare simd improvement, cache to look up
      • 4 loop unrolling decreasing performance
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach

08/03/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries - still under review
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Waiting for new version of patcheset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing ongoing
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863

`

    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
        • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
          • config option set to N, /dev/vfio device or resource busy error
          • config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
            • The longer the server runs, more test cases fails
          • Next to do
            • Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
              • Also seen in Intel QAT card from Zach
            • Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
            • Will try Mellanox card to see if same issue happens - Juraj
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
    • VPN access request to FD.io Arm servers
      • Lijian has got VPN access now
      • Juraj singed Jieqiang's key
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • Internal patch to resolve this issue under review
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
        • Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
        • For 64B cacheline size native build on Arm, may need to change code.
        • Current VPP does not support 64B cacheline size compilation for Arm images.
    • VPP Classifier - Lijian
      • 3 patches: prefetch, key-value compare simd improvement, cache to look up
      • 4 loop unrolling decreasing performance
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach

07/27/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
            • Random issue, more frequently happening on Arm
              • Not see in CI recently or manually.
        • scapy unexpected timeout issue: packet drop or slow issue?
            • vfio-pci driver may be the root cause - bind/unbind
      • Connection issue between Jenkins and the build executor in FD.io lab
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
    • VPN access request to FD.io Arm servers
      • Lijian has got VPN access now
      • Juraj singed Jieqiang's key
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
        • Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
        • For 64B cacheline size native build on Arm, may need to change code.
    • VPP Classifier - Lijian
      • 3 patches: prefetch, key-value compare simd improvement, cache to look up
      • 4 loop unrolling descreasing performance
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach

07/20/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
            • Random issue, more frequently happening on Arm
            • vfio-pci driver may be the root cause - bind/unbind
      • Connection issue between Jenkins and the build executor in FD.io lab
    • Shipment of new advanced server to the FD.io lab
      • Two advanced servers are in plan to ship
    • VPN access request to FD.io Arm servers
      • Lijian has got VPN access now
      • Juraj singed Jieqiang's key
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
    • VPP mbuf-fast-free tx offload
      • Performance improvement for IPv4 routing test cases using vector path
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach

07/13/2021

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries - still under review
            • Expected to be merged soon
          • Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
            • Hugepage size, numa-node, core isolation etc. may need to check.
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
        • Release testing ongoing
        • IPsec SPD input/output case ongoing
        • Juraj may share the steps how CSIT handle new configuration changes
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
        • 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
          • May need to check VM and IPsec cases
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
            • Random issue, more frequently happening on Arm
            • vfio-pci driver may be the root cause - bind/unbind
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
    • Shipment of new advanced server to the FD.io lab
      • New servers are in shortage.
    • VPN access request to FD.io Arm servers
      • Will remind Machiek to sign Lijian's GPG public key.
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)


07/06/2021

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries - still under review
            • Expected to be merged soon
          • Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
            • Hugepage size, numa-node, core isolation etc. may need to check.
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
        • Release testing ongoing
        • IPsec SPD input/output case ongoing
        • Juraj may share the steps how CSIT handle new configuration changes
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
        • 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
          • May need to check VM and IPsec cases
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
            • Random issue, more frequently happening on Arm
            • vfio-pci driver may be the root cause - bind/unbind
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
    • Shipment of new advanced server to the FD.io lab
      • New servers are in shortage.
    • VPN access request to FD.io Arm servers
      • Will remind Machiek to sign Lijian's GPG public key.
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform
    • VPP Prefetch
      • Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
        • Repeat the same test on Ampere server - PMU cache-miss less for write always
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
      • 3 patches: prefetch, key-value compare simd improvement, cache to look up
      • Investigating VPP classify function, use case, benchmarking - Lijian
        • Start with simple use case
        • VPP Classify basic inbound L3 src ip / prot case
        • Benchmark VPP classifier on Arm/X86 platform
          • Done some NEON changes, see some microbranchmark improvement
      • investigate CSIT case
        • No classify test case in CSIT. - Jieqiang
        • may be there is a CSIT case named iacldstbase
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
        • Benchmarking DPDK memif vs VPP memif
          • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
          • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
          • Patches have been upstreamed and waiting for review
          • Prepare the memif readout - Tianyu
          • Add support for VPP aarch64 docker image build
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
      • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
    • VPP IPsec on Arm - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
        • VPP uses linear search on SPD lookups
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach

06/29/2021

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries - still under review
            • Expected to be merged soon
          • Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
            • Hugepage size, numa-node, core isolation etc. may need to check.
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
        • Release testing ongoing
        • IPsec SPD input/output case ongoing
        • Juraj may share the steps how CSIT handle new configuration changes
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
        • 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
          • May need to check VM and IPsec cases
    • VPP Path
      • Voting and working fine.
      • Community plans to drop the support for CentOS-8.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
            • Debugging
            • vfio-pci driver may be the root cause - bind/unbind
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
    • Shipment of new advanced server to the FD.io lab
      • New servers are in shortage.
    • VPN access request to FD.io Arm servers
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform
    • VPP Prefetch
      • Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
        • Repeat the same test on Ampere server - PMU cache-miss less for write always
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
      • 3 patches: prefetch, key-value compare simd improvement, cache to look up
      • Investigating VPP classify function, use case, benchmarking - Lijian
        • Start with simple use case
        • VPP Classify basic inbound L3 src ip / prot case
        • Benchmark VPP classifier on Arm/X86 platform
          • Done some NEON changes, see some microbranchmark improvement
      • investigate CSIT case
        • No classify test case in CSIT. - Jieqiang
        • may be there is a CSIT case named iacldstbase
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
        • Benchmarking DPDK memif vs VPP memif
          • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
          • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
          • Patches have been upstreamed and waiting for review
          • Prepare the memif readout - Tianyu
          • Add support for VPP aarch64 docker image build
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
      • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
    • VPP IPsec on Arm - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
        • VPP uses linear search on SPD lookups
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach

06/22/2021

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries
            • Expected to be merged soon
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
        • Release testing ongoing
        • IPsec SPD input/output case ongoing
        • Juraj may share the steps how CSIT handle new configuration changes
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • Community plans to drop the support for CentOS-8.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • vfio-pci driver may be the root cause
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
    • Shipment of new advanced server to the FD.io lab
      • New servers are in shortage.
    • VPN access request to FD.io Arm servers
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform
    • VPP Prefetch
      • Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
        • Repeat the same test on Ampere server
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
      • Investigating VPP classify function, use case, benchmarking - Lijian
        • Start with simple use case
        • VPP Classify basic inbound L3 src ip / prot case
        • Benchmark VPP classifier on Arm/X86 platform
          • Done some NEON changes, see some microbranchmark improvement
      • investigate CSIT case
        • No classify test case in CSIT. - Jieqiang
        • may be there is a CSIT case named iacldstbase
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
        • Benchmarking DPDK memif vs VPP memif
          • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
          • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
          • Patches have been upstreamed and waiting for review
          • Prepare the memif readout - Tianyu
          • Add support for VPP aarch64 docker image build
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
      • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
    • VPP IPsec on Arm - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
        • VPP uses linear search on SPD lookups
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach

06/15/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
          • VPP exits with the IPsec startup config, try startup config from Zach's email - Juraj
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
        • Release testing ongoing
        • IPsec SPD input/output case ongoing
        • Juraj may share the steps how CSIT handle new configuration changes
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • Community plans to drop the support for CentOS-8.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly. - DaveW
    • Shippment of new adavanced server to the FD.io lab
      • New servers are in shortage.
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform
    • VPP Prefetch
      • Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
        • Repeat the same test on Ampere server
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
      • Investigating VPP classify function, use case, benchmarking - Lijian
        • Start with simple use case
        • VPP Classify basic inbound L3 src ip / prot case
        • Benchmark VPP classifier on Arm/X86 platform
          • Done some NEON changes, see some microbranchmark improvement
      • investigate CSIT case
        • No classify test case in CSIT. - Jieqiang - may be there is a CSIT case named iacldstbase
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
        • Benchmarking DPDK memif vs VPP memif
          • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
          • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
          • Patches have been upstreamed and waiting for review
          • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
      • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
    • VPP IPsec on Arm - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
        • VPP uses linear search on SPD lookups
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Waiting for review comments on outbound side before upstream to VPP
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach


06/08/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
          • VPP exits with the IPsec startup config, try startup config from Zach's email - Juraj
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Work in progress.
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform.
    • VPP Path
      • Voting and working fine.
      • Community plans to drop the support for CentOS-8.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results.
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
    • Shippment of new adavanced server to the FD.io lab
      • New servers are in shortage.
  • VPP
    • VPP default compiler on Arm platform
      • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
        • Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
          • No obvious performance improvement, keep the original default compiler
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • SVE validation on FPGA platform
    • VPP Prefetch
      • Benchmark VPP using prefetch read always vs prefetch write always - Jieqiang
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
      • Investigating VPP classify function, use case, benchmarking - Lijian
        • Start with simple use case
        • VPP Classify basic inbound L3 src ip / prot case
        • Benchmark VPP classifier on Arm/X86 platform
      • investigate CSIT case
        • No classify test case in CSIT. - Jieqiang
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
        • Benchmarking DPDK memif vs VPP memif
          • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
          • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
          • Patches have been upstreamed and waiting for review
          • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
      • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
    • VPP IPsec on Arm - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input/output nodes - Govind & Zach
        • VPP uses linear search on SPD lookups
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Waiting for review comments on outbound side before upstream to VPP
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach


06/01/2021

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
              • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
              • IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
            • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
            • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Work in progress.
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate cabling issue on Taishan performance test-bed - resolved.
          • Some container case are seems failure on all platform.
    • VPP Path
      • Voting and working fine.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
          • Internal ticket has been raised
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
      • Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
    • Vector length specific patch is ready
    • Investigating VPP classify function, use case, benchmarking - Lijian
      • Start with simple use case
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
    • SVE patch ready and upstreamed, under review - Lijian
      • SVE patch sent to Nitin, Nitin will review the patch when back to work.
      • Review memif patch
        • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
      • VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case - No classify test case in CSIT. - Jieqiang
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
    • Work on IPsec input/output nodes - VPP uses linear search on SPD lookups - Govind & Zach
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Waiting for review comments on outbound side before upstream to VPP
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • Perfmon plugin enablement on Arm - Zach

05/25/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Zachary Leaf
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
              • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
              • IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
            • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
            • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - will look into it
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate cabling issue on Taishan performance test-bed - resolved.
          • Some container case are seems failure on all platform.
    • VPP Path
      • Voting and working fine.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
          • Internal ticket has been raised
            • Try the new version of DPDK but it does not help
            • Contact Intel devs for the possible advice
            • Workaround may impact too much to all test cases
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
      • Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
    • Vector length specific patch is ready
    • Investigating VPP classify function, use case, benchmarking - Lijian
      • Start with simple use case
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
    • SVE patch ready and upstreamed, under review - Lijian
      • SVE patch sent to Nitin, Nitin will review the patch when back to work.
      • Review memif patch
        • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
      • VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case - No classify test case in CSIT. - Jieqiang
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform. - Govind
      • IPsec input node optimization work in progress - Zach & Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
          • https://gerrit.fd.io/r/c/vpp/+/31694
          • IPSec unit test - make test new cases implementation
          • Make test cases for IPSec policy mode - Done, included in Govind's patch, waiting for maintainer review - Zach
            • Add/Remove/Add+Remove+Readd/Hash collisions/Multiple interfaces & rules
          • Review the patch and grasp the basics about IPSec - Lijian
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • perfmon CMN-600 investigating - Zach
      • VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
      • Plan to upstream perfmon plugin - resolving review comments - Zach
      • IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
      • IPSec decryption / input node - Zach

05/18/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Zachary Leaf
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
              • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
              • IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
            • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
            • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308
            • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
    • VPP Path
      • Voting and working fine.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
          • Internal ticket has been raised
            • Try the new version of DPDK but it does not help
            • Contact Intel devs for the possible advice
            • Workaround may impact too much to all test cases
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
      • Lab move for the fd.io lab
        • Lab moving started stage 2, moved part of the servers to make sure ci service not down.
        • Lab move is done, some issues with taishan testbed
        • Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
      • Plan to benchmark gcc-10 vs clang-12
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
    • SVE patch ready and upstreamed, under review - Lijian
      • SVE patch sent to Nitin, Nitin will review the patch when back to work.
      • Review memif patch
        • Functional bug related to C11 atomics has been resolved by VPP maintainer.
      • VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case. - Jieqiang
      • Make test cases for IPSec policy mode - Zach
        • Add/Remove/Add+Remove+Readd/Hash collisions/Multiple interfaces & rules - Add more test cases
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Performance improvement using loop unrolling for memif nodes
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform. - Govind
      • IPsec input node optimization work in progress - Zach & Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • perfmon CMN-600 investigating - Zach
      • VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
      • Plan to upstream perfmon plugin - resolving review comments - Zach
      • IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
      • IPSec decryption / input node - Zach

05/11/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Zachary Leaf
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
              • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
              • IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
            • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
    • VPP Path
      • Voting and working fine.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
          • https://doc.dpdk.org/guides/nics/i40e.html
          • Internal ticket has been raised
            • Try the new version of DPDK but it does not help
            • Contact Intel devs for the possible advice
            • Workaround may impact too much to all test cases
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
      • Lab move for the fd.io lab
        • Lab moving started stage 2, moved part of the servers to make sure ci service not down.
        • Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
        • Almost all except performance testbed, which will be moved this week, everything is smooth so far.
        • ubuntu 1804 -> 2004
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
    • SVE patch ready and upstreamed, under review - Lijian
      • SVE patch sent to Nitin, Nitin will review the patch when back to work.
      • Review memif patch
      • VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case.
      • Make test cases for IPSec policy mode - Jieqiang
    • Review memif test cases/memif cases
    • Finished coding of SVE string library, bihash key compare functions
    • SVE unit testing based on test_vec, fix test_vec issues
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
          • Will do readout presentation with extended people - Tianyu
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Performance improvement using loop unrolling for memif nodes
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform. - Govind
      • IPsec input node optimization work in progress - Zach & Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • Using startup parameter to enable the IPsec flow cache feature
    • Adding Python test case to test IPSec node behavior - Jieqiang
    • perfmon CMN-600 investigating - Zach
      • VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
      • Plan to upstream perfmon plugin - resolving review comments - Zach
      • IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
      • IPSec decryption / input node - Zach

04/27/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
    • VPP Path
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • https://doc.dpdk.org/guides/nics/i40e.html
          • Internal ticket has been raised
            • Try the new version of DPDK but it does not help
            • Contact Intel devs for the possible advice
            • Workaround may impact too much to all test cases
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
    • SVE patch ready and upstreamed, under review - Lijian
      • Make test cases for IPSec policy mode - Jieqiang
    • Review memif test cases/memif cases
    • Finished coding of SVE string library, bihash key compare functions
    • SVE unit testing based on test_vec, fix test_vec issues
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
          • Will do readout presentation with extended people - Tianyu
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Performance improvement using loop unrolling for memif nodes
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform. - Govind
      • IPsec input node optimization work in progress - Zach & Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • Using startup parameter to enable the IPsec flow cache feature
    • Adding Python test case to test IPSec node behavior - Jieqiang
    • perfmon CMN-600 investigating - Zach
      • Plan to upstream perfmon plugin - resolving review comments - Zach
      • IPSec decryption / input node - Zach

04/13/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Some issues occurred during the upgrade.
          • Patch to resolve the building error of DPDK on 3n-tsh testbed.
          • Root cause is the change of build system of DPDK on 3n-tsh related to SOC id detection.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
    • VPP Path
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
      • Make test cases for IPSec policy mode - Jieqiang
    • Review memif test cases/memif cases
    • Finished coding of SVE string library, bihash key compare functions
    • SVE unit testing based on test_vec, fix test_vec issues
    • Test template update - Jieqiang
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
          • Will do readout presentation with extended people - Tianyu
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • Using startup parameter to enable the IPsec flow cache feature
    • Adding Python test case to test IPSec node behavior - Jieqiang
    • perfmon CMN-600 investigating - Zach
      • Plan to upstream perfmon plugin - resolving review comments - Zach
      • IPSec decryption - Zach


03/30/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • 2 node IPsec SPD policy test case patch is ready, starting with 1 and 1k tunnels. (40, 400 tunnels in seperate patch)
            • https://gerrit.fd.io/r/c/csit/+/31605
            • Fix the wrong CLI commands but configuration still has problems.
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Some issues occurred during the upgrade.
          • Patch to resolve the building error of DPDK on arm testbed.(taishan dpdk cases still have issues, investigating)
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
        • Will try to reproduce the issue with x86 servers.
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Review memif test cases/memif cases
    • Finished coding of SVE string library, bihash key compare functions
    • SVE unit testing based on test_vec, fix test_vec issues
    • Test template update
    • SVE unit test in qemu-vm, met compiling issue, investigating
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
          • Will do readout presentation with extended people - Tianyu
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Prepare the memif readout - Tianyu
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • Using startup parameter to enable the IPsec flow cache feature
    • Discuss with jieqiang adding python test case to test ipsec node behavior
    • perfmon CMN-600 investigating - Zach

03/16/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
      • Verify SVE vector length specific wrappers - Jieqiang
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
        • Extend vector length agnostic opportunities
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
          • Will do readout presentation with extented people - Tianyu
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

03/09/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • CentOS-7 will be enabled with master branch for support lts release
        • CentOS-7 Jenkins on Arm will not be supported.
        • CentOS-8 will be supported by the end of this year by Redhat.
      • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
        • Dependency on maintainers to fix this issue
        • Intel will ship a new NIC with latest firmware
          • Shipment takes a long time empirically
            • NIC has been shipped to vexxhost, wait for NIC arrival.
          • Try to reproduce the issue on this NIC on Arm platform
          • Updating firmware on the current NIC is risky
        • Voting rights will be enabled once this issue is fixed
            • Maintainer raised the ticket to get intel people involved
            • Will not update the firmware because the release testing is ongoing
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
        • Will show Arm roadmap in the next TSC meeting
        • Proposals have been sent to VPP maintainer on verifying SVE/SVE2
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
        • 128 and 256 fixed size vector wrappers are ready, needs verification
        • Verify SVE vector length specific wrappers - Jieqiang
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
        • Extend vector length agnostic opportunities
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
        • Analysis of benchmarking results for Ampere Altra
          • A lot of context switches occur on Ampere Altra compared to N1SDP
          • perf tools used to capture the perf events
          • Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
          • Remove interrupts on altra but no performance improvement seen
          • instruction cache misses are higher on altra than N1
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
    • VPP compiling error on CentOS 7 - Jieqiang
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

02/23/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • CentOS-7 will be enabled with master branch for support lts release
        • CentOS-7 Jenkins on Arm will be supported.
      • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
        • Dependency on maintainers to fix this issue
        • Intel will ship a new NIC with latest firmware
          • Shipment takes a long time empirically
          • Try to reproduce the issue on this NIC on Arm platform
          • Updating firmware on the current NIC is risky
        • Voting rights will be enabled once this issue is fixed
            • Maintainer raised the ticket to get intel people involved
            • Will not update the firmware because the release testing is ongoing
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features inside QEMU docker - Lijian
        • Latest VPP binary crash on the QEMU docker
          • System call fails inside QEMU docker when running VPP
        • Verify SVE/SVE2 features inside ARM QEMU VM
        • VPP maintainers want real hardware to verify SVE code
          • This solution will be abandoned.
        • 'make test' execution is slow
        • Sync with DPDK team/VPP community to decide the solution
        • Proposals have been sent to VPP maintainer on verifying SVE/SVE2
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
        • 128 and 256 fixed size vector wrappers are ready, needs verification
        • Verify SVE vector length specific wrappers - Jieqiang
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
        • Extend vector length agnostic opportunities
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
        • Analysis of benchmarking results for Ampere Altra
          • A lot of context switches occur on Ampere Altra compared to N1SDP
          • perf tools used to capture the perf events
          • Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
          • Remove interrupts on altra but no performance improvement seen
          • instruction cache misses are higher on altra than N1
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Investigate VPP memif test cases in VPP CSIT - Jieqiang
      • Investigate VPP use cases proposals in containers - Tianyu
        • 3 use cases are investigated.
        • Will explore the memif logic and share the progress.
        • Will share the link on details about how to run VPP in container.
      • Investigate VPP agent usage - Tianyu
        • Focus more on data-plane performance benchmarking and optimization - Tianyu
    • VPP compiling error on CentOS 7 - Jieqiang
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

02/09/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
        • Dependency on maintainers to fix this issue
        • Intel will ship a new NIC with latest firmware
          • Shippment takes a long time empirically
          • Try to reproduce the issue on this NIC on Arm platform
          • Updating firmware on the current NIC is risky
        • Voting rights will be enabled once this issue is fixed
            • Maintainer raised the ticket to get intel people involved
            • Will not update the firmware because the release testing is ongoing
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features inside QEMU docker - Lijian
        • Latest VPP binary crash on the QEMU docker
          • System call fails inside QEMU docker when running VPP
        • Verify SVE/SVE2 features inside ARM QEMU VM
        • 'make test' execution is slow
        • Sync with DPDK team/VPP community to decide the solution
        • Proposals have been sent to VPP maintainer on verifying SVE/SVE2
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
        • Analysis of benchmarking results for Ampere Altra
          • A lot of context switches occur on Ampere Altra compared to N1SDP
          • perf tools used to capture the perf events
          • Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
          • Remove interrupts on altra but no performance improvement seen
          • instruction cache misses are higher on altra than N1
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Investigate VPP memif test cases in VPP CSIT - Jieqiang
      • Investigate VPP use cases proposals in containers - Tianyu
        • 3 use cases are investigated.
        • Will explore the memif logic and share the progress.
        • Will share the link on details about how to run VPP in container.
      • Investigate VPP agent usage - Tianyu
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

02/02/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
        • Dependency on maintainers to fix this issue
        • Voting rights will be enabled once this issue is fixed
          • Implementation is ready, and will do test it with actual patches.
          • Apply file locking mechanism to allow that only one VPP instances are running.
            • https://gerrit.fd.io/r/c/csit/+/30425
            • Patches are under review
            • Maintainer raised the ticket to get intel people involved
            • Will not update the firmware because the release testing is ongoing
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features inside QEMU docker - Lijian
        • Latest VPP binary crash on the QEMU docker
          • System call fails inside QEMU docker when running VPP
        • Verify SVE/SVE2 features inside ARM QEMU VM
        • 'make test' execution is slow
        • Sync with DPDK team/VPP community to decide the solution
        • Proposals have been sent to VPP maintainer on verifying SVE/SVE2
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
        • Analysis of benchmarking results for Ampere Altra
          • A lot of context switches occur on Ampere Altra compared to N1SDP
          • perf tools used to capture the perf events
          • Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
          • Remove interrupts on altra but no performance improvement seen
          • instruction cache misses are higher on altra than N1
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Investigate VPP memif test cases in VPP CSIT - Jieqiang
      • Investigate VPP use cases proposals in containers - Tianyu
        • 3 use cases are investigated.
        • Will explore the memif logic and share the progress.
        • Will share the link on details about how to run VPP in container.
      • Investigate VPP agent usage - Tianyu
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

01/19/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
        • CSIT official release 20.09 is available
          • https://docs.fd.io/csit/rls2009/report/
          • Jieqiang will compare the performance data with release 20.09
            • Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
            • DPDK testpmd running inside VM, l2 cross connect running inside VPP.
            • Check the number for CSIT 2101 release
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Hardware configurations/wiring are done; Physical connection to the TG is done.
        • almost done, two steps need to be done
          • start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • Take the execution time into consideration if we want run release testing on 2n-thx2.
          • It takes 9 hours to finish the one round testing.
          • Tests are running fine
            • L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
            • Suitable time to run release testing on 2n-tx2 testbed.
            • Will investigate IPSec test cases on 2n-tx2 - Juraj
            • Add memif test case to 2n-tx2 once the release testing is done.
    • VPP Path
      • CentOS-7 will be enabled with master branch for support lts release
        • CentOS-7 Jenkins on Arm will be supported.
      • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
          • Implementation is ready, and will do test it with actual patches.
          • Apply file locking mechanism to allow that only one VPP instances are running.
            • https://gerrit.fd.io/r/c/csit/+/30425
            • Patches are under review
            • Machiek raised the ticket to get intel people involved
            • Will not update the firmaware because the release testing is ongoing
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features inside QEMU docker
        • Latest VPP binary crash on the QEMU docker - Lijian
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
        • Analysis of benchmarking results for Ampere Altra
          • A lot of context switches occur on Ampere Altra compared to N1SDP
          • perf tools used to capture the perf events
          • Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Investigate VPP memif test cases in VPP CSIT - Jieqiang
      • Investigate VPP use cases proposals in containers - Tianyu
        • 3 use cases are investigated.
        • Will explore the memif logic and share the progress.
        • Will share the link on details about how to run VPP in container.
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind


01/05/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
        • CSIT official release 20.09 is available
          • https://docs.fd.io/csit/rls2009/report/
          • Jieqiang will compare the performance data with release 20.09
            • Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
            • DPDK testpmd running inside VM, l2 cross connect running inside VPP.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Hardware configurations/wiring are done; Physical connection to the TG is done.
        • almost done, two steps need to be done
          • start with basic L2/L3/IPSec/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • Take the execution time into consideration if we want run release testing on 2n-thx2.
          • Tests are running fine
            • L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
            • Suitable time to run release testing on 2n-tx2 testbed.
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
          • Implementation is ready, and will do test it with actual patches.
          • Apply file locking mechanism to allow that only one VPP instances are running.
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features inside QEMU docker
        • Latest VPP binary crash on the QEMU docker - Lijian
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Investigate VPP test cases in VPP CSIT - Jieqiang
      • Investigate VPP use cases proposals in containers - Tianyu
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

12/22/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
    • Will cancel the meeting on Dec 29th;
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • LF will provide QSFP+ fiber switch for FD.io lab.
        • Basically done. LF just procured the existing fiber switch currently rent by Arm in FD.io lab.
        • Send the progress to relavent people in Arm - Lijian
        • Confirm with Tina to ensure Arm is not charged - Lijian
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features on VPP CSIT
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind


12/15/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
    • Will cancel the meeting on Dec 29th;
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
          • Implementation is ready, and will do test it with actual patches.
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maitainers
      • LF will provide QSFP+ fiber switch for FD.io lab.
        • Basically done. LF just procured the existing fiber switch currently rent by Arm in FD.io lab.
        • Send the progress to relavent people in Arm - Lijian
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind


12/08/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • Working with VPP/DPDK/Intel to root cause this issue. - Juraj
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maitainers
      • LF will provide QSFP+ fiber switch for FD.io lab.
        • Vexxhost just has a spare one, and LF will buy it for FD.io lab, which will probably happen this month.
      • N1SDP shipment to FD.io
        • Govind will track the status
      • CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
        • Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
        • Arm is required to present Arm achievement and plan to TSC.
          • Govind will prepare the slides
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • SOC id will be available on /proc entry starting from kernel version 5.9
        • Will investigate the details - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Benchmarked cross-connect and TX queue is dropping packets
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • To optimize ethernet-input and avf-input node with NEON intrinsics
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals upstreamed
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • Have to repeat the testing in the future.
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Ampere altra server has some PCIe bugs.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

12/1/2020

  • Attendees
    • Govindarajan Mohandoss
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
      • LF will provide QSFP+ fiber switch for FD.io lab.
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • To enable voting right for the VPP device jobs. - Juraj
          • Failed tests due to sw_interface_dump api issue. - Juraj
        • VPP device job is unstable
          • Race condition occurs when multiple VPP instances are starting.
          • Will try to update the i40e driver & firmware.
      • N1SDP shipment to FD.io
        • Govind will update the shippment status to Juraj and Machiek.
        • Will still have to ship N1SDP to FD.io lab, and Machiek confirmed that there will be enough rack space.
      • CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
        • Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
        • Arm is required to present Arm achievement and plan to TSC.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • SOC id will be available on /proc entry starting from kernel version 5.9
        • Will investigate the details - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • To optimize ethernet-input and avf-input node with NEON intrinsics
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 proposal
      • Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
      • Patches are upstreamed for comments
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Ampere altra server has some PCIe bugs.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • IPsec on Arm platform. - Govind
      • Apply prefetches with ipsec-out node
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

11/24/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
      • LF will provide QSFP+ fiber switch for FD.io lab.
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • To enable voting right for the VPP device jobs. - Juraj
          • Failed tests due to sw_interface_dump api issue. - Juraj
      • N1SDP shipment to FD.io
        • Govind will update the shippment status to Juraj and Machiek.
        • Will still have to ship N1SDP to FD.io lab, and Machiek confirmed that there will be enough rack space.
      • CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
        • Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
        • Arm is required to present Arm achievement and plan to TSC.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • SOC id will be available on /proc entry starting from kernel version 5.9
        • Will investigate the details - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • To optimize ethernet-input and avf-input node with NEON intrinsics
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 proposal
      • Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
      • Patches are upstreamed for comments
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Ampere altra server has some PCIe bugs.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • IPsec on Arm platform. - Govind
      • Apply prefetches with ipsec-out node
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

11/17/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Tina Tsou
  • General
  • CSIT
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • SOC id will be available on /proc entry starting from kernel version 5.9
        • Will investigate the details - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • To optimize ethernet-input and avf-input node with NEON intrinsics
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 proposal
      • Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
      • Patches are upstreamed for comments
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Ampere altra server has some PCIe bugs.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • IPsec on Arm platform. - Govind
      • Apply prefetches with ipsec-out node
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

11/10/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
      • CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
      • L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
        • The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
        • Repeat tests on local N1SDP and cascade server. - Jieqiang
        • Repeat the test case with latest master branch. - Jieqiang
        • The patch introduced this perf drop need to be analyzed. - Jieqiang, Lijian
        • This patch needs to be analysed on VPP 2005 and 2001 releases. - Jieqiang, Lijian
        • The perf drop rate is ~5-8% on latest VPP code compared to the original data.
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
      • 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
        • Juraj to check with Peter about the feasibility.
        • Move the thx2 to the same rack for tg and install the same nic on tg.
        • 1g NIC for management installed on thx2, but cannot be net-booted.
          • Able to net-boot from the built-in 10G NIC.
          • The tx2 has been moved to the same rack where the tg is located.
          • Plan to set up the weekly perf tests on the new topo.
        • Port the robotframe configuration steps for tsh testbeds from thx1 to thx2 to speed up perf tests. - Juraj
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
        • https://gerrit.fd.io/r/c/ci-management/+/28022 automate the generation of docker builder images.
        • Plan to drop the support for CentOS 7 from Dave.
        • Tried Dave's patch to generate docker image on Arm and saw some errors. - Juraj
        • Test arm centos7 jenkins builder image. - Juraj.
        • Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
        • Figure out which host of two hosts to run the Jenkins job.
        • Revert to old kernel version 4.15.0-55 to avoid AVF issue.
        • AVF issue is common across the platform.
          • Differences between avf driver versions may be the root cause of behavior changes.
        • New VPP device job takes about 55 minutes to finish, which needs to be reduced to 45 minutes around.
          • Python runs slower on new thx2 servers than 1-node skylake.
          • Try new version of Python(such as 3.8) or split the device tests into two parts.
          • Check how many CPUs get utilized for robot framework execution on thx2 server.
          • Two thunderx2 are running fine right now and the VPP device jobs are almost done.
          • Disabling hyperthreading on new thx2 will speed up the VPP device tests.
          • Enable the voting right for the VPP device jobs. - Juraj
            • Failed tests due to sw_interface_dump api issue. - Juraj
      • N1SDP shippment to FD.io
        • Get response from Maciek about the rack space and traffic generator availability.
      • CSIT budget plan for 10g switch purchase in FD.io lab. - Juraj, Tina
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
      • Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
      • Summarize the meeting minutes and action items. - Lijian
      • SOC id will be available on /proc entry from kernel version 5.9 - Ljian, Honnappa
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
      • Start benchmarking AVF PMD driver in VPP on N1SDP.
      • Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
      • avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
    • SVE/SVE2 proposal
      • Will send email to Damjan asking him to review
      • SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
      • No further comments from VPP community.
      • Apply the SVE/SVE2 intrinsics on refactoring ethernet-input node. - Lijian
        • SVE/SVE2 functionality to be tested on the new development platform.
        • Verify SVE/SVE2 code changes on simulator.
        • Try to run standalone SVE codes on the new FPGA platform.
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
    • Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
      • Ampere altra server has some PCIe bugs.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
      • Find out the tuned configuration for cross connect test cases using AVF PMD driver.
        • Figure out corresponding configurations in CSIT scripts.
        • Repeat the ACL ingress SL test cases locally for N1SDP.
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Plans

11/03/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
        • https://gerrit.fd.io/r/c/ci-management/+/28022 automate the generation of docker builder images.
        • Test arm centos7 jenkins builder image. - Juraj.
        • Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
        • Figure out which host of two hosts to run the Jenkins job.
        • Revert to old kernel version 4.15.0-55 to avoid AVF issue.
        • AVF issue is common across the platform.
          • Differences between avf driver versions may be the root cause of behavior changes.
        • New VPP device job takes about 55 minutes to finish, which needs to be reduced to 45 minutes around.
          • Python runs slower on new thx2 servers than 1-node skylake.
          • Try new version of Python(such as 3.8) or split the device tests into two parts.
          • Check how many CPUs get utilized for robot framework execution on thx2 server.
          • Two thunderx2 are running fine right now and the VPP device jobs are almost done.
      • N1SDP shippment to FD.io
        • Get response from Machiek about the rack space and traffic generator avalability.
      • CSIT budget plan for 10g switch purchase in FD.io lab. - Juraj, Tina
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
      • Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
      • Summarize the meeting minutes and action items. - Lijian
      • SOC id will be available on /proc entry from kernel version 5.9 - Ljian, Honnappa
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
      • Start benchmarking AVF PMD driver in VPP on N1SDP.
      • Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
      • avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
    • SVE/SVE2 proposal
      • Will send email to Damjan asking him to review
      • SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
      • No further comments from VPP community.
      • Apply the SVE/SVE2 intrinsics on refractoring ethernet-input node. - Lijian
        • SVE/SVE2 functionality to be tested on the new development platform.
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
    • Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
      • Find out the tuned configuration for cross connect test cases using AVF PMD driver.
        • Figure out corresponding configurations in CSIT scripts
        • Repeat the ACL ingress SL test cases locally for N1SDP.
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind.
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
    • Plans

10/27/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
        • Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
        • Figure out which host of two hosts to run the Jenkins job.
        • Revert to old kernel version 4.15.0-55 to aviod AVF issue.
          • Differences between avf driver versions may be the root cause of behavior changes.
        • New VPP device job takes about 55 minutes to finish, which needs to be reduced to 40 minutes around.
          • Python runs slower on new thx2 servers than 1-node skylake.
          • Try new version of Python(such as 3.8) or split the device tests into two parts.
          • Check how many CPUs get utilized for robot framework execution on thx2 server.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
      • Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
      • Summarize the meeting minutes and action items. - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
      • Start benchmarking AVF PMD driver in VPP on N1SDP.
      • Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
      • avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
    • SVE/SVE2 proposal
      • Will send email to Damjan asking him to review
      • SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
      • No further comments from VPP community.
      • Apply the SVE/SVE2 on ethernet-input node. - Lijian
    • Repeat the 4x and 2x loop unrolling tests on Ampere server. - Jieqiang
    • Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
        • Figure out corresponding configurations in CSIT scripts
        • Repeat the ACL ingress SL test cases locally for N1SDP.
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
    • Plans

10/20/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
        • Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
        • Errors happen when running latest VPP debug image, which was introduced by https://gerrit.fd.io/r/c/vpp/+/29490 - Lijian
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
        • Figure out which host of two hosts to run the Jenkins job.
        • Two failed test cases related to AVF plugin.
          • The root cause is the newer kernel version - 4.15.0-118-generic fails, 4.15.0-72-generic works.
          • Downgrade the kernel version to 4.15.0-72-generic and continue the VPP device testing.
          • Try the same experiment on X86 to see if this issue is arm-specific or not. - Juraj
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
      • Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
      • Start benchmarking AVF PMD driver in VPP on N1SDP.
      • Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
    • SVE/SVE2 proposal
    • Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
      • Finished the benchmarking and shared the data to team. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
        • Figure out corresponding configurations in CSIT scripts
        • Repeat the ACL ingress SL test cases locally for N1SDP.
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
    • Plans

10/13/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
        • Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
        • Figure out which host of two hosts to run the Jenkins job.
        • Two failed test cases related to AVF plugin.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
      • Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
    • SVE/SVE2 proposal
    • Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
      • Finished the benchmarking and shared the data to team.
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
        • Figure out corresponding configurations in CSIT scripts
        • Repeat the ACL ingress SL test cases locally for N1SDP.
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
    • Plans

10/06/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tina Tsou
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs and other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
    • SVE/SVE2 proposal
    • Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
      • Finished the benchmarking and shared the data to team.
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
      • Figure out corresponding configurations in CSIT scripts
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
    • Plans

09/29/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate Vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
    • SVE/SVE2 proposal
    • Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
      • Finished the benchmarking and shared the data to team.
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
      • Figure out corresponding configurations in CSIT scripts
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
    • Plans

09/22/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate Vendor CPUs with other Perseus CPUs
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
    • SVE/SVE2 proposal
    • Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
      • Figure out corresponding configurations in CSIT scripts
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
    • Plans

09/15/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • LF will pay for the expense, and Vexhost has or will make the order for new RAM module.
      • Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
      • Check with Juraj with the latest news about the faulty RAMs.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - first step;
        • Add CentOS-7 on Arm will be second step.
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
        • Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
    • VPP Device
      • We can decommission 3x SoftIron servers directly, but for the existing ThunderX2 servers, the decommission with it could be temporarily. We probably will reinstall it in the near future.
      • Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
    • Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
    • Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
    • Budget plan for CSIT FD.io lab.
      • We have enough servers for VPP path & device tests.
      • We can ask the CSIT FD.io lab folks for saving rack space for arm servers.
      • We may plan to send new advanced servers for perf tests in future but we won't mention the specific server type.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • Vendor CPU server enablement in VPP - Lijian
      • Ready for internal review
      • Will discuss with VPP maintainer
    • Investigate VPP Intel AVF driver - Lijian
    • SVE
      • SVE intrinsics wrapper is done. Proposal patch is ready for review.
      • ACLE, architecture, sve-sve2-programming-example
      • SVE intrinsics is preferred.
      • Share dpdk team with SVE knowledge.
    • Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
      • Will repeat scalability testing on N1SDP.
    • Benchmark AVF driver btw Cascade Lake and N1SDP - Jieqiang
      • Will investigate AVF drivers on Arm. - Lijian
    • Jieqiang will figure out performance data for 1x, 10Kx flows on Cascade Lake in CSIT.
      • Conform if the system is same for the local dell server and cascade server in CSIT. - Jieqiang
      • Check if there are any test cases with 1t1c/2t2c/4t4c configured for 2n-clx testbed in CSIT - Jieqiang
      • Performance data; Configurations;
    • VM2VM
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Started system tuning on PMD TX direction.
      • Investigate mempool configuration.
      • Change the descriptor size by modifying the DPDK source code.
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans

09/08/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • LF will pay for the expense, and Vexhost has or will make the order for new RAM module.
      • Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - first step;
        • Add CentOS-7 on Arm will be second step.
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
        • Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
    • VPP Device
      • We can decommission 3x SoftIron servers directly, but for the existing ThunderX2 servers, the decommission with it could be temporarily. We probably will reinstall it in the near future.
      • Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
    • Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
    • Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • SVE
      • SVE intrinsics wrapper is done. Proposal patch is ready for review.
      • ACLE, architecture, sve-sve2-programming-example
      • SVE intrinsics is preferred.
    • Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
      • Will repeat scalability testing on N1SDP.
    • Benchmark AVF driver btw Cascade Lake and N1SDP - Jieqiang
      • Will investigate AVF drivers on Arm. - Lijian
    • Jieqiang will figure out performance data for 1x, 10Kx flows on Cascade Lake in CSIT.
      • Performance data; Configurations;
    • VM2VM
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Started system tuning on PMD TX direction.
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans

09/01/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
        • Seems plugin working RAMs into empty slots will resolve the problem.
        • Juraj will send email to Machiek about the ownership of any FD.io lab servers, and who should pay for the charge.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
        • IPMI IP is configured via SSH Linux prompt. It's working fine now.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
        • Pending with Vexx host to proceed further.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
        • Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
    • VPP Device
      • Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
    • Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
    • Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
  • VPP
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
      • gcc-10.1.0 has compiling errors with latest VPP source code.
        • This issue is fixed by Jieqiang and available for internal review.
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • gcc-10 compiling issue is resolved and merged.
    • SVE
      • ACLE, architecture, sve-sve2-programming-example
      • SVE intrinsics is preferred.
    • Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
    • VM2VM
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Started system tuning on PMD TX direction.
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans

08/25/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
        • Seems plugin working RAMs into empty slots will resolve the problem.
        • Juraj will send email to Machiek about the ownership of any FD.io lab servers, and who should pay for the charge.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
        • IPMI IP is configured via SSH Linux prompt. It's working fine now.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
        • Pending with Vexx host to proceed further.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
        • Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
    • VPP Device
      • Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
    • Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
    • Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
  • VPP
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
      • gcc-10.1.0 has compiling errors with latest VPP source code.
        • This issue is fixed by Jieqiang and available for internal review.
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • SVE
      • ACLE, architecture, sve-sve2-programming-example
      • SVE intrinsics is preferred.
    • Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
    • VM2VM
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans


08/18/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
      • Jieqiang is investigating some performance drop (between 2005 and 2008 releases) cases on Taishan servers.
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
      • They have finished collecting data with performance testing setup, and the mrr daily is resumed
      • FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
      • Jieqiang will share investigation report, but so far there is no apparent performance differences.
        • Some 4T4C test cases on Taishan have obvious performance drop. Will compare the trending with x86 machines.
    • VPP Path
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
        • Pending with Vexx host to proceed further.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
        • Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
    • VPP Device
    • Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
    • Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
      • Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
      • Investigating vlib_timer and timer wheel in VPP.
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
      • gcc-10.1.0 has compiling errors with latest VPP source code.
        • This issue is fixed by Jieqiang and available for internal review.
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans


08/11/2020

  • Attendees
    • Honnappa Nagarahalli
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
    • Filip Varga
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
      • Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
      • Investigating vlib_timer and timer wheel in VPP.
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
      • gcc-10.1.0 has compiling errors with latest VPP source code.
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans

08/04/2020

  • Attendees
    • Honnappa Nagarahalli
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
    • Filip Varga
  • General
  • CSIT
    • VPP Performance Test
      • They have finished collecting data with performance testing setup, and the mrr daily is resumed
      • FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
      • Jieqiang will share investigation report, but so far there is no apparent performance differences.
        • Some 4T4C test cases on Taishan have obvious performance drop. Will compare the trending with x86 machines.
    • VPP Path
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
      • Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
      • Investigating vlib_timer and timer wheel in VPP.
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
      • gcc-10.1.0 has compiling errors with latest VPP source code.
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans


07/28/2020

  • Attendees
    • Honnappa Nagarahalli
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • They have finished collecting data with performance testing setup, and the mrr daily is resumed
      • FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
      • Jieqiang will share investigation report, but so far there is no apparent performance differences.
      • VPP performance testing is running once a week.
    • VPP Path
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
      • Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
      • Investigating vlib_timer and timer wheel in VPP.
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans

07/21/2020

  • Attendees
    • Honnappa Nagarahalli
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Arm has
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
      • Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
      • Investigating vlib_timer and timer wheel in VPP.
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
    • N1SDP enablement. - Lijian
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans


07/14/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community has started collecting performance data with these CSIT machines.
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
        • Two of the three ThunderX1 servers cannot be accessed.
        • Spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
        • Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
      • Investigating various No. of rx_q_bufs & tx_q_bufs
      • Investigating various No. of vector size, and check its affection on throughput
      • Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
    • ACL optimization investigation on n1sdp - Govind
      • Investigating using SPE counters to profile ACL plugin bottle-neck
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

07/07/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community has started collecting performance data with these CSIT machines.
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
        • Two of the three ThunderX1 servers cannot be accessed.
        • Spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
        • Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
      • Investigating various No. of rx_q_bufs & tx_q_bufs
      • Investigating various No. of vector size, and check its affection on throughput
      • Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
    • ACL optimization investigation on n1sdp - Govind
      • Investigating using SPE counters to profile ACL plugin bottle-neck
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang


06/30/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community has started collecting performance data with these CSIT machines.
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
        • Two of the three ThunderX1 servers cannot be accessed.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
        • Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
      • Investigating various No. of rx_q_bufs & tx_q_bufs
      • Investigating various No. of vector size, and check its affection on throughput
      • Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
    • ACL optimization investigation on n1sdp - Govind
      • Investigating using SPE counters to profile ACL plugin bottle-neck
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

06/23/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community has started collecting performance data with these CSIT machines.
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • Two of the three ThunderX1 servers cannot be accessed.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
        • Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • L3FWD status
    • CSIT status
    • EPIC plan
      • SVE2 investigation in VPP;
      • VPP hoststack TCP/CPS(Connnection per Second) investigation;
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • Profiling with NMU-600 counters.
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

06/16/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community has started collecting performance data with these CSIT machines.
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
      • Investigate Ubuntu-20.04 on Arm servers - Jieqiang
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
        • labelled by Dave Wallace to use it for VPP Jenkins job.
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
        • gcc-9 is hard-coded and used, so compilation issue is gone.
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • Profiling with NMU-600 counters.
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

06/09/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community will collect performance data with these CSIT machines.
      • IPSec tunnel configuration issue.
        • Issue is resolved.
        • Also failing on x86. CSIT maintainer is trying to root cause the problem.
          • Juraj to run the IPSec regression on Taishan server with the IPSec patch.
    • VPP Path
      • Investigate Ubuntu-20.04 on Arm servers - Jieqiang
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
        • labelled by Dave Wallace to use it for VPP Jenkins job.
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
        • gcc-9 is hard-coded and used, so compilation issue is gone.
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • Profiling with NMU-600 counters.
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

06/02/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • IPSec tunnel configuration issue.
        • Also failing on x86. CSIT maintainer is trying to root cause the problem.
          • Juraj to run the IPSec regression on Taishan server with the IPSec patch.
    • VPP Path
      • Investigate Ubuntu-20.04 on Arm servers - Jieqiang
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
        • labelled by Dave wallace to use it for VPP Jenkins job.
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
        • gcc-9 is hard-coded and used, so compilation issue is gone.
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

05/26/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
      • IPSec tunnel configuration issue.
        • Also failing on x86. CSIT maintainer is trying to root cause the problem.
          • Juraj to run the IPSec regression on Taishan server with the IPSec patch.
    • VPP Path
      • Investigate Ubuntu-20.04 on Arm servers - Jieqiang
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
        • labelled by Dave wallace to use it for VPP Jenkins job.
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
        • gcc-9 is hard-coded and used, so compilation issue is gone.
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers. Jieqiang will setup a meeting with Juraj regarding this documentation.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
    • N1SDP enablement. - Lijian
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang


05/19/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • the other failure is related with VPP image on Arm, IPSec tunnel configuration issue.
        • Also failing on x86. CSIT maintainer is trying to root cause the problem.
    • VPP Path
      • Investigate Ubuntu-20.04 on Arm servers - Jieqiang
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
    • N1SDP enablement. - Lijian
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

04/28/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
  • General
  • CSIT
  • FD.io lab
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • Resolve VPP compiling issue with clang-6.
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
    • N1SDP enablement. - Lijian
      • Multi-arch, arch-specific compiling and dynamic function selection patch is merged.
      • IOMMU limitation issue is gone after upgrade the kernel and fw
        • Share kernel/fw upgrade version to Govind
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
    • Trying to make IPsec enabled with Arm platform. - Govind
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

04/28/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
    • Arthur Marshall
  • General
  • CSIT
  • FD.io lab
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
    • Investigate bihash operations in L2 throughput are hot-spots
      • To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
      • To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
    • Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
    • N1SDP enablement. - Lijian
    • GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
      • Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
      • It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
    • iova_mode == VA not working issue is not root-caused
      • DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
      • However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
        • This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
    • Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
    • Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
      • The degradation is seen even when L3 cache is enabled.
    • Trying to make IPsec enabled with Arm platform. - Govind
    • Create Confluence page to record all the performance benchmarking data - Lijian
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
        • Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.

04/21/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
  • General
  • CSIT
  • FD.io lab
  • VPP
    • Investigate bihash operations in L2 throughput are hot-spots
      • To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
      • To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
    • Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
    • N1SDP enablement. - Lijian
    • gcc-10 is not working so far.
      • GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
      • Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
      • It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
    • iova_mode == VA not working issue is not root-caused
      • DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
      • However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
        • This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
    • Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
    • Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
      • The degradation is seen even when L3 cache is enabled.
    • Trying to make IPsec enabled with Arm platform. - Govind
    • Create Confluence page to record all the performance benchmarking data - Lijian
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
        • Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.

04/14/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
  • General
  • CSIT
  • FD.io lab
  • VPP
    • Investigate bihash operations in L2 throughput are hot-spots
      • To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
      • To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
    • Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
    • N1SDP enablement. - Lijian
      • GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
      • Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
      • It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
    • iova_mode == VA not working issue is not root-caused
      • DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
      • However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
        • This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
    • Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
    • Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
      • The degradation is seen even when L3 cache is enabled.
    • Trying to make IPsec enabled with Arm platform. - Govind
    • Create Confluence page to record all the performance benchmarking data - Lijian
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
        • Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.

04/07/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
  • General
  • CSIT
  • FD.io lab
  • VPP
    • Vectorization
    • Investigate bihash operations in L2 throughput are hot-spots
      • To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
      • To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
    • Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
    • N1SDP enablement. - Lijian
      • GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
      • Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
      • It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
    • iova_mode == VA not working issue is not root-caused
      • DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
      • However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
        • This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
    • Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
    • Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
      • The degradation is seen even when L3 cache is enabled.
    • Trying to make IPsec enabled with Arm platform. - Govind
    • Create Confluence page to record all the performance benchmarking data - Lijian
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
        • Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.

03/31/2020

03/24/2020

03/17/2020

03/10/2020

03/03/2020

02/25/2020


02/18/2020


02/11/2020


02/04/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Juraj Linkes
    • Tina Tsou
  • General
  • CSIT
  • FD.io lab
    • Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
      • Cables for intel NICs have been ordered.
      • Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
    • Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
    • Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
    • Current Configurations:
      • RAM: 256G
      • Disk: 480G SSD
      • The boxes are coming with Qlogic cards which are not supported in VPP.
    • Changes required to the servers:
      • The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
      • Need 2 Intel NICs XL710-QDA2 for each server.
      • If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
      • Disk size to 480G
      • Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
      • Cables: N1, P1 to N2, P1 and so on
      • Cables for IPMI and Management port: 2
    • Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
    • Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
    • ThunderX1
  • VPP
    • Align Arm patches with VPP release plan.
      • F0 2020-01-08 APIs frozen. Only low-risk changes accepted on main branch.
      • RC1 2020-01-15 (F0+7) Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
      • RC2 2020-01-22 (RC1+7) Second artifacts posted.
      • Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
    • Vectorization
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Benchmarking AVF drivers on Arm servers - Jieqiang
      • VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
      • Check if performance tests includes AVF driver or not?
    • AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
      • Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
      • Will try one patch to enable N1SDP board.
      • Please try AVF with Mcbin if possible.
    • Investigate bihash operations in L2 throughput are hot-spots
      • Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
      • Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
      • Cache misses and CRC32 calculation are possible opportunities.
        • To check cycles by applying CRC32 calculation unrolling
    • Bench-mark VPP on Dawn N1SDP board
      • Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
    • Investigating bi-hash lockless implementation - Jason
      • Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
    • Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
    • EPIC for next quarter:

01/28/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
  • General
  • CSIT
  • FD.io lab
    • Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
      • Cables for intel NICs have been ordered.
      • Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
    • Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
    • Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
    • Current Configurations:
      • RAM: 256G
      • Disk: 480G SSD
      • The boxes are coming with Qlogic cards which are not supported in VPP.
    • Changes required to the servers:
      • The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
      • Need 2 Intel NICs XL710-QDA2 for each server.
      • If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
      • Disk size to 480G
      • Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
      • Cables: N1, P1 to N2, P1 and so on
      • Cables for IPMI and Management port: 2
    • Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
    • Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
    • ThunderX1
  • VPP
    • Align Arm patches with VPP release plan.
      • F0 2020-01-08 APIs frozen. Only low-risk changes accepted on main branch.
      • RC1 2020-01-15 (F0+7) Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
      • RC2 2020-01-22 (RC1+7) Second artifacts posted.
      • Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
    • Vectorization
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Benchmarking AVF drivers on Arm servers - Jieqiang
      • VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
      • Check if performance tests includes AVF driver or not?
    • AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
      • Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
      • Will try one patch to enable N1SDP board.
      • Please try AVF with Mcbin if possible.
    • Investigate bihash operations in L2 throughput are hot-spots
      • Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
      • Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
      • Cache misses and CRC32 calculation are possible opportunities.
        • To check cycles by applying CRC32 calculation unrolling
    • Bench-mark VPP on Dawn N1SDP board
      • Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
    • Investigating bi-hash lockless implementation - Jason
      • Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
    • Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
    • EPIC for next quarter:

01/21/2020


01/14/2020

01/07/2020


12/17/2019

12/10/2019


12/03/2019

11/26/2019

11/19/2019

11/12/2019

10/29/2019

10/22/2019

10/15/2019

10/08/2019

10/01/2019

09/24/2019

09/17/2019

09/10/2019

09/03/2019

08/27/2019

08/20/2019

08/13/2019

08/06/2019

07/30/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
    • Christian Hopps
  • General
  • CSIT
  • FD.io lab
  • VPP
    • https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
    • Align Arm patches with VPP release plan.
      • Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
      • Will check VPP release schedual and map with Arm Quaterly plan.
      • Note down patches in community review and align them to VPP release plan.
      • It has been challenging to do that in VPP.
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue optimize it with relaxed atomic intrinsics - Lijian
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
        • The patch is also enabled for x86. Will ask maintainer to review.
    • Spinlock/read-write lock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
      • Spinlock with inner loop got improvement on both x86 and Arm.
      • Read/write lock got a little degradation with the patch.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
      • Jieqiang checked the video by Sirshak
    • Apply dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
      • Will do bench-marking profiling on mcbin/Bluefield.
    • Think of memory usage and optimization for smaller device/memory
    • Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
    • To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

07/23/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
    • Christian Hopps
  • General
  • CSIT
    • VPP Performance Test
    • Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
    • Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
    • Only 1 out of 199 test cases failed, 8 test cases show random 'show interface' failure.
    • Some failures are related with 'show hardware'/'show interface'/'show vhost dump', time-out.
    • Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
      • creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • VPP Path
      • Working on MAC learning test failures on Cortex-A72 server - Jieqiang
        • Enlarge duration can fix the failure, but will investigate more details.
        • Issues have been fixed in latest master branch. Investigating the details.
      • cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
    • VPP Device
      • VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
        • Send email and current debug details to community calling for volunteer to fix it. - Lijian
        • pmalloc module test cases failed on Arm server.
      • Changes are uploaded to community gerrit.
      • vpp VMs seems to bring up well. Will work on init script and bring up vpp.
        • VM tests passed. Patches are to be submitted for community review.
        • All the patches are merged and all images are built.
      • Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
      • Ed to help set up numad cluster with dual ThunderX and one ThunderX2
  • FD.io lab
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue optimize it with relaxed atomic intrinsics - Lijian
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
        • The patch is also enabled for x86. Will ask maintainer to review.
    • Spinlock/read-write lock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
      • Spinlock with inner loop got improvement on both x86 and Arm.
      • Read/write lock got a little degradation with the patch.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
      • Inform MAP owner that Jieqiang will take care of MAP on VPP. - Lijian
    • Apply dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
      • Will do bench-marking profiling on mcbin/Bluefield.
    • Think of memory usage and optimization for smaller device/memory
    • Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
    • To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

07/16/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
    • Christian Hopps
  • General
  • CSIT
    • VPP Performance Test
    • Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
    • Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
      • creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • VPP Path
      • Working on MAC learning test failures on Cortex-A72 server - Jieqiang
        • Enlarge duration can fix the failure, but will investigate more details.
        • Issues have been fixed in latest master branch. Investigating the details.
      • cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
    • VPP Device
      • VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
        • Send email and current debug details to community calling for volunteer to fix it. - Lijian
      • Changes are uploaded to community gerrit.
      • vpp VMs seems to bring up well. Will work on init script and bring up vpp.
        • VM tests passed. Patches are to be submitted for community review.
        • Patch is splited into three small pieces. Two patches (kernel image for VM test/generic CSIT changes to support ThunderX2 testbed) are merged. Third patch about code changes for VM test to be merged, Arm specific code and use kernel image.
      • Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
        • Docker images for both Arm and x86 are merged and available.
        • Docker image is verified on Arm server, but to verify it on x86 server also and try it in Jenkins.
      • Ed to help set up numad cluster with dual ThunderX and one ThunderX2
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
      • It’s 1RU blade ThunderX2.
      • The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
      • The machine should have a big RAM, more than 120G and 256G preferred.
      • The machine should Three NICs (XL710-QDA2, 2x40G).
      • The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue optimize it with relaxed atomic intrinsics - Lijian
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
        • The patch is also enabled for x86. Will ask maintainer to review.
    • Spinlock/read-write lock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
      • Spinlock with inner loop got improvement on both x86 and Arm.
      • Read/write lock got a little degradation with the patch.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Apply dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
      • Will do bench-marking profiling on mcbin.
    • Think of memory usage and optimization for smaller device/memory
    • Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
    • To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

07/09/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
    • Christian Hopps
  • General
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
    • Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
      • creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • VPP Path
    • VPP Device
      • VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
        • Send email and current debug details to community calling for volunteer to fix it. - Lijian
      • Changes are uploaded to community gerrit.
      • vpp VMs seems to bring up well. Will work on init script and bring up vpp.
        • VM tests passed. Patches are to be submitted for community review.
      • Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
        • Docker images for both Arm and x86 are merged and available.
      • Ed to help set up numad cluster with dual ThunderX and one ThunderX2
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
      • Update the current status to Pravin. - Lijian
      • The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
      • Require a bigger than 120G RAM, prefer 256G
      • Three NICs and each has two ports.
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue optimize it with relaxed atomic intrinsics - Lijian
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
    • Spinlock/read-write lock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
      • Spinlock with inner loop got improvement on both x86 and Arm.
      • Read/write lock got a little degradation with the patch.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Apply dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
      • Will do bench-marking profiling on mcbin.
    • Think of memory usage and optimization for smaller device/memory
    • Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
    • To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective


07/02/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
  • General
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
    • Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
      • creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • VPP Path
    • VPP Device
      • VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
        • Send email and current debug details to community calling for volunteer to fix it. - Lijian
      • vpp VMs seems to bring up well. Will work on init script and bring up vpp.
      • Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
      • Set up numad cluster with dual ThunderX and one ThunderX2
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
      • Update the current status to Pravin. - Lijian
      • The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
      • Require a bigger than 120G RAM, prefer 256G
      • Three NICs and each has two ports.
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue, remove atomic intrinsics and use lock version only - Lijian
      • Have verified removing atomic intrinsics in message_queue alloc/free APIs, and require confirmation from Florin.
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
    • Spinlock/read-write lock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
    • Fix ip4_forward compiling - Jason
      • Will check gerrit CI/CD related with that patch. Check why it's not warning in gerrit Jenkins.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Spread dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Will do bench-marking profiling on mcbin.
    • Think of memory usage and optimization for smaller device/memory
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

06/25/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
  • General
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
    • Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
      • creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • VPP Path
    • VPP Device
      • VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
      • Crypto test cases, will use dpdk driver if configured, native-vpp implementation, fall back to openSSL
        • Will try Crypto test cases next week - Juraj
      • Juraj to send Lijian the details of vpp VMs, Lijian will confirm internally
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
      • Firstly will sponsor the machine
      • The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
      • Require a bigger than 120G RAM, prefer 256G
      • Three NICs and each has two ports.
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue, remove atomic intrinsics and use lock version only - Lijian
      • Have verified removing atomic intrinsics in message_queue alloc/free APIs, and require confirmation from Florin.
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
    • Spinlock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Spread dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Will do bench-marking profiling on mcbin.
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

06/18/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina Tsou
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
  • General
  • CSIT
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
      • Will confirm with Florin to remove atomic intrinsics in message_queue alloc/free APIs
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - Upstreamed.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Investigate hyperscan plugin in VPP - Sirshak
    • Spread qual/quad optimization - ethernet-input
    • Redo perf/MAP profiling/bench-marking
      • DPI plugin?
    • EPIC for next quarter:
      • Apply dual/quad optimization on more data path nodes
      • Investigate and optimize VPP hash and bihash library
      • VPP translation overhead analysis btw Mbuf and VLIB buffer ENTNET-1293
      • VPP Memif performance analysis and optimization ENTNET-1292
      • VPP l3fwd performance analysis and optimization ENTNET-751
      • Using MAP with VPP ENTNET-1288

06/11/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina Tsou
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj
  • General
  • CSIT
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
      • Will confirm with Florin to remove atomic intrinsics in message_queue alloc/free APIs
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - Upstreamed.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Investigate hyperscan plugin in VPP - Sirshak
    • Spread qual/quad optimization - ethernet-input
    • Redo perf/MAP profiling/bench-marking
      • DPI plugin?
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

06/04/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina Tsou
    • Lijian Zhang
    • Jieqiang Wang
    • Stan
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - Upstreamed.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Investigate hyperscan plugin in VPP - Sirshak
      • DPI plugin?
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

05/28/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina
    • Lijian Zhang
    • Vijay (vijayakumar.rajamanickam@nokia.com)
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - internal Review.
    • MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
    • Investigate hyperscan plugin in VPP - Sirshak
      • DPI plugin?
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

05/21/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina
    • Lijian Zhang
    • Vijay (vijayakumar.rajamanickam@nokia.com)
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - internal Review.
    • MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
    • Investigate hyperscan plugin in VPP - Sirshak
      • DPI plugin?
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

05/14/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina
    • Lijian Zhang
    • Vijay (vijayakumar.rajamanickam@nokia.com)
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • VPP generic distro package building patch - Patch updated. Require Damjan's follow up review.
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - internal Review.
    • MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
    • Investigate hyperscan plugin in VPP - Sirshak
      • DPI plugin?
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

05/07/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina
    • Lijian Zhang
    • Vijay (vijayakumar.rajamanickam@nokia.com)
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
      • Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input.
    • VPP generic distro package building patch - Patch updated Damjan's follow up review required.
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • Vectorization in esp-encrypt, optimize memcpy_le. Upstreamed(https://gerrit.fd.io/r/#/c/18398/). - Lijian
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - internal Review.
    • MAP(Arm Proprietary Performance Analysis Tool) with VPP - Tried internal Patch still failing. Continuing to work on it.
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

04/30/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
      • Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input.
    • VPP generic distro package building patch - Patch updated Damjan's follow up review required.
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • Vectorization in esp-encrypt, optimize memcpy_le. Upstreamed(https://gerrit.fd.io/r/#/c/18398/). - Lijian
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - internal Review.
    • MAP(Arm Proprietary Performance Analysis Tool) with VPP - Tried internal Patch still failing. Continuing to work on it.
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

04/23/2019

  • Attendees
    • Sirshak Das
    • Lijian Zhang
    • Juraj Linkeš
    • Vijay
    • Nitin
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • List all the blockers on aarch64 in CSIT wiki page - Done - Stan or Juraj
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • QSFP+ is available and working now.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
      • Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
      • These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
      • Investigate why these three blades have only one numa node - Juraj
  • VPP
    • Investigate session_queue_node_fn/vlib_worker_loop.
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
      • Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input
    • Rebase VPP distro package building patch; contact Damjan in slack; Talk with Damjan in vpp meeting - Lijian & Sirshak
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Perf degradation is fixed. Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK sample apps on it now.
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
    • Vectorization
    • TAS patch will be ready soon (Sirshak)
    • MAP with VPP is ongoing - Sirshak
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective
  • Action Items - Last Week
  • Action Items - Next Week

04/16/2019

  • Attendees
    • Sirshak Das
    • Lijian Zhang
    • Juraj Linkeš
    • Vijay
    • Nitin
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • List all the blockers on aarch64 in CSIT wiki page - Done - Stan or Juraj
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • QSFP+ is available and working now.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
      • Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
      • These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
      • Investigate why these three blades have only one numa node - Juraj
  • VPP
    • Investigate session_queue_node_fn/vlib_worker_loop. - https://jira.arm.com/browse/ENTNET-1179 - Done
      • Will create two Jira tickets to track the findings. - Lijian
    • Rebase VPP distro package building patch; contact Damjan in slack; Talk with Damjan in vpp meeting - Lijian & Sirshak
    • Investigating message queue - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Perf degradation is fixed. Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK sample apps on it now.
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
    • Vectorization
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective
  • Action Items - Last Week
  • Action Items - Next Week

04/09/2019

  • Attendees
    • Sirshak Das
    • Lijian Zhang
    • Juraj Linkeš
    • Nitin
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • List all the blockers on aarch64 in CSIT wiki page - Done - Stan or Juraj
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • QSFP+ is available and working now.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
      • Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
      • These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
      • Investigate why these three blades have only one numa node - Juraj
  • VPP
  • VPP Hoststack
    • Investigate session_queue_node_fn/vlib_worker_loop. - https://jira.arm.com/browse/ENTNET-1179 - Done
    • Rebase VPP distro package building patch; contact Damjan in slack; Talk with Damjan in vpp meeting - Lijian & Sirshak
    • Investigating message queue - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Perf degradation is fixed. Investigating performance degradation on Bluefield - Sirshak
      • Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK sample apps on it now.
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
    • Vectorization
      • Vecterization in esp-encrypt, https://gerrit.fd.io/r/#/c/18398/ - Get improvement on ThunderX/OcteonTX/Taishan, but degradation on ThunderX2 - Lijian
      • ethernet-input - will implement for aarch64 128bits only
      • Create vectorization specific EPIC - Lijian
  • Action Items - Last Week
  • Action Items - Next Week

04/02/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Nitin
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • Investigate session_queue_node_fn/message queue data structure. - Investigating the source code
    • Review https://gerrit.fd.io/r/#/c/18398/ - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Upstreamed. Perf degradation is fixed.
      • Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK on it now.
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • List all the blockers on aarch64 in CSIT wiki page - Stan or Juraj
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Both binaries and packages built with generic option by default, and provide Makefile variable NATIVE_OPTIMIZE=Y for end user to build native optimized images.
      • Prepare email and a draft patch asking comments from community - Lijian
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • QSFP+ is available and working now.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
      • Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
      • These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
      • Investigate why these three blades have only one numa node - Juraj
  • VPP
    • Write description/expectation about the two NEON related patch - Lijian
    • Investigating performance degradation on CortexA72 - Sirshak
    • Message queue - Sirshak
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - To close it.
    • Vectorization
      • ethernet-input - no progress yet
    • 128B cache line size
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

03/26/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Nitin
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • Investigate session_queue_node_fn/message queue data structure. - Investigating the source code
    • Review https://gerrit.fd.io/r/#/c/18398/ - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Upstreamed. Perf degradation is fixed.
      • Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK on it now.
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Both binaries and packages built with generic option by default, and provide Makefile variable NATIVE_OPTIMIZE=Y for end user to build native optimized images.
      • Prepare email and a draft patch asking comments from community - Lijian
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • QSFP+ is available and working now.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
      • Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
      • These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
      • Investigate why these three blades have only one numa node - Juraj
  • VPP
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - To close it.
    • Vectorization
      • ethernet-input - no progress yet
    • 128B cache line size
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

03/19/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • vlib_worker_loop and session_queue_node_fn are two major hot-spots. - Just started
    • Enable NEON instruction in Buffer pool free function. Patch is committed.
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Upstreamed, but still working on issues, e.g., performance degradation
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Done by Malvika.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
      • Prepare email and a draft patch asking comments from community - Lijian
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node. Also blocked by QSFP+ issue.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - In internal review.
    • Vectorization
    • 128B cache line size
      • VPP image with 128B cache line size crashed on ThunderX2 - Cannot reproduce crash with my setup
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week
    • Commit VPP distro making patch - Lijian
    • Plugin 25G NIC Taishan server, and connect the 25G ports to x86 25G NIC - Lijian
    • Follow Jianlin's suggestion, update Uboot and Kernel, and then sync up with Juraj - Lijian

03/12/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
    • Tina to update the meeting notice.
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • vlib_worker_loop and session_queue_node_fn are two major hot-spots. - No progress
    • Enable NEON instruction in Buffer pool free function. Patch is committed.
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. All test failures are resolved.
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Switched to Malvika.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
      • Prepare email and a draft patch asking comments from community - Lijian
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - In internal review.
    • Vectorization
    • 128B cache line size
      • VPP image with 128B cache line size crashed on ThunderX2
    • thunderx2 crashing - No update
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week
    • Commit VPP distro making patch - Lijian
    • Plugin 25G NIC Taishan server, and connect the 25G ports to x86 25G NIC - Lijian
    • Follow Jianlin's suggestion, update Uboot and Kernel, and then sync up with Juraj - Lijian

03/05/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • vlib_worker_loop and session_queue_node_fn are two major hot-spots. - No progress
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. All test failures are resolved.
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Switched to Malvika.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - No progress
      • Investigate with latest VPP code on x86 server - Lijian - Send emails to vpp-dev mailor if there's problem. Will not put much effort.
    • Vectorization
      • ethernet-input
      • buffer pools
    • 128B cache line size
      • Will try this on Taishan server - Slightly performance degradation with 128 bytes cache line
    • thunderx2 crashing - No update
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

02/26/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • el0_sys hot-spot on Taishan D05 only, no plan to fix it.
    • vlib_worker_loop and session_queue_node_fn are two major hot-spots. - No progress
    • memcpy optimization
      • memcpy patch verification on taishan by khem l3 forwarding usecase- Lijian Status(khem): No updates.
      • memcpy patch consumes more clocks in OcteonTX2 - updated by Nitin.
      • Stopped working on this patch.
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Test failure on SCTP, not root-caused yet.
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Switched to Marvikar
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Buffer Pools per NUMA
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560
      • Investigate with latest VPP code on x86 server - Lijian - Send emails to vpp-dev mailor if there's problem. Will not put much effort.
    • Vectorization
      • ethernet-input
      • buffer pools
    • 128B cache line size
      • Will try this on Taishan server - Slightly performance degradation with 128 bytes cache line
    • Qualcomm no change iperf3
    • thunderx2 crashing - No update
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

02/19/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • memcpy optimization
      • memcpy patch verification on taishan by khem l3 forwarding usecase- Lijian Status(khem): No updates.
      • memcpy patch consumes more clocks in OcteonTX2 - updated by Nitin.
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch.
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
    • Target: master trending job
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Buffer Pools per NUMA
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560
    • 1GB page taking long time Status: fixed.
      • Investigate with latest VPP code on x86 server
    • Vectorization
      • ethernet-input
      • buffer pools
      • memcpy
    • 128B cache line size
      • Will try this on Taishan server - Lijian
    • Qualcomm no change iperf3
    • thunderx2 crashing - No update
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

02/11/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • memcpy optimization
      • memcpy patch verification on taishan by khem l3 forwarding usecase- Lijian Status(khem): No updates.
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch.
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible.
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config
      • b. merging CSIT patch.
      • c. creating a job.
    • Target: master trending job
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Buffer Pools per NUMA
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560
    • 1GB page taking long time Status: fixed.
    • Vectorization
      • ethernet-input
      • buffer pools
      • memcpy
    • 128B cache line size
    • Qualcomm no change iperf3
    • thunderx2 crashing
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

02/05/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • memcpy optimization
      • Check optimized memory copy version are deployed on Taishan and ThunderX2 during runtime - Lijian
      • Send memcpy patch to Khem and Fede for further verification - Lijian Status: fede: small improvement in mcbin with iperf3, khem to try them with l3 forwarding
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation
      • Working on svm_fifo alternate version with front and back pointers synchronized instead of cursize.
    • Verifying per NUMA node buffer pool https://gerrit.fd.io/r/#/c/16638/
      • sirshak create jira id in fd.io jira. https://jira.fd.io/browse/VPP-1560
      • Hanging of VPP is actually VPP taking a lot of time to allocate 400K chunks for 1GB - Damjan has this in his todo list
      • gcc-8 compilation still fails on ARM.
      • Octeon-Tx failure. Status: unknown
    • Gorka is trying some optimal configs for VCL. Status: no updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • OcteonTx boots to buildroot with no dhclient hence an impasse. Still not clear how to use USB stick.
  • CSIT
    • VPP Path
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Status: no updates.
      • Kernel Migration on mcbin. Status:
      • ThunderX2:
    • VPP Performance Test
      • Stan and Khem to come up with a summary of current status and an estimate of at least upstreaming basic L2/L3 performance suites.
      • Juraj to come up with a solution for NUMA node anamoly in Taishan.
      • https://gerrit.fd.io/r/#/c/16850/ Status: Juraj has a version all ready to work. Package installation blocker.
      • Package installation error Status: Juraj to investigate logs.
  • FD.io lab
    • ThunderX1 -
      • New QSFP+ switch for ThunderX1 is available now: QSFP+ to be connected SFP+ switch.
      • Juraj to setup a call with LF folks on.
    • ThunderX2 -
      • Andy still waiting cables.
      • Juraj to remind Andy of when the cable will be available.
      • Juraj to follow up on ssh connectivity to thunderx2.
  • VPP
    • Vectorization
      • [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
      • [Lijian] Check if setting default cache line size to 128 will degradate thru-put on Taishan/Qualcomm/ThunderX2
        • no perf diff in Qualcomm
        • vpp crashes on thunderx2
        • waiting for results on A72 (Taishan)
      • [Sirshak] on ethernet-input node, investigate vectorized buffer index, Damjan's per numa node buffer pool patch. Status: No updates
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

01/29/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Lijian Zhang
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Honnappa Nagarahalli
    • John Ddigilio
  • General
  • VPP Hoststack
    • TaiShan Server with Debian distro crashed with command of 'ip probe-neighbor' when doing VPP hoststack with iperf3
    • With 64 bytes packets, on ThunderX2, 10G NIC, VPP hoststack bandwidth is about 1/2 of Linux Kernel stack.
    • With 64 bytes packets, on Taishan, 10G NIC, VPP hoststack bandwidth is about 2x of Linux Kernel stack.
    • Memory copy patch gives 4% improvement on VPP hoststack on Taishan server.
    • Check optimized memory copy version are deployed on Taishan and ThunderX2 during runtime - Lijian
    • Send memcopy patch to Khem and Fede for further verification - Lijian
    • iperf3 performance with Hoststack.
      • Investigating lock-less fifo, ip4_local_forward node and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
    • Verifying https://gerrit.fd.io/r/#/c/16638/ - Suppose to give better performance, but VPP hang with this patch on some Arm machines.
    • Gorka is trying some optimal configs for VCL. - No Updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • OcteonTX is received in ARM lab. Will boot it up firstly and then start doing profiling with it.
  • FD.io lab
    • ThunderX1 -
      • New Arista switch for ThunderX1 is available now. Gathering details that required by LF lab before sending the switch to CSIT lab. - Juraj - Andy will try to send the switch to CSIT this Thursday.
    • ThunderX2 -
      • Cable type is confirmed. Procurement is in the process.
      • Juraj to remind Andy of when the cable will be available.
      • Require access to these servers in FD.io lab. Anton gives the IP to access them.(ADMIN/ADMIN)
  • CSIT
    • VPP Path
      • So far so good.
      • ARM CI results are overwritten by x86 machines. Should be a Jenkin issue. Monitor if this corner will happen again. - Juraj
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts. Juraj is able to modify, execute the scripts in Container. Things to fix is scripts for 1-link 1-node topology and interfaces binding to VPP. Is able to run successfully a traffic test.
      • Kernel Migration on mcbin. Juraj is able to build all the images, but got kernel panic. Try with the more recent uBoot version. Tried latest uBoot image, but still has the same issue.
      • Juraj to investigate further work once ThunderX2 is available.
    • VPP Performance Test
  • VPP
    • Vectorization
      • [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
      • [Lijian] Check if setting default cache line size to 128 will degradate thru-put on Taishan/Qualcomm/ThunderX2
      • [Sirshak] on ethernet-input node, investigate vectorized buffer index.
    • Memory Ordering
      • [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
    • [Juraj] IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
    • [Juraj] Kernel Migration on mcbin. Juraj is able to build all the images.
  • Action Items - Next Week
    • [Sirshak] -

01/22/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Lijian Zhang
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Honnappa Nagarahalli
    • John Ddigilio
  • General
  • VPP Hoststack
    • TaiShan Server with Debian distro crashed with command of 'ip probe-neighbor' when doing VPP hoststack with iperf3
    • With 64 bytes packets, on ThunderX2, 10G NIC, VPP hoststack bandwidth is about 1/4 of Linux Kernel stack.
    • iperf3 performance with Hoststack.
      • Investigating lock-less fifo, ip4_local_forward node and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
    • Gorka is trying some optimal configs for VCL. - No Updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • OcteonTX is received in ARM lab. Will boot it up firstly and then start doing profiling with it.
  • FD.io lab
    • ThunderX1 -
      • New Arista switch for ThunderX1 is available now. Gathering details that required by LF lab before sending the switch to CSIT lab. - Juraj - Andy will try to send the switch to CSIT this Thursday.
    • ThunderX2 -
      • Cable type is confirmed. Procurement is in the process.
      • Require access to these servers in FD.io lab.
  • CSIT
    • VPP Path
      • So far so good.
      • ARM CI results are overwritten by x86 machines. Should be a Jenkin issue. Monitor if this corner will happen again. - Juraj
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts. Juraj is able to modify, execute the scripts in Container. Things to fix is scripts for 1-link 1-node topology and interfaces binding to VPP.
      • Kernel Migration on mcbin. Juraj is able to build all the images, but got kernel panic. Try with the more recent uBoot version.
      • Juraj to investigate further work once ThunderX2 is available.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now.
      • Khem will get L2 work in CI firstly, and then IP4, and other test cases.
      • Stan starts to work on performance scripts with Khem. Is able to connect Taishan machines in CSIT lab.
      • The performance topology in wiki link is to update per below file.
      • https://github.com/FDio/csit/blob/master/docs/lab/Testbeds_Xeon_Skx_Arm_Atom.md
      • Stan and Khem to come up with a summary of current status and an estimate of at least upstreaming basic L2/L3 performance suites.
        • Install Ubuntu-18.04 on Huawei Taishan servers firstly, and then investigate upstreaming performance test framework to enable Aarch64
        • Lijian to verify Ubuntu-18.04 on Taishan server.
      • Stan installed latest CSIT scripts on packet generator server(x86 NEON) and Tainshan servers in FD.io lab.
      • https://gerrit.fd.io/r/#/c/16850/
      • Some of L2 and L3 test cases passed.
  • VPP
    • Vectorization
      • [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
      • [Sirshak] on ethernet-input node, investigate vectorized buffer index.
    • Memory Ordering
      • [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
    • [Juraj] IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
    • [Juraj] Kernel Migration on mcbin. Juraj is able to build all the images.
  • Action Items - Next Week
    • [Sirshak] - To update patch list in VPP/Aarch64 wiki

01/15/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Lijian Zhang
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Honnappa Nagarahalli
    • John Ddigilio
  • General
  • VPP Hoststack
    • iperf3 performance with Hoststack.
      • Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
    • Gorka is trying some optimal configs for VCL. - No Updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • OcteonTX is received in ARM lab. Will boot it up firstly and then start doing profiling with it.
  • FD.io lab
    • ThunderX2 -
      • New Arista switch is available now. Gathering details that required by LF lab before sending the switch to CSIT lab. - Juraj
      • Cable type is confirmed. Procurement is in the process.
  • CSIT
    • VPP Path
      • IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
      • We have voting verify on bionic. Upload nexus disabled but merge job working. - Juraj created LF ticket for nexus upload. Both master merge job and verifying job are working fine.
      • ARM CI results are overwritten by x86 machines. Should be a Jenkin issue. Monitor if this corner will happen again. - Juraj
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts.
      • Kernel Migration on mcbin. Juraj is able to build all the images.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now.
      • Khem will get L2 work in CI firstly, and then IP4, and other test cases.
      • Stan starts to work on performance scripts with Khem. Is able to connect Taishan machines in CSIT lab.
      • The performance topology in wiki link is to update per below file.
      • https://github.com/FDio/csit/blob/master/docs/lab/Testbeds_Xeon_Skx_Arm_Atom.md
      • Stan and Khem to come up with a summary of current status and an estimate of at least upstreaming basic L2/L3 performance suites.
  • VPP
    • Vectorization
      • [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
    • Memory Ordering
      • [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
    • [Juraj] IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
    • [Juraj] Kernel Migration on mcbin. Juraj is able to build all the images.
  • Action Items - Next Week
    • [Sirshak] - To update patch list in VPP/Aarch64 wiki

01/08/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Lijian Zhang
    • Stanislav Chlebec
    • Khemendra Kumar
  • General
  • VPP Hoststack
    • iperf3 performance with Hoststack.
      • Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
    • Gorka is trying some optimal configs for VCL. - No Updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
    • [Lijian] Working on IP4 reassembly and GBP failures. - fixed. Juraj has upstreamed patched to enable these two tests.
    • [Sirshak] Kernel Migration mcbin. Juraj is working on based on Jianlin's suggestion.
    • [Andy] Getting a new Arista switch next year.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy - Macro benchmarking is done and data is updated to Jira.
    • Memory Ordering
      • [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
  • CSIT
    • VPP Path
  • VPP Path Failures
      • We have voting verify on bionic. Upload nexus disabled but merge job working. - Juraj created LF ticket for nexus upload. Both merge job and verifying job are working fine.
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts.
      • thunderx2: Juraj working with LF to get this resolved.
      • mcbin: Juraj can contact Jianlin if needed.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now.
      • Khem will get L2 work in CI firstly, and then IP4, and other test cases.
      • Stan is starting working on VPP performance test. Khem to send email to Stan on VPP performance testing stuff.
  • FD.io lab
    • New Arista switch to be proccured next year.
    • ThunderX2 - Racked. Andy is trying to buy cables compatible to Intel XL710. Juraj to confirm info required by lab people before sending out the cables.
  • Action Items - Next Week

12/18/2018

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Tina Tsou
    • Stanislav Chlebec
    • Avinash
    • Khemendra
  • General
  • VPP Hoststack
    • iperf3 performance with Hoststack.
      • Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
    • Gorka is trying some optimal configs for VCL. - No Updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working.
    • [Lijian] Working on IP4 reassembly and GBP failures. - Some preliminary on gbp waiting Neale. Juraj to give access to Lijian to investigate on ThunderX.
    • [Sirshak] Kernel Migration mcbin. Status: Jianlin to work with Juraj to get fd.io mcbins up and running. Sirshak to setup a meeting.
    • [Andy] Getting a new Arista switch next year.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy - Still benchmarking and setting it up for internal review.
      • [Lijian] Patch for compiling issue with GCC-8.x is under community review. Status: No updtaes.
      • [Lijian] Patch for fixing StringTest failure is under community review. Status: Abandoned.
      • [Lijian] Patch for CDP failure is under community review. Status: No updates.
    • Memory Ordering
      • [Sirshak] svm_fifo lockless alternate algorithm for SPSC.
  • CSIT
    • VPP Path
  • VPP Path Failures
    • https://jira.fd.io/browse/VPP-1475 - IP4 random reassembly failure in master, also seen on x86
    • https://jira.fd.io/browse/VPP-1491 - GBP L3/L2 Endpoint Learning failure
      • We have voting verify on bionic. Upload nexus disabled but merge job working. Juraj to create LF ticket for nexus upload.
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts.
      • thunderx2: Sirshak working with LF to get this resolved.
      • mcbin: Sirshak to setup a meeting between Juraj and Jianlin.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now.
      • Khem will get L2 work in CI firstly, and then IP4, and other test cases.
  • FD.io lab
    • New Arista switch to be proccured next year.
    • ThunderX2 - Racked. IPMI Static IP configuration missing. Sirshak with LF.
  • Action Items - Next Week

12/11/2018

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Tina Tsou
    • Stanislav Chlebec
  • VPP Hoststack
    • iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance. Three case: kernel to kernel; kernel to VPP hoststack; VPP hoststack to VPP hoststack
    • ongoing perf analysis. One patch(https://gerrit.fd.io/r/#/c/16184/) is merged, and the other one is under internal review.
    • Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
    • Gorka is trying some optimal configs for VCL.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. Two scripts of L2 performance suites for CI management repository are done, investigating on for CSIT repository, and three more scripts to be developed.
    • [Lijian] Working on IP4 reassembly and GBP failures
    • [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far. - To confirm with Jianling and Joyce.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy - Second priority, no update so far.
      • [Lijian] Patch for compiling issue with GCC-8.x is under community review.
      • [Lijian] Patch for fixing StringTest failure is under community review.
      • [Lijian] Patch for CDP failure is under community review.
    • Memory Ordering
      • [Sirshak] To start work on Arithmetic and Logic relaxed functions.
  • VPP Path failures
  • CSIT
    • VPP Path
      • Actually, everything is ready. The only thing is to get CI patch merged.
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx is in place, but there are errors. Will continue investigation.
      • thunderx2: Racked. Lack of static IP. Sirshak gave a work-around to fix lacking of static IP to Anton.
      • mcbin: Kernel issue yet to try suggestion from Garcia and Damjan. To confirm with Jianling and Joyce - Lijian
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now. Khem will get L2 work in CI firstly, and then IP4, and other test cases.
  • FD.io lab
    • Arista switch is still not working. Andy and Anton are working on the exact requirement for the switch, and two possible option: Andy to replace the Arista or buy a new one.
    • ThunderX2 - Racked. Lack of IP.
  • Action Items - Next Week
    • [Lijian] to continue to investigate make test failures.
    • [Andy] to work with Anton to resolve Arista problem.

12/04/2018

  • Attendees
    • Sirshak Das
    • Andy Wang
    • Juraj Linkeš
    • Khemendra
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Tina Tsou
  • VPP Hoststack
    • iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance. Three case: kernel to kernel; kernel to VPP hoststack; VPP hoststack to VPP hoststack
    • ongoing perf analysis. Two patches ongoing. One is upstreamed and the other is under internal review. Hotpots on memory copy or maybe other stuff.
    • Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
    • Gorka is trying some optimal configs for VCL.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. Two scripts of L2 performance suites for CI management repository are done, investigating on for CSIT repository, and three more scripts to be developed.
    • [Lijian] VPP dlmalloc crash issue root-caused and fixed by maintainer. Florin Coras fixed time-out issues.
    • [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far. - To confirm with Jianling and Joyce.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy - Second priority, no update so far.
      • [Lijian] Patch for compiling issue with GCC-8.x is under internal review.
      • [Lijian] Patch for fixing StringTest failure is under internal review.
    • Memory Ordering
      • [Sirshak] To start work on Arithmetic and Logic relaxed functions.
  • CSIT
    • VPP Path
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx is in place, but there are errors. Will continue investigation.
      • thunderx2: Racked. Lack of IP. To confirm with Anton.
      • mcbin: Kernel issue yet to try suggestion from Garcia and Damjan. To confirm with Jianling and Joyce - Lijian
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now. Khem will get L2 work in CI firstly, and then IP4, and other test cases.
  • FD.io lab
    • Arista switch is still not working. Andy and Anton are working on the exact requirement for the switch, and two possible option: Andy to replace the Arista or buy a new one.
    • ThunderX2 - Racked. Lack of IP.
  • Action Items - Next Week
    • [Lijian] to continue to investigate make test failures.
    • [Andy] to work with Anton to resolve Arista problem.


11/27/2018

  • Attendees
    • Juraj Linkeš
    • Khemendra
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Tina Tsou
  • VPP Hoststack
    • iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance.
    • ongoing perf analysis, two patches ongoing. Hotpots on memory copy or maybe other stuff. Will share patches with community.- Sirshak
    • Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
    • Gorka is trying some optimal configs for VCL.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • Alternate test cases.
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. Status: Working with Juraj to get scripts ready for jobs. - Need to prepare some scripts. First to understand how the script works and then add more options.
    • [Lijian] Status on VPP path failures. Status: Still debugging. Still at early study stage.
    • [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy
    • Memory Ordering
      • [Sirshak] To start work on Arithmetic and Logic relaxed functions.
  • CSIT
    • VPP Path
      • 3 failures currently stalling deployment.
      • VPP-1476, VPP-1475, VPP-1478
      • These failures are seen on Debian x86 VM also.
      • Parallelization(n=32) is resulting in failures. Seems also be caused by below two patches.
      • VPP-1490, caused by https://gerrit.fd.io/r/#/c/15106/ and https://gerrit.fd.io/r/#/c/15534/.
      • VPP-1491, VPP-1497 about parallazation and GBP failure are filed.
      • Get CSIT/Aarch64 pass with partial test cases - Juraj
    • VPP Device
      • thunderx: Juraj created a LF tkt for wiring the 1-node topology on cavium thunderx.
      • thunderx2: to be racked by this Friday.
      • mcbin: Kernel issue yet to try suggestion from Garcia and Damjan.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • L2 test is working now manually. Khem is trying to get it work in CI, and then IP4, and other test cases.
  • FD.io lab
    • Arista switch is missing cable. Andy will send tracking no. for cables.
    • ThunderX2 - to be racked by this Friday.
  • Action Items - Next Week
    • [Lijian] to investigate VPP-1490 issue.
    • [Andy] Andy will send tracking no. for cables.

11/20/2018

  • Attendees
    • Sirshak Das
    • Andy Wang
    • Juraj Linkeš
    • Khemendra
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Tina Tsou
  • VPP Hoststack
    • iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance.
    • ongoing perf analysis, two patches ongoing. Hotpots on memory copy or maybe other stuff. Will share patches with community.- Sirshak
    • Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
    • Gorka is trying some optimal configs for VCL.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • Alternate test cases.
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. Status: Working with Juraj to get scripts ready for jobs. - Need to prepare some scripts. First to understand how the script works and then add more options.
    • [Lijian] Status on VPP path failures. Status: Still debugging. Still at early study stage.
    • [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy
    • Memory Ordering
      • [Sirshak] To start work on Arithmetic and Logic relaxed functions.
  • CSIT
    • VPP Path
      • 3 failures currently stalling deployment.
      • VPP-1476, VPP-1475, VPP-1478
      • These failures are seen on Debian x86 VM also.
      • Parallelization(n=32) is resulting in failures. Seems also be caused by below two patches.
      • VPP-1490, caused by https://gerrit.fd.io/r/#/c/15106/ and https://gerrit.fd.io/r/#/c/15534/.
      • VPP-1491, VPP-1497 about parallazation and GBP failure are filed.
      • Get CSIT/Aarch64 pass with partial test cases - Juraj
    • VPP Device
      • thunderx: Juraj created a LF tkt for wiring the 1-node topology on cavium thunderx.
      • thunderx2: to be racked by this Friday.
      • mcbin: Kernel issue yet to try suggestion from Garcia and Damjan.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • L2 test is working now manually. Khem is trying to get it work in CI, and then IP4, and other test cases.
  • FD.io lab
    • Arista switch is missing cable. Andy will send tracking no. for cables.
    • ThunderX2 - to be racked by this Friday.
  • Action Items - Next Week
    • [Lijian] to investigate VPP-1490 issue.
    • [Andy] Andy will send tracking no. for cables.


11/12/2018

  • Attendees
    • Sirshak Das
    • Andy Wang
    • Juraj Linkeš
    • Khemendra
    • Garcia
    • Gorka
  • VPP Hoststack
    • iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance.
    • ongoing perf analysis, two patches ongoing. Hotpots on memory copy or maybe other stuff. - Sirshak
    • Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
    • Gorka is trying some optimal configs for VCL.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • Alternate test cases.
    • khem to get more information on benchmarking DMM. Khem to send the information to

Status Report Ligato/Contiv

Capture LandC.PNG