Difference between revisions of "VPP/AArch64"

From fd.io
< VPP
Jump to: navigation, search
(Meeting Minutes)
 
(386 intermediate revisions by 10 users not shown)
Line 4: Line 4:
 
=== Meeting Details ===
 
=== Meeting Details ===
  
* Regular AArch64 meeting: [https://zoom.us/my/fastdata Tuesdays at 06:00 PT (Pacific Time)] (weekly). [http://www.thetimezoneconverter.com/?t=06:00&tz=PT%20%28Pacific%20Time%29 Convert to your timezone.]
+
* Regular AArch64 meeting: 1st and 3rd Tuesdays of every month at 06:00 PT (Pacific Time) (biweekly). [http://www.thetimezoneconverter.com/?t=06:00&tz=PT%20%28Pacific%20Time%29 Convert to your timezone.]
** [https://zoom.us/my/fastdata FD.io Zoom Meeting room ]
+
** [https://zoom.us/my/fastdata?pwd=Z3Z0UnJyUmRIMlU3eTJLcGF6VEptQT09 FD.io Zoom Meeting room ]
  
 
=== IRC Channel ===
 
=== IRC Channel ===
  
 
'''<code>#fdio-arm</code>''' on <code>freenode.net</code>
 
'''<code>#fdio-arm</code>''' on <code>freenode.net</code>
 +
 +
=== Slack ===
 +
 +
Request invitation at https://slack.fd.io/
  
 
=== Jira ===
 
=== Jira ===
Line 18: Line 22:
  
 
* [https://schd.ws/hosted_files/fdiominisummitatkubeconeu20/aa/kubecon_fdio_brooks.pdf The path to Fast Data on Arm] [pdf] - FD.io Mini-Summit at KC+CNC EU 2018
 
* [https://schd.ws/hosted_files/fdiominisummitatkubeconeu20/aa/kubecon_fdio_brooks.pdf The path to Fast Data on Arm] [pdf] - FD.io Mini-Summit at KC+CNC EU 2018
 +
* [https://www.youtube.com/watch?v=T7za89oBZtw&t=79s Vector Packet Processing (VPP) Arm Story: Now and Beyond] [youtube] - FD.io Mini-summit at KC+CNC NA 2018
  
 
== Release Milestones ==
 
== Release Milestones ==
Line 41: Line 46:
 
* [https://jenkins.fd.io/computer/ '''CI build servers'''] integrated into Jenkins
 
* [https://jenkins.fd.io/computer/ '''CI build servers'''] integrated into Jenkins
  
* [https://wiki.fd.io/view/CSIT/fdio_csit_lab_ext_lld_draft '''CSIT test beds'''] (''under construction'')
+
* [https://github.com/FDio/csit/blob/master/docs/lab/testbed_specifications.md '''CSIT testbed specifications''']
  
 
{| class="wikitable"
 
{| class="wikitable"
Line 56: Line 61:
 
! Distro
 
! Distro
 
|-
 
|-
| [https://softiron.com/development-tools/overdrive-1000/ SoftIron OverDrive 1000] || CI build server || Running in CI || softiron-1 || 10.30.51.12 || N/A || 4 || 8GB || || openSUSE
+
| [https://www.marvell.com/server-processors/thunderx-arm-processors/ Marvell ThunderX] || VPP dev debug server|| Running || vpp-marvell-dev || 10.30.51.38 || 10.30.50.38 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 18.04.4
 +
|-
 +
| || CI build server|| Running in Nomad || s53-nomad || 10.30.51.39 || 10.30.50.39 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 18.04.4
 +
|-
 +
| || CI build server|| Running in Nomad || s54-nomad || 10.30.51.40 || 10.30.50.40 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 18.04.4
 +
|-
 +
| || CI build server || Running in Nomad || s52-nomad || 10.30.51.65 || 10.30.50.65 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.4
 +
|-
 +
| || CI build server || Running in Nomad || s51-nomad || 10.30.51.66 || 10.30.50.66 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.4
 
|-
 
|-
| || CI build server || Running in CI || softiron-2 || 10.30.51.13 || N/A || 4 || 8GB || || openSUSE
+
| || CI build server || Running in Nomad || s49-nomad || 10.30.51.67 || 10.30.50.67 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.4
 
|-
 
|-
| || CI build server || Running || softiron-3 || 10.30.51.14 || N/A || 4 || 8GB || || openSUSE
+
| || CI build server || Running in Nomad || s50-nomad || 10.30.51.68 || 10.30.50.68 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.4
 
|-
 
|-
| [https://cavium.com/product-thunderx-arm-processors.html Cavium ThunderX] || CI build server || Running in CI || nomad3arm || 10.30.51.38 || 10.30.50.38 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 16.04
+
| [https://www.marvell.com/server-processors/thunderx2-arm-processors/ Marvell ThunderX2] || Perf DUT candidate || Running || s27-t13-sut1 || 10.30.51.69 || 10.30.50.69 || 224 || 128GB || 3x40GbE QSFP+ XL710-QDA2 || Ubuntu 18.04.2
 
|-
 
|-
| || CI build server || Running in CI || nomad4arm || 10.30.51.39 || 10.30.50.39 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 16.04
+
| || VPP device server || Running in Nomad || s55-t36-sut1 || 10.30.51.70 || 10.30.50.70 || 256 || 256GB || 2x40GbE QSFP+ XL710-QDA2 / 2x10/25GE SFP+ ConnectX5 || Ubuntu 18.04.4
 
|-
 
|-
| || CI build server || Running in CI || nomad5arm || 10.30.51.40 || 10.30.50.40 || 96 || 128GB || 3x40GbE QSFP+ / 4x10GbE SFP+ || Ubuntu 16.04
+
| || VPP device server || Running in Nomad || s56-t37-sut1 || 10.30.51.71 || 10.30.50.71 || 256 || 256GB || 2x40GbE QSFP+ XL710-QDA2 / 2x10/25GE SFP+ ConnectX5 || Ubuntu 18.04.4
 
|-
 
|-
| || CI build server || Running || fdio-cavium4 || 10.30.51.65 || 10.30.50.65 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.1
+
| Huawei TaiShan 2280 || CSIT testbed || Running in CI || s17-t33-sut1 || 10.30.51.36 || 10.30.50.36 || 64 || 128GB || 2x10GbE SFP+ Intel X520-DA2 / 2x25GbE SFP28 Mellanox CX-4 || 18.04.1
 
|-
 
|-
| || VPP dev debug server || Running || fdio-cavium5 || 10.30.51.66 || 10.30.50.66 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 18.04.1
+
| || CSIT testbed || Running in CI || s18-t33-sut2 || 10.30.51.37 || 10.30.50.37 || 64 || 128GB || 2x10GbE SFP+ Intel X520-DA2 / 2x25GbE SFP28 Mellanox CX-4 || 18.04.1
 
|-
 
|-
| || CI build server || Running || fdio-cavium6 || 10.30.51.67 || 10.30.50.67 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 16.04.1
+
| [http://macchiatobin.net/ Marvell MACCHIATObin] || N/A || Decommissioned || s20-t34-sut1 || 10.30.51.41 || 10.30.51.49, then connect to /dev/ttyUSB0 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.4
 
|-
 
|-
| || VPP dev debug server || Running || fdio-cavium7 || 10.30.51.68 || 10.30.50.68 || 96 || 256GB || 2xQSFP+ / USB Ethernet || Ubuntu 16.04.1
+
| || N/A || Decommissioned || s21-t34-sut2 || 10.30.51.42 || 10.30.51.49, then connect to /dev/ttyUSB1 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.5
 
|-
 
|-
| Huawei TaiShan 2280 || CSIT testbed || Running || s15-t33-sut1 || 10.30.51.36 || 10.30.50.36 || 64 || 128GB || 2x10GbE SFP+ Intel 82599 / 2x25GbE SFP28 Mellanox CX-4 || Ubuntu 17.10
+
| || N/A || Decommissioned || fdio-mcbin3 || 10.30.51.43 || 10.30.51.49, then connect to /dev/ttyUSB2 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.5
 
|-
 
|-
| || CSIT testbed || Running || s16-t33-sut2 || 10.30.51.37 || 10.30.50.37 || 64 || 128GB || 2x10GbE SFP+ Intel 82599 / 2x25GbE SFP28 Mellanox CX-4 || Ubuntu 17.10
+
| || Power Cycler || Operational || || 10.30.50.80 || || || || ||
 
|-
 
|-
| [http://macchiatobin.net/ Marvell MACCHIATObin] || CSIT testbed || Running || s20-t34-sut1 || 10.30.51.41 || 10.30.51.49, then connect to /dev/ttyUSB0 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.4
+
| [https://softiron.com/development-tools/overdrive-1000/ SoftIron OverDrive 1000] || N/A || Decommissioned || softiron-1 || 10.30.51.12 || N/A || 4 || 8GB || || openSUSE
 
|-
 
|-
| || CSIT testbed || Running || s21-t34-sut2 || 10.30.51.42 || 10.30.51.49, then connect to /dev/ttyUSB1 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.5
+
| || N/A || Decommissioned || softiron-2 || 10.30.51.13 || N/A || 4 || 8GB || || openSUSE
 
|-
 
|-
| || VPP dev debug server || Running || fdio-mcbin3 || 10.30.51.43 || 10.30.51.49, then connect to /dev/ttyUSB2 || 4 || 16GB || 2x10GbE SFP+ || Ubuntu 16.04.5
+
| || N/A || Decommissioned || softiron-3 || 10.30.51.14 || N/A || 4 || 8GB || || openSUSE
 
|-
 
|-
 
|}
 
|}
  
Note: to get lab access, open a ticket at https://rt.linuxfoundation.org/
+
Note: to get lab access, create a gpg key, upload it to keyserver, have it signed by a trusted anchor in a video call (fingerprint will be needed) and then an ARM authority (Tina) needs to send an e-mail to helpdesk@fd.io with your name, e-mail, keygrip and key fingerprint
  
 
== CI ==
 
== CI ==
Line 150: Line 163:
 
=== Recent Patches ===
 
=== Recent Patches ===
 
{| class="wikitable"
 
{| class="wikitable"
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/34716 misc: vppctl fix heap-buffer-overflow & memleaks] || Merged 12/14 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/34634 crypto-native: fix build error on Arm using clang-13] || Merged 12/14 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/33306 snort: fix unused result warning for gcc-10] || Merged 11/06 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/33307 l2: fix array-bounds error for prefetch on Arm] || Merged 11/07 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/33422 ip6: fix IPv6 address calculation error using "ip route add" CLI] || Merged 10/21 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/31694 ipsec: Performance improvement of ipsec4_output_node using flow cache] || Merged 10/13 || || Govindarajan Mohandoss
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/33999 build: fix centos rpm build] || Merged 10/08 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/33324 vppinfra: fix potential memory access error in _pool_init_fixed] || Merged 10/05 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/32885 svm: fix asan check failed @svm_map_region on arm ] || Merged 06/24 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/32638 l2: fix vrrp prefix mac comparison ] || Merged 06/09 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/32565 build: fix build error after make wipe ] || Merged 06/04 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/32367 memif: fix input node buffer prefetch ] || Merged 05/21 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/32366 memif: fix gcc-10 build error on arm platform ] || Merged 05/21 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/31972 papi: fix ubuntu 1804 make test socket.close error] || Merged 04/16 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/31960 rdma: fix skip_ipv4_cksum behavior in scalar path] || Merged 04/15 || || Tianyu Li
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/31985 vppinfra: correct intrinsic called by u16x16_from_u8x16] || Merged 04/15 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/31421 vppinfra: fix compiling error due to incompatible udphdr field names] || Merged 03/05 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/30458 avf: optimized with NEON SIMD instruction] || Merged 12/18 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/28252 ip: fix compiling error with gcc-10] || Merged 09/01 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/28044 build: Fix 'make install-deps' errors on aarch64 CentOS 7] || Merged 07/29 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/28034 acl: correct acl vat help message] || Merged 07/24 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/27417 build: add libssl-dev library for ubuntu 20.04] || Merged 06/04 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/26949 dpdk: fix compiling issue with clang] || Merged 05/08 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/26950 vppinfra: fix u32x4_byte_swap on Arm] || Merged 05/08 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/26804 build: support arch-specific compiling for Neoverse N1] || Merged 04/30 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/26023 dpdk: false link down issue with ixgbe NIC] || Merged 03/23 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/25896 vlib: fix error when creating avf interface on SMP system] || Merged 03/21 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/25906 vlib: leave SIGPROF signal with its default handler] || Merged 03/21 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/25259 build: add libssl-dev for ubuntu 16.04 and 18.04] || Merged 03/11 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/25195 vlib: fix code of getting numa node with specific cpu_id] || Merged 02/17 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/23083 docs: add physmem section in configuration parameters] || Merged 12/19 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/23082 vlib: add max-size configuration parameter for pmalloc] || Merged 12/18 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/23075 crypto: not use vec api with opt_data[VNET_CRYPTO_N_OP_IDS]] || Merged 11/13 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/23084 acl: add missing square brackets to vat_help option in acl api] || Merged 10/31 || || Jieqiang Wang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21968 dpdk: apply dual loop unrolling in DPDK TX] || Merged 09/12 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21969 ip: apply dual loop unrolling in ip4_rewrite] || Merged 09/12 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21970 ip: apply dual loop unrolling in ip4_input] || Merged 09/12 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21940 build: fix running error with vmxnet3_test_plugin.so] || Merged 09/11 || || Jianlin Lv
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21741 build: fix unsupported CMake comparison operation] || Merged 09/05 || || Jianlin Lv
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/21469 tap: fix tap interface not working on Arm issue] || Merged 09/04 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20379 build: fix vpp compilation failure on ThunderX2 and Amp] || Merged 08/19 || || Jianlin Lv
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/18564 vppinfra: Update "show cpu" output for AArch64 chips] || Merged 08/19 || || Nitin Saxena
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20861 vppinfra: refactor test_and_set spinlocks to use clib_spinlock_t] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20862 vppinfra: added performance test for clib_rwlock_t (test_rwlock.c)] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20863 vppinfra: refactor clib_rwlock_t to use single condition variable] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20860 vppinfra: refactor clib_spinlock_t to use compare and swap] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20859 vppinfra: added lock performance test for clib_spinlock_t (test_spinlock.c)] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20857 vppinfra: refactor use of CLIB_MEMORY_BARRIER ()] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/c/vpp/+/20856 vppinfra: conformed spinlocks to use CLIB_PAUSE] || Merged 08/02 || || Jason Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/20272/ vppinfra: add u64x2_scatter/u32x4_scatter] || Merged 06/21 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/20271/ vppinfra: add u64x2_gather/u32x4_gather] || Merged 06/21 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/20064/ fix compiling error with marvell pp2 plugin] || Merged 06/11 || || Jianlin Lv
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/19930/ Switch atomic release API from __sync to __atomic builtin] || Merged 06/05 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/19929/ Switch atomic test and set API from __sync to __atomic builtin] || Merged 06/05 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/18278/ Build packages for generic Arm architecture] || Merged 05/15 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/19135/ Enable NEON instructions in memcpy_le] || Merged 05/01 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/18223/ svm_fifo rework to avoid contention on cursize] || Merged 04/17 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/18405/ Re-enable aarch64 neon instruction in vlib_buffer_free_inline] || Merged 03/20 || || Lijian Zhang
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/18077/ sctp chunk_len fix] || Merged 03/06 || || Sirshak Das
 +
|-
 +
| [https://gerrit.fd.io/r/#/c/16184/ Use acquire/release ordering when accessing svm_fifo shared variable cursize] || Merged 11/29 || || Sirshak Das
 +
|-
 
| [https://gerrit.fd.io/r/#/c/15756/ Optimize xxx_zero_byte_mask NEON function.] || Merged 11/07 || || Lijian Zhang
 
| [https://gerrit.fd.io/r/#/c/15756/ Optimize xxx_zero_byte_mask NEON function.] || Merged 11/07 || || Lijian Zhang
 
|-
 
|-
Line 293: Line 427:
 
|}
 
|}
  
=== Meeting Minutes ===
+
== Meeting Minutes ==
 +
'''11/21/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Niyaz Murshed
 +
** Jieqiang Wang
  
'''11/20/2018'''
+
* CSIT
 +
** Status
 +
*** Dave Wallace help monitor the AArch64 CI/CD status, which looks fine
 +
*** Replace old thunderX2 with Ampera Altra, bugdets got approved, still in progress
 +
**** Sync with CSIT folks in the call when possible -- Juraj
 +
*** Maciek asked about the availability of N2-based hardwares
 +
**** Plans to ship N2-based servers(Nvidia Grace(V2)/Ampere One(in-house design by Ampere)) to FD.io lab in next year
 +
**** Timeline TBD
 +
*** IPSec test cases
 +
**** Patch already merged
 +
**** QAT cards in Austin labs, plan to ship them to FD.io lab
 +
*** RDMA test cases
 +
**** MLX DPDK test cases are enabled, RDMA are not on AArch64
 +
 
 +
* VPP
 +
** Detailed planning for VPP projects in the next call
 +
** Refactor OpenSSL usage in VPP IPsec -- Lijian
 +
*** Move key generation and initialization steps out of data plane to control plane, see performance boost
 +
** Investigate make test framework in VPP -- Lijian
 +
*** Patch broke wireguard test cases so need to figure out the work flow
 +
** VPP ramp-up -- Niyaz
 +
*** Investigate VPP graph node mechanism and how to add nodes to the group
 +
** IPSec scalability tests -- Jieqiang
 +
*** Try to figure out dpdk-rss-flows.py and how to generate balanced rss flows for IPSec tests
 +
 
 +
'''07/18/2023'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Jieqiang Wang
** Andy Wang
+
** Tianyu Li
** Juraj Linkeš
+
** Juraj Linkes
** Khemendra
+
* CSIT
** Garcia
+
** Timeout issue happens preriodically on Taishan server, even in release testing.
** Manuel
+
*** Setting CPU affinity only after VMs boot up fully.
** Gorka
+
**** https://gerrit.fd.io/r/c/csit/+/38550
** Fede
+
*** Another issue maybe related with Taishan NUMA topology
 +
**** https://gerrit.fd.io/r/c/csit/+/35772
 +
*** Increase timeout will bypass the issue and have no effecton VPP VM perf
 +
** IPSec & VxLAN performance drop issue on Ampere Altra
 +
*** QAT cards are planned to be shipped
 +
*** need to pay attention to the execution time for IPSec release testing
 +
*** Need to investigate further on performance degradation issue
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
*** Will have a debug meeting with RDMA maintainers on the issues.
 +
*** ConnectX6 NIC info will be updated in doc first
 +
*** Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
 +
** MRR failed cases
 +
*** Probably due to latest DPDK upgrade, not an arm-specific issue.
 +
** New test cases list on 3n-alt
 +
*** NAT tests cannot be added because they are running on 2-node testbed only
 +
*** enable IPSec flow cache(arm)/IPSec SPD fast path feature
 +
** Release testing
 +
*** 23.06 release testing is done
 +
*** New CSIT page https://csit.fd.io/
 +
** Plan to replace TX2 with Altra as VPP device testing testbed
 +
 
 +
'''06/20/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj Linkes
 +
* CSIT
 +
** Timeout issue happens preriodically on Taishan server, even in release testing.
 +
*** Setting CPU affinity only after VMs boot up fully.
 +
**** https://gerrit.fd.io/r/c/csit/+/38550
 +
*** Another issue maybe related with Taishan NUMA topology
 +
**** https://gerrit.fd.io/r/c/csit/+/35772
 +
*** Increase timeout will bypass the issue and have no effecton VPP VM perf
 +
** IPSec & VxLAN performance drop issue on Ampere Altra
 +
*** QAT cards are planned to be shipped
 +
*** need to pay attention to the execution time for IPSec release testing
 +
*** Need to investigate further on performance degradation issue
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
*** Will have a debug meeting with RDMA maintainers on the issues.
 +
*** ConnectX6 NIC info will be updated in doc first
 +
*** Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
 +
** MRR failed cases
 +
*** Probably due to latest DPDK upgrade, not an arm-specific issue.
 +
** New test cases list on 3n-alt
 +
*** NAT tests cannot be added because they are running on 2-node testbed only
 +
*** enable IPSec flow cache(arm)/IPSec SPD fast path feature
 +
 
 +
'''05/16/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
* CSIT
 +
** Timeout issue happens preriodically on Taishan server, even in release testing.
 +
*** Setting CPU affinity only after VMs boot up fully.
 +
**** https://gerrit.fd.io/r/c/csit/+/38550
 +
*** Another issue maybe related with Taishan NUMA topology
 +
**** https://gerrit.fd.io/r/c/csit/+/35772
 +
*** Increase timeout will bypass the issue and have no effecton VPP VM perf
 +
** The link issue in DPDK testpmd test cases on Ampere Altra is still there.
 +
*** Try cable switch while upgrading NIC firmeare and drivers
 +
*** Try to reproduce the tests after the NIC firmware
 +
*** Try different port pairs of the same two NICs
 +
** IPSec & VxLAN performance drop issue on Ampere Altra
 +
*** QAT cards are planned to be shipped
 +
*** need to pay attention to the execution time for IPSec release testing
 +
*** Need to investigate further on performance degradation issue
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
*** Will have a debug meeting with RDMA maintainers on the issues.
 +
*** ConnectX6 NIC info will be updated in doc first
 +
*** Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
 +
** MRR failed cases
 +
*** Probably due to latest DPDK upgrade, not an arm-specific issue.
 +
* VPP
 +
'''04/18/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
* CSIT
 +
** Timeout issue happens preriodically on Taishan server, even in release testing.
 +
*** Setting CPU affinity only after VMs boot up fully.
 +
**** https://gerrit.fd.io/r/c/csit/+/38550
 +
*** Another issue maybe related with Taishan NUMA topology
 +
**** https://gerrit.fd.io/r/c/csit/+/35772
 +
** The link issue in DPDK testpmd test cases on Ampere Altra is still there.
 +
** IPSec & VxLAN performance drop issue on Ampere Altra
 +
*** QAT cards are planned to be shipped
 +
*** need to pay attention to the execution time for IPSec release testing
 +
*** Need to investigate further on performance degradation issue
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
*** Will have a debug meeting with RDMA maintainers on the issues.
 +
*** ConnectX6 NIC info will be updated in doc first
 +
* VPP
 +
 
 +
'''04/04/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
* CSIT
 +
** Timeout issue happens preriodically on Taishan server, even in release testing.
 +
*** Setting CPU affinity only after VMs boot up fully.
 +
**** https://gerrit.fd.io/r/c/csit/+/38550
 +
*** Another issue maybe related with Taishan NUMA topology
 +
**** https://gerrit.fd.io/r/c/csit/+/35772
 +
** The link issue in DPDK testpmd test cases on Ampere Altra is still there.
 +
** IPSec & VxLAN performance drop issue on Ampere Altra
 +
***
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
*** Will have a debug meeting with RDMA maintainers on the issues.
 +
* VPP
 +
 
 +
'''03/07/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
* CSIT
 +
** Timeout issue happens preriodically on Taishan server, even in release testing.
 +
** The link issue in DPDK testpmd test cases on Ampere Altra is still there.
 +
** Verify job, Merge Job, Device Testing, and release testing is so far so good.
 +
** RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
 +
* VPP
 +
 
 +
'''2/21/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
 
 +
* Miscellaneous
 +
** Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Confirm with Vexxhost people if replacing intel NICs is feasible
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
****** Will talk to dpdk i40e maintainer to seek their help
 +
******* Dpdk Port/link status broken - l3fwd have the some issue
 +
******* Sent detail email to i40e maintainer in dpdk-dev mail list waiting for response
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Need to investigate 22.10 release testing result
 +
****** Compiler version change seems to be one of factors for perf degradation
 +
******* Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
 +
******* New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
 +
***** VM testcase timeout issue on 3-tsh testbed
 +
****** Timeout issue occured when starting VPP inside VM, but not for starting testpmd
 +
****** Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
 +
******* isolcpus seems to be working fine
 +
******* still need to root cause the timeout issue- sometimes slower
 +
******* run dpdk build, just use the non-isolated cores for build
 +
******* both VM and VPP start slower than before
 +
******* VPP loading plugins and timeout happens
 +
******* Is VPP crashing? - not crash
 +
******* Is the VM bound with isolated core? - need to check
 +
******* Will set up a live debug session for Tianyu and Juraj
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
 
 +
** VPP Device
 +
*** Enable VPP device testing per patch
 +
**** Voting right for VPP device testing on Arm is enabled
 +
**** VPP device testing on Arm runs per VPP/CSIT patch
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
** MLX NICs Planning
 +
*** CX6 and CX7 - CX7 is hard to get on market - MLX Nics will be used and reported
 +
*** CX6 vpp native rdma driver has issues, dpdk mlx driver is fine.
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
 
 +
'''2/7/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Juraj
 +
 
 +
* Miscellaneous
 +
** Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Confirm with Vexxhost people if replacing intel NICs is feasible
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
****** Will talk to dpdk i40e maintainer to seek their help
 +
******* Dpdk Port/link status broken - l3fwd have the some issue
 +
******* Sent detail email to i40e maintainer in dpdk-dev mail list waiting for response
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Need to investigate 22.10 release testing result
 +
****** Compiler version change seems to be one of factors for perf degradation
 +
******* Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
 +
******* New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
 +
***** VM testcase timeout issue on 3-tsh testbed
 +
****** Timeout issue occured when starting VPP inside VM, but not for starting testpmd
 +
****** Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
 +
******* isolcpus seems to be working fine
 +
******* still need to root cause the timeout issue- sometimes slower
 +
******* run dpdk build, just use the non-isolated cores for build
 +
******* both VM and VPP start slower than before
 +
******* VPP loading plugins and timeout happens
 +
******* Is VPP crashing? - not crash
 +
******* Is the VM bound with isolated core? - need to check
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
 
 +
** VPP Device
 +
*** Enable VPP device testing per patch
 +
**** Voting right for VPP device testing on Arm is enabled
 +
**** VPP device testing on Arm runs per VPP/CSIT patch
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
** MLX NICs Planning
 +
*** CX6 and CX7 - CX7 is hard to get on market - MLX Nics will be used and reported
 +
*** CX6 vpp native rdma driver has issues, dpdk mlx driver is fine.
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''1/17/2023'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 
** Tina Tsou
 
** Tina Tsou
* VPP Hoststack
+
 
** iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance.
+
* Miscellaneous
** ongoing perf analysis, two patches ongoing. Hotpots on memory copy or maybe other stuff. Will share patches with community.- Sirshak
+
** Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
** Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
+
 
** Gorka is trying some optimal configs for VCL.
+
* CSIT
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).  
+
** VPP Performance Test
** Alternate test cases.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
**  
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
* Action Items - Last Week
+
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
** [Khem] Deployment of only L2 CSIT performance suite. Status: Working with Juraj to get scripts ready for jobs. - Need to prepare some scripts. First to understand how the script works and then add more options.
+
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
** [Lijian] Status on VPP path failures. Status: Still debugging. Still at early study stage.
+
****** Confirm with Vexxhost people if replacing intel NICs is feasible
** [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far.
+
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
****** Will talk to dpdk i40e maintainer to seek their help
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Need to investigate 22.10 release testing result
 +
****** Compiler version change seems to be one of factors for perf degradation
 +
******* Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
 +
******* New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
 +
***** VM testcase timeout issue on 3-tsh testbed
 +
****** Timeout issue occured when starting VPP inside VM, but not for starting testpmd
 +
****** Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
 
 +
** VPP Device
 +
*** Enable VPP device testing per patch
 +
**** Voting right for VPP device testing on Arm is enabled
 +
**** VPP device testing on Arm runs per VPP/CSIT patch
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
 
 
* VPP
 
* VPP
** Vectorization
+
** VPP SVE implementation - Lijian
*** [Lijian] working on vectorized memory copy
+
*** SVE validation on FPGA platform - Confluence page ready
** Memory Ordering
+
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
*** [Sirshak] To start work on Arithmetic and Logic relaxed functions.
+
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''12/20/2022'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Tina Tsou
 +
 
 +
* Miscellaneous
 +
** Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
 +
 
 
* CSIT
 
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Confirm with Vexxhost people if replacing intel NICs is feasible
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
****** Will talk to dpdk i40e maintainer to seek their help
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Need to investigate 22.10 release testing result
 +
****** Compiler version change seems to be one of factors for perf degradation
 +
******* Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
 +
******* New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
 +
***** VM testcase timeout issue on 3-tsh testbed
 +
****** Timeout issue occured when starting VPP inside VM, but not for starting testpmd
 +
****** Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
 +
 
** VPP Path
 
** VPP Path
*** 3 failures currently stalling deployment.
+
*** Voting and working fine.
*** VPP-1476, VPP-1475, VPP-1478
+
*** CentOS-8 jobs have been removed.
*** These failures are seen on Debian x86 VM also.
+
 
*** Parallelization(n=32) is resulting in failures. Seems also be caused by below two patches.
+
*** VPP-1490, caused by https://gerrit.fd.io/r/#/c/15106/ and https://gerrit.fd.io/r/#/c/15534/.
+
*** VPP-1491, VPP-1497 about parallazation and GBP failure are filed.
+
*** Get CSIT/Aarch64 pass with partial test cases - Juraj
+
 
** VPP Device
 
** VPP Device
*** thunderx: Juraj created a LF tkt for wiring the 1-node topology on cavium thunderx.
+
*** Enable VPP device testing per patch
*** thunderx2: to be racked by this Friday.
+
**** Voting right for VPP device testing on Arm is enabled
*** mcbin: Kernel issue yet to try suggestion from Garcia and Damjan.
+
**** VPP device testing on Arm runs per VPP/CSIT patch
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''12/06/2022'''
 +
* Attendees
 +
** Juraj Linkes
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
 
 +
* Miscellaneous
 +
** Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
 +
 
 +
* CSIT
 
** VPP Performance Test
 
** VPP Performance Test
*** Working ongoing on writing scripts for Performance Jobs.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
*** L2 test is working now manually. Khem is trying to get it work in CI, and then IP4, and other test cases.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
* FD.io lab
+
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
** Arista switch is missing cable. Andy will send tracking no. for cables.
+
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
** ThunderX2 - to be racked by this Friday.
+
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
* Action Items - Next Week
+
****** Confirm with Vexxhost people if replacing intel NICs is feasible
** [Lijian] to investigate VPP-1490 issue.
+
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
** [Andy] Andy will send tracking no. for cables.
+
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
****** Will talk to dpdk i40e maintainer to seek their help
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Need to investigate 22.10 release testing result
 +
****** Compiler version change seems to be one of factors for perf degradation
 +
******* Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
 +
******* New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
 +
***** VM testcase timeout issue on 3-tsh testbed
 +
****** Timeout issue occured when starting VPP inside VM, but not for starting testpmd
 +
****** Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
  
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
  
'''11/12/2018'''
+
** VPP Device
 +
*** Enable VPP device testing per patch
 +
**** Voting right for VPP device testing on Arm is enabled
 +
**** VPP device testing on Arm runs per VPP/CSIT patch
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''11/15/2022'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Juraj Linkes
** Andy Wang
+
** Lijian Zhang
** Juraj Linkeš
+
** Jieqiang Wang
** Khemendra
+
** Tianyu Li
** Garcia
+
 
** Gorka
+
* Miscellaneous
* VPP Hoststack
+
** Reschedule the meeting to 9 am for Juraj and 3 pm for Shanghai folks
** iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance.
+
 
** ongoing perf analysis, two patches ongoing. Hotpots on memory copy or maybe other stuff. - Sirshak
+
* CSIT
** Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
+
** VPP Performance Test
** Gorka is trying some optimal configs for VCL.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).  
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** Alternate test cases.
+
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
** khem to get more information on benchmarking DMM. Khem to send the information to community if there's more.
+
***** CSIT perf numbers VS local perf numbers
* Action Items - Last Week
+
****** VPP cloud image in CSIT VS native built VPP in local env
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
****** One DPDK patch introduced perf degradation on Arm platform
** [Khem] Deployment of only L2 CSIT performance suite. Status: Working with Juraj to get scripts ready for jobs. - Need to prepare some scripts. First to understand how the script works and then add more options.
+
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
** [Lijian] Status on VPP path failures. Status: Still debugging. Still at early study stage.
+
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
** [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far.
+
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
** [Andy] has sent out on Nov 12th. Juraj has sent the info to LF.
+
****** Confirm with Vexxhost people if replacing intel NICs is feasible
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Need to investigate 22.10 release testing result
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
 
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
**** Good news, No more slow down after 200 rounds of testing.
 +
***** Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
 +
***** Distro upgrade to ubuntu 22.04 is still ongoing - no ETA yet
 +
****** Server configuration will remain the same, already integrated in ansible playbook
 +
***** Re-enable voting IF no more issue with 22.04 device testing
 +
****** Submit a patch to enable voting right after meeting
 +
*** Test meltdown/spectre vulnerabilities
 +
**** CSIT maintainers ask for tools if existing to test vulnerabilities on Arm platform(not just limited to Arm)
 +
**** Will confirm this issue with support team - Lijian
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** VM cases failed only on 3n-alt performance testbed, error log report some file missing, likely configuration issue
 +
**** Another intermit failed VM issue happens on tx2 and alt, need to figure out above case first
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.  
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
 
 
* VPP
 
* VPP
** Vectorization
+
** VPP SVE implementation - Lijian
*** [Lijian] Zero byte mask NEON implementation. - Merged.
+
*** SVE validation on FPGA platform - Confluence page ready
*** [Lijian] ip4 lookup buffer index to buffer pointer optimizations. - Both micro and macro prove it not worth implementing vectorization here.
+
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
*** [Lijian] working on vectorized memory copy - Khem to send the vectorized memory copy done previously.
+
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
** Memory Ordering
+
**** Investigate SVE vs NEON packet checksum comparison
*** [Sirshak] To start work on Arithmetic and Logic relaxed functions.
+
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
 
 +
'''10/18/2022'''
 +
* Attendees
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
* Miscellaneous
 +
** Reschedule the meeting to 9 am for Juraj and 3 pm for Shanghai folks
 
* CSIT
 
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
***** NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Replace XL710 NIC? - try asking tomorrow.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** NUMA issue
 +
***** Will run performance report on Arm testbed onece patch to resolve NUMA issue is merged
 +
***** Dave will help merge the patch into the corresponding branches
 +
 +
 
** VPP Path
 
** VPP Path
*** 3 failures currently stalling deployment.
+
*** Voting and working fine.
*** VPP-1476, VPP-1475, VPP-1478
+
*** CentOS-8 jobs have been removed.
*** These failures are seen on Debian x86 VM also.
+
*** Parallelization(n=32) is resulting in failures. Seems also be caused by below two patches.
+
*** VPP-1490, caused by https://gerrit.fd.io/r/#/c/15106/ and https://gerrit.fd.io/r/#/c/15534/.
+
 
** VPP Device
 
** VPP Device
*** thunderx: Juraj created a LF tkt for wiring the 1-node topology on cavium thunderx.
+
*** Device Testing on ThunderX2 servers
*** thunderx2: no updates. Anton is figuring out where/how to put these servers.
+
**** Juraj will commit the patch to disable the failling test cases
*** mcbin: Kernel issue yet to try suggestion from Garcia and Damjan.
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
**** Good news, No more slow down after 200 rounds of testing.
 +
***** Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
 +
***** Suggest to rerun test after upgrade to 22.04
 +
***** Re-enable voting after not more issue with 22.04 device testing
 +
 
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node  
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT cards can be seen with new kernel update
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''9/20/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
 
 +
* CSIT
 
** VPP Performance Test
 
** VPP Performance Test
*** Working ongoing on writing scripts for Performance Jobs.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
*** L2 test is working now manually. Khem is trying to get it work in CI, and then IP4, and other test cases.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
* FD.io lab
+
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
** Arista switch and power supply. Setup was sent out and tracking no. info was sent to LF.
+
***** CSIT perf numbers VS local perf numbers
** ThunderX2 - No updates.
+
****** VPP cloud image in CSIT VS native built VPP in local env
* Action Items - Next Week
+
****** One DPDK patch introduced perf degradation on Arm platform
** [Lijian] to investigate VPP-1490 issue.
+
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
***** NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Replace XL710 NIC? - try asking tomorrow.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
**** Good news, No more slow down after 200 rounds of testing.
 +
***** Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
 +
***** Suggest to rerun test after upgrade to 22.04
 +
***** Re-enable voting after not more issue with 22.04 device testing
 +
 
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.  
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
**** QAT enabled Kernel patch release about October, upgrade kernel required.
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
** VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
  
  
'''11/06/2018'''
+
'''9/6/2022'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
** Honnappa
+
** Juraj Linkes
 +
** Tianyu Li
 +
** Lijian Zhang
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
***** NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
 +
***** DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
 +
****** Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
 +
****** Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
 +
****** Replace XL710 NIC? - try asking tomorrow.
 +
****** Tried old version of DPDK - 21.08 does not work. May need to try older version.
 +
****** Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
 +
****** May try to upgrade the NIC's firmware. - check local xl710 firmware version
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
**** Good news, No more slow down after 200 rounds of testing.
 +
***** Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
 +
***** Suggest to rerun test after upgrade to 22.04
 +
***** Re-enable voting after not more issue with 22.04 device testing
 +
 
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
**** QAT enabled Kernel patch release about October, upgrade kernel required.
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
** VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''8/16/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Masksym Vynnvk
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Lijian Zhang
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
***** NDR/PDR data difference - deep dive needed
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
 
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
** VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''8/2/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Masksym Vynnvk
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Check if there is customer support can help with the PEX installation issue - Juraj
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
***** NDR PDR data difference - deep dive needed, MRR is
 +
******
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
** VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
****** IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate
 +
 
 +
'''7/19/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Check if there is customer support can help with the PEX installation issue - Juraj
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
 
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP using 100G MLX NIC
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
 
 +
 
 +
'''7/5/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Check if there is customer support can help with the PEX installation issue - Juraj
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
***** 22.06 release testing will happen soon
 +
 
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
 +
*** Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
 +
** Investigate One Terabit throughput test on Arm platform
 +
*** Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
 +
*** Kernel cmdline may impact on NDR PDR results - Jieqiang
 +
*** Intern help to benchmark VPP on N1 platforms
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Tested perfmon patch - Jieqiang
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
******* VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
 +
****** QAT single core test done - investigate multiple core QAT case
 +
 
 +
'''6/21/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 
** Tina Tsou
 
** Tina Tsou
** Andy Wang
+
 
** Khemendra
+
* CSIT
** Garcia
+
** VPP Performance Test
** Manuel
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** Fede
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
* VPP Hoststack
+
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
** iperf3 perfomance with Hoststack.
+
***** CSIT perf numbers VS local perf numbers
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).  
+
****** VPP cloud image in CSIT VS native built VPP in local env
** Alternate test cases.
+
****** One DPDK patch introduced perf degradation on Arm platform
** khem to get more information on benchmarking DMM.  
+
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
* Action Items - Last Week
+
***** Check if there is customer support can help with the PEX installation issue - Juraj
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
** [Khem] Deployment of only L2 CSIT performance suite. Status: Working with Juraj to get scripts ready for jobs.
+
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
** [Lijian] Status on VPP path failures. Status: Still debugging.
+
****** Juraj should have already sent to Jieqiang previously.
** [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan.
+
***** 22.06 release testing will happen soon
** [Andy] to send tracking no for Arista power supply. Status: PO is still being worked on internally.
+
 
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.  
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 
* VPP
 
* VPP
** Vectorization
+
** VPP SVE implementation - Lijian
*** [Lijian] msb patch. - merged
+
*** SVE validation on FPGA platform - Confluence page ready
*** [Lijian] Zero byte mask NEON implementation. - internal review completed to up-streamed soon.
+
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
*** [Lijian] ip4 lookup buffer index to buffer pointer optimizations. - internal investigations ongoing.  
+
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
** Memory Ordering
+
**** Investigate SVE vs NEON packet checksum comparison
*** [Sirshak] atomic exchange acquire and release macro patch.- Merged.
+
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
*** [Sirshak] To start work on Arithmetic and Logic relaxed functions.
+
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Tested perfmon patch - Jieqiang
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
 
 +
'''6/7/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Tina Tsou
 +
 
 
* CSIT
 
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
 +
***** CSIT perf numbers VS local perf numbers
 +
****** VPP cloud image in CSIT VS native built VPP in local env
 +
****** One DPDK patch introduced perf degradation on Arm platform
 +
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Check if there is customer support can help with the PEX installation issue - Juraj
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 
** VPP Path
 
** VPP Path
*** 3 failures currently stalling deployment.
+
*** Voting and working fine.
*** VPP-1476, VPP-1475, VPP-1478
+
*** CentOS-8 jobs have been removed.
*** These failures are seen on Debian x86 VM also.
+
*** Parallelization(n=32) is resulting in failures.
+
 
** VPP Device
 
** VPP Device
*** thunderx: Juraj created a LF tkt for wiring the 1-node topology on cavium thunderx.
+
*** Device Testing on ThunderX2 servers
*** thunderx2: no updates.
+
**** Juraj will commit the patch to disable the failling test cases
*** mcbin: Kernel issue yet to try suggestion from Garcia and Damjan.
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
**** Investigate SVE vs NEON packet checksum comparison
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Tested perfmon patch - Jieqiang
 +
*** Review SPD flow cache patch from Intel folks - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
****** Investigating crash issue with 90% linerate IPSec traffic with QAT card
 +
 
 +
'''5/17/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Lijian Zhang
 +
** Tina Tsou
 +
 
 +
* CSIT
 
** VPP Performance Test
 
** VPP Performance Test
*** Working ongoing on writing scripts for Performance Jobs.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
* FD.io lab
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** Arista switch power supply and rack rails. Status: being worked internally
+
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
** ThunderX2 - No updates.
+
***** CSIT perf numbers VS local perf numbers
* Action Items - Next Week
+
****** VPP cloud image in CSIT VS native built VPP in local env
**
+
****** One DPDK patch introduced perf degradation on Arm platform
'''10/30/2018'''
+
****** Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
 +
***** Check if there is customer support can help with the PEX installation issue - Juraj
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have already sent to Jieqiang previously.
 +
 
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
****** Needs a kernel patch to resolve crash issue for QAT card
 +
******* Patch made by Yoan is upstream and waits for review
 +
******* Try patched VPP to verify QAT card usage
 +
 
 +
'''4/5/2022'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
** Honnappa
+
** Juraj Linkes
 
** Lijian Zhang
 
** Lijian Zhang
 
** Tina Tsou
 
** Tina Tsou
** Andy Wang
+
 
** Khemendra
+
* CSIT
* Action Items - Last Week
+
** VPP Performance Test
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** [Khem] To start with deployment of only L2 CSIT performance suite. - Discussed in CSIT meeting.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** [Khem] to send the ip4 failure logs csit-dev, vpp-dev. - Sent csit-dev and vpp-dev.
+
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
** [Juraj] Anton to install NICs. Juraj has sent the instructions. - Sent
+
***** Check if there is customer support can help with the PEX installation issue - Juraj
 +
***** Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
 +
****** And the procedures of developing/developing test cases in CSIT (performance & device testing)
 +
****** Juraj should have alrady sent to Jieqiang previously.
 +
 
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
******* Juraj will commit the patch and get it confirmed with Zachary
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** Device Testing on ThunderX2 servers
 +
**** Juraj will commit the patch to disable the failling test cases
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
 +
** QAT cards
 +
*** Govind will ship another 2x QAT from Austin to FD.io lab
 +
*** Will procure 2x QAT cards and verify them internally firstly.
 +
*** The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
 +
*** QAT test cases are developed based on Python APIs / CLIs
 +
 
 
* VPP
 
* VPP
** Vectorization
+
** VPP SVE implementation - Lijian
*** [Lijian] Internal review going on msb patch.
+
*** SVE validation on FPGA platform - Confluence page ready
*** [Lijian] Zero byte mask neon function not consistent with x86. Working optimizing it.
+
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
*** [Lijian] ip4 lookup buffer index to buffer pointer optimization. - Still investigating.
+
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
** Memory Ordering:
+
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
*** [Sirshak] Internal review going on atomic exchange acquire and release macro patch.
+
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
 
 +
'''3/15/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Tianyu Li
 +
** Tina Tsou
 +
 
 
* CSIT
 
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
****** Investigate Inbound SPD test cases - Juraj
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
***** Release testing for 22.02 is done
 +
****** https://s3-docs.fd.io/csit/rls2202/report/index.html
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 
** VPP Path
 
** VPP Path
*** 2 failures currently stalling deployment.
+
*** Voting and working fine.
*** https://jira.fd.io/browse/VPP-1476
+
*** CentOS-8 jobs have been removed.
*** https://jira.fd.io/browse/VPP-1475
+
*** VPP-1478 - L2FIB failures in Taishan.
+
*** 18.04 does not solve the problem
+
*** ThunderX2 - 4-5 failiures L2BD cases.
+
*** ThunderX1 - l2fib and juraj
+
*** [Lijian] to take a look.
+
 
** VPP Device
 
** VPP Device
*** Waiting for NICs to be installed on ThunderX2.
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
*** mcbin : Linux Kernel not able to use the data ports. Needed for scapy. [Sirshak] to take a look
+
*** VM cases failed only on Arm
*** mcbin: New Kernel is not working.  
+
**** Tried to increase the timeout to see it will fix the issue
** Performance Test
+
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
*** Discussed in CSIT meeting
+
**** Reboot server recover and monitoring
*** Researching on how to create jobs
+
**** Need to look into it, try manually
*** More things to be discussed with CSIT meeting.
+
***** May need to upgrade iavf driver
* FD.io lab
+
*** Server in-accessiable
** Arista switch need additional hardware. - Andy to send tracking no.
+
**** Reboot server recover the service
** ThunderX2 - Waiting on LF for NICs installing and rack installing in general.
+
*** AVF interface creation issue:
* Action Items - Next Week
+
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
** [Khem] Deployment of only L2 CSIT performance suite.
+
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
** [Lijian] Status on VPP path failures.
+
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
** [Sirshak] Kernel Migration mcbin.
+
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
** [Andy] to send tracking no for Arista power supply.
+
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.  
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Performance servers have arrived at FD.io lab
 +
**** Servers are in the processing of wiring, expected to be operational soon
 +
**** Will follow the trend for Arm servers if more mlx NICs are installed on X86
 +
**** Plan to install QAT cards on performance servers
 +
**** Juraj to get QAT card avalibility from CSIT community
  
'''10/23/2018'''
+
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
*** Rebase the patch and final round of benchmarking for frag/reassembly nodes
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
****** Kernel with aarch64 patch is expected to release soon
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
***** Patch to resolve iommu issue for mlx NIC when using with QAT card
 +
****** Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
 +
'''3/1/2022'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
** Juraj Linkeš
+
** Jieqiang Wang
 +
** Tianyu Li
 +
** Tina Tsou
 +
 
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Paper work for shipment is done
 +
*** Build servers will arrive at end of Jan
 +
*** Performance servers will arrive in Feb
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Confluence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
 +
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
 +
*** reassembly node opt by adding prefetch
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
** VPP IPv6 fragmentation
 +
*** Multi-arch node and batch memcpy - src, dst, bytes.
 +
*** VPP performance drop seen in CSIT after bump dpdk version to 21.11
 +
**** https://gerrit.fd.io/r/c/vpp/+/34705
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''1/25/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 
** Lijian Zhang
 
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tianyu Li
 
** Tina Tsou
 
** Tina Tsou
** Andy Wang
+
 
** Khemendra
+
* CSIT
* Action Items - Last Week
+
** VPP Performance Test
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** [Khem] VPP performance suite. L2XC, L2BD working. IPv4 failing.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** [Juraj and Sirshak] We need to check RC2 after 17th of october. [khem] To send the test failure logs to vpp-dev.
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
 +
**** Reboot server recover and monitoring
 +
**** Need to look into it, try manually
 +
***** May need to upgrade iavf driver
 +
*** Server in-accessiable
 +
**** Reboot server recover the service
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Need to figure out how to reproduce the error - Juraj
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
*** Paper work for shipment is done
 +
*** Build servers will arrive at end of Jan
 +
*** Performance servers will arrive in Feb
 +
 
 
* VPP
 
* VPP
** Vectorization
+
** VPP SVE implementation - Lijian
*** [Lijian] Looking at shuffle and msb patches.
+
*** SVE validation on FPGA platform - Confluence page ready
*** [Lijian] Sent the shuffle analysis to Nitin for feedback.
+
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
*** [Lijian] Preliminary analysis says we take more instructions hence the slow down. To reevaluate the algorithm itself.
+
**** Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
** Memory Ordering:
+
** VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
*** [Sirshak] Part 1 of the patch merged which now introduces macros. Working on part 2.
+
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''1/18/2022'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Tina Tsou
 +
 
 
* CSIT
 
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591 -- Merged
 
** VPP Path
 
** VPP Path
*** To be deployed as a part of CI after 1810 release.
+
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 
** VPP Device
 
** VPP Device
*** Waiting for NICs to be installed on ThunderX2.
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
*** mcbins working. Juraj to start work on getting one instance of the VPP device test running.
+
*** VM cases failed only on Arm
*** Docker working musdk working. Currently facing issues with the 1-node topology.
+
**** Tried to increase the timeout to see it will fix the issue
** Performance Test
+
*** VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node  
*** L2XC, L2BD working. IPv4 failing. [Khem] to send the failure logs csit-dev, vpp-dev.
+
**** Reboot server recover and monitoring
*** [Khem] To start with deployment of only L2 performance suite.
+
**** Need to look into it, try manually
* FD.io lab
+
***** May need to upgrade iavf driver
** Arista switch need additional hardware. To be shipped in 2 weeks.
+
*** Server in-accessiable
** NICs and wires for ThunderX2 - Received. Anton to install NICs. Juraj has sent the instructions. 
+
**** Reboot server recover the service
** Power Cycler to be ordered by LF - Its available and working.
+
*** AVF interface creation issue:
* Documentation
+
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
** Lijian's patch merged.
+
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
* Action Items - Next Week
+
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
** [Khem] To start with deployment of only L2 performance suite.
+
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
** [Khem] to send the failure logs csit-dev, vpp-dev.
+
**** Race condition occur on /dev/vfio mounting
** [Juraj] Anton to install NICs. Juraj has sent the instructions.
+
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
'''10/16/2018'''
+
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.  
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
 +
*** 2 performance servers waiting for Intel NICs
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
 +
** VPP IPv4 fragmentation - Tianyu & Jieqiang
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''1/11/2022'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
** Juraj Linkeš
+
 
** Lijian Zhang
 
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 
** Tina Tsou
 
** Tina Tsou
** Andy Wang
+
 
* Action Items - Last Week
+
* CSIT
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
** VPP Performance Test
** [Khem] to try VPP performance suite to see change after Vectorization and Loop unrolling patch.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** Remaining issues in L2 and IPv4 - Sirshak to try debug. Status: With inputs from Neale issue fixed.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 
* VPP
 
* VPP
** Vectorization
+
** VPP SVE implementation - Lijian
*** [Lijian] Studying about Vectorization and Memory ordering.
+
*** SVE validation on FPGA platform - Conflunence page ready
** Memory Ordering:
+
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
*** [Sirshak] One patch up-streamed. Relaxed memory ordering patch being reviewed internally.
+
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
 +
** VPP IPv4 fragmentation - Tianyu & Jieqiang
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
*** Benchmark IPv4 fragmentation node using rdma plugin
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''12/14/2021'''
 +
* Attendees
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
 
 
* CSIT
 
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591
 
** VPP Path
 
** VPP Path
*** Should be deployed on 25th Oct after 1810 release.
+
*** Voting and working fine.
*** We need to check RC2 after 17th of october [Juraj and Sirshak].
+
*** CentOS-8 jobs have been removed.
 
** VPP Device
 
** VPP Device
*** Waiting for ThunderX2 NICs. Mcbin has issues with 2 VPP instances and traffic being sent.
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
** Performance Test
+
*** VM cases failed only on Arm
*** No updates
+
**** Tried to increase the timeout to see it will fix the issue
* FD.io lab
+
*** AVF interface creation issue:
** QSFP+ switch for Cavium blades. - Recieved
+
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
** NICs and wires for ThunderX2 - Andy to confirm that the NICs have been sent. Juraj LF tkt to be opened.  
+
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
** Power Cycler to be ordered by LF - Its available and should be operational this week.
+
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
* Documentation
+
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
** Scott has reviewed the changes from Lijian he will merge it this week after few modifications of his own.  
+
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
* Action Items - Next Week
+
**** Race condition occur on /dev/vfio mounting
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
** [Khem] to try VPP performance suite to see change after Vectorization and Loop unrolling patch.
+
******* Patch has been merged
** [Juraj and Sirshak] We need to check RC2 after 17th of october.
+
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.  
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
  
'''10/09/2018'''
+
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
 +
** VPP IPv4 fragmentation - Tianyu & Jieqiang
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
** VPP IPv6 Benchmarking and Profiling - Jieqiang
 +
*** IPv6 profiling
 +
**** No perf bump for lookup_x2 function in Fd.io gerrit
 +
**** Try Mellaonx nics for IPv6 routing tests
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''12/07/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Tianyu Li
** Maciek Konstantinowicz
+
** Govindarajan Mohandoss
** Juraj Linkeš
+
 
** Lijian Zhang
 
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 
** Tina Tsou
 
** Tina Tsou
** Andy Wang
+
 
* Action Items - Last Week
+
* CSIT
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
** VPP Performance Test
** [Juraj] to try musdk enabled kernel. - Juraj retried it worked.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** [Khem] to try VPP performance suite - Diff in nos based on previous patches.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
***** Patch ready for review and to be merged https://gerrit.fd.io/r/c/csit/+/34591
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VM cases failed only on Arm
 +
**** Tried to increase the timeout to see it will fix the issue
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
******* Dave gave minor comments https://gerrit.fd.io/r/c/ci-management/+/34679 - comment addressed and will be merged soon
 +
******* Periodic job will stop when per patch job enabled
 +
** New Arm servers shipment to the FD.io lab - about Jan 2022
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 
* VPP
 
* VPP
** Vectorization
+
** VPP SVE implementation - Lijian
*** Do perf analysis of the compiled code i.e. compare code with buffer indices to buffer pointers code with and without in quad loop.
+
*** SVE validation on FPGA platform - Conflunence page ready
*** Lijian to rebase the patch and try a few experiments
+
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
*** Khem Updates: No updates.
+
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
*** Memory Ordering Patch reintroduced. Broken down into smaller patches and the first patch has been upstreamed, others will be phased in gradually.
+
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
 +
** VPP IPv4 fragmentation - Tianyu & Jieqiang
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
** VPP IPv6 Benchmarking and Profiling - Jieqiang
 +
*** IPv6 profiling
 +
**** No perf bump for lookup_x2 function in Fd.io gerrit
 +
**** Try Mellaonx nics for IPv6 routing tests
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Depends on kernel patch
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
***** Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
 +
 
 +
'''11/30/2021'''
 +
* Attendees
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
 
 
* CSIT
 
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Outbound SPD test patch has been enabled
 +
******* https://s3-docs.fd.io/csit/master/trending/trending/ipsec-2n-tx2-xl710.html#b-ipsec-spe-ip4routing-base-scale
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 
** VPP Path
 
** VPP Path
*** Set up publicly accessible Cavium machine for debugging purposes
+
*** Voting and working fine.
*** 2 tkts resolved by Neale.
+
*** CentOS-8 jobs have been removed.
*** Remaining issues in L2 and IPv4 - Sirshak to try debug; Neale will look into it in spare cycles
+
 
** VPP Device
 
** VPP Device
*** SRIOV reservation system code exists and is being tested
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
** Performance Test
+
*** AVF interface creation issue:
*** NDR issue Status:
+
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
*** Issues in ip4. Status:
+
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
* FD.io lab
+
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
** QSFP+ switch for Cavium blades. - Has been shipped
+
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
** NICs and wires for ThunderX2 - Wires have been shipped; NICs will be delayed
+
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
** Power Cycler to be ordered by LF.
+
**** Race condition occur on /dev/vfio mounting
* Documentation
+
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
** Started upstreaming, no response yet
+
******* Patch has been merged
* Action Items - Next Week
+
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.  
** [Khem] to try VPP performance suite to see change after Vectorization and Loop unrolling patch.
+
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
** Remaining issues in L2 and IPv4 - Sirshak to try debug. Status: With inputs from Neale issue fixed.
+
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
****** Ping Dave about enabling VPP device testing per patch
 +
** New Arm servers shipment to the FD.io lab
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
**** 500G disk/256G RAM
 +
**** Each job will consume about 16G memory
 +
*** Intel NIC firmware upgrade on Arm - not supported
  
'''10/02/2018'''
+
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
 +
** VPP IPv4 fragmentation - Tianyu & Jieqiang
 +
*** Add multi-arch support for ip4-frag node but see no perf bump
 +
*** Apply loop unrolling on ip4-frag node
 +
** VPP IPv6 Benchmarking and Profiling - Jieqiang
 +
*** IPv6 profiling
 +
**** No perf bump for lookup_x2 function in Fd.io gerrit
 +
**** Try Mellaonx nics for IPv6 routing tests
 +
** CNF PoC proposal preparation- Tianyu
 +
*** Add support for VPP aarch64 docker image build
 +
*** Calico use cases exploration on VPP
 +
*** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
'''11/23/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Tianyu Li
** Honnappa
+
** Juraj Linkeš
+
 
** Lijian Zhang
 
** Lijian Zhang
** Khemendra
+
** Jieqiang Wang
 +
** Juraj Linkes
 
** Tina Tsou
 
** Tina Tsou
** Andy Wang
+
 
** Honnappa
+
* CSIT
* Action Items - Last Week
+
** VPP Performance Test
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** [Khem] VPP-1391 - VPP 'make verify' failed on Huawei Taishan server: Similar failures to cavium setup.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** [Sirshak] - mlnx patch merged.
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** [Sirshak] to setup meeting with andy, tina and honnappa regarding fd.io lab purchase planning. - Not needed.
+
**** IPsec SPD input/output test case ongoing - Juraj
** [Juraj] to try musdk enabled kernel. - failing Sirshak to debug this.
+
***** Enable the SPD outbound tests
** [Lijian] to resolve upstream comments. - Patch merged up-streamed.
+
****** Patches ready, waiting release testing done - ETA 1 week or 2
** [Khem] to try VPP performance suite with Lijian's Patch. - No need can try with master now.
+
****** Outbound SPD test patch merged and running, expected report shows next week.
** [Khem] To try to debug traffic flow form TG to DUT with current master. - No updates.
+
****** Inbound patch pending on merge, need maintainer's review
** [Khem] To send current status. - Sent.
+
****** https://gerrit.fd.io/r/c/csit/+/34256
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
**** 3 nodes Taishan crypto test case failed - related to CSIT change
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* Patch has been merged
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.  
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
******* VPP Device configuration align with VPP Performance configuration - no issue yet
 +
***** Enable VPP device testing per patch
 +
****** Monitor for a week and enable vote right then
 +
** New Arm servers shipment to the FD.io lab
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
**** Need to confirm with RAM/disk size for the new build servers
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 
* VPP
 
* VPP
** Vectorization
+
** VPP SVE implementation - Lijian
*** Understand Performance degradation.
+
*** SVE validation on FPGA platform - Conflunence page ready
**** msb correct version.
+
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
**** ip4_forward buffer index to buffer pointers.
+
**** Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
** Tuning dual/quad loop
+
**** FPGA team promises to provide FPGA image with DMC-620
*** Patch merged.
+
**** Liangxing will help to benchmark VPP with FPGA enbaled with DMC-620
*** Khem Updates: No updates.
+
** VPP IPv4 fragmentation
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Performance degradation with quad loop unrolling applied on ip6_lookup_inline
 +
**** Patch the current kernel to enable perfmon plugin on VPP
 +
**** Need to check performance for IPv6 subnet routing
 +
**** Try Mellaonx nics for IPv6 routing tests
 +
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
***** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
 
 +
 
 +
'''11/16/2021'''
 +
* Attendees
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
 
 
* CSIT
 
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** Patches ready, waiting release testing done - ETA 1 week or 2
 +
****** Inbound patch pending on merge, need maintainer's review
 +
****** https://gerrit.fd.io/r/c/csit/+/34256
 +
***** Release testing for 21.10 is done
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 
** VPP Path
 
** VPP Path
*** 2 categories of failures primarily. 8 tkts opened.
+
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 
** VPP Device
 
** VPP Device
*** shim layer to be leaner and most of the functionality will reside in jenkins-slave.
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
** Performance Test
+
*** AVF interface creation issue:
*** L2-basic(L2XC, L2BD) PDR, MDR passing. NDR has issues but debugged.  
+
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
*** Issues in ip4.
+
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
* FD.io lab
+
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
** mcbin - Sirshak to help debug connectivity issues
+
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
** QSFP+ switch for Cavium blades. - Andy working getting a refurbished one.
+
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
** Power Cycler to be ordered by LF.
+
**** Race condition occur on /dev/vfio mounting
* Documentation
+
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
**  
+
******* Patch has been merged
* Action Items - Next Week
+
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
+
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.  
** [Juraj] to try musdk enabled kernel. - Juraj retried it worked.
+
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
** [Khem] to try VPP performance suite - Diff in nos based on previous patches.
+
******* Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
 +
***** Enable VPP device testing per patch
 +
** New Arm servers shipment to the FD.io lab
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** SVE validation on FPGA platform - Conflunence page ready
 +
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
 +
**** Performonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
 +
**** FPGA team promises to provide FPGA image with DMC-620
 +
**** Liangxing will help to benchmark VPP with FPGA enbaled with DMC-620
 +
** VPP IPv4 fragmentation
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Performance degradation with quad loop unrolling applied on ip6_lookup_inline
 +
**** Patch the current kernel to enable perfmon plugin on VPP
 +
**** Need to check performance for IPv6 subnet routing
 +
**** Try Mellaonx nics for IPv6 routing tests
 +
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
***** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
  
'''9/25/2018'''
+
'''11/09/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Tianyu Li
** Honnappa
+
** Govindarajan Mohandoss
** Juraj Linkeš
+
 
** Lijian Zhang
 
** Lijian Zhang
** Khemendra
+
** Juraj Linkes
 
** Tina Tsou
 
** Tina Tsou
** Andy Wang
+
 
** Honnappa
+
* CSIT
* Action Items - Last Week
+
** VPP Performance Test
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. Khem to to measure verify and test timings.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** Sirshak to verify if Damjan's suggestion on VPP compiling with Mellanox PMD driver works. - Does not. Lijian suggested fix merged by damjan, Additional patch needed: under internal review.
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** [VPP performance Suite] shm issues seen sporadically. Not seeing currently.
+
**** IPsec SPD input/output test case ongoing - Juraj
** Sirshak to setup meeting with andy, tina and honnappa regarding fd.io lab purchase planning. - Not done will do it this week.
+
***** Enable the SPD outbound tests
** Sirshak to send musdk instructions. - Done.
+
****** Patches ready, waiting release testing done - ETA 1 week or 2
** Juraj to try musdk enabled kernel and see if musdk and docker can coexist in the dirty kernel. - Trying them out.
+
****** Inbound patch pending on merge, need maintainer's review
 +
****** https://gerrit.fd.io/r/c/csit/+/34256
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** AVF interface creation issue:
 +
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
***** Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
 +
***** Not related to iavf driver, AVF interface - vpp native driver have this issue
 +
***** dpdk iavf ignore the error and continue initialization, while vpp abort the init process
 +
***** Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
 +
**** Race condition occur on /dev/vfio mounting
 +
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
 +
******* By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
 +
******* Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.  
 +
******* Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
 +
******* Will enable voting right soon after the patch gets merged
 +
** New Arm servers shippment to the FD.io lab
 +
*** New servers are in the procurement process
 +
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
*** Intel NIC firmware upgrade on Arm - not supported
 +
 
 
* VPP
 
* VPP
** Mellanox NIC not working with VPP
+
** VPP SVE implementation - Lijian
*** [Sirshak] glue library not detected by VPP. Compiling static has cmake issues. - Fix done pending internal review.
+
*** SVE validation on FPGA platform - Conflunce page ready
** VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. Khem to try things suggested by Juraj.
+
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
** Vectorization
+
**** Perfmonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
*** Understand Performance degradation.
+
**** FPGA team promises to provide FPGA image with DMC-620
**** msb correct version.
+
*** VPP IPv4 fragmetation
**** ip4_forward buffer index to buffer pointers.
+
** VPP IPv6 Benchmarking and Profiling
** Tuning dual/quad loop
+
*** IPv6 profiling
*** Patch upstreamed pending review.
+
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
*** Lijian to resolve upstream comments.
+
**** Performance degradation with quad loop unrolling applied on ip6_lookup_inline
*** Khem to try VPP performance suite with Lijian's Patch. (affects 1 route configuration more than 10k routes.)
+
**** Patch the current kernel to enable perfmon plugin on VPP
 +
**** Need to check performance for IPv6 subnet routing
 +
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
***** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
 
 +
'''11/02/2021'''
 +
* Attendees
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
 
 
* CSIT
 
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
**** IPsec SPD input/output test case ongoing - Juraj
 +
***** Enable the SPD outbound tests
 +
****** https://gerrit.fd.io/r/c/csit/+/34256
 +
**** New links for VPP perf trending/report pages
 +
***** Daily trending: https://s3-docs.fd.io/csit/master/trending/
 +
***** Release report: https://s3-docs.fd.io/csit/master/report/
 
** VPP Path
 
** VPP Path
*** Current Test Cases Failure: 8
+
*** Voting and working fine.
*** [Juraj] Mail sent regarding failures. Need community support regarding failures.
+
*** CentOS-8 jobs have been removed.
*** [Khem] To see if he can take up one test case failure and resolve it.  
+
 
** VPP Device
 
** VPP Device
*** In case of 2 loopbacks we can use 2 physical devices.
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
*** Facing issues with console connection to mcbin.
+
*** AVF interface creation issue:
** Performance Test
+
**** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
*** [Khem] working on IPv4 testing failures with CSIT script. Will be starting on v4 suite. To debug at DPDK level and talk to VPP community.
+
***** race condition occur
*** [Khem] To try to debug traffic flow form TG to DUT with current master.
+
****** Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
*** [Khem] To send current status.
+
******* Addressed comments, waiting Peter's review.
* FD.io lab
+
******* Will enable voting right soon after the patch gets merged
** mcbin - trying new kernel images to resolve connectivity issues.
+
** New Arm servers shippment to the FD.io lab
** QSFP+ switch for Cavium blades. - Waiting for reply from Anton.
+
*** New servers are in the procurement process
* Documentation
+
*** Plan to replace old thunderx1 build servers with more advanced Arm servers
** Trevor to help with contiv documentation. - Trevor has made changes, Lijian has sent for comments.
+
*** Intel NIC firmware upgrade on Arm - not supported
* Action Items - Next Week
+
 
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
* VPP
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server.
+
** VPP SVE implementation - Lijian
** Sirshak - mlnx patch merged.
+
*** SVE validation on FPGA platform - Conflunce page ready
** Sirshak to setup meeting with andy, tina and honnappa regarding fd.io lab purchase planning. - Not needed.
+
*** Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
** Juraj to try musdk enabled kernel. - failing Sirshak to debug this.
+
**** Perfmonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
** Lijian to resolve upstream comments. - Patch merged up-streamed.
+
**** FPGA team promises to provide FPGA image with DMC-620
** Khem to try VPP performance suite with Lijian's Patch.
+
** VPP IPv6 Benchmarking and Profiling
** [Khem] To try to debug traffic flow form TG to DUT with current master.
+
*** IPv6 profiling
** [Khem] To send current status.
+
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
'''9/18/2018'''
+
**** Performance degradation with quad loop unrolling applied on ip6_lookup_inline
 +
**** Patch the current kernel to enable perfmon plugin on VPP
 +
**** Need to check performance for IPv6 subnet routing
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu
 +
**** Two patches unmerged
 +
***** l2: fix array-bounds error for prefetch on Arm https://gerrit.fd.io/r/c/vpp/+/33307
 +
***** ioam: fix prefetch out bound on Arm https://gerrit.fd.io/r/c/vpp/+/33506
 +
 
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
***** Improve ansible scripts to deploy VPP&snort on K8S pods automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
 +
*** v10 kernel patch is ready, which fixes intermittent large statistic number for events
 +
**** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Picchi
 +
 
 +
 
 +
'''10/26/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Juraj Linkes
** Honnappa
+
** Tianyu Li
** Juraj Linkeš
+
** Govindarajan Mohandoss
 
** Lijian Zhang
 
** Lijian Zhang
 +
** Jieqiang Wang
 
** Tina Tsou
 
** Tina Tsou
* Action Items - Last Week
+
 
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
* CSIT
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server, and requires steps on reproducing the issue from Juraj.
+
** VPP Performance Test
** Lijian to update dual/quad loop code review per comments. - Resolved.  
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** Sirshak to verify if Damjan's suggestion on VPP compiling with Mellanox PMD driver works. - It doesnt. Send him a reminder.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** Juraj to investigate the compiler versions on fd.io lab machines. Upgrade to latest GCC-7.3.
+
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week. - closed
** [VPP performance Suite] shm issues seen sporadically. Khem to send an email to vpp-dev.
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
** Honnappa to confirm the if we order NIC for ThunderX1, specify if they are external NICs.
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** Sirshak to setup meeting with andy, tina and honnappa regarding fd.io lab purchase planning.
+
***** Inbound IPsec: reproduced and need to investigate - Juraj
** Sirshak to send musdk instructions.
+
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** IPsec SPD input/output case ongoing
 +
***** Adding IPsec SPD outbound test cases 64B 1, 100 and 1k SPD entries, 1, 2, 4 cores, on tx2 testbed - clarified
 +
****** Flow cache on and off cases need to be measured.
 +
***** L2 BD 20k test cases execute time too long, removed on taishan.
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 3n-tsh testbed unreachable, investigating right now - Juraj
 +
***** TG firmware is under upgradation
 +
***** Server unreachable due to firmware & driver update - resolved - update all done
 +
**** Release testing for 21.10 starts
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - Juraj
 +
***** Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
 +
**** AVF interface creation issue:
 +
***** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
****** race condition occur
 +
****** try mounting a part of /dev/vfio to see if issue can be resolved
 +
******  Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and looks good right now
 +
******* Addressed comments, waiting Peter's review..
 +
******* Will enable voting right soon after the patch gets merged
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
**** New servers are in the procurement process
 +
**** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
****** CPU not fully utilized on Arm, need further investigation
 +
** Intel NIC firmware upgrade on Arm - not supported
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 
* VPP
 
* VPP
** Mellanox NIC not working with VPP
+
** VPP SVE implementation - Lijian
*** [Sirshak] glue library not detected by VPP. Compiling static has cmake issues.
+
**** SVE validation on FPGA platform - Conflunce page ready
** VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. - make build-release and make test work. Khem to see if make verify is still failing.
+
***** Run unit tests from DPDK and VPP bihash on FPGA
** Vectorization
+
***** Try Lijian's SVE patch to see any cycle count improvement
*** [Sirshak] Patches submitted. Patch widening the usage has performance issues.
+
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
** Tuning dual/quad loop
+
****** Run standalone SVE test cases on FPGA
*** [Lijian] 3%- throughput improvement (mcbin).  
+
****** Ask for DMC 620 images to run for FPGA
*** Include mcbin nos in commit message.
+
****** Enable DMC 620 more close to real system, but performance will drop
** [Khem] working on IPv4 testing failures with CSIT script
+
****** Build a system using VPP memif and pktgen
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
**** Plan to try quad loop unrolling for ip6_lookup_inline function
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu
 +
**** Two patches unmerged
 +
***** l2: fix array-bounds error for prefetch on Arm https://gerrit.fd.io/r/c/vpp/+/33307
 +
***** ioam: fix prefetch out bound on Arm https://gerrit.fd.io/r/c/vpp/+/33506
 +
 
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
***** Try to use ansible to deploy VPP automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from maintainer and Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''10/19/2021'''
 +
* Attendees
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
 
 
* CSIT
 
* CSIT
** CSIT-1139 - parallelize 'make verify'
+
** VPP Performance Test
*** make verify - working. Test Failure still there
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
*** Juraj investigating FAILED test cases in make verify.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** CSIT VPP Device updates
+
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week. - closed
*** juraj to try musdk enabled kernel and see if musdk and docker can coexist in the dirty kernel.
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
** CSIT Performance Test Suite Updates
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
* FD.io lab
+
***** Inbound IPsec: reproduced and need to investigate - Juraj
** mcbin issue
+
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
** QSFP+ ports for Cavium blades.
+
**** IPsec SPD input/output case ongoing
* Documentation
+
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
** Trevor to help with contiv documentation.
+
**** 3n-tsh testbed unreachable, investigating right now - Juraj
* Action Items - Next Week
+
***** TG firmware is under upgradation
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
***** Server unreachable due to firmware & driver update - resolved - update all done
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server, and requires steps on reproducing the issue from Juraj.
+
**** Release testing for 21.10 starts
** Sirshak to verify if Damjan's suggestion on VPP compiling with Mellanox PMD driver works. -  
+
** VPP Path
** [VPP performance Suite] shm issues seen sporadically. Khem to send an email to vpp-dev.
+
*** Voting and working fine.
** Sirshak to setup meeting with andy, tina and honnappa regarding fd.io lab purchase planning.
+
*** CentOS-8 jobs have been removed.
** Sirshak to send musdk instructions.
+
** VPP Device
** Juraj to try musdk enabled kernel and see if musdk and docker can coexist in the dirty kernel.
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
'''9/11/2018'''
+
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - Juraj
 +
***** Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
 +
**** AVF interface creation issue:
 +
***** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
****** race condition occur
 +
****** try mounting a part of /dev/vfio to see if issue can be resolved
 +
******  Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and looks good right now
 +
******* Will enable voting right soon after the patch gets merged
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
**** New servers are in the procurement process
 +
**** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
****** CPU not fully utilized on Arm, need further investigation
 +
** Intel NIC firmware upgrade on Arm - not supported
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
****** Enable DMC 620 more close to real system, but performance will drop
 +
****** Build a system using VPP memif and pktgen
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
**** Plan to try quad loop unrolling for ip6_lookup_inline function
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu
 +
**** Two patches unmerged
 +
***** l2: fix array-bounds error for prefetch on Arm https://gerrit.fd.io/r/c/vpp/+/33307
 +
***** ioam: fix prefetch out bound on Arm https://gerrit.fd.io/r/c/vpp/+/33506
 +
 
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
***** Try to use ansible to deploy VPP automatically
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input - Zach
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from maintainer and Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''10/12/2021'''
 
* Attendees
 
* Attendees
** Honnappa Nagarahalli
 
** Juraj Linkeš
 
** Sirshak Das
 
 
** Lijian Zhang
 
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Govindarajan Mohandoss
 
** Tina Tsou
 
** Tina Tsou
** Khemendra Kumar
+
 
* Action Items - Last Week
+
* CSIT
** Khem and Sachin to verify Sirshak's vectorization patches. - Ongoing(Khem)
+
** VPP Performance Test
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. - try make build-release and then make test. - Continue investigation on this issue.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
*** [Juraj] sends his steps to Khem to reproduce this issue.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** Lijian to talk to Damjan regarding adding Architecture specific TAG in make.- Code review internally
+
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week. - closed
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
** Juraj to investigate the compiler versions on fd.io lab machines. - No update
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** [VPP performance Suite] shm issues seen sporadically. Khem to send an email to vpp-dev. - Not reproducible so far, will keep observing.
+
***** Inbound IPsec: reproduced and need to investigate - Juraj
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing done.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 3n-tsh testbed unreachable, investigating right now - Juraj
 +
***** TG firmware is under ugradation
 +
***** Server unreachable due to firmware & driver update - resolved - update all done
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - Juraj
 +
***** Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
 +
**** AVF interface creation issue:
 +
***** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
****** race condition occur
 +
****** try mounting a part of /dev/vfio to see if issue can be resolved
 +
****** Talked with Peter, Juraj is working on prototype of mounting part of /dev/vfio
 +
******  x86 vpp device job is fine, duo to firmware & driver is old
 +
******  arm vpp device servers have drivers updated, vlan striping not allowed, vlan configuration cannot removed from lab view.
 +
******  only performance testbeds have NIC drivers updated
 +
******  maintainer doesn't want to a option from vpp config
 +
******  may need to check x86 have the same issue with the same version driver before reaching intel folks
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
**** New servers are in the procurement process
 +
**** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
****** CPU not fully utilized on Arm, need further investigation
 +
** Intel NIC firmware upgrade on Arm - not supported
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 
* VPP
 
* VPP
** Mellanox NIC not working with VPP caused by libmnl.so missing in cmake
+
** VPP SVE implementation - Lijian
*** [Sirshak]
+
**** SVE validation on FPGA platform - Conflunce page ready
** VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. - make build-release and make test work. Khem to see if make verify is still failing.
+
***** Run unit tests from DPDK and VPP bihash on FPGA
** Vectorization
+
***** Try Lijian's SVE patch to see any cycle count improvement
*** [Sirshak] Patches ready and sent for review. Need arm community feedback on performance implications.
+
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
*** [Khem/Nitin] verify how those patches effect
+
****** Run standalone SVE test cases on FPGA
** Tuning dual/quad loop
+
****** Ask for DMC 620 images to run for FPGA
*** [Lijian] Dual/Quad loop code change is under internal review. Sirshak gives some comments, and Lijian will update diff accordingly.
+
****** Enable DMC 620 more close to real system, but performance will drop
** [Khem] working on IPv4 testing failures with CSIT script
+
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu
 +
**** Two patches unmerged
 +
***** l2: fix array-bounds error for prefetch on Arm https://gerrit.fd.io/r/c/vpp/+/33307
 +
***** ioam: fix prefetch out bound on Arm https://gerrit.fd.io/r/c/vpp/+/33506
 +
 
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from maintainer and Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''09/28/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
 
 
* CSIT
 
* CSIT
** CSIT-1139 - parallelize 'make verify'
+
** VPP Performance Test
*** Parallazation is stable and working fine. We might see feature improvement requirement or new bugs.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
*** No new requirement so far.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** CSIT VPP Device updates
+
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
*** Two tickets on VPP FD.io done. Have to verify to make sure they are working.
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
*** x86 has improvement and some experiments are done. Juraj will try those experiments on ARM platforms.
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** CSIT Performance Test Suite Updates
+
***** Inbound IPsec: reproduced and need to investigate - Juraj
*** Facing issues with ip4 forwarding test case.
+
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
*** shm issues seen sporadically. Khem to send an email to vpp-dev.
+
**** Release testing done.
* FD.io lab
+
**** IPsec SPD input/output case ongoing
** 3 LF tkts - 1 Node Topology wiring mcbin, 3-node topology wiring Mcbin, 1-node Topology wiring Cavium. - Finished
+
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
* Documentation
+
**** 3n-tsh testbed unreachable, investigating right now - Juraj
** Sirshak to take a look.
+
***** TG firmware is under ugradation
* Action Items - Next Week
+
** VPP Path
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
*** Voting and working fine.
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server, and requires steps on reproducing the issue from Juraj.
+
*** CentOS-8 jobs have been removed.
** Lijian to update dual/quad loop code review per comments
+
** VPP Device
** Sirshak to verify if Damjan's suggestion on VPP compiling with Mellanox PMD driver works
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
** Juraj to investigate the compiler versions on fd.io lab machines. Upgrade to latest GCC-8.2.0
+
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
** [VPP performance Suite] shm issues seen sporadically. Khem to send an email to vpp-dev.
+
***** Try to reproduce with another set of firmware and etc but issues still exist
** Honnappa to confirm the if we order NIC for ThunderX1, specify if they are external NICs
+
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.  
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - Juraj
 +
***** Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
 +
**** AVF interface creation issue:
 +
***** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
****** race condition occur
 +
****** try mounting a part of /dev/vfio to see if issue can be resolved
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
**** New servers are in the procurement process
 +
**** Plan to replace old thunderx1 build servers with more advanced Arm servers
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu
 +
**** Two patches unmerged
 +
***** l2: fix array-bounds error for prefetch on Arm https://gerrit.fd.io/r/c/vpp/+/33307
 +
***** ioam: fix prefetch out bound on Arm https://gerrit.fd.io/r/c/vpp/+/33506
 +
 
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from maintainer and Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
  
'''9/4/2018'''
+
'''09/14/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
 
 
** Lijian Zhang
 
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 
** Tina Tsou
 
** Tina Tsou
** Khemendra Kumar
+
* CSIT
* Action Items - Last Week
+
** VPP Performance Test
** Lijian to try merging it upstream - Mellanox Changes. - Resolved.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** Sachin to introduce RFC for IPsec offload support in DPDK plugin. - No updates.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. - try make build-release and then make test.
+
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
** Lijian to talk to Damjan regarding adding Architecture specific TAG in make.- Discuss Internally.  
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
** Juraj to investigate the compiler versions on fd.io lab machines. - No updates.
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Outbound IPsec finished.
 +
******* Waiting for new version of patchset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate - Juraj
 +
******** Learn more about RFC and need time to understand more - on hold - waiting Neale's response
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing done.
 +
***** A few more jobs run for release 21.06 and will be finished soon
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - juraj
 +
***** Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
 +
**** AVF interface creation issue:
 +
***** Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 
* VPP
 
* VPP
** VPP-1339 - Mellanox NIC not working with VPP. - Resolved
+
** VPP SVE implementation - Lijian
** VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. - make build-release and make test work. Khem to see if make verify is still failing.
+
*** Vector length specific patch is ready
** Vectorization
+
*** SVE patch ready and upstreamed, under review - Lijian
*** [Sirshak] Patches ready and sent for review. Need arm community feedback on performance implications.
+
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
** Tuning dual/quad loop
+
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
*** [Lijian/Brian] Investigate dynamic function selection. Brian to looking at this.
+
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
*** Try IPv4 multicast testing to verify the scenario when refcnt > 1
 +
**** GDB shows that mbufs are copied instead of reference from src port to all dst ports
 +
**** Direct/Indirect mbuf for VPP multicast testing
 +
**** Try IPv4 multicasting & L2 flood testing which works fine
 +
**** ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
 +
***** show mbuf is copied so that ref_cnt will always be one
 +
****** dpdk 21.08 have the patches, need to verify on vpp
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.  
 +
**** For 64B cacheline size native build on Arm, may need to change code.  
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
**** Patch split into 3 components
 +
***** acl: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33597 (Merged)
 +
***** dpdk: fix prefetch assert on Arm https://gerrit.fd.io/r/c/vpp/+/33598 (Merged)
 +
***** session: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33599 (Merged)
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from maintainer and Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''09/07/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 
* CSIT
 
* CSIT
** CSIT-1139 - parallelize 'make verify'
+
** VPP Performance Test
*** No updates
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** CSIT VPP Device updates
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
*** No updates
+
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
** CSIT Performance Test Suite Updates
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
*** Facing issues with ip4 forwarding test case.
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
*** shm issues seen sporadically. Khem to send an email to vpp-dev.
+
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
* FD.io lab
+
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
** 3 LF tkts - 1 Node Topology wiring mcbin, 3-node topology wiring Mcbin, 1-node Topology wiring Cavium.
+
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
* Documentation
+
******* Outbound IPsec finished.
** Sirshak to take a look.
+
******* Waiting for new version of patchset to verify test cases
* Action Items - Next Week
+
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
******* Inbound IPsec: reproduced and need to investigate - Juraj
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server.
+
******** Learn more about RFC and need time to understand more - on hold - waiting Neale's response
** Lijian to talk to Damjan regarding adding Architecture specific TAG in make.
+
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
** Juraj to investigate the compiler versions on fd.io lab machines.
+
**** Release testing done.
** [VPP performance Suite] shm issues seen sporadically. Khem to send an email to vpp-dev.
+
***** A few more jobs run for release 21.06 and will be finished soon
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.  
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
***** Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
 +
***** Dig into the log for more details - juraj
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
**** Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
*** Try IPv4 multicast testing to verify the scenario when refcnt > 1
 +
**** GDB shows that mbufs are copied instead of reference from src port to all dst ports
 +
**** Direct/Indirect mbuf for VPP multicast testing
 +
**** Try IPv4 multicasting & L2 flood testing which works fine
 +
**** ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
 +
***** show mbuf is copied so that ref_cnt will always be one
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
**** Patch split into 3 components
 +
***** acl: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33597 (Merged)
 +
***** dpdk: fix prefetch assert on Arm https://gerrit.fd.io/r/c/vpp/+/33598 (Under review)
 +
***** session: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33599 (Merged)
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
***** Patch has been upstreamed and recieved comments from one Intel engineer
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
  
'''8/28/2018'''
+
'''08/31/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Lijian Zhang
** Sachin Saxena
+
** Govindarajan Mohandoss
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Outbound IPsec finished.
 +
******* Waiting for new version of patchset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate - Juraj
 +
******** Learn more about RFC and need time to understand more - on hold - waiting Neale's response
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing done.
 +
***** A few more jobs run for release 21.06 and will be finished soon
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
**** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
***** config option set to N, /dev/vfio device or resource busy error
 +
***** config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
 +
****** The longer the server runs, more test cases fails
 +
***** Next to do
 +
****** Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
 +
******* Juraj modified script to reproduce the issue - Lijian will try it locally
 +
******* Also seen in Intel QAT card from Zach
 +
****** Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
 +
****** Lijian have slight different firmware version, driver version
 +
****** Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
 +
****** Lijian can use Juraj's script to reproduce the issue on local tx2 server
 +
******* Reducing the numa buffer allocation size resolves this issue
 +
******* Observed from the error log of numa buffer allocation
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
****** Run standalone SVE test cases on FPGA
 +
****** Ask for DMC 620 images to run for FPGA
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
**** VPP CLI configuration and hotspot function are recorded in Confluence page
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
*** Try IPv4 multicast testing to verify the scenario when refcnt > 1
 +
**** GDB shows that mbufs are copied instead of reference from src port to all dst ports
 +
**** Direct/Indirect mbuf for VPP multicast testing
 +
**** Try IPv4 multicasting & L2 flood testing which works fine
 +
**** ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
 +
***** show mbuf is copied so that ref_cnt will always be one
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
**** Patch split into 3 components
 +
***** acl: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33597 (Merged)
 +
***** dpdk: fix prefetch assert on Arm https://gerrit.fd.io/r/c/vpp/+/33598 (Under review)
 +
***** session: fix prefetch out of struct bound on Arm https://gerrit.fd.io/r/c/vpp/+/33599 (Merged)
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Wait for v10 kernel patch to fix intermittent large statistic number for events
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''08/24/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 
** Juraj Linkes
 
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Outbound IPsec finished.
 +
******* Waiting for new version of patchset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate - Juraj
 +
******** Learn more about RFC and need time to understand more - on hold - waiting Neale's response
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing done.
 +
***** A few more jobs run for release 21.06 and will be finished soon
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
**** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
***** config option set to N, /dev/vfio device or resource busy error
 +
***** config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
 +
****** The longer the server runs, more test cases fails
 +
***** Next to do
 +
****** Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
 +
******* Juraj modified script to reproduce the issue - Lijian will try it locally
 +
******* Also seen in Intel QAT card from Zach
 +
****** Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
 +
****** Lijian have slight different firmware version, driver version
 +
****** Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
 +
****** Lijian can use Juraj's script to reproduce the issue on local tx2 server
 +
******* Reducing the numa buffer allocation size resolves this issue
 +
******* Observed from the error log of numa buffer allocation
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
**** Need vexxhost guys confirm with ethernet / power cable type info.
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
***** Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
*** Try IPv4 multicast testing to verify the scenario when refcnt > 1
 +
**** GDB shows that mbufs are copied instead of reference from src port to all dst ports
 +
**** Will try L2 flood test case & understand VPP/multicast code
 +
**** Direct/Indirect mbuf for VPP multicast testing
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
**** Issues about prefetch on current VPP code base
 +
***** Issue 1 support 128B/64B cache-line size in Arm image
 +
***** Issue 2 prefetch 'overflow' for native build
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 +
'''08/17/2021'''
 +
* Attendees
 
** Lijian Zhang
 
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 
** Tina Tsou
 
** Tina Tsou
** Khemendra Kumar
+
* CSIT
** Honnappa Nagarahalli
+
** VPP Performance Test
** Brian Brooks
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
* Action Items - Last Week
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** Khem to create Jira Tkt - startup-config issues (NUMA node and memory issues). Jira ID [VPP-1405]
+
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
** Sirshak to ask Juraj to create a LF tkt for Power cycling. - Done
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
** Lijian to follow up Mellanox issue. - Done. Patch Verified. To try merging it upstream.
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** Andy following up on cavium. - Done. Unavailability of resource from cavium. Box not priority right now, will take up later.
+
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
** Khem to create Jira IDs for Jumbo frames. [CSIT-1259]In CSIT performance suite, Jumbo frames TCs failing on ARM servers
+
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
** Sachin to introduce RFC for IPsec offload support in DPDK plugin. - No updates, trying to resolve cmake issues.
+
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
** Sirshak to add porting and tuning section to wiki. - Done
+
******* Outbound IPsec finished.
** Sirshak/Juraj talk about Mellanox issue in CSIT call. - Done
+
******* Waiting for new version of patchset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate - Juraj
 +
******** Learn more about RFC and need time to understand more
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing ongoing
 +
***** A few more jobs run for release 21.06 and will be finished soon
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.  
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
**** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
***** config option set to N, /dev/vfio device or resource busy error
 +
***** config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
 +
****** The longer the server runs, more test cases fails
 +
***** Next to do
 +
****** Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
 +
******* Juraj modified script to reproduce the issue - Lijian will try it locally
 +
******* Also seen in Intel QAT card from Zach
 +
****** Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
 +
****** Lijian have slight different firmware version, driver version
 +
****** Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
 +
****** Lijian can use Juraj's script to reproduce the issue on local tx2 server
 +
******* Reducing the numa buffer allocation size resolves this issue
 +
******* Observed from the error log of numa buffer allocation
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian/Jieqiang has got VPN access now
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 
* VPP
 
* VPP
** VPP-1339 - Mellanox NIC not working with VPP - Mellanox provided DPDK Patch ready, Lijian to try upstream it to VPP.
+
** VPP SVE implementation - Lijian
** VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. - try make build-release and then make test.
+
*** Vector length specific patch is ready
** Vectorization
+
*** SVE patch ready and upstreamed, under review - Lijian
*** [Sirshak] Rough patch ready. Currently facing a crash due it.
+
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
** Tuning dual/quad loop
+
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
*** Discussion ongoing with Damjan.
+
**** Juan met and fixing some issue running SVE in qemu VM
** Lijian to talk to Damjan regarding adding Architecture specific TAG in make.
+
**** SVE validation on FPGA platform - Conflunce page ready
** Also to proceed with generic compilation to build for all micro architecture and do dynamic selection.
+
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
**** Patch upstreamed https://gerrit.fd.io/r/c/vpp/+/33422
 +
*** IPv6 profiling
 +
**** Hotspot function - ip6_lookup_node/ip6_rewrite_node
 +
**** Will try perfmon & understand two node functions
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
*** Try IPv4 multicast testing to verify the scenario when refcnt > 1
 +
**** GDB shows that mbufs are copied instead of reference from src port to all dst ports
 +
**** Will try L2 flood test case & understand VPP/multicast code
 +
**** Direct/Indirect mbuf for VPP multicast testing
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.  
 +
**** For 64B cacheline size native build on Arm, may need to change code.  
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
**** Issues about prefetch on current VPP code base
 +
***** Issue 1 support 128B/64B cache-line size in Arm image
 +
***** Issue 2 prefetch 'overflow' for native build
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** Discussion on the default action on the IPsec inbound interface which does not match
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
***** Modify the commit message and upstream the perfmon patch - Zach
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''08/10/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 
* CSIT
 
* CSIT
** Juraj to investigate the compiler versions on fd.io lab machines.
+
** VPP Performance Test
** CSIT-1139 - parallelize 'make verify'
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
*** Discussing with EdK, how to use that in jenkins job.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** CSIT VPP Device updates
+
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
*** No problems. Trying basic package installation of container topology.
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
*** aarch64 vpp packages not built for 18.04 LTS, potential problem when we switch from 16.04->18.04.  
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** CSIT Performance Test Suite Updates
+
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
*** No updates.  
+
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
* FD.io lab
+
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
** 3 LF tkts - 1 Node Topology wiring mcbin, 3-node topology wiring Mcbin, 1-node Topology wiring Cavium.
+
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
* Documentation
+
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
* Action Items - Next Week
+
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
** Lijian yo try merging it upstream - Mellanox Changes.
+
******* Outbound IPsec finished.
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
******* Waiting for new version of patcheset to verify test cases
** Khem VPP-1391 - VPP 'make verify' failed on Huawei Taishan server. - try make build-release and then make test.
+
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
** Lijian to talk to Damjan regarding adding Architecture specific TAG in make.
+
******* Inbound IPsec: reproduced and need to investigate
'''8/21/2018'''
+
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing ongoing
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
`
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.  
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
**** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
***** config option set to N, /dev/vfio device or resource busy error
 +
***** config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
 +
****** The longer the server runs, more test cases fails
 +
***** Next to do
 +
****** Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
 +
******* Juraj modified script to reproduce the issue - Lijian will try it locally
 +
******* Also seen in Intel QAT card from Zach
 +
****** Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
 +
****** Lijian have slight different firmware version, driver version
 +
****** Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian has got VPN access now
 +
*** Juraj singed Jieqiang's key
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128, CLI issue only, CSIT's python API works fine.
 +
*** Internal patch to resolve this issue under review - upstreamed
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.  
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
*** 4 loop unrolling decreasing performance
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''08/03/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Lijian Zhang
** Sachin Saxena
+
** Govindarajan Mohandoss
 
** Juraj Linkes
 
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
 +
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
 +
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
 +
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
 +
******* Waiting for new version of patcheset to verify test cases
 +
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
 +
******* Inbound IPsec: reproduced and need to investigate
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing ongoing
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
`
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
*** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
**** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
***** Resulting in the same failure as before, only happen on AArch64 platform
 +
**** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
***** config option set to N, /dev/vfio device or resource busy error
 +
***** config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
 +
****** The longer the server runs, more test cases fails
 +
***** Next to do
 +
****** Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
 +
******* Also seen in Intel QAT card from Zach
 +
****** Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
 +
****** Will try Mellanox card to see if same issue happens - Juraj
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian has got VPN access now
 +
*** Juraj singed Jieqiang's key
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform - Conflunce page ready
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
** VPP IPv6 Benchmarking and Profiling
 +
*** VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
 +
*** Internal patch to resolve this issue under review
 +
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
**** Current VPP does not support 64B cacheline size compilation for Arm images.
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
*** 4 loop unrolling decreasing performance
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
***** Calico use cases exploration on VPP
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code - Lijian & Govind
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 +
'''07/27/2021'''
 +
* Attendees
 
** Lijian Zhang
 
** Lijian Zhang
** Andy Wang
+
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 
** Tina Tsou
 
** Tina Tsou
** Khemendra Kumar
+
* CSIT
** Honnappa Nagarahalli
+
** VPP Performance Test
** Brian Brooks
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
* Action Items - Last Week
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** [Sirshak] Create LF RT ticket for power cycling mcbins - Not Done yet
+
**** 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
** [Honnappa] Add module owners list and performance analysis items to wiki page - Discussion Still going on.
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
** [Lijian] Check if DPDK 18.08 helps Mellanox NIC issues. - Waiting for patch from Mellanox
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** [Sirshak] Create Jira ticket to see impact of Florin's patch : VPP-1401
+
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
** [Sirshak] Create Jira ticket for msb : VPP-1402
+
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
** [Khem] Try dual loop ip4_lookup_inline patch to see if it helps on A72-based D05. : Problems with Ipv4 forwarding(startup-config issues- NUMA Node and Memory issues).  
+
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
** [Brian] [https://projects.linaro.org/browse/LTN-10 LTN-10] - Help resolve VPP build failure on mcbins in FD.io lab
+
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
** [Juraj] Enable VPP Device on 1-node SoC now that SFP+ cables have arrived. : No Response from LF.
+
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
** [Sirshak] Follow up with Cavium regarding Ubuntu installation on cavium-4. Status: Andy Following up.
+
******* Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
** [Khem] Create Jira ticket for CSIT failures with jumbo frames
+
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
** [Khem] Create Jira ticket for running a subset of tests via a tag : [CSIT1250]In ARM Perf verify CI, running a subset of tests via a tag
+
******* Inbound IPsec: reproduced and need to investigate
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
 +
**** Release testing ongoing
 +
***** Comparison between 21.06 and 21.01.1 is ongoing.
 +
**** IPsec SPD input/output case ongoing
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
 +
***** May need to check VM and IPsec cases
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.  
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
****** Random issue, more frequently happening on Arm
 +
******* Not see in CI recently or manually.
 +
**** scapy unexpected timeout issue: packet drop or slow issue?
 +
****** vfio-pci driver may be the root cause - bind/unbind
 +
*** Connection issue between Jenkins and the build executor in FD.io lab
 +
** Shipment of new advanced server to the FD.io lab
 +
*** One link between TG and DUT, multiple link between DUT for testing LACP.
 +
*** Two advanced servers are in plan to ship
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian has got VPN access now
 +
*** Juraj singed Jieqiang's key
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 
* VPP
 
* VPP
** VPP-1339 - Mellanox NIC not working with VPP
+
** VPP SVE implementation - Lijian
** VPP-1391 - VPP 'make verify' failed on Huawei Taishan server
+
*** Vector length specific patch is ready
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
*** SVE patch ready and upstreamed, under review - Lijian
** Vectorization 
+
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
*** Stalled due to Mellanox NIC issues as benchmarking patches is not posible.
+
***** Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
*** hadd and msb - Done.
+
**** Juan met and fixing some issue running SVE in qemu VM
*** extendto and shuffle going on.
+
**** SVE validation on FPGA platform - Conflunce page ready
*** Shuffle using __built_in gives same performance as vector intrinsic as at -O2 neither compile tbl instruction.
+
***** Run unit tests from DPDK and VPP bihash on FPGA
** Tuning dual/quad loop
+
***** Try Lijian's SVE patch to see any cycle count improvement
** Sirshak to add Porting and Tuning Section to Wiki.
+
** VPP mbuf-fast-free tx offload
 +
*** Vector path shows performance improvement, still need to investigate scalar path
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
**** Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
 +
**** For 64B cacheline size native build on Arm, may need to change code.
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
*** 4 loop unrolling descreasing performance
 +
** VPP memif - Tianyu
 +
*** CNF PoC proposal preparation
 +
***** Add support for VPP aarch64 docker image build
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Review perfmon code: having some questions/comments, would like a review meeting - Lijian
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''07/20/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 
* CSIT
 
* CSIT
** CSIT-1139 - parallelize 'make test'
+
** VPP Performance Test
*** dave barach to take a final look and merge.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** Sirshak/Juraj to talk about having Mellanox in CSIT seeing current compatibility issues post-release.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
** CSIT VPP Device updates
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
*** Trying to get the 1-node topology: mcbin and cavium thunderx.
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** CSIT Perfomance Test Suite Updates
+
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
*** Current issues: NDR, PDR Jumbo frames failure but MRR passing. Memory and Numa Nodes issues in Taishan.  
+
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
* FD.io lab
+
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
** 3 LF tkts - Ubuntu Installation cavium-4, 1 Node Topology, Power Cycling mcbins(to be opened).
+
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
* Documentation
+
****** Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
** Documentation changes by Lijian Merged.
+
****** Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
* Action Items - Next Week
+
***** Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
** Khem to create Jira Tkt - startup-config issues (NUMA node and memory issues).
+
**** Release testing ongoing
** Sirshak to ask Juraj to create a LF tkt for Power cycling.
+
***** Comparison between 21.06 and 21.01.1 is ongoing.  
** Lijian to follow up Mellanox issue.
+
**** IPsec SPD input/output case ongoing
** Andy following up on cavium.
+
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
** Khem to create Jira IDs for Jumbo frames.
+
**** 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
** Sachin to introduce RFC for IPsec offload support in DPDK plugin.
+
***** May need to check VM and IPsec cases
** Sirshak to add porting and tuning section to wiki.
+
** VPP Path
** Sirshak/Juraj talk about Mellanox issue in CSIT call.
+
*** Voting and working fine.
+
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated.  
 +
****** This is fixed in DPDK 21.05 version by making iavf PMD as default.  
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
****** Random issue, more frequently happening on Arm
 +
****** vfio-pci driver may be the root cause - bind/unbind
 +
*** Connection issue between Jenkins and the build executor in FD.io lab
 +
** Shipment of new advanced server to the FD.io lab
 +
*** Two advanced servers are in plan to ship
 +
** VPN access request to FD.io Arm servers
 +
*** Lijian has got VPN access now
 +
*** Juraj singed Jieqiang's key
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  
'''8/14/2018'''
+
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform
 +
***** Run unit tests from DPDK and VPP bihash on FPGA
 +
***** Try Lijian's SVE patch to see any cycle count improvement
 +
** VPP mbuf-fast-free tx offload
 +
*** Performance improvement for IPv4 routing test cases using vector path
 +
** VPP Prefetch
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
**** https://gerrit.fd.io/r/c/vpp/+/33062
 +
**** https://gerrit.fd.io/r/c/vpp/+/33063
 +
**** https://gerrit.fd.io/r/c/vpp/+/33061
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
***** Patches have been upstreamed and waiting for review
 +
****** https://gerrit.fd.io/r/c/vpp/+/32420
 +
***** Prepare the memif readout - Tianyu
 +
***** Add support for VPP aarch64 docker image build
 +
** VPP IPsec on Arm - Govind
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
*** SPD prototype change on ipsec_output/encryption node - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''07/13/2021'''
 
* Attendees
 
* Attendees
 +
** Lijian Zhang
 
** Juraj Linkes
 
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
 +
****** Expected to be merged soon
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
 +
****** Hugepage size, numa-node, core isolation etc. may need to check.
 +
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
 +
**** Release testing ongoing
 +
**** IPsec SPD input/output case ongoing
 +
**** Juraj may share the steps how CSIT handle new configuration changes
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
 +
***** May need to check VM and IPsec cases
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
****** Random issue, more frequently happening on Arm
 +
****** vfio-pci driver may be the root cause - bind/unbind
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
** Shipment of new advanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
** VPN access request to FD.io Arm servers
 +
*** Will remind Machiek to sign Lijian's GPG public key.
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 +
 +
'''07/06/2021'''
 +
* Attendees
 
** Lijian Zhang
 
** Lijian Zhang
** Andy Wang
+
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 
** Tina Tsou
 
** Tina Tsou
** Khemendra Kumar
+
* CSIT
** Honnappa Nagarahalli
+
** VPP Performance Test
** Brian Brooks
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
* FD.io lab
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
** SFP+ cables shipment showing as delivered
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
 +
****** Expected to be merged soon
 +
***** Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
 +
****** Hugepage size, numa-node, core isolation etc. may need to check.
 +
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
 +
**** Release testing ongoing
 +
**** IPsec SPD input/output case ongoing
 +
**** Juraj may share the steps how CSIT handle new configuration changes
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
 +
***** May need to check VM and IPsec cases
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** CentOS-8 jobs have been removed.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
****** Random issue, more frequently happening on Arm
 +
****** vfio-pci driver may be the root cause - bind/unbind
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
** Shipment of new advanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
** VPN access request to FD.io Arm servers
 +
*** Will remind Machiek to sign Lijian's GPG public key.
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 
* VPP
 
* VPP
** VPP-1339 - Mellanox NIC not working with VPP
+
** VPP SVE implementation - Lijian
*** Lijian noticed DPDK version updated to 18.08 and might help - https://gerrit.fd.io/r/#/c/14154/
+
*** Vector length specific patch is ready
*** Tina helping find someone from Mellanox to help
+
*** SVE patch ready and upstreamed, under review - Lijian
** VPP-1391 - VPP 'make verify' failed on Huawei Taishan servers
+
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
*** Khem looking into this
+
**** Juan met and fixing some issue running SVE in qemu VM
** No updates on crypto
+
**** SVE validation on FPGA platform
** No updates on vectorization
+
** VPP Prefetch
** Tuning dual/quad loop
+
*** Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
*** DaveB suggests looking at MULTIARCH macros
+
**** Repeat the same test on Ampere server - PMU cache-miss less for write always
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
*** Investigating VPP classify function, use case, benchmarking - Lijian
 +
**** Start with simple use case
 +
**** VPP Classify basic inbound L3 src ip / prot case
 +
**** Benchmark VPP classifier on Arm/X86 platform
 +
***** Done some NEON changes, see some microbranchmark improvement
 +
*** investigate CSIT case
 +
**** No classify test case in CSIT. - Jieqiang
 +
**** may be there is a CSIT case named iacldstbase
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
**** Benchmarking DPDK memif vs VPP memif
 +
***** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
***** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
***** Patches have been upstreamed and waiting for review
 +
***** Prepare the memif readout - Tianyu
 +
***** Add support for VPP aarch64 docker image build
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
*** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
** VPP IPsec on Arm - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
**** VPP uses linear search on SPD lookups
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
'''06/29/2021'''
 +
* Attendees
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Zachary Leaf
 +
** Tina Tsou
 
* CSIT
 
* CSIT
** CSIT-1139 - parallelize 'make test'
+
** VPP Performance Test
*** Juraj updated patch with comments from Klement
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** Khem seeing failures with jumbo frames
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
** Khem noticed new CSIT machines using tag to run a subset of tests
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
* Documentation
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
** Lijian working on patch to add Arm to Architecture section and Arm-based CSIT testbeds to CSIT section
+
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
* Action Items - Next Week
+
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
** [Sirshak] Create LF RT ticket for power cycling mcbins
+
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
** [Honnappa] Add module owners list and performance analysis items to wiki page
+
****** Test cases with 1, 10, 100, 1000 SPD entries - still under review
** [Lijian] Check if DPDK 18.08 helps Mellanox NIC issues
+
****** Expected to be merged soon
** [Sirshak] Create Jira ticket to see impact of Florin's patch
+
***** Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
** [Sirshak] Create Jira ticket for msb
+
****** Hugepage size, numa-node, core isolation etc. may need to check.
** [Khem] Try dual loop ip4_lookup_inline patch to see if it helps on A72-based D05
+
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
** [Brian] Help resolve VPP build failure on mcbins in FD.io lab
+
**** Release testing ongoing
** [Juraj] Enable VPP Device on 1-node SoC now that SFP+ cables have arrived
+
**** IPsec SPD input/output case ongoing
** [Sirshak] Follow up with Cavium regarding Ubuntu installation on cavium-4
+
**** Juraj may share the steps how CSIT handle new configuration changes
** [Khem] Create Jira ticket for CSIT failures with jumbo frames
+
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
** [Khem] Create Jira ticket for running a subset of tests via a tag
+
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
**** 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
 +
***** May need to check VM and IPsec cases
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** Community plans to drop the support for CentOS-8.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
 +
****** Debugging
 +
****** vfio-pci driver may be the root cause - bind/unbind
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
** Shipment of new advanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
** VPN access request to FD.io Arm servers
 +
*** mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
 +
 
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform
 +
** VPP Prefetch
 +
*** Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
 +
**** Repeat the same test on Ampere server - PMU cache-miss less for write always
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** 3 patches: prefetch, key-value compare simd improvement, cache to look up
 +
*** Investigating VPP classify function, use case, benchmarking - Lijian
 +
**** Start with simple use case
 +
**** VPP Classify basic inbound L3 src ip / prot case
 +
**** Benchmark VPP classifier on Arm/X86 platform
 +
***** Done some NEON changes, see some microbranchmark improvement
 +
*** investigate CSIT case
 +
**** No classify test case in CSIT. - Jieqiang
 +
**** may be there is a CSIT case named iacldstbase
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
**** Benchmarking DPDK memif vs VPP memif
 +
***** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
***** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
***** Patches have been upstreamed and waiting for review
 +
***** Prepare the memif readout - Tianyu
 +
***** Add support for VPP aarch64 docker image build
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
*** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
** VPP IPsec on Arm - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
**** VPP uses linear search on SPD lookups
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
  
'''8/7/2018'''
+
'''06/22/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
 
** Juraj Linkes
 
** Juraj Linkes
** Lijian
+
** Tianyu Li
** Andrew Pinski
+
** Jieqiang Wang
** Andy Wang
+
** Zachary Leaf
 
** Tina Tsou
 
** Tina Tsou
** Nitin Saxena
+
* CSIT
** Khemendra
+
** VPP Performance Test
** Sachin Saxena
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Test cases with 1, 10, 100, 1000 SPD entries
 +
****** Expected to be merged soon
 +
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
 +
**** Release testing ongoing
 +
**** IPsec SPD input/output case ongoing
 +
**** Juraj may share the steps how CSIT handle new configuration changes
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm
 +
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** Community plans to drop the support for CentOS-8.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
****** Resulting in the same failure as before, only happen on AArch64 platform
 +
****** vfio-pci driver may be the root cause
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
** Shipment of new advanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
** VPN access request to FD.io Arm servers
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform
 +
** VPP Prefetch
 +
*** Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
 +
**** Repeat the same test on Ampere server
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** Investigating VPP classify function, use case, benchmarking - Lijian
 +
**** Start with simple use case
 +
**** VPP Classify basic inbound L3 src ip / prot case
 +
**** Benchmark VPP classifier on Arm/X86 platform
 +
***** Done some NEON changes, see some microbranchmark improvement
 +
*** investigate CSIT case
 +
**** No classify test case in CSIT. - Jieqiang
 +
**** may be there is a CSIT case named iacldstbase
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
**** Benchmarking DPDK memif vs VPP memif
 +
***** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
***** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
***** Patches have been upstreamed and waiting for review
 +
***** Prepare the memif readout - Tianyu
 +
***** Add support for VPP aarch64 docker image build
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
*** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
** VPP IPsec on Arm - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
**** VPP uses linear search on SPD lookups
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Internal review for IPsec input node flow cache implementation - Zach & Govind
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
  
* General Topic
+
'''06/15/2021'''
* Action Items - Last Week
+
* Attendees
** [Khem] make verify on Taishan failure Status: No Status. Khem to create a Jira Tkt.
+
** Lijian Zhang
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status: [Andy] Cables to be sent today.
+
** Govindarajan Mohandoss
** [Sirshak] Open Jira tkt look at Florin's patch. Status: To be done next week
+
** Juraj Linkes
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status: Failing in different place, like rx-error reported to Mellanox people by Lijian. [Lijian] To send the mail vpp-dev. [Honnappa] To talk to DPDK Mellanox DPDK community.
+
** Tianyu Li
** [Sirshak] Share Mellanox settings with nitin.
+
** Jieqiang Wang
** [Sirshak] to send email to yi and lijian for documentation. Status: Lijian has done the documentation under internal review.
+
** Zachary Leaf
** [Honnappa/Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Status: [Sachin] To include Nitin suggestions and upstream.
+
** Tina Tsou
** [Sirshak] Send mail to LF for power cycler access for mcbin due to lack of IPMI interface Status: Sirshak to open LF Tkt
+
** [Nitin] Create a Jira Tkt - ip cksum 128 bit vector support. Status: Nitin using ip incremental cksum.
+
** [Sirshak] To create a LF tkt for Ubuntu 16.04 installation on cavium-4. Status: Anton tried 16.04 but it didnt work, sent mail to Cavium contact for help.
+
* VPP
+
** [Sirshak] Vectorization
+
*** msb is already implemented verifying correctness and performance.
+
*** [Sirshak] To raise a Jira Tkt for msb changes.
+
*** Have communicated to ARM compiler team related to vtbl performance.
+
*** planning to add cvt (extend_to) and hadd(horizontal) equivalents.
+
** [Brian/Sirshak] Tuning Dual or Quad loop.
+
*** [Khem/Sachin] Updates on the changes seen after applying Brian's Patch. Status: No Updates.
+
** [Sachin] To create Jira Card, DPDK IOVA issue. (Created VPP-1377)
+
** [Honnappa] Module Ownership Discussion. Status: To come back to discussion next time. Community feedback to move to more use-case based approach.
+
 
* CSIT
 
* CSIT
** [Juraj] Parallelizing the make test(CSIT-1139) Status: Scheduling Done, Waiting for community review, got some internal comments Juraj working on it. To try this patch on jenkins sandbox.  
+
** VPP Performance Test
** [Sirshak] replying to cavium regarding Ubuntu 18.04/16.04 installation problem cavium-4. Done. Status: Following up with Cavium
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** [Khem] Performance Suite: 64B, 9000Jumbo. Jumbo Frames is failing.(khem to jira tkt: startup.conf, Frame size, NIC Card, Hugepages configuration).
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
** [Khem] Have a subset of tests running with tag.
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
** [Juraj/Sirshak] VPP Device SoC one node topology constraints Status: Orchestration still under discussion.  
+
**** Add new IPSec NULL encryption & decryption test cases - Juraj
* fd.io lab
+
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
** [Juraj] mcbin access Status: Accessible mcbin build failing, wait fro Brian for help.
+
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
** [Sirshak] cavium blades. Status: [Sirshak] Following up with cavium
+
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
* Documentation
+
***** VPP exits with the IPsec startup config, try startup config from Zach's email - Juraj
** Need to update the working ARM boards in the documentation section.
+
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
**** Release testing ongoing
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
**** IPsec SPD input/output case ongoing
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
**** Juraj may share the steps how CSIT handle new configuration changes
** Subscribe to: docs@lists.fd.io
+
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
* Action Items - Next Week
+
***** Juraj is investigating running those test cases with 2N-TX2 topology.
** [Khem] make verify on Taishan failure, Khem to create a Jira Tkt. Status:
+
***** Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status:
+
**** Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
** [Sirshak] Open Jira tkt look at Florin's patch. Status: Not done to be done next week
+
 
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status:
+
** VPP Path
** [Lijian] To send the mail vpp-dev (VPP-1339) Status:
+
*** Voting and working fine.
** [Honnappa] To talk to DPDK Mellanox DPDK community. Status:
+
*** Community plans to drop the support for CentOS-8.
** [Sachin] To include Nitin suggestions and upstream.(ARMv8 Crypto changes) Status:
+
** VPP Device
** [Sirshak] To Open a LF Tkt regarding power cycler remote access fro mcbin. Status:
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
** [Sirshak] To raise a Jira Tkt for msb changes. Status:  
+
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
'''7/30/2018'''
+
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
 +
***** New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
 +
****** /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly. - DaveW
 +
** Shippment of new adavanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
* VPP
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
**** Juan met and fixing some issue running SVE in qemu VM
 +
**** SVE validation on FPGA platform
 +
** VPP Prefetch
 +
*** Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
 +
**** Repeat the same test on Ampere server
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** Investigating VPP classify function, use case, benchmarking - Lijian
 +
**** Start with simple use case
 +
**** VPP Classify basic inbound L3 src ip / prot case
 +
**** Benchmark VPP classifier on Arm/X86 platform
 +
***** Done some NEON changes, see some microbranchmark improvement
 +
*** investigate CSIT case
 +
**** No classify test case in CSIT. - Jieqiang - may be there is a CSIT case named iacldstbase
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
**** Benchmarking DPDK memif vs VPP memif
 +
***** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
***** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
***** Patches have been upstreamed and waiting for review
 +
***** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
*** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
** VPP IPsec on Arm - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
 +
**** VPP uses linear search on SPD lookups
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Waiting for review comments on outbound side before upstream to VPP
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
**** Building Intel QAT driver on arm to test IPsec crypt - Zach
 +
 
 +
 
 +
'''06/08/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Lijian Zhang
 +
** Govindarajan Mohandoss
 
** Juraj Linkes
 
** Juraj Linkes
** Lijian
+
** Tianyu Li
** Andrew Pinski
+
** Jieqiang Wang
** Andy Wang
+
** Zachary Leaf
 
** Tina Tsou
 
** Tina Tsou
** Nitin Saxena
+
* CSIT
** Khemendra
+
** VPP Performance Test
** Sachin Saxena
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
***** New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
***** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
***** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
***** VPP exits with the IPsec startup config, try startup config from Zach's email - Juraj
 +
***** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Work in progress.
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Some container test cases failed on all platform.
 +
** VPP Path
 +
*** Voting and working fine.
 +
*** Community plans to drop the support for CentOS-8.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
***** Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results.
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
** Shippment of new adavanced server to the FD.io lab
 +
*** New servers are in shortage.
 +
* VPP
 +
** VPP default compiler on Arm platform
 +
*** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
**** Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
 +
***** No obvious performance improvement, keep the original default compiler
 +
** VPP SVE implementation - Lijian
 +
*** Vector length specific patch is ready
 +
*** SVE patch ready and upstreamed, under review - Lijian
 +
**** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
**** SVE validation on FPGA platform
 +
** VPP Prefetch
 +
*** Benchmark VPP using prefetch read always vs prefetch write always - Jieqiang
 +
*** Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
 +
** VPP Classifier - Lijian
 +
*** Investigating VPP classify function, use case, benchmarking - Lijian
 +
**** Start with simple use case
 +
**** VPP Classify basic inbound L3 src ip / prot case
 +
**** Benchmark VPP classifier on Arm/X86 platform
 +
*** investigate CSIT case
 +
**** No classify test case in CSIT. - Jieqiang
 +
** VPP memif - Tianyu
 +
*** Investigating VPP memif - Tianyu
 +
**** Benchmarking DPDK memif vs VPP memif
 +
***** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
***** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
***** Patches have been upstreamed and waiting for review
 +
***** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
*** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
** VPP IPsec on Arm - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input/output nodes - Govind & Zach
 +
**** VPP uses linear search on SPD lookups
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Waiting for review comments on outbound side before upstream to VPP
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** VPP Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
  
* General Topic
+
 
* Action Items - Last Week
+
'''06/01/2021'''
** [Khem] make verify on Taishan failure Status: No Status
+
* Attendees
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status: [Andy] Still working internally, expecting to be done this week.
+
** Govindarajan Mohandoss
** [Sirshak] mlnx tx non vector version used for no-multiseg. Status: Jira Tkt VPP-1355
+
** Juraj Linkes
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status: Yet to be verified , if fixed.
+
** Zachary Leaf
** [Sirshak] look at Florin's patch. Status: No status, [Sirshak] Open Jira tkt.
+
** Tina Tsou
** [Tina] to get back on New ARMv8 Crypto. Status: Bob to schedule meeting with Cavium. To be tracked by Nitin, bob, tina.
+
** [Sirshak] Why Quad to Dual loop improves performance. Status: VPP-1356
+
** [Sirshak] To update VPP documentation with fd.io lab devices. Status: Not yet done. [Sirshak] to send email to yi and lijian.
+
** [Sirshak] VPP Vectorization Jira Tkt. Status: VPP-1357
+
** [Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Status: Waiting for Nitin to help on changes for Internal DPDK. External DPDK support patch done. [Sachin : created VPP-1378] To create a Jira Tkt for Internal Tkt. [Honnappa] To comment on current gerrit item to get it moving.
+
** [Sirshak] replying to cavium regarding Ubuntu 18.04 installation problem cavium-4. Status: Sent
+
** [Sirshak] Get credentails from Brian for mcbin Status: Done
+
** [Sirshak] Send mail to LF for power cycler access for mcbin due to lack of IPMI interface Status: Done
+
* VPP
+
** [Sirshak] Vectorization
+
*** Almost done with shuffle.
+
*** Will get to working with msb.
+
*** AARCH32 compilation to be discussed.(Shuffle Vector Intrinsic AARCH64 ARMv8 specific)
+
*** There are no specific requirements on aarch32 at this time.
+
** [Lijian && Yi] To continue effort on analyzing IPv4 nos on available platforms with Intel and Mellanox NICs
+
*** [Sirshak] Why is Mellanox NIC not used in CSIT ? Performance Suite Designed for Intel and Cisco NICs.
+
** [Brian/Sirshak] Tuning Dual or Quad loop.
+
*** [Khem/Sachin] Updates on the changes seen after applying Brian's Patch. Status: No Updates.
+
** [Khem] Updates on Benchmarking on taishan. Status: Held up hardware.
+
** [Nitin] Any new findings from IPv4 VPP test case. Status: Working HW offloading.
+
** [Sachin] To create Jira Card, DPDK IOVA issue. (Created VPP-1377)
+
** [Lijian] ipcksum - No Degradation on Qualcomm.
+
** [Nitin] Create a Jira Tkt - ip cksum 128 bit vector support.
+
 
* CSIT
 
* CSIT
** [Juraj] Parallelizing the make test(CSIT-1139) Status: All VPP instances running on same core. Tried scheduling cores. Dynamically finding available cores. Sweetspot currently:  8 containers with 96 core.
+
** VPP Performance Test
** [Juraj] Test features listed by talking to dave.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** [Sirshak] replying to cavium regarding Ubuntu 18.04 installation problem cavium-4. Done. Status: To open a new LF tkt to ask for 16.04 installation.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
** [Juraj/Sirshak] VPP Device SoC one node topology constraints Status: Orchestration still under discussion.  
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
** [Sirshak] to ask brian about mcbin credentials. Status: Done.
+
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
* fd.io lab
+
***** IPSec policy test cases are not running by default.
** [Juraj] mcbin access Status: Created LF tkt.
+
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
** [Sirshak] cavium blades. Status: [Sirshak] To create a LF tkt for Ubuntu 16.04 installation on cavium-4.  
+
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
* Documentation
+
****** Add new IPSec NULL encryption & decryption test cases - Juraj
** Need to update the working ARM boards in the documentation section.
+
******* New IPSec SPD test cases will not have NULL encrypt/decrypt config.
*** [Lijian/Yi] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin).
+
******* IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
*** [Lijian/Yi] Add only fd.io lab devices.
+
****** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
*** [Sirshak] To send email with details.
+
****** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
****** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Work in progress.
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
** Subscribe to: docs@lists.fd.io
+
***** Juraj is investigating running those test cases with 2N-TX2 topology.
* Action Items - Next Week
+
***** Juraj will investigate cabling issue on Taishan performance test-bed - resolved.
** [Khem] make verify on Taishan failure Status:
+
***** Some container case are seems failure on all platform.
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status: [Andy]
+
** VPP Path
** [Sirshak] Open Jira tkt look at Florin's patch. Status:
+
*** Voting and working fine.
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status:
+
** VPP Device
** [Sirshak] to send email to yi and lijian for documentation.
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
** [Honnappa/Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Status:
+
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
** [Sirshak] Send mail to LF for power cycler access for mcbin due to lack of IPMI interface Status:
+
**** VPP community is responding this issue actively. - Juraj
** [Nitin] Create a Jira Tkt - ip cksum 128 bit vector support.
+
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
** [Sirshak] To create a LF tkt for Ubuntu 16.04 installation on cavium-4.  
+
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
*** Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
 +
** Vector length specific patch is ready
 +
** Investigating VPP classify function, use case, benchmarking - Lijian
 +
*** Start with simple use case
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
** SVE patch ready and upstreamed, under review - Lijian
 +
*** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
*** Review memif patch
 +
**** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
*** VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case - No classify test case in CSIT. - Jieqiang
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
** Work on IPsec input/output nodes - VPP uses linear search on SPD lookups - Govind & Zach
 +
*** SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
 +
**** https://gerrit.fd.io/r/c/vpp/+/31694
 +
**** IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
 +
***** Testing of flow cache functionality, including hash collisions and stale entry overwrites
 +
*** IPSec input node/decryption flow cache implemented in a separate patch - Zach
 +
**** Waiting for review comments on outbound side before upstream to VPP
 +
**** Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
 +
** Perfmon plugin enablement on Arm - Zach
 +
*** Implemented statistics from PMUv3 - done
 +
**** Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
 +
***** https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/
 +
*** Investigated CMN-600 stats in perfmon plugin
 +
**** Abandoned, CMN-600 only gives system level view, no useful stats at node level - linux perf tool can give the same result
  
'''7/24/2018'''
+
'''05/25/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Lijian Zhang
 +
** Govindarajan Mohandoss
 
** Juraj Linkes
 
** Juraj Linkes
** Lijian
+
** Jieqiang Wang
** Andrew Pinski
+
** Zachary Leaf
** Andy Wang
+
** Tianyu Li
 
** Tina Tsou
 
** Tina Tsou
** Nitin Saxena
+
* CSIT
** Khemendra
+
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
 +
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
 +
****** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
******* New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
******* IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
 +
****** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
****** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
 +
****** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - will look into it
 +
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Juraj will investigate cabling issue on Taishan performance test-bed - resolved.
 +
***** Some container case are seems failure on all platform.
 +
** VPP Path
 +
*** Voting and working fine.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
****** Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
 +
***** Try to reproduce with another set of firmware and etc but issues still exist
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** Try the new version of DPDK but it does not help
 +
****** Contact Intel devs for the possible advice
 +
****** Workaround may impact too much to all test cases
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
*** Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
 +
** Vector length specific patch is ready
 +
** Investigating VPP classify function, use case, benchmarking - Lijian
 +
*** Start with simple use case
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
** SVE patch ready and upstreamed, under review - Lijian
 +
*** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
*** Review memif patch
 +
**** Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
 +
*** VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case - No classify test case in CSIT. - Jieqiang
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Performance improvement using loop unrolling for memif nodes - running CSIT perftest
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform. - Govind
 +
*** IPsec input node optimization work in progress - Zach & Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** IPSec unit test - make test new cases implementation
 +
***** Make test cases for IPSec policy mode - Done, included in Govind's patch, waiting for maintainer review - Zach
 +
****** Add/Remove/Add+Remove+Readd/Hash collisions/Multiple interfaces & rules
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** perfmon CMN-600 investigating - Zach
 +
*** VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
 +
*** Plan to upstream perfmon plugin - resolving review comments - Zach
 +
*** IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
 +
*** IPSec decryption / input node - Zach
  
* General Topic
+
'''05/18/2021'''
** .
+
* Attendees
* Action Items - Last Week
+
** Lijian Zhang
** [Honnappa/Nitin] Aarch64 rpms not building - DPDK Neon Build Break. Status: Nitin, compiler does not support arm neon intrinsics. Honnappa working with compiler team: neon intrinsics is supported #defines not present. Tmp solution available. Honnappa to follow up in DPDK.
+
** Govindarajan Mohandoss
** [Nitin/Sachin] Follow up: Add Virtual addressing support in IOVA dmap Status: patch committed by sachin merged.
+
** Juraj Linkes
** [Khem] make verify on Taishan failure Status: No updates.
+
** Zachary Leaf
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status: PO Approved. Should get going in few days.
+
** Tianyu Li
** [Sirshak] mlnx tx non vector version used for no-multiseg. Status: Waiting for x86 hotspots for confirmation and will then open a ticket.
+
** Tina Tsou
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status: Seems to be fixed but have not tried all the test cases to confirm.
+
** [Sirshak] look at Florin's patch. Status: Not done yet
+
** [Tina] to get back on New ARMv8 Crypto. Status: No updates. Close to complete but not upstreamed yet.
+
** [Sirshak] Why Quad to Dual loop improves performance. Status: Not saturating no of outstanding prefetches. AI to raise a Jira Bug.
+
** [Sirshak] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin). Status; Not done yet.
+
* VPP
+
** [Sirshak] vectorization patch effects
+
*** Made few changes no visible changes.
+
*** Plan to read mlnx drivers DPDK to understand how neon intrinsics accelerate the vectors.
+
*** Add Jira Tkt.
+
** [Sirshak] Anamolies with mlx5 and VPP.
+
** [Honnappa <-> Nitin] Nitin okay with ARM contacting Customer Support for help on TX2 optimal settings.
+
** [Brian/Sirshak] Tuning Dual or Quad loop.
+
*** Visible change in A72.
+
*** Sirshak sent patch to Sachin and Khem to analyze if they see any improvement.
+
*** [Khem->Sirshak] Why moving form Quad to Dual improves performance.
+
** [Lijian] x86 nos reported: 9.5 Mpps is not same as reported by Nitin Status: Could because of broadwell and skylake difference.
+
** [Khem] Updates on IPv4 Benchmarking on taishan. Status: CSIT perfomance bringup on fd.io lab. 18.04 gcc 7.3 trex. Workaround done. DUT VPP crashing. Plan for running L2 test cases.
+
** [Nitin] Any new findings from IPv4 VPP test case. Status: not available to discuss
+
**  
+
 
* CSIT
 
* CSIT
** [Juraj] Parallelizing the make test(CSIT-1139) Status: Sent for review, Figuring out optimal no of threads.
+
** VPP Performance Test
** [Sirshak] replying to cavium regarding Ubuntu 18.04 installation problem cavium-4.  
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** [Juraj/Sirshak] VPP Device SoC one node topology constraints Status: [Sirshak] Access to one of the three consoles.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
** [Sirshak] to ask brian about mcbin credentials.  
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
** [Adarsh] VPP Path/Device Efforts: Nested Container, trying VM inside a container facing some issues. Status: Work on hold as adarsh moved out of the project.  
+
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
* fd.io lab
+
***** IPSec policy test cases are not running by default.
** [Sirshak] Installation of TG pending. Status: Done
+
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
** [Juraj] mcbin access Status: Two of them can be accessed the other 1 cant.
+
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
** [Sirshak] cavium blades connected need SFP and DACs. Status: Up and running, still need SFP and DACs
+
****** Add new IPSec NULL encryption & decryption test cases - Juraj
* Documentation
+
******* New IPSec SPD test cases will not have NULL encrypt/decrypt config.
** Need to update the working ARM boards in the documentation section.
+
******* IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
*** [Sirshak] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin).
+
****** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
*** [Sirshak] Add only fd.io lab devices.
+
****** New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
****** Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
** Subscribe to: docs@lists.fd.io
+
***** Juraj is investigating running those test cases with 2N-TX2 topology.
* Action Items - Next Week
+
***** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
** [Khem] make verify on Taishan failure Status:
+
** VPP Path
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status:
+
*** Voting and working fine.
** [Sirshak] mlnx tx non vector version used for no-multiseg. Status: Jira Tkt VPP-1355
+
** VPP Device
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status:
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
** [Sirshak] look at Florin's patch. Status:
+
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
** [Tina] to get back on New ARMv8 Crypto. Status:
+
**** VPP community is responding this issue actively. - Juraj
** [Sirshak] Why Quad to Dual loop improves performance. Status: VPP-1356
+
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
** [Sirshak] To update VPP documenetation witrh fd.io lab devices. Status:
+
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
** [Sirshak] VPP Vectorization Jira Tkt. Status: VPP-1357
+
***** Try to reproduce with another set of firmware and etc but issues still exist
** [Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Last Status: Waiting for Nitin to help on changes for Internal DPDK. Current Status:
+
***** https://doc.dpdk.org/guides/nics/i40e.html
** [Sirshak] replying to cavium regarding Ubuntu 18.04 installation problem cavium-4. Status: Sent
+
***** Internal ticket has been raised
** [Sirshak] Get credentails from Brian for mcbin Status: Done
+
****** Try the new version of DPDK but it does not help
** [Sirshak] Send mail to LF for power cycler access for mcbin due to lack of IPMI interface Status: Done
+
****** Contact Intel devs for the possible advice
'''7/17/2018'''
+
****** Workaround may impact too much to all test cases
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
*** Lab move for the fd.io lab
 +
**** Lab moving started stage 2, moved part of the servers to make sure ci service not down.
 +
**** Lab move is done, some issues with taishan testbed
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
*** Plan to benchmark gcc-10 vs clang-12
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
** SVE patch ready and upstreamed, under review - Lijian
 +
*** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
*** Review memif patch
 +
**** Functional bug related to C11 atomics has been resolved by VPP maintainer.
 +
*** VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case. - Jieqiang
 +
*** Make test cases for IPSec policy mode - Zach
 +
**** Add/Remove/Add+Remove+Readd/Hash collisions/Multiple interfaces & rules - Add more test cases
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Performance improvement using loop unrolling for memif nodes
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform. - Govind
 +
*** IPsec input node optimization work in progress - Zach & Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** IPSec unit test - make test new cases implementation
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** perfmon CMN-600 investigating - Zach
 +
*** VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
 +
*** Plan to upstream perfmon plugin - resolving review comments - Zach
 +
*** IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
 +
*** IPSec decryption / input node - Zach
 +
 
 +
'''05/11/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Lijian Zhang
** Sachin Saxena
+
** Govindarajan Mohandoss
** Khemendra Kumar
+
 
** Juraj Linkes
 
** Juraj Linkes
** Lijian
+
** Zachary Leaf
 +
** Tianyu Li
 +
** Tina Tsou
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
 +
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
 +
****** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
******* New IPSec SPD test cases will not have NULL encrypt/decrypt config.
 +
******* IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
 +
****** CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
 +
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
 +
** VPP Path
 +
*** Voting and working fine.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** Try the new version of DPDK but it does not help
 +
****** Contact Intel devs for the possible advice
 +
****** Workaround may impact too much to all test cases
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
*** Lab move for the fd.io lab
 +
**** Lab moving started stage 2, moved part of the servers to make sure ci service not down.
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
**** Almost all except performance testbed, which will be moved this week, everything is smooth so far.
 +
**** ubuntu 1804 -> 2004
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
** SVE patch ready and upstreamed, under review - Lijian
 +
*** SVE patch sent to Nitin, Nitin will review the patch when back to work.
 +
*** Review memif patch
 +
*** VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case.
 +
*** Make test cases for IPSec policy mode - Jieqiang
 +
** Review memif test cases/memif cases
 +
** Finished coding of SVE string library, bihash key compare functions
 +
*** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
** SVE unit testing based on test_vec, fix test_vec issues
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
***** Will do readout presentation with extended people - Tianyu
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Performance improvement using loop unrolling for memif nodes
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform. - Govind
 +
*** IPsec input node optimization work in progress - Zach & Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** IPSec unit test - make test new cases implementation
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** Using startup parameter to enable the IPsec flow cache feature
 +
** Adding Python test case to test IPSec node behavior - Jieqiang
 +
** perfmon CMN-600 investigating - Zach
 +
*** VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
 +
*** Plan to upstream perfmon plugin - resolving review comments - Zach
 +
*** IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
 +
*** IPSec decryption / input node - Zach
  
* General Topic
+
'''04/27/2021'''
** Austin Folks leaving early meeting. If needs be somebody can takeover after 1 hour (9 am CT).
+
* Attendees
* Action Items - Last Week
+
** Govindarajan Mohandoss
** [Honnappa/Nitin] Aarch64 rpms not building - DPDK Neon Build Break
+
** Lijian Zhang
** [Nitin/Sachin]Follow up: Add Virtual addressing support in IOVA dmap Status: No updates.
+
** Juraj Linkes
** [Khem] make test on Taishan timings: Status: Done. To look at why make verify.
+
** Tianyu Li
** [Sirshak] cavium USB-Ethernet adapters to Quantta Switch. Status: Andy waiting for cables to reach him.
+
** Jieqiang Wang
** [Sirshak] mlnx tx non vector version used for no-multiseg. Jira Tkt Status: Not yet done. Will do this week.
+
** Tina Tsou
** [Sirshak] DPDK 18.05 mlnx bug. Status: Sirshak to open Jira Tkt - VPP-1339
+
** [Sirshak] look at Florin's patch. Status: Not yet done.
+
** [Tina] to get back on New ARMv8 Crypto.
+
* VPP
+
** [Sirshak] vectorization patch effects
+
*** Made few changes no visible change.
+
*** Plan to read mlnx drivers DPDK to understand how neon intrinsics accelerate the vectors.
+
** [Brian/Sirshak] Tuning Dual or Quad loop.
+
*** Visible change in A72.
+
*** None in Qualcomm because of pfrm not being hotspot.
+
*** [Khem->Sirshak] Why moving form Quad to Dual improves performance.
+
*** Commmunity wide investigation needed.
+
** [Lijian] x86 nos reported: 9.5 Mpps is not same as reported by Nitin Status: Investigation.
+
** [Khem] Updates on IPv4 Benchmarking on taishan. Status: Stuck with pktgen.
+
** [Nitin] Any new findings from IPv4 VPP test case. Status:
+
** [Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Status: Waiting for Nitin to help on changes for Internal DPDK.
+
**  
+
 
* CSIT
 
* CSIT
** [Juraj] Parallelizing the make test(CSIT-1139) Status: Almost done, need to work on polishing.
+
** VPP Performance Test
** [Juraj/Sirshak] SoC devices as non voting VPP device targets. Status: [Sirshak] pending on TG credentials.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** [Adarsh] VPP Path/Device Efforts: Nested Container, trying VM inside a container facing some issues. Status: No update, Adarsh replaced on the project; postponed
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
* fd.io lab
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
** [Sirshak] Installation of TG pending. Status: No update from LF - Anton
+
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
** [Sirshak] cavium blades connected need SFP and DACs. Status: Up and running, still need SFP and DACs
+
***** IPSec policy test cases are not running by default.
* Documentation
+
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
** Need to update the working ARM boards in the documentation section.
+
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
*** [Sirshak] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin).
+
****** Add new IPSec NULL encryption & decryption test cases - Juraj
*** [ARM community] Waiting for feedback from Khem and other companies
+
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
***** Juraj is investigating running those test cases with 2N-TX2 topology.
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
***** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
** Subscribe to: docs@lists.fd.io
+
** VPP Path
* Action Items - Next Week
+
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
** [Honnappa/Nitin] Aarch64 rpms not building - DPDK Neon Build Break Status:
+
***** https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
** [Nitin/Sachin] Follow up: Add Virtual addressing support in IOVA dmap Status:
+
****** Its voting right is enabled on Arm.
** [Khem] make test on Taishan failure Status:
+
** VPP Device
** [Sirshak] cavium(4,5,6,7) USB-Ethernet adapters to Quantta Switch. Status:
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
** [Sirshak] mlnx tx non vector version used for no-multiseg. Status:
+
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
** [Sirshak] DPDK 18.05 mlnx bug(VPP-1339). Status: 
+
**** VPP community is responding this issue actively. - Juraj
** [Sirshak] look at Florin's patch. Status:
+
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
** [Tina] to get back on New ARMv8 Crypto.
+
***** https://doc.dpdk.org/guides/nics/i40e.html
** [Sirshak] Why Quad to Dual loop improves performance.
+
***** Internal ticket has been raised
** [Sirshak] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin).
+
****** Try the new version of DPDK but it does not help
 +
****** Contact Intel devs for the possible advice
 +
****** Workaround may impact too much to all test cases
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
*** Lab move for the fd.io lab
 +
**** Lab moving started stage 2, moved part of the servers to make sure ci service not down.
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
** SVE patch ready and upstreamed, under review - Lijian
 +
*** Make test cases for IPSec policy mode - Jieqiang
 +
** Review memif test cases/memif cases
 +
** Finished coding of SVE string library, bihash key compare functions
 +
*** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
** SVE unit testing based on test_vec, fix test_vec issues
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
***** Will do readout presentation with extended people - Tianyu
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Performance improvement using loop unrolling for memif nodes
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform. - Govind
 +
*** IPsec input node optimization work in progress - Zach & Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** IPSec unit test - make test new cases implementation
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** Using startup parameter to enable the IPsec flow cache feature
 +
** Adding Python test case to test IPSec node behavior - Jieqiang
 +
** perfmon CMN-600 investigating - Zach
 +
*** Plan to upstream perfmon plugin - resolving review comments - Zach
 +
*** IPSec decryption / input node - Zach
  
'''7/10/2018'''
+
'''04/13/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
** Sachin Saxena
+
** Lijian Zhang
** Khemendra Kumar
+
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 
** Tina Tsou
 
** Tina Tsou
** Nitin Saxena
+
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
****** Some of the IPSec test cases(Policy tests) has been added to daily testing.
 +
****** Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
 +
****** Add new IPSec NULL encryption & decryption test cases - Juraj
 +
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
 +
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
 +
***** Some issues occurred during the upgrade.
 +
***** Patch to resolve the building error of DPDK on 3n-tsh testbed.
 +
***** Root cause is the change of build system of DPDK on 3n-tsh related to SOC id detection.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
***** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
 +
** VPP Path
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
***** https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Its voting right is enabled on Arm.
 +
** VPP Device
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** Try the new version of DPDK but it does not help
 +
****** Contact Intel devs for the possible advice
 +
**** Will try to reproduce the issue with x86 servers.
 +
**** This issue is common to all platforms(Arm & Intel)
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Jieqiang helped to verify most fixed size vector wrappers - unit test code
 +
** SVE Remaining works - variable type convention - need some workaround for 256bit convention
 +
** VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
 +
*** Make test cases for IPSec policy mode - Jieqiang
 +
** Review memif test cases/memif cases
 +
** Finished coding of SVE string library, bihash key compare functions
 +
*** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
** SVE unit testing based on test_vec, fix test_vec issues
 +
** Test template update - Jieqiang
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
***** Will do readout presentation with extended people - Tianyu
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Prepare the memif readout - Tianyu
 +
*** Try to apply C11 weak memory model on VPP memif - Tianyu
 +
**** Use 'show runtime'/perfmon to see cycle improvement
 +
**** Run memif unit test
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** Using startup parameter to enable the IPsec flow cache feature
 +
** Adding Python test case to test IPSec node behavior - Jieqiang
 +
** perfmon CMN-600 investigating - Zach
 +
*** Plan to upstream perfmon plugin - resolving review comments - Zach
 +
*** IPSec decryption - Zach
 +
 
 +
 
 +
'''03/30/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 
** Juraj Linkes
 
** Juraj Linkes
** Brian Brooks
+
** Tianyu Li
** Lijian
+
** Jieqiang Wang
** Tom Herbert
+
** Tina Tsou
* General Topic
+
** Austin Folks leaving early meeting. If needs be somebody can takeover after 1 hour (9 am CT).
+
** [Tom] Aarch64 rpms not building - anyone can help?
+
* Action Items - Last Week
+
** [Nitin/Sachin]Follow up: Add Virtual addressing support in IOVA dmap Status: No updates.
+
** [Nitin] make test on Thunderx2 timings Status: Send error report of make test.
+
** [Khem] make test on Taishan timings: Status: 22 mins. Try make verify.
+
** [Sirshak] cavium USB-Ethernet adapters to Quantta Switch. Status: Done for cavium 1,2,3. Need cables for 4,5,6,7. Cables ordered
+
** [Khem] to update on nested VMs on performance test cases. Status: No updates. Could be a naming problem.
+
** [Sirshak] Q to Maciek: buildroot image with VPP device(within container)? Status: No updates. Check with Brian to see if buildroot works on arm.
+
** [Sirshak] mlnx tx non vector version used for no-multiseg. Reason ? Status: No updates. Sirshak to open Jira Tkt.
+
** [Sirshak] DPDK 18.05 mlnx bug. Status: Asked in the community need to look at backtrace as pointed by damjan. Sirshak to open Jira Tkt.
+
* VPP
+
** [Sirshak] vectorization patch effects. https://gerrit.fd.io/r/#/c/13229/
+
*** I see around 15% in qualcomm with mellanox based on some patch which is not vectorization patch need find that.
+
*** Do others see similar improvement in past 2 weeks.
+
*** [Sirshak] look at Florin's patch.
+
** [Lijian] x86 nos, checking within Nitin for sync on configuration. Skylake Single Core Single Thread: Ipv4 forwarding 64B 15 Mppps.
+
** [Khem] Updates on IPv4 Benchmarking on taishan. Status: No Updates
+
** [Nitin] Any known comparision between AVF nos on aarch64 and DPDK nos ? On Intel its ~25% and ARM ~20%.
+
** [Nitin/Sachin] Upstreaming ARMv8 Crypto Changes with external DPDK. Status: Internal DPDK changes effort. Wait for status on New ARMv8 Crypto.
+
** [Sirshak->Nitin] Thunderx2(high core count)coremask for DPDK config in VPP startup conf.
+
** [Tina] to get back on New ARMv8 Crypto.
+
 
* CSIT
 
* CSIT
** [Juraj] Parallelizing the make test(CSIT-1139) Discussion: On Plan and if anybody wants to join hands.
+
** VPP Performance Test
** [Juraj/Sirshak] SoC devices as non voting VPP device targets. Discussion: mcbin console access will be available once TG credentials are availlable.  
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** [Adarsh] VPP Path/Device Efforts: Nested Container, trying VM inside a container facing some issues.  
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
* fd.io lab
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
** [Sirshak] Taishan connected need to verify once we get TG credentials. [Khem] Checked from Taishan side ports connected to TG are up.
+
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
** [Sirshak] mcbin connected need to verify once we get TG credentials.
+
***** IPSec policy test cases are not running by default.
** [Sirshak] cavium blades connected need to switch the network adapters before using it for CI.
+
****** 2 node IPsec SPD policy test case patch is ready, starting with 1 and 1k tunnels. (40, 400 tunnels in seperate patch)
* Documentation
+
****** https://gerrit.fd.io/r/c/csit/+/31605
** Need to update the working ARM boards in the docyumentation section.
+
****** Fix the wrong CLI commands but configuration still has problems.
*** [Sirshak] To update Qualcomm Centriq, mcbin, Thunderx1, Thunderx2, Taishan 2280, OD 1000 and OD 3000(Check with Sachin).
+
****** Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
**** 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
***** Some issues occurred during the upgrade.
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
***** Patch to resolve the building error of DPDK on arm testbed.(taishan dpdk cases still have issues, investigating)
** Subscribe to: docs@lists.fd.io
+
***** Juraj is investigating running those test cases with 2N-TX2 topology.
* Action Items - Next Week
+
***** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
** [Honnappa/Nitin] Aarch64 rpms not building - DPDK Neon Build Break
+
** VPP Path
** [Nitin/Sachin]Follow up: Add Virtual addressing support in IOVA dmap Status: No updates.
+
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
** [Khem] make test on Taishan timings: Status:
+
***** https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
** [Sirshak] cavium USB-Ethernet adapters to Quantta Switch.
+
****** Its voting right is enabled on Arm.
** [Sirshak] mlnx tx non vector version used for no-multiseg. Jira Tkt Status:
+
** VPP Device
** [Sirshak] DPDK 18.05 mlnx bug. Status: Sirshak to open Jira Tkt.
+
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
** [Sirshak] look at Florin's patch.
+
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
***** Internal ticket has been raised
 +
****** Try the new version of DPDK but it does not help
 +
****** Contact Intel devs for the possible advice
 +
**** Will try to reproduce the issue with x86 servers.
 +
*** "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0].  The root cause is being addressed and should be fixed shortly.
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
** Review memif test cases/memif cases
 +
** Finished coding of SVE string library, bihash key compare functions
 +
*** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
** SVE unit testing based on test_vec, fix test_vec issues
 +
** Test template update
 +
** SVE unit test in qemu-vm, met compiling issue, investigating
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
***** Will do readout presentation with extended people - Tianyu
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
**** Review the confluence page and prepare the memif readout - Lijian & Tianyu
 +
**** Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
 +
**** Prepare the memif readout - Tianyu
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
***** https://gerrit.fd.io/r/c/vpp/+/31694
 +
***** Review the patch and grasp the basics about IPSec - Lijian
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
 +
** Using startup parameter to enable the IPsec flow cache feature
 +
** Discuss with jieqiang adding python test case to test ipsec node behavior
 +
** perfmon CMN-600 investigating - Zach
  
'''7/3/2018'''
+
'''03/16/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
** Sachin Saxena
+
** Lijian Zhang
** Khemendra Kumar
+
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 
** Tina Tsou
 
** Tina Tsou
** Nitin Saxena
+
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** IPSec policy test cases are not running by default.
 +
***** Juraj is investigating running those test cases with 2N-TX2 topology.
 +
*** Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
 +
** VPP Path
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Its voting right is enabled on Arm.
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** The issue could be reproduced on Arm servers with the NIC with latest firmware version.
 +
***** https://doc.dpdk.org/guides/nics/i40e.html
 +
**** Will try to reproduce the issue with x86 servers.
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Vector length specific patch is ready
 +
*** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
*** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
***** Will do readout presentation with extented people - Tianyu
 +
** Investigating VPP memif - Tianyu
 +
*** Benchmarking DPDK memif vs VPP memif
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
**** Perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''03/09/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 
** Juraj Linkes
 
** Juraj Linkes
** Brian Brooks
+
** Tianyu Li
** Ed Kern
+
** Jieqiang Wang
** Song
+
** Tina Tsou
** Lijian
+
* CSIT
* General Topic
+
** VPP Performance Test
** Architecture Section in Documentation.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
* Action Items - Last Week
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
** Khem: Ipv4 layer investigation. To Share some findings next week on parameters for CSIT Status: Done. If yes cover in VPP section.
+
**** CSIT official release 21.01 is available
** Nitin Follow up: Sachin: Upstreaming ARMv8 Crypto Changes with external DPDK. Status: Nitin to provide help on using Internal DPDK  
+
***** https://docs.fd.io/csit/rls2101/report/
** Nitin Follow up: Add Virtual addressing support in IOVA dmap Status: Waiting for response from Damjan
+
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
** Nitin make test on Thunderx2 timings :
+
******* 20.09 vs 21.01 show run vector per call drop from 256 to 200 - need to check dpdk version changes
** Khem: status on make test failures: CSIT-1148 Status: Fixed.
+
******* Perf drop only observed for VM cases
** Khem: make test on Taishan timings: Status: No status
+
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP.
** Sirshak: cavium USB-Ethernet adapters to Quantta Switch. Status: Still working with LF guys
+
****** Check the number for CSIT 2101 release 
** Khem to update on nested VMs on performance test cases. Status: No Updates
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
** Sirshak & Khem: Documentation review. Status: Done. continuous effort.  
+
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
** Sirshak: Q to Maciek: buildroot image with VPP device(within container) ? Status: No updates.
+
***** Tests are running fine
 +
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
 +
****** Suitable time to run release testing on 2n-tx2 testbed.
 +
***** Will investigate IPSec test cases on 2n-tx2 - Juraj
 +
****** Maintainer confirm that it is feasible
 +
******* Patch merged, https://gerrit.fd.io/r/c/csit/+/31309 p
 +
******* Patch created for daily running https://gerrit.fd.io/r/c/csit/+/31478
 +
******* crypto tests will be enabled on daily and report Jenkins job
 +
******* IPv6 / policy mode crypto test cases to be investigated and added
 +
****** Add memif test case to 2n-tx2 once the release testing is done.
 +
******* Take ~ 1 or 1.5 hour for one round of memif testing.
 +
**** release testing for 2n-tx2
 +
***** Performance data added to daily trending page
 +
****** https://docs.fd.io/csit/master/trending/introduction/dashboard.html#n-tx2
 +
****** Test cases include L2/IPv4/IPv6/Classifier/ACL
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will not be supported.
 +
**** CentOS-8 will be supported by the end of this year by Redhat.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Job is enabled https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Running per patch and voting right is enabled
 +
***** Maintainer ask for more servers for sake of redundancy
 +
****** Sync with Dave for ARM server requirement
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
**** Dependency on maintainers to fix this issue
 +
**** Intel will ship a new NIC with latest firmware
 +
***** Shipment takes a long time empirically
 +
****** NIC has been shipped to vexxhost, wait for NIC arrival.
 +
***** Try to reproduce the issue on this NIC on Arm platform
 +
***** Updating firmware on the current NIC is risky
 +
**** Voting rights will be enabled once this issue is fixed
 +
****** Maintainer raised the ticket to get intel people involved
 +
****** Will not update the firmware because the release testing is ongoing
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
**** Will show Arm roadmap in the next TSC meeting
 +
**** Proposals have been sent to VPP maintainer on verifying SVE/SVE2
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
 
 
* VPP
 
* VPP
** Sirshak: Investigate mlnx_burst_rx_vec used in case of no multi-seg but plain mlnx_tx_burst used. Movement of hotspot seen for rx. Probable reason SRIOV(VFs) used. Root cause yet to be found.
+
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
** Sirshak: VPP DPDK 18.05 change done by damjan. mlnx drivers on Qualcomm are a problem. Urge Everyone to test respective sanity in their setup. set interface state <InerfaceName> up - stuck
+
** Multi-arch support - Lijian
** Khem: Discuss various parameters in CSIT for IPv4 Testing.
+
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
** Sirshak: TCP termination performance nos ?
+
** Investigate VPP Intel AVF PMD driver - Lijian
** Sirshak: vectorization patch effects. https://gerrit.fd.io/r/#/c/13229/
+
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
**** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
**** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
**** Analysis of benchmarking results for Ampere Altra
 +
***** A lot of context switches occur on Ampere Altra compared to N1SDP
 +
***** perf tools used to capture the perf events
 +
***** Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
 +
***** Remove interrupts on altra but no performance improvement seen
 +
***** instruction cache misses are higher on altra than N1
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
**** Record the benchmarking results of VPP CNF 3 test cases in excel template
 +
** VPP compiling error on CentOS 7 - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/31421
 +
**** CentOS 7 build issue has been fixed
 +
*** Developing NEON wrapper to SVE 128/256bit on qemu
 +
 
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
 +
**** perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''02/23/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 
* CSIT
 
* CSIT
** Juraj Make test bottlenecks: Updates: One plausible solution available. Parallelizing the make test(CSIT-1139)
+
** VPP Performance Test
** Juraj to start looking at SoC devices as non voting VPP device targets.  
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** Adarsh: openssl issues ? Issue still persists.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
** Adarsh: VPP Path Tasks.
+
**** CSIT official release 21.01 is available
** Tkt updates:
+
***** https://docs.fd.io/csit/rls2101/report/
*** CSIT-1021: Handle Scapy pcap limit Khem(brief on patch, updates): Status: To check with CSIT team for jenkins build failure. Status: No Updates. Not Priorty.
+
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
* fd.io lab
+
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP.
** Sirshak: Update from LF guys
+
****** Check the number for CSIT 2101 release 
* Documentation
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
***** Tests are running fine
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
** Subscribe to: docs@lists.fd.io
+
****** Suitable time to run release testing on 2n-tx2 testbed.
* Action Items - Next Week
+
****** Will investigate IPSec test cases on 2n-tx2 - Juraj
** [Nitin/Sachin]Follow up: Add Virtual addressing support in IOVA dmap Status:
+
******* Maintainer confirm that it is feasible
** [Nitin] make test on Thunderx2 timings :
+
******* Patch created, https://gerrit.fd.io/r/c/csit/+/31309
** [Khem] make test on Taishan timings: Status:
+
******* crypto tests will be enabled on daily and report Jenkins job
** [Sirshak] cavium USB-Ethernet adapters to Quantta Switch. Status:
+
****** Add memif test case to 2n-tx2 once the release testing is done.
** [Khem] to update on nested VMs on performance test cases. Status:
+
******* Take ~ 1 or 1.5 hour for one round of memif testing.
** [Sirshak] Q to Maciek: buildroot image with VPP device(within container)? Status:
+
**** release testing for 2n-tx2
** [Sirshak] mlnx tx non vector version used for no-multiseg. Reason ? Status:
+
***** Performance data added to daily trending page
** [Sirshak] DPDK 18.05 mlnx bug. Status:
+
****** https://docs.fd.io/csit/master/trending/introduction/dashboard.html#n-tx2
 +
****** Test cases include L2/IPv4/IPv6/Classifier/ACL
 +
****** Release testing done for 2n-tx2, ongoing for 3n-tsh(due to next week)
 +
****** Release report plan to be published on 10th Feb
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Job is enabled https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Running per patch and voting right is enabled
 +
***** Maintainer ask for more servers for sake of redundancy
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
**** Dependency on maintainers to fix this issue
 +
**** Intel will ship a new NIC with latest firmware
 +
***** Shipment takes a long time empirically
 +
***** Try to reproduce the issue on this NIC on Arm platform
 +
***** Updating firmware on the current NIC is risky
 +
**** Voting rights will be enabled once this issue is fixed
 +
****** Maintainer raised the ticket to get intel people involved
 +
****** Will not update the firmware because the release testing is ongoing
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features inside QEMU docker - Lijian
 +
**** Latest VPP binary crash on the QEMU docker
 +
***** System call fails inside QEMU docker when running VPP
 +
**** Verify SVE/SVE2 features inside ARM QEMU VM
 +
**** VPP maintainers want real hardware to verify SVE code
 +
***** This solution will be abandoned.
 +
**** 'make test' execution is slow
 +
**** Sync with DPDK team/VPP community to decide the solution
 +
**** Proposals have been sent to VPP maintainer on verifying SVE/SVE2
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
  
'''6/26/2018'''
+
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
**** 128 and 256 fixed size vector wrappers are ready, needs verification
 +
**** Verify SVE vector length specific wrappers - Jieqiang
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
**** Extend vector length agnostic opportunities
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
**** Analysis of benchmarking results for Ampere Altra
 +
***** A lot of context switches occur on Ampere Altra compared to N1SDP
 +
***** perf tools used to capture the perf events
 +
***** Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
 +
***** Remove interrupts on altra but no performance improvement seen
 +
***** instruction cache misses are higher on altra than N1
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Investigate VPP memif test cases in VPP CSIT - Jieqiang
 +
*** Investigate VPP use cases proposals in containers - Tianyu
 +
**** 3 use cases are investigated.
 +
**** Will explore the memif logic and share the progress.
 +
**** Will share the link on details about how to run VPP in container.
 +
*** Investigate VPP agent usage - Tianyu
 +
**** Focus more on data-plane performance benchmarking and optimization - Tianyu
 +
** VPP compiling error on CentOS 7 - Jieqiang
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** perfmon plugin enablement on Arm - Zach
 +
***** patch upstream has dependency on kernel patch
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''02/09/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
** Sachin Saxena
+
** Lijian Zhang
** Khemendra Kumar
+
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 
** Tina Tsou
 
** Tina Tsou
** Nitin Saxena
+
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
 +
**** CSIT official release 20.09 is available
 +
***** https://docs.fd.io/csit/rls2009/report/
 +
**** CSIT official release 21.01 is ongoing
 +
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
 +
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP.
 +
****** Check the number for CSIT 2101 release 
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** Tests are running fine
 +
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
 +
****** Suitable time to run release testing on 2n-tx2 testbed.
 +
****** Will investigate IPSec test cases on 2n-tx2 - Juraj
 +
******* Maintainer confirm that it is feasible
 +
****** Add memif test case to 2n-tx2 once the release testing is done.
 +
**** release testing for 2n-tx2
 +
***** Performance data added to daily trending page
 +
****** https://docs.fd.io/csit/master/trending/introduction/dashboard.html#n-tx2
 +
****** Test cases include L2/IPv4/IPv6/Classifier/ACL
 +
****** Release testing done for 2n-tx2, ongoing for 3n-tsh(due to next week)
 +
****** Release report plan to be published on 10th Feb
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Job is enabled https://jenkins.fd.io/job/vpp-verify-master-centos8-aarch64/
 +
****** Running per patch and voting right is enabled
 +
***** Will ask Dave if he needs help with testing CentOS-8 on Arm - Juraj.
 +
****** Will verify the image uploaded by Dave if it is ready.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
**** Jenkins job to verify runs fine but slow
 +
***** https://gerrit.fd.io/r/c/ci-management/+/31083
 +
***** Maintainer ask for more servers for sake of redundancy
 +
**** 'make test' failure on ubuntu 20.04 AARCH64
 +
***** Dave has sent email for the details
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
**** Dependency on maintainers to fix this issue
 +
**** Intel will ship a new NIC with latest firmware
 +
***** Shippment takes a long time empirically
 +
***** Try to reproduce the issue on this NIC on Arm platform
 +
***** Updating firmware on the current NIC is risky
 +
**** Voting rights will be enabled once this issue is fixed
 +
****** Maintainer raised the ticket to get intel people involved
 +
****** Will not update the firmware because the release testing is ongoing
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features inside QEMU docker - Lijian
 +
**** Latest VPP binary crash on the QEMU docker
 +
***** System call fails inside QEMU docker when running VPP
 +
**** Verify SVE/SVE2 features inside ARM QEMU VM
 +
**** 'make test' execution is slow
 +
**** Sync with DPDK team/VPP community to decide the solution
 +
**** Proposals have been sent to VPP maintainer on verifying SVE/SVE2
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
**** Analysis of benchmarking results for Ampere Altra
 +
***** A lot of context switches occur on Ampere Altra compared to N1SDP
 +
***** perf tools used to capture the perf events
 +
***** Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
 +
***** Remove interrupts on altra but no performance improvement seen
 +
***** instruction cache misses are higher on altra than N1
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Investigate VPP memif test cases in VPP CSIT - Jieqiang
 +
*** Investigate VPP use cases proposals in containers - Tianyu
 +
**** 3 use cases are investigated.
 +
**** Will explore the memif logic and share the progress.
 +
**** Will share the link on details about how to run VPP in container.
 +
*** Investigate VPP agent usage - Tianyu
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''02/02/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 
** Juraj Linkes
 
** Juraj Linkes
** Brian Brooks
+
** Tianyu Li
** Ed Kern
+
** Jieqiang Wang
** Song
+
** Tina Tsou
* General Topic
+
* CSIT
** Introduce Song, Yi and Lijian
+
** VPP Performance Test
* Action Items - Last Week
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** Adarsh: Updates on Jira tkt for openssl issues. Updates: none
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
** Adarsh: Update on topology for Kubernetes Functional Tests. Updates: Kubernetes, Docker
+
**** CSIT official release 20.09 is available
** Sirshak Tuning Section - Not Done
+
***** https://docs.fd.io/csit/rls2009/report/
** Khem: Ipv4 layer investigation. CSIT: IPv4. To Share some findings next week on parameters for CSIT
+
**** CSIT official release 21.01 is ongoing
** Nitin: Send old dpdk input node patch - Done
+
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
** Sachin: Upstreaming ARMv8 Crypto Changes with external DPDK. - Nitin to send mail
+
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP.
** Add Virtual addressing support in IOVA dmamap: Updates - nitin to send mail
+
****** Check the number for CSIT 2101 release 
** Nitin Measure make make test on Thunderx2
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
** Khem: measure make and make test on Taishan (Juraj tested it it failed : https://jira.fd.io/browse/CSIT-1148)
+
***** Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
** Sirshak: try to switch eth-usb for regular eth ports on ThunderXs - Created a LF tkt have follow up meeting today.
+
***** Tests are running fine
 +
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
 +
****** Suitable time to run release testing on 2n-tx2 testbed.
 +
****** Will investigate IPSec test cases on 2n-tx2 - Juraj
 +
******* Maintainer confirm that it is feasible
 +
****** Add memif test case to 2n-tx2 once the release testing is done.
 +
**** release testing for 2n-tx2
 +
***** Performance data added to daily trending page
 +
****** https://docs.fd.io/csit/master/trending/introduction/dashboard.html#n-tx2
 +
****** Test cases include L2/IPv4/IPv6/Classifier/ACL
 +
****** Release report plan to be published on 10th Feb
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Will ask Dave if he needs help with testing CentOS-8 on Arm - Juraj.
 +
****** Will verify the image uploaded by Dave if it is ready.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
**** Jenkins job to verify runs fine but slow
 +
***** https://gerrit.fd.io/r/c/ci-management/+/31083
 +
***** Maintainer ask for more servers for sake of redundancy
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
**** Dependency on maintainers to fix this issue
 +
**** Voting rights will be enabled once this issue is fixed
 +
***** Implementation is ready, and will do test it with actual patches.
 +
***** Apply file locking mechanism to allow that only one VPP instances are running.
 +
****** https://gerrit.fd.io/r/c/csit/+/30425
 +
****** Patches are under review
 +
****** Maintainer raised the ticket to get intel people involved
 +
****** Will not update the firmware because the release testing is ongoing
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features inside QEMU docker - Lijian
 +
**** Latest VPP binary crash on the QEMU docker
 +
***** System call fails inside QEMU docker when running VPP
 +
**** Verify SVE/SVE2 features inside ARM QEMU VM
 +
**** 'make test' execution is slow
 +
**** Sync with DPDK team/VPP community to decide the solution
 +
**** Proposals have been sent to VPP maintainer on verifying SVE/SVE2
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 
* VPP
 
* VPP
** Discuss vec_en_rx/tx=1 parameters.
+
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
** Discuss Vectorized rx and tx functions in mlx5 (in case of no multi-seg)
+
** Multi-arch support - Lijian
** rxd,txd nos in VPP config.
+
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
** mbcache any configuring done from VPP side ?
+
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
**** Analysis of benchmarking results for Ampere Altra
 +
***** A lot of context switches occur on Ampere Altra compared to N1SDP
 +
***** perf tools used to capture the perf events
 +
***** Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
 +
***** Remove interrupts on altra but no performance improvement seen
 +
***** instruction cache misses are higher on altra than N1
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Investigate VPP memif test cases in VPP CSIT - Jieqiang
 +
*** Investigate VPP use cases proposals in containers - Tianyu
 +
**** 3 use cases are investigated.
 +
**** Will explore the memif logic and share the progress.
 +
**** Will share the link on details about how to run VPP in container.
 +
*** Investigate VPP agent usage - Tianyu
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
'''01/19/2021'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tianyu Li
 +
** Jieqiang Wang
 +
** Tina Tsou
 
* CSIT
 
* CSIT
** make test failures Taishan Khem/adarsh (https://jira.fd.io/browse/CSIT-1148)
+
** VPP Performance Test
** Juraj Make test bottlenecks: Updates: Ran 4 containers (85 mins) (CSIT-1139)
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** mcbin, OD(1000/3000), cavium thunderX as one of the targets for VPP Device Test.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
** Future role of devices. Status: Existing Taishan Servers to be used for performance suite only.
+
**** CSIT official release 20.09 is available
** Khem to update on nested VMs on performance test cases.
+
***** https://docs.fd.io/csit/rls2009/report/
** buildroot image with VPP device(within container) ? Sirshak to ask maciek
+
***** Jieqiang will compare the performance data with release 20.09
** Tkt updates:
+
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
*** CSIT-1043 (Guest OS becomes unresponsive during CSIT): Juraj Updates: Merged and Closed
+
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP.
*** CSIT-990 (buildroot package) Juraj Updates: Postponed
+
****** Check the number for CSIT 2101 release 
*** CSIT-1021: Handle Scapy pcap limit Khem(brief on patch, updates): Status: To check with CSIT team for jenkins build failure.
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
* fd.io lab
+
**** Hardware configurations/wiring are done; Physical connection to the TG is done.
** Sirshak to have follow up LF guys.
+
**** almost done, two steps need to be done
* Documentation
+
***** start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
***** Take the execution time into consideration if we want run release testing on 2n-thx2.
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
***** It takes 9 hours to finish the one round testing.
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
***** Tests are running fine
** Subscribe to: docs@lists.fd.io
+
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
** Sirshak and Khem to try doing some reviews this week.
+
****** Suitable time to run release testing on 2n-tx2 testbed.
* Action Items - Next Week
+
****** Will investigate IPSec test cases on 2n-tx2 - Juraj
** Khem: Ipv4 layer investigation. To Share some findings next week on parameters for CSIT
+
****** Add memif test case to 2n-tx2 once the release testing is done.
** Nitin Follow up: Sachin: Upstreaming ARMv8 Crypto Changes with external DPDK.
+
** VPP Path
** Nitin Follow up: Add Virtual addressing support in IOVA dmap
+
*** CentOS-7 will be enabled with master branch for support lts release
** Nitin make test on Thunderx2 timings :
+
**** CentOS-7 Jenkins on Arm will be supported.
** Khem: status on make test failures: CSIT-1148
+
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
** Khem: make test on Taishan timings:
+
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
** Sirshak: cavium USB-Ethernet adapters to Quantta Switch.
+
***** Will ask Dave if he needs help with testing CentOS-8 on Arm - Juraj.
 +
****** Will verify the image uploaded by Dave if it is ready.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
***** Implementation is ready, and will do test it with actual patches.
 +
***** Apply file locking mechanism to allow that only one VPP instances are running.
 +
****** https://gerrit.fd.io/r/c/csit/+/30425
 +
****** Patches are under review
 +
****** Machiek raised the ticket to get intel people involved
 +
****** Will not update the firmaware because the release testing is ongoing
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features inside QEMU docker
 +
**** Latest VPP binary crash on the QEMU docker - Lijian
 +
*** Lab move for the fd.io lab
 +
**** Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
**** Analysis of benchmarking results for Ampere Altra
 +
***** A lot of context switches occur on Ampere Altra compared to N1SDP
 +
***** perf tools used to capture the perf events
 +
***** Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Investigate VPP memif test cases in VPP CSIT - Jieqiang
 +
*** Investigate VPP use cases proposals in containers - Tianyu
 +
**** 3 use cases are investigated.
 +
**** Will explore the memif logic and share the progress.
 +
**** Will share the link on details about how to run VPP in container.
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
  
** Sirshak: try to switch eth-usb for regular eth ports on ThunderXs - Created a LF tkt have follow up meeting today.
 
  
'''6/19/2018'''
+
 
 +
'''01/05/2021'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
** Sachin Saxena
+
** Lijian Zhang
** Khemendra Kumar
+
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tianyu Li
 
** Tina Tsou
 
** Tina Tsou
** Nitin Saxena
+
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
**** CSIT official release 20.09 is available
 +
***** https://docs.fd.io/csit/rls2009/report/
 +
***** Jieqiang will compare the performance data with release 20.09
 +
****** Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
 +
****** DPDK testpmd running inside VM, l2 cross connect running inside VPP. 
 +
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
 +
**** Hardware configurations/wiring are done; Physical connection to the TG is done.
 +
**** almost done, two steps need to be done
 +
***** start with basic L2/L3/IPSec/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
 +
***** Take the execution time into consideration if we want run release testing on 2n-thx2.
 +
***** Tests are running fine
 +
****** L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
 +
****** Suitable time to run release testing on 2n-tx2 testbed.
 +
** VPP Path
 +
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Will ask Dave if he needs help with testing CentOS-8 on Arm - Juraj.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
***** Implementation is ready, and will do test it with actual patches.
 +
***** Apply file locking mechanism to allow that only one VPP instances are running.
 +
****** https://gerrit.fd.io/r/c/csit/+/30425
 +
****** Patches are under review
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features inside QEMU docker
 +
**** Latest VPP binary crash on the QEMU docker - Lijian
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
**** Plan to talk with VPP maintainers on this topic
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
** Investigate VPP test cases in container
 +
*** Investigate VPP test cases in VPP CSIT - Jieqiang
 +
*** Investigate VPP use cases proposals in containers - Tianyu
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''12/22/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 
** Juraj Linkes
 
** Juraj Linkes
** Brian Brooks
+
** Jieqiang Wang
** Ed Kern
+
** Tina Tsou
** Song
+
* General
* General Topic
+
** Will cancel the meeting on Dec 29th;
** Introduce Yi ,Lijian and Song
+
* CSIT
* Action Items - Last Week
+
** VPP Performance Test
** Brian: mcbin Status:
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** Sirshak: Follow up clang changes. Status: Merged updated wiki.
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
** Khem: Update Kernel Image based on Sanil's input to move past known VM issues. Try this internally.
+
**** CSIT official release 20.09 is available
** Khem: LF tkt for Taishan BIOS updates.
+
***** https://docs.fd.io/csit/rls2009/report/
*** No update for the ticket
+
***** Jieqiang will compare the performance data with release 20.05
** Adarsh: openssl updates. Status:
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
*** Raised Jira ticket, needs to be discussed with VPP folks
+
**** Hardware configurations/wiring are done; Physical connection to the TG is done.
** Adarsh: Kubernetes
+
**** almost done, two steps need to be done
*** Working with K8s folks, planning on creating topology from containers for functional tests
+
***** codes to update Jenkins job needs to be merged
** Khem: VM(s) in container, VFs for containers
+
***** start with basic L2/L3/IPSec/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
** Sirshak: Summarize tkts in the Tuning Section. Status: Not Done
+
***** Take the execution time into consideration if we want run release testing on 2n-thx2.
** Khem: Investigation on ipv4 layer. Status: Not Done
+
** VPP Path
** Nitin: Send old patch on dpdk_input node tuning
+
*** CentOS-7 will be enabled with master branch for support lts release
 +
**** CentOS-7 Jenkins on Arm will be supported.
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
***** Will ask Dave if he needs help with testing CentOS-8 on Arm - Juraj.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
***** Implementation is ready, and will do test it with actual patches.
 +
***** Apply file locking mechanism to allow that only one VPP instances are running.
 +
****** https://gerrit.fd.io/r/c/csit/+/30425
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maintainers
 +
*** LF will provide QSFP+ fiber switch for FD.io lab.
 +
**** Basically done. LF just procured the existing fiber switch currently rent by Arm in FD.io lab.
 +
**** Send the progress to relavent people in Arm - Lijian
 +
**** Confirm with Tina to ensure Arm is not charged - Lijian
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
*** Verify SVE/SVE2 features on VPP CSIT
 
* VPP
 
* VPP
** Sachin: Upstreaming armv8 crypto changes. Status: Sachin will try to upstream a patch related to external DPDK
+
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
** Sirshak: Vectorization - Presentation.
+
** Multi-arch support - Lijian
** Any new findings on hotspots or optimizations. Brian: adjusting queue sizes seem to have an effect
+
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
** https://gerrit.fd.io/r/#/c/12932/ discussion: Need to understand the usecase(s) for iommu inside VPP
+
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
**** avf-input node with neon optimization is merged.
 +
**** ethernet-input patch needs to split into two parts required by VPP maintainer
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
*** Investigate the scalable SIMD instructions on RISC-V - Lijian
 +
*** Investigate how to run traffic tests for VPP in docker - Lijian
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
 
 +
'''12/15/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
** Will cancel the meeting on Dec 29th;
 
* CSIT
 
* CSIT
** Discuss current make test time bottleneck.
+
** VPP Performance Test
** AI Nitin: measure make and make test on ThunderX
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** AI Khem: measure make and make test on Taishan
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
** AI Sirshak: try to switch eth-usb for regular eth ports on Thunderxs
+
**** CSIT official release 20.09 is available
** Future role of devices. Status: will be decided when we have more info (performance on different devices etc.)
+
***** https://docs.fd.io/csit/rls2009/report/
** Question to Nitin/Anyone of how to individually run one test case of the performance suite. Status: no performance testcase can run on 2-node topologies
+
***** Jieqiang will compare the performance data with release 20.05
** Tkt updates:
+
*** Leverage current spare TX2 server as 2-node topology performance test-bed.
*** CSIT-1043 (Guest OS becomes unresponsive during CSIT): Juraj Updates: Sent a patch. Status: Patch is waiting to be merged
+
**** Hardware configurations/wiring are done; Physical connection to the TG is done.
*** CSIT-990 (buildroot package) Juraj Updates: No updates
+
** VPP Path
*** CSIT-1021: Handle Scapy pcap limit Khem(brief on patch, updates): Submitted. Jobs still failing, Khem to investigate. Patch related to Jumbo pkts.
+
*** CentOS-7 will be enabled with master branch for support lts release
* fd.io lab
+
**** CentOS-7 Jenkins on Arm will be supported.
** mcbin get them up, discuss with LF. Status: Brian - No Updates
+
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
** Cavium Blades LF ticket #56713 Status: Tina - Need to have a meeting
+
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
* Documentation
+
**** https://gerrit.fd.io/r/c/ci-management/+/28960
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
**** https://gerrit.fd.io/r/c/ci-management/+/28022
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
** VPP Device
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
** Subscribe to: docs@lists.fd.io
+
**** VPP community is responding this issue actively. - Juraj
 +
**** Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
 +
***** Implementation is ready, and will do test it with actual patches.
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maitainers
 +
*** LF will provide QSFP+ fiber switch for FD.io lab.
 +
**** Basically done. LF just procured the existing fiber switch currently rent by Arm in FD.io lab.
 +
**** Send the progress to relavent people in Arm - Lijian
 +
*** Arm is required to present Arm achievement and plan to TSC.
 +
**** Govind will prepare the slides
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** Optimize ethernet-input and avf-input node with NEON intrinsics
 +
**** Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
 +
*** Try to capture some software benchmarking results
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Performance on Altra is about 30%-40% lower than 8268.
 +
*** Performance on Altra is slightly better than N1SDP.
 +
*** IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** Benchmark Altra vs Cascade 8268
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
  
* Action Items - Next Week
 
  
'''6/12/2018'''
+
'''12/08/2020'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
** Brian Brooks
+
** Lijian Zhang
** John Bromhead
+
** Juraj Linkes
** Sachin Saxena
+
** Khemendra Kumar
+
** Adarsh
+
** Andy Wang
+
 
** Tina Tsou
 
** Tina Tsou
** Andrew Pinski
+
* General
** Nitin Saxena
+
* CSIT
** Natalie Samsonov
+
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
*** Use the spare TX2 server as 2-node topology performance test-bed.
 +
**** Hardware configurations/wiring are done.
 +
**** Physical connection to the TG is done.
 +
**** Software installation for the perf tests is pending.
 +
**** Execution time is much slower due to thunderx
 +
***** Code changes related to SSH calls speed up 4x.
 +
** VPP Path
 +
*** Dave will add CentOS-8 Jenkins on Arm job
 +
*** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022
 +
** VPP Device
 +
*** VPP device testing issues may be caused by XL710 i40e fw or kernel module.
 +
**** Working with VPP/DPDK/Intel to root cause this issue. - Juraj
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
**** Which is acceptable by CSIT maitainers
 +
*** LF will provide QSFP+ fiber switch for FD.io lab.
 +
**** Vexxhost just has a spare one, and LF will buy it for FD.io lab, which will probably happen this month.
 +
*** N1SDP shipment to FD.io
 +
**** Govind will track the status
 +
*** CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
 +
**** Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
 +
**** Arm is required to present Arm achievement and plan to TSC.
 +
***** Govind will prepare the slides
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** SOC id will be available on /proc entry starting from kernel version 5.9
 +
**** Will investigate the details - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Benchmarked cross-connect and TX queue is dropping packets
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** To optimize ethernet-input and avf-input node with NEON intrinsics
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 two proposals upstreamed
 +
*** https://gerrit.fd.io/r/c/vpp/+/29942
 +
*** https://gerrit.fd.io/r/c/vpp/+/30326
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
 +
*** Have to repeat the testing in the future.
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Ampere altra server has some PCIe bugs.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches and loop-unrolling with ipsec-out node
 +
**** Didn't observe much performance improvement (2%) so far
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
  
* Action Items - Last Week
+
'''12/1/2020'''
** Brian: mcbin status: Updates from Trishan LF tkt #54490. - No updates
+
* Attendees
** Sirshak: Follow up clang changes. Sent: Follow up patch.
+
** Govindarajan Mohandoss
** Khem: Update Kernel Image based on Sanil's input to move past known VM issues. Try this internally and then do it fd.io lab.
+
** Jieqiang Wang
** Khem: LF tkt for Taishan BIOS updates. LF #56898 Status: Not done. Will follow up.
+
** Juraj Linkes
** Adarsh: openssl updates. Status: IPSEC SA add entry error. To open a Jira tkt tracking this.  
+
** Tina Tsou
** Sirshak: Summarize tkts in the Tuning Section. Didnt get chance to do this week would try to complete it by next week.
+
* General
** Sirshak: Schedule a Meeting between Juraj and Khem. Done
+
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/ - Done
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/ - Done
 +
*** 20% perf-drop with L2 learning 1Mx flows, 4T4C, in release-2005
 +
**** Issue caused by - https://gerrit.fd.io/r/c/vpp/+/26549 - Sync up with Lijian
 +
*** Perf data capture for CSIT official release is done, so MRR testing with Taishan server is resolved.
 +
**** Huge-pages are not configured on Taishan, or previous 4K huge-pages are not enough.
 +
***** The issues are gone with 32k huge pages configured on the Taishan servers.
 +
**** Some random failed test cases due to SSH connection failures.
 +
*** Use the spare TX2 server as 2-node topology performance test-bed.
 +
**** Hardware configurations/wiring are done.
 +
**** Physical connection to the TG is done.
 +
**** Software installation for the perf tests is pending.
 +
**** Execution time is much slower due to thunderx
 +
***** Code changes related to SSH calls speed up 4x.
 +
** VPP Path
 +
*** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022 - auto-generate docker image
 +
**** Will keep the CentOS 7 with master branch.
 +
** VPP Device
 +
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
 +
*** LF will provide QSFP+ fiber switch for FD.io lab.
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** To enable voting right for the VPP device jobs. - Juraj
 +
***** Failed tests due to sw_interface_dump api issue. - Juraj
 +
**** VPP device job is unstable
 +
***** Race condition occurs when multiple VPP instances are starting.
 +
***** Will try to update the i40e driver & firmware.
 +
*** N1SDP shipment to FD.io
 +
**** Govind will update the shippment status to Juraj and Machiek.
 +
**** Will still have to ship N1SDP to FD.io lab, and Machiek confirmed that there will be enough rack space.
 +
*** CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
 +
**** Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
 +
**** Arm is required to present Arm achievement and plan to TSC.
 
* VPP
 
* VPP
** Brian: Talk on mcbin perf analysis. Nitin to send a old patch on tuning prefetch on dpdk_input node.
+
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
** Sirshak: VPP Multi-arch optimizations Guidelines
+
** Multi-arch support - Lijian
** Sirshak: Vectorization - Plan to present something next week. Any thoughts ?
+
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
** Nitin: anybody willing to take up ipv4 layer ? Khem to take a look.
+
*** SOC id will be available on /proc entry starting from kernel version 5.9
** Sachin: Upstreaming armv8 crypto changes.
+
**** Will investigate the details - Lijian
** Nitin: memcpy updates ?
+
** Investigate VPP Intel AVF PMD driver - Lijian
** Sirshak: clang patch status
+
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** To optimize ethernet-input and avf-input node with NEON intrinsics
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 proposal
 +
*** Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
 +
*** Patches are upstreamed for comments
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Ampere altra server has some PCIe bugs.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches with ipsec-out node
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
 
 +
'''11/24/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Tina Tsou
 +
* General
 
* CSIT
 
* CSIT
** Sirshak: Explain VPP Path and VPP Device
+
** VPP Performance Test
** Open Questions and Answers surrounding VPP Device
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
*** Q. Do the Intel onboard NICs support VFs via SRIOV on machiattobin boards ?
+
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
*** A.[Natalie] We support it but it’s not formally released yet. Will be formally delivered in 18.09.
+
*** 20% perf-drop with L2 learning 1Mx flows, 4T4C, in release-2005
*** BB - Kernel bypass uses UIO possible to do. [natalie] check support for VF for onboard NICs
+
**** Issue caused by - https://gerrit.fd.io/r/c/vpp/+/26549
*** Q. If Yes, is it a hardware level support or supported in musdk also ?
+
*** Perf data capture for CSIT official release is done, so MRR testing with Taishan server is resolved.
*** A.[Natalie] MUSDK is not relevant here. Intel NICs are using DPDK and ARM infrastructure directly. We support PCIE SR-IOV with both v4.4 and v4.14 kernels
+
**** Huge-pages are not configured on Taishan, or previous 4K huge-pages are not enough.
*** Q. Has anybody tested containers (docker) and any container orchestration system on mcbin (e.g Docker Swarm or Kubernetes) ?
+
*** Use the spare TX2 server as 2-node topology performance test-bed.
*** A.[Natalie] Yes.
+
**** Hardware configurations/wiring are done.
*** Q. K8s or Docker Swarn ?
+
** VPP Path
*** A. [Bin Arm Internal] K8s is good choice version(1.9.4). Use kubeadm to install k8s cluster.
+
*** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
*** Q. VM inside a container works on ARM ?
+
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
*** A. [Bin ARM Internal] Use Kata and Runv. Kata/Runv is the solution of hardware-virtualized containers.
+
**** CentOS-8 docker image on Arm is working fine, but not triggered per patch yet.
*** Q. Container within a Container(nested) works on ARM ?
+
**** https://gerrit.fd.io/r/c/ci-management/+/28960
*** A.[Bin ARM Internal] ‘Docker in docker’ or ‘Docker of Docker’ can works well on Arm platform.
+
**** https://gerrit.fd.io/r/c/ci-management/+/28022 - auto-generate docker image
** Sirshak: Explain the proposed role of Cavium Blades for functional tests.
+
** VPP Device
** Tkt updates:
+
*** Current VPP device testing on TX2 is around 40 mins - 45 mins
*** CSIT-1043 (Guest OS becomes unresponsive during CSIT): Juraj Updates: Sent a patch.  
+
*** LF will provide QSFP+ fiber switch for FD.io lab.
*** CSIT-990 (buildroot package) Juraj Updates:
+
*** CSIT will install normally used os distro and kernel.
*** CSIT-1021: Handle Scapy pcap limit Khem(brief on patch, updates): Submitted. Jobs failing Khem to investigate. Patch related to Jumbo pkts.
+
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
*** Sachin: To open tkt to track ARMv8 crypto.
+
**** To enable voting right for the VPP device jobs. - Juraj
* fd.io lab
+
***** Failed tests due to sw_interface_dump api issue. - Juraj
** mcbin Status: Brian - No Updates
+
*** N1SDP shipment to FD.io
** Cavium Blades #56713 Status: Tina
+
**** Govind will update the shippment status to Juraj and Machiek.
*Documentation
+
**** Will still have to ship N1SDP to FD.io lab, and Machiek confirmed that there will be enough rack space.
** Link to Pull Request: https://github.com/fdioDocs/vpp-docs/pull/7
+
*** CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
** Changes Shown Here: https://github.com/fdioDocs/vpp-docs/pull/7/files
+
**** Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
** Docs Page: https://a-olechtchoukvpp-docs.readthedocs.io/en/latest/tasks/writingdocs/index.html
+
**** Arm is required to present Arm achievement and plan to TSC.
** Subscribe to: docs@lists.fd.io
+
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** SOC id will be available on /proc entry starting from kernel version 5.9
 +
**** Will investigate the details - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** To optimize ethernet-input and avf-input node with NEON intrinsics
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 proposal
 +
*** Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
 +
*** Patches are upstreamed for comments
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Ampere altra server has some PCIe bugs.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches with ipsec-out node
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
  
* Action Items - Next Week
+
'''11/17/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
*** 20% perf-drop with L2 learning 1Mx flows, 4T4C, in release-2005
 +
**** Issue caused by - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** Use the spare TX2 server as 2-node topology performance test-bed.
 +
**** Juraj to check with Peter about the feasibility.
 +
** VPP Path
 +
*** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
*** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022 - auto-generate docker image
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** To enable voting right for the VPP device jobs. - Juraj
 +
***** Failed tests due to sw_interface_dump api issue. - Juraj
 +
*** N1SDP shipment to FD.io
 +
**** Will still have to ship N1SDP to FD.io lab, and Machiek confirmed that there will be enough rack space.
 +
*** CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
 +
**** Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
 +
**** Arm is required to present Arm achievement and plan to TSC.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** SOC id will be available on /proc entry starting from kernel version 5.9
 +
**** Will investigate the details - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Initial benchmarking and analysis is done, and profiling result is recorded.
 +
*** To optimize ethernet-input and avf-input node with NEON intrinsics
 +
*** Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
 +
** SVE/SVE2 proposal
 +
*** Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
 +
*** Patches are upstreamed for comments
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
 +
** Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Ampere altra server has some PCIe bugs.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
** IPsec on Arm platform. - Govind
 +
*** Apply prefetches with ipsec-out node
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
  
** Brian: mcbin Status:
+
'''11/10/2020'''
** Sirshak: Follow up clang changes. Status: Merged updated wiki.
+
* Attendees
** Khem: Update Kernel Image based on Sanil's input to move past known VM issues. Try this internally.
+
** Govindarajan Mohandoss
** Khem: LF tkt for Taishan BIOS updates.
+
** Lijian Zhang
** Adarsh: openssl updates. Status:
+
** Jieqiang Wang
** Sirshak: Summarize tkts in the Tuning Section. Status: Not Done
+
** Juraj Linkes
** Khem: Investigation on ipv4 layer. Status:
+
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
 +
***** Already done by juraj, the data is published on CSIT 2009 report.
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
**** Repeat tests on local N1SDP and cascade server. - Jieqiang
 +
**** Repeat the test case with latest master branch. - Jieqiang
 +
**** The patch introduced this perf drop need to be analyzed. - Jieqiang, Lijian
 +
**** This patch needs to be analysed on VPP 2005 and 2001 releases. - Jieqiang, Lijian
 +
**** The perf drop rate is ~5-8% on latest VPP code compared to the original data.
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
**** The iterative jobs for VPP 2009 are still running.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-report-iterative-2009-3n-tsh/
 +
**** Still running for one more weeks.
 +
**** Still running for more time due to Jenkins issues like Jenkins restart.
 +
**** Daily performance jobs only run MRR tests, while iterative jobs run MRR tests and NDR/PDR tests, etc, which takes longer time.
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
**** Move the thx2 to the same rack for tg and install the same nic on tg.
 +
**** 1g NIC for management installed on thx2, but cannot be net-booted.
 +
***** Able to net-boot from the built-in 10G NIC.
 +
***** The tx2 has been moved to the same rack where the tg is located.
 +
***** Plan to set up the weekly perf tests on the new topo.
 +
**** Port the robotframe configuration steps for tsh testbeds from thx1 to thx2 to speed up perf tests. - Juraj
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022 automate the generation of docker builder images.
 +
**** Plan to drop the support for CentOS 7 from Dave.
 +
**** Tried Dave's patch to generate docker image on Arm and saw some errors. - Juraj
 +
**** Test arm centos7 jenkins builder image. - Juraj.
 +
**** Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
**** Figure out which host of two hosts to run the Jenkins job.
 +
**** Revert to old kernel version 4.15.0-55 to avoid AVF issue.
 +
**** AVF issue is common across the platform.
 +
***** Differences between avf driver versions may be the root cause of behavior changes.
 +
**** New VPP device job takes about 55 minutes to finish, which needs to be reduced to 45 minutes around.
 +
***** Python runs slower on new thx2 servers than 1-node skylake.
 +
***** Try new version of Python(such as 3.8) or split the device tests into two parts.
 +
***** Check how many CPUs get utilized for robot framework execution on thx2 server.
 +
***** Two thunderx2 are running fine right now and the VPP device jobs are almost done.
 +
***** Disabling hyperthreading on new thx2 will speed up the VPP device tests.
 +
***** Enable the voting right for the VPP device jobs. - Juraj
 +
****** Failed tests due to sw_interface_dump api issue. - Juraj
 +
*** N1SDP shippment to FD.io
 +
**** Get response from Maciek about the rack space and traffic generator availability.
 +
*** CSIT budget plan for 10g switch purchase in FD.io lab. - Juraj, Tina
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
*** Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
 +
*** Summarize the meeting minutes and action items. - Lijian
 +
*** SOC id will be available on /proc entry from kernel version 5.9 - Ljian, Honnappa
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
*** Start benchmarking AVF PMD driver in VPP on N1SDP.
 +
*** Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
 +
*** avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
 +
** SVE/SVE2 proposal
 +
*** Will send email to Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
*** No further comments from VPP community.
 +
*** Apply the SVE/SVE2 intrinsics on refactoring ethernet-input node. - Lijian
 +
**** SVE/SVE2 functionality to be tested on the new development platform.
 +
**** Verify SVE/SVE2 code changes on simulator.
 +
**** Try to run standalone SVE codes on the new FPGA platform.
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
 +
** Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Ampere altra server has some PCIe bugs.
 +
*** Try the VFs with DPDK plugin. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
*** Find out the tuned configuration for cross connect test cases using AVF PMD driver.
 +
**** Figure out corresponding configurations in CSIT scripts.
 +
**** Repeat the ACL ingress SL test cases locally for N1SDP.
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
**** Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
 +
** Plans
  
'''6/4/2018'''
+
'''11/03/2020'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
** Brian Brooks
+
** Lijian Zhang
** John Bromhead
+
** Jieqiang Wang
** Sachin Saxena
+
** Juraj Linkes
** Khemendra Kumar
+
** Adarsh
+
** Andy Wang
+
 
** Tina Tsou
 
** Tina Tsou
** Andrew Pinski
+
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
**** Repeat tests on local N1SDP and cascade server. - Jieqiang
 +
**** Repeat the test case with latest master branch. - Jieqiang
 +
**** The patch introduced this perf drop need to be analyzed. - Jieqiang, Lijian
 +
**** Look into the patch to get some ideas about the code changes. - Jieqiang, Lijian
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
**** The iterative jobs for VPP 2009 are still running.
 +
**** Still running for one more weeks.
 +
***** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-report-iterative-2009-3n-tsh/
 +
***** Daily performance jobs only run MRR tests, while iterative jobs run MRR tests and NDR/PDR tests, etc, which takes longer time.
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
**** Move the thx2 to the same rack for tg and install the same nic on tg.
 +
**** 1g NIC for management installed on thx2, but cannnot be net-booted.
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28022 automate the generation of docker builder images.
 +
**** Test arm centos7 jenkins builder image. - Juraj.
 +
**** Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
**** Figure out which host of two hosts to run the Jenkins job.
 +
**** Revert to old kernel version 4.15.0-55 to avoid AVF issue.
 +
**** AVF issue is common across the platform.
 +
***** Differences between avf driver versions may be the root cause of behavior changes.
 +
**** New VPP device job takes about 55 minutes to finish, which needs to be reduced to 45 minutes around.
 +
***** Python runs slower on new thx2 servers than 1-node skylake.
 +
***** Try new version of Python(such as 3.8) or split the device tests into two parts.
 +
***** Check how many CPUs get utilized for robot framework execution on thx2 server.
 +
***** Two thunderx2 are running fine right now and the VPP device jobs are almost done.
 +
*** N1SDP shippment to FD.io
 +
**** Get response from Machiek about the rack space and traffic generator avalability.
 +
*** CSIT budget plan for 10g switch purchase in FD.io lab. - Juraj, Tina
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
*** Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
 +
*** Summarize the meeting minutes and action items. - Lijian
 +
*** SOC id will be available on /proc entry from kernel version 5.9 - Ljian, Honnappa
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
*** Start benchmarking AVF PMD driver in VPP on N1SDP.
 +
*** Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
 +
*** avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
 +
** SVE/SVE2 proposal
 +
*** Will send email to Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
*** No further comments from VPP community.
 +
*** Apply the SVE/SVE2 intrinsics on refractoring ethernet-input node. - Lijian
 +
**** SVE/SVE2 functionality to be tested on the new development platform.
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
 +
** Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
*** Find out the tuned configuration for cross connect test cases using AVF PMD driver.
 +
**** Figure out corresponding configurations in CSIT scripts
 +
**** Repeat the ACL ingress SL test cases locally for N1SDP.
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
**** Review akshitha's PPT on SLC eviction and share it with the team. - Govind.
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
** Plans
 +
 
 +
'''10/27/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 
** Juraj Linkes
 
** Juraj Linkes
** Nitin Saxena
+
** Tina Tsou
** Natalie Samsonov
+
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
**** Repeat tests on local N1SDP and cascade server. - Jieqiang
 +
**** Look into the patch to get some ideas about the code changes. - Jieqiang, Lijian
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
**** The iterative jobs for VPP 2009 are still running.
 +
**** Still running for one or two weeks.
 +
***** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-report-iterative-2009-3n-tsh/
 +
***** Daily performance jobs only run MRR tests, while iterative jobs run MRR tests and NDR/PDR tests, etc, which takes longer time.
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
**** Move the thx2 to the same rack for tg and install the same nic on tg.
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
**** Figure out which host of two hosts to run the Jenkins job.
 +
**** Revert to old kernel version 4.15.0-55 to aviod AVF issue.
 +
***** Differences between avf driver versions may be the root cause of behavior changes.
 +
**** New VPP device job takes about 55 minutes to finish, which needs to be reduced to 40 minutes around.
 +
***** Python runs slower on new thx2 servers than 1-node skylake.
 +
***** Try new version of Python(such as 3.8) or split the device tests into two parts.
 +
***** Check how many CPUs get utilized for robot framework execution on thx2 server.
  
* Action Items - Last Week
+
* VPP
** Sirshak: To create a LF tkt for mcbin - Didnt create as Brian is handling it offline. If things remain unresolved this week, will create one. - LF Tkt created #54490. [BB]Trishan to follow up over email.
+
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
** Sirshak: Follow up on cavium-3 : Its integrated to arm CI job.
+
** Multi-arch support - Lijian
** Sirshak: Upstream clang changes: Failing on Cavium TX1 host up-streamed related patch working on review comments.
+
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
** Sirshak: Discuss with Maciek and get a signoff for moving the x86 Hosts to arm rack: Done
+
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
** Honnappa: Provide inputs on how to proceed with comments on Marvell dpdk patch.
+
*** Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
** Honnappa: VPP-1284: To look at this patch to provide comments on performance implications of the fix
+
*** Summarize the meeting minutes and action items. - Lijian
** Juraj estimate moving CSIT functional tests to make test. - 1-2 months for 1 person. Others CSIT looking into this. Better estimate soon.
+
** Investigate VPP Intel AVF PMD driver - Lijian
** Khem: Create LF tkt for Performance Suite Topology Creation. : Created LF #56736
+
*** Start investigating AVF code in VPP.
** Adarsh: Create a Jira to document Automation Task. Created Jira Tkt.
+
*** Start benchmarking AVF PMD driver in VPP on N1SDP.
** Khem: Follow up Sanil : Known taishan vm issues. Update Kernel Image
+
*** Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
** Khem: LF tkt for Taishan BIOS updates. LF #56898
+
*** avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
** Adarsh: openssl updates. Updated openssl dpdk. VPP is now stable. Will test soon. Adarsh to close the tkt.
+
** SVE/SVE2 proposal
** Nitin: VPP-1064 multiple cache line size patch. Nitin to raise to LF tkt to remove DPDK package from Nexus server.
+
*** Will send email to Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
*** No further comments from VPP community.
 +
*** Apply the SVE/SVE2 on ethernet-input node. - Lijian
 +
** Repeat the 4x and 2x loop unrolling tests on Ampere server. - Jieqiang
 +
** Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
**** Figure out corresponding configurations in CSIT scripts
 +
**** Repeat the ACL ingress SL test cases locally for N1SDP.
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
*** Work on IPsec input node and VPP uses linear search on SPD lookup.
 +
**** Will try loop unrolling on the SPD lookup.
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
**** Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
** Plans
  
* fd.io lab
+
'''10/20/2020'''
** mcbin onboarding issue. - Comments in Action Items - Last Week.
+
* Attendees
** new cavium boxes status - JohnB : Blade 1-4 racked. CSIT Functional.
+
** Govindarajan Mohandoss
** Sirshak : Summarize tkts.
+
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
**** The iterative jobs for VPP 2009 are still running.
 +
***** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-report-iterative-2009-3n-tsh/
 +
***** Daily performance jobs only run MRR tests, while iterative jobs run MRR tests and NDR/PDR tests and etc, which takes longer time.
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
 +
**** Errors happen when running latest VPP debug image, which was introduced by https://gerrit.fd.io/r/c/vpp/+/29490 - Lijian
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
**** Figure out which host of two hosts to run the Jenkins job.
 +
**** Two failed test cases related to AVF plugin.
 +
***** The root cause is the newer kernel version - 4.15.0-118-generic fails, 4.15.0-72-generic works.
 +
***** Downgrade the kernel version to 4.15.0-72-generic and continue the VPP device testing.
 +
***** Try the same experiment on X86 to see if this issue is arm-specific or not. - Juraj
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
*** Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
*** Start benchmarking AVF PMD driver in VPP on N1SDP.
 +
*** Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
 +
** SVE/SVE2 proposal
 +
*** Will send email to Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
*** No further comments from VPP community.
 +
** Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
 +
*** Finished the benchmarking and shared the data to team. - Jieqiang
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
**** Figure out corresponding configurations in CSIT scripts
 +
**** Repeat the ACL ingress SL test cases locally for N1SDP.
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
*** Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
 +
*** Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
 +
** Plans
  
 +
'''10/13/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
**** Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
**** Figure out which host of two hosts to run the Jenkins job.
 +
**** Two failed test cases related to AVF plugin.
 
* VPP
 
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs with other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
*** Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
** SVE/SVE2 proposal
 +
*** Will send email Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
*** No further comments from VPP community.
 +
** Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
 +
*** Finished the benchmarking and shared the data to team.
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
**** Figure out corresponding configurations in CSIT scripts
 +
**** Repeat the ACL ingress SL test cases locally for N1SDP.
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
** Plans
  
** memcpy patch updates/closure: Abandon. Jira to be updated with more data.
+
'''10/06/2020'''
** clang compilation Sirshak: Working on getting the patch upstreamed.
+
* Attendees
** mcbin performance analysis Brian: To talk about this next week.
+
** Govindarajan Mohandoss
** vectorization sirshak(Problem, Plausible Solution, Volunteers): SSE2NEON
+
** Juraj Linkes
** Sachin: upstreaming armv8 crypto changes.
+
** Tina Tsou
** Sirshak: Add Tuning section in Wiki
+
** Honnappa Nagarahalli
** Sirshak: Summarize Jira Tkts
+
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
 +
**** Juraj to check with Peter about the feasibility.
 +
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
**** The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate vendor CPUs and other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
** SVE/SVE2 proposal
 +
*** Will send email Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
** Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
 +
*** Finished the benchmarking and shared the data to team.
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
*** Figure out corresponding configurations in CSIT scripts
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
** Plans
  
 +
'''09/29/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 
* CSIT
 
* CSIT
** Performance Suite Roadmap(topology, work distribution(khem, juraj)):
+
** VPP Performance Test
** Sirshak to Schedule a Meeting between Juraj and Khem.
+
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
** CSIT-1043 (Guest OS becomes unresponsive during CSIT): Juraj Updates: Seen by Juraj. Seeing the issue in ipv6 suite. happens during pcie rescan.
+
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
** CSIT-990 (buildroot package) Juraj Updates: Peter from pantheon replied Juraj still looking into it.
+
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
** CSIT-1021: Handle Scapy pcap limit Khem(brief on patch, updates):
+
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
** Sirshak : Summarize CSIT tkts
+
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
** Sachin: To open tkt to track ARMv8 crypto.
+
** VPP Path
 +
*** Totally 6x ThunderX1 servers in Nomad cluster
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
** VPP Device
 +
*** CSIT will install normally used os distro and kernel.
 +
*** 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate Vendor CPUs with other Perseus CPUs
 +
*** Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
** SVE/SVE2 proposal
 +
*** Will send email Damjan asking him to review
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
** Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
 +
*** Finished the benchmarking and shared the data to team.
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
*** Figure out corresponding configurations in CSIT scripts
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
** Plans
  
*Documentation
+
'''09/22/2020'''
** Special VPP installations(eg. dpaa).
+
* Attendees
** ARMv8 crypto needs to documented.
+
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
**
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
**** https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
 +
*** L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
 +
**** The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
** VPP Path
 +
*** VexxHost will replace the faulty RAM with a new one, and get the expense reimbursed by LF.
 +
**** Issue is resolved by replugining back the previous RAM, and server is alive now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
 +
**** Add CentOS-7 on Arm - Second step;
 +
**** https://gerrit.fd.io/r/c/ci-management/+/28960
 +
** VPP Device
 +
*** 3x SoftIron servers will be decommissioned directly to free rack space for 2x ThunderX2 servers.
 +
*** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
*** VexxHost people will setup the servers and provide IP connectivity.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Multi-arch support - Lijian
 +
*** Key point is how to differentiate Vendor CPUs with other Perseus CPUs
 +
** Investigate VPP Intel AVF PMD driver - Lijian
 +
*** Start investigating AVF code in VPP.
 +
** SVE/SVE2 proposal
 +
*** SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
 +
** Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
 +
*** Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
 +
*** Figure out corresponding configurations in CSIT scripts
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** IPsec on Arm platform. - Govind
 +
** Plans
  
* Action Items - Next Week
+
'''09/15/2020'''
** Brian: mcbin status: Updates from Trishan LF tkt #54490.
+
* Attendees
** Sirshak: Follow up clang changes.
+
** Govindarajan Mohandoss
** Khem: Update Kernel Image based on Sanil's input to move past known VM issues.
+
** Juraj Linkes
** Khem: LF tkt for Taishan BIOS updates. LF #56898 Status:
+
** Jieqiang Wang
** Adarsh: openssl updates.
+
** Tina Tsou
** Sirshak: Summarize tkts in the Tuning Section.
+
* General
** Sirshak: Schedule a Meeting between Juraj and Khem.  
+
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
*** On L2 learning 1Mx flows, 4T4C, with release-2005, there is about 20% performance drop.
 +
**** The patch caused this issue has been identified.
 +
***** https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
** VPP Path
 +
*** LF will pay for the expense, and Vexhost has or will make the order for new RAM module.
 +
*** Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
*** Check with Juraj with the latest news about the faulty RAMs.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - first step;
 +
**** Add CentOS-7 on Arm will be second step.
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
**** Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
 +
** VPP Device
 +
*** We can decommission 3x SoftIron servers directly, but for the existing ThunderX2 servers, the decommission with it could be temporarily. We probably will reinstall it in the near future.
 +
*** Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
 +
** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
** Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
 +
** Budget plan for CSIT FD.io lab.
 +
*** We have enough servers for VPP path & device tests.
 +
*** We can ask the CSIT FD.io lab folks for saving rack space for arm servers.
 +
*** We may plan to send new advanced servers for perf tests in future but we won't mention the specific server type.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** Vendor CPU server enablement in VPP - Lijian
 +
*** Ready for internal review
 +
*** Will discuss with VPP maintainer
 +
** Investigate VPP Intel AVF driver - Lijian
 +
** SVE
 +
*** SVE intrinsics wrapper is done. Proposal patch is ready for review.
 +
*** ACLE, architecture, sve-sve2-programming-example
 +
*** SVE intrinsics is preferred.
 +
*** Share dpdk team with SVE knowledge.
 +
** Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
 +
*** Will repeat scalability testing on N1SDP.
 +
** Benchmark AVF driver btw Cascade Lake and N1SDP - Jieqiang
 +
*** Will investigate AVF drivers on Arm. - Lijian
 +
** Jieqiang will figure out performance data for 1x, 10Kx flows on Cascade Lake in CSIT.
 +
*** Conform if the system is same for the local dell server and cascade server in CSIT. - Jieqiang
 +
*** Check if there are any test cases with 1t1c/2t2c/4t4c configured for 2n-clx testbed in CSIT - Jieqiang
 +
*** Performance data; Configurations;
 +
** VM2VM
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Started system tuning on PMD TX direction.
 +
*** Investigate mempool configuration.
 +
*** Change the descriptor size by modifying the DPDK source code.
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
  
 +
'''09/08/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
 +
*** On L2 learning 1Mx flows, 4T4C, with release-2005, there is about 20% performance drop.
 +
**** The patch caused this issue has been identified.
 +
***** https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
** VPP Path
 +
*** LF will pay for the expense, and Vexhost has or will make the order for new RAM module.
 +
*** Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Dave is preparing scripts to generate docker images automatically on both x86 and Arm - first step;
 +
**** Add CentOS-7 on Arm will be second step.
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
**** Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
 +
** VPP Device
 +
*** We can decommission 3x SoftIron servers directly, but for the existing ThunderX2 servers, the decommission with it could be temporarily. We probably will reinstall it in the near future.
 +
*** Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
 +
** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
** Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
 +
* VPP
 +
** In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** SVE
 +
*** SVE intrinsics wrapper is done. Proposal patch is ready for review.
 +
*** ACLE, architecture, sve-sve2-programming-example
 +
*** SVE intrinsics is preferred.
 +
** Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
 +
*** Will repeat scalability testing on N1SDP.
 +
** Benchmark AVF driver btw Cascade Lake and N1SDP - Jieqiang
 +
*** Will investigate AVF drivers on Arm. - Lijian
 +
** Jieqiang will figure out performance data for 1x, 10Kx flows on Cascade Lake in CSIT.
 +
*** Performance data; Configurations;
 +
** VM2VM
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Started system tuning on PMD TX direction.
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
  
'''5/29/2018'''
+
'''09/01/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** On L2 learning 1Mx flows, 4T4C, with release-2005, there is about 20% performance drop.
 +
**** The patch caused this issue has been identified.
 +
***** https://gerrit.fd.io/r/c/vpp/+/26549
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
** VPP Path
 +
*** Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
**** Seems plugin working RAMs into empty slots will resolve the problem.
 +
**** Juraj will send email to Machiek about the ownership of any FD.io lab servers, and who should pay for the charge.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
**** IPMI IP is configured via SSH Linux prompt. It's working fine now.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
**** Pending with Vexx host to proceed further.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
**** Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
 +
** VPP Device
 +
*** Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
 +
** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
** Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
 +
* VPP
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
*** gcc-10.1.0 has compiling errors with latest VPP source code.
 +
**** This issue is fixed by Jieqiang and available for internal review.
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** gcc-10 compiling issue is resolved and merged.
 +
** SVE
 +
*** ACLE, architecture, sve-sve2-programming-example
 +
*** SVE intrinsics is preferred.
 +
** Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
 +
** VM2VM
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Started system tuning on PMD TX direction.
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
 
 +
'''08/25/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** On L2 learning 1Mx flows, 4T4C, with release-2005, there is about 20% performance drop.
 +
**** Jieqiang is trying to narrow down the patch that causes the issue.
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
** VPP Path
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
**** Seems plugin working RAMs into empty slots will resolve the problem.
 +
**** Juraj will send email to Machiek about the ownership of any FD.io lab servers, and who should pay for the charge.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
**** IPMI IP is configured via SSH Linux prompt. It's working fine now.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
**** Pending with Vexx host to proceed further.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
**** Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
 +
** VPP Device
 +
*** Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
 +
** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
** Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
 +
* VPP
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
*** gcc-10.1.0 has compiling errors with latest VPP source code.
 +
**** This issue is fixed by Jieqiang and available for internal review.
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** SVE
 +
*** ACLE, architecture, sve-sve2-programming-example
 +
*** SVE intrinsics is preferred.
 +
** Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
 +
** VM2VM
 +
** Transport use cases on VPP. - Govind
 +
*** Discussed the node graph and topology.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
 
 +
 
 +
'''08/18/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Jieqiang is investigating some performance drop (between 2005 and 2008 releases) cases on Taishan servers.
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
*** Jieqiang will share investigation report, but so far there is no apparent performance differences.
 +
**** Some 4T4C test cases on Taishan have obvious performance drop. Will compare the trending with x86 machines.
 +
** VPP Path
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
**** Pending with Vexx host to proceed further.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
**** Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
 +
** VPP Device
 +
** Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
 +
** Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
*** Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
 +
*** Investigating vlib_timer and timer wheel in VPP.
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
*** gcc-10.1.0 has compiling errors with latest VPP source code.
 +
**** This issue is fixed by Jieqiang and available for internal review.
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
 
 +
 
 +
'''08/11/2020'''
 
* Attendees
 
* Attendees
** Sirshak Das
 
** Brian Brooks
 
** John Bromhead
 
** Sachin Saxena
 
** Khemendra Kumar
 
** Adarsh
 
** Andy Wang
 
 
** Honnappa Nagarahalli
 
** Honnappa Nagarahalli
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 
** Tina Tsou
 
** Tina Tsou
** Andrew Pinski
+
** Lijian Zhang
 +
** Filip Varga
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Jieqiang is investigating some performance drop cases on Taishan servers.
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
*** Jieqiang will share investigation report, but so far there is no apparent performance differences.
 +
**** Some 4T4C test cases on Taishan have obvious performance drop. Will compare the trending with x86 machines.
 +
** VPP Path
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
*** Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
 +
*** Investigating vlib_timer and timer wheel in VPP.
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
*** gcc-10.1.0 has compiling errors with latest VPP source code.
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
 
 +
'''08/04/2020'''
 +
* Attendees
 +
** Honnappa Nagarahalli
 +
** Govindarajan Mohandoss
 
** Juraj Linkes
 
** Juraj Linkes
** Nitin Saxena
+
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
** Filip Varga
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
*** Jieqiang will share investigation report, but so far there is no apparent performance differences.
 +
**** Some 4T4C test cases on Taishan have obvious performance drop. Will compare the trending with x86 machines.
 +
** VPP Path
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
*** Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
 +
*** Investigating vlib_timer and timer wheel in VPP.
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
*** gcc-10.1.0 has compiling errors with latest VPP source code.
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify the patch.
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Currently working on non-encryption optimization with PMD driver.
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
  
* Action Items - Last Week
 
** Sirshak: To update LF RT #54919 to follow up on cavium-2 status. - Not Needed as cavium-2 is present.
 
** Sirshak: Release Machine to EdK as soon as ThunderX is up. - Done
 
** Sirshak: Status on new ThunderXs: Will be decided after talks with Maciek. - Yet to decide.
 
** Sirshak: vm unresponsive issue. Tried again still got 27 errors for ipv4 handed over to Juraj for further investigation.
 
** Sirshak: To ask about CSIT performance topology connection status. Didnt get time mostly discussing VIRL job.
 
** Sirshak: to add OS version to fd.io lab machines. -Done by somebody else.
 
** Sirshak: to add Porting and Tuning section. Check with Honnappa
 
** Sirshak: to track arm master build failure. - Damjan has sent a fix.
 
** Juraj: Access to fd.io lab. - Done.
 
** Khem: to create a Jira tkt to document automation task of CSIT. - Still Working on it.
 
** Khem: to reach out to Sanil(Huawei)regarding known Taishan problems with KVM. - No response from Sanil yet.
 
** Khem: BIOS patch for NUMA node numbering issue. - Khem to create LF RT tkt to do this in fd.io lab.
 
** Nitin: VPP-1064 Support multiple cache line sizes per architecture. - Still in discussion with Dave.
 
** Adarsh: openssl updates. VPP crashing.
 
  
 +
'''07/28/2020'''
 +
* Attendees
 +
** Honnappa Nagarahalli
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** They have finished collecting data with performance testing setup, and the mrr daily is resumed
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
*** Jieqiang will share investigation report, but so far there is no apparent performance differences.
 +
*** VPP performance testing is running once a week.
 +
** VPP Path
 +
*** One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
 +
*** The second ThunderX1 has IPMI problem, but SSH is working fine.
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster.
 +
**** Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been upstreamed for review and merge.
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
*** Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
 +
*** Investigating vlib_timer and timer wheel in VPP.
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
** N1SDP enablement. - Lijian
 +
*** Preparing patches to enable creating big tables on huge-pages
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify  the patch.
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Focus both non-encryption and encryption cases.
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
  
* fd.io lab
+
'''07/21/2020'''
** mcbin powering on ? Sirshak to create LF tkt. Reach out to Brian offline.
+
* Attendees
** Cavium-3 role. Make decision based on feedback Edk. Sirshak to check availability.  
+
** Honnappa Nagarahalli
** Sirshak to ask Brian to forward old LF tkt to JohnB.
+
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
 +
*** VPP performance testing is running once a week.
 +
** VPP Path
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** 3x spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Arm has
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
*** Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
 +
*** Investigating vlib_timer and timer wheel in VPP.
 +
** Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
 +
** N1SDP enablement. - Lijian
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
**** Upstreamed and are using csit testing to verify  the patch.
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
  
 +
 +
'''07/14/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VPP performance testing is running once a week.
 +
*** Community has started collecting performance data with these CSIT machines.
 +
** VPP Path
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
 +
**** Two of the three ThunderX1 servers cannot be accessed.
 +
**** Spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
 +
**** Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 
* VPP
 
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
*** Investigating various No. of rx_q_bufs & tx_q_bufs
 +
*** Investigating various No. of vector size, and check its affection on throughput
 +
*** Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Investigating using SPE counters to profile ACL plugin bottle-neck
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
  
** ARMv8 crypto patch from Sachin related to dpdk_plugin only.
+
'''07/07/2020'''
** memcpy issue: going with memcpy and not hand crafted memcpy.
+
* Attendees
** clang compilation: Sirshak to upstream to clang related changes add all other aarch64 leads.
+
** Govindarajan Mohandoss
** Brian to use cache stashing result. Updates: No affects for VPP but there is improvement on musdk sample application.
+
** Juraj Linkes
** VPP-1267(Marvell dpdk patch mcbin): How to move forward based on Damjan's comments. Still discussing. Honnappa to provide some inputs next week.
+
** Jieqiang Wang
** VPP-1276(rpm issues aarch64): Not priorty. Status: No updates.
+
** Tina Tsou
** VPP-1284: TLS corruption on aarch64: Status(After Sachin's suggestion): Resolved. Might have performance implications but currently only possible solution. HN to look at this Jira Card in order talk to compiler team if needs be.
+
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VPP performance testing is running once a week.
 +
*** Community has started collecting performance data with these CSIT machines.
 +
** VPP Path
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
 +
**** Two of the three ThunderX1 servers cannot be accessed.
 +
**** Spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
 +
**** Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
*** Investigating various No. of rx_q_bufs & tx_q_bufs
 +
*** Investigating various No. of vector size, and check its affection on throughput
 +
*** Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Investigating using SPE counters to profile ACL plugin bottle-neck
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
  
 +
 +
'''06/30/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
** Honnappa Nagarahalli
 +
* General
 
* CSIT
 
* CSIT
** TG status in fd.io lab and internal Huawei Lab. - Sirshak to discuss with Maciek. Khem to create LF tkt.
+
** VPP Performance Test
** CSIT-1019 (timeout of PacketVerifier.RxQueue is not working): Done.(Upstreamed Merged ?). Status: Merged.
+
*** VPP performance testing is running once a week.
** CSIT-1023 (Crypto Func Tests): VPP still crashing - Adarsh
+
*** Community has started collecting performance data with these CSIT machines.
** CSIT-1043 (Guest OS becomes unresponsive during CSIT): Sirshak tried pinning the VMs to phy CPUs but tests still failing. Juraj to take over.
+
** VPP Path
** CSIT-990 (buildroot package) Brian Status: build issue with grub.  
+
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
** Juraj: Estimate on moving CSIT Functional tests to make test. Maciek proposal does consider all the implications of letting go VIRL especially parallelization VIRL offers.
+
**** Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
 +
**** Two of the three ThunderX1 servers cannot be accessed.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload.
 +
**** The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
 +
**** Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** IP4-rewrite refactor patch brings performance improvement especially with 10K flows
 +
*** Investigating various No. of rx_q_bufs & tx_q_bufs
 +
*** Investigating various No. of vector size, and check its affection on throughput
 +
*** Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Investigating using SPE counters to profile ACL plugin bottle-neck
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
  
* Action Items - Next Week
+
'''06/23/2020'''
** Sirshak: To create a LF tkt for mcbin
+
* Attendees
** Sirshak: Follow up on cavium-3.
+
** Govindarajan Mohandoss
** Sirshak: Upstream clang changes.
+
** Juraj Linkes
** Honnappa: Provide inputs on how to proceed with comments on Marvell dpdk patch.
+
** Jieqiang Wang
** Honnappa: VPP-1284: To look at this patch to provide comments on performance implications of the fix
+
** Tina Tsou
** Juraj estimate moving CSIT functional tests to make test.
+
** Lijian Zhang
** Sirshak: Discuss with Maciek and get a signoff for moving the x86 Hosts to arm rack.
+
* General
** Khem: Create LF tkt for Performance Suite Topology Creation.
+
* CSIT
** Adarsh: Create a Jira to document Automation Task
+
** VPP Performance Test
** Khem: Follow up Sanil : Known taishan vm issues.
+
*** VPP performance testing is running once a week.
** Khem: LF tkt for Taishan BIOS updates.
+
*** Community has started collecting performance data with these CSIT machines.
** Nitin: VPP-1064 multiple cache line size patch.:
+
** VPP Path
** Adarsh: openssl updates.
+
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
**** Two of the three ThunderX1 servers cannot be accessed.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload.  
 +
**** The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
 +
**** Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.  
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** L3FWD status
 +
** CSIT status
 +
** EPIC plan
 +
*** SVE2 investigation in VPP;
 +
*** VPP hoststack TCP/CPS(Connnection per Second) investigation;
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** Profiling with NMU-600 counters.
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
  
'''5/22/2018'''
+
'''06/16/2020'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
** Stanislav Chlebec
+
** Juraj Linkes
** John Bromhead
+
** Jieqiang Wang
** Sachin Saxena
+
** Tina Tsou
** Khemendra Kumar
+
** Lijian Zhang
** Andy Wang
+
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VPP performance testing is running once a week.
 +
*** Community has started collecting performance data with these CSIT machines.
 +
** VPP Path
 +
*** Juraj will follow or create new vexxhost ticket to replace faulty RAM.
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
**** Patch is merged.
 +
**** libssl.so is missing in dependencies in vpp Makefile.
 +
**** Committed internal code review to address the issue - https://gerrit.oss.arm.com/#/c/162878/
 +
**** using gcc-9.3 now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
 +
**** labelled by Dave Wallace to use it for VPP Jenkins job.
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** Profiling with NMU-600 counters.
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''06/09/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VPP performance testing is running once a week.
 +
*** Community will collect performance data with these CSIT machines.
 +
*** IPSec tunnel configuration issue.
 +
**** Issue is resolved.
 +
**** Also failing on x86. CSIT maintainer is trying to root cause the problem.
 +
***** Juraj to run the IPSec regression on Taishan server with the IPSec patch.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
**** Patch is merged.
 +
**** libssl.so is missing in dependencies in vpp Makefile.
 +
**** Committed internal code review to address the issue - https://gerrit.oss.arm.com/#/c/162878/
 +
**** using gcc-9.3 now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
 +
**** labelled by Dave Wallace to use it for VPP Jenkins job.
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
*** Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
 +
*** Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** If vexxhost can collect the hardware, will ship the servers asap.
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** Profiling with NMU-600 counters.
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''06/02/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** IPSec tunnel configuration issue.
 +
**** Also failing on x86. CSIT maintainer is trying to root cause the problem.
 +
***** Juraj to run the IPSec regression on Taishan server with the IPSec patch.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
**** libssl.so is missing in dependencies in vpp Makefile.
 +
**** Committed internal code review to address the issue - https://gerrit.oss.arm.com/#/c/162878/
 +
**** using gcc-9.3 now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
 +
**** labelled by Dave wallace to use it for VPP Jenkins job.
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers.
 +
**** Internal patch is committed. Requires legal permission.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
*** 'make build-release CC=gcc' will override default clang-9 in vpp.
 +
** N1SDP enablement. - Lijian
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''05/26/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Juraj Linkes
 +
** Jieqiang Wang
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** IPSec tunnel configuration issue.
 +
**** Also failing on x86. CSIT maintainer is trying to root cause the problem.
 +
***** Juraj to run the IPSec regression on Taishan server with the IPSec patch.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
**** libssl.so is missing in dependencies in vpp Makefile.
 +
**** Committed internal code review to address the issue - https://gerrit.oss.arm.com/#/c/162878/
 +
**** using gcc-9.3 now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
 +
**** labelled by Dave wallace to use it for VPP Jenkins job.
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
*** Update the document with server information before shipping the servers. Jieqiang will setup a meeting with Juraj regarding this documentation.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
*** This change can be done once TX2 servers are shipped to FDIO lab.
 +
** Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
 +
*** Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
** N1SDP enablement. - Lijian
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
 
 +
'''05/19/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 
** Honnappa Nagarahalli
 
** Honnappa Nagarahalli
 +
** Juraj Linkes
 
** Tina Tsou
 
** Tina Tsou
** Andrew Pinski
+
** Jieqiang Wang
** John Bromhead
+
** Lijian Zhang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** the other failure is related with VPP image on Arm, IPSec tunnel configuration issue.
 +
**** Also failing on x86. CSIT maintainer is trying to root cause the problem.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
**** libssl.so is missing in dependencies in vpp Makefile.
 +
**** Committed internal code review to address the issue - https://gerrit.oss.arm.com/#/c/162878/
 +
**** using gcc-9.3 now.
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Vanessa Valderrama <vvalderrama@linuxfoundation.org>
 +
**** 'Dave Wallace' <dwallacelf@gmail.com>
 +
**** https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
* FD.io lab
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
** Ed Kern - Install nomad service in those two servers - Juraj & Jieqiang
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
*** Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
 +
** N1SDP enablement. - Lijian
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
*** Upstream the ACL patch for CSIT performance testing experiment.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
*** Basic IPsec functions are working. Will do benchmarking per CPU core.
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
 
 +
'''04/28/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 
** Juraj Linkes
 
** Juraj Linkes
** rkinsell
+
** Tina Tsou
** Nitin Saxena
+
** Jieqiang Wang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Two failures in performance testing
 +
**** one failure is related with CSIT script, NAT44 is common issue, failing with x86 also.
 +
***** Has been fixed already.
 +
**** the other failure is related with VPP image on Arm, IPSec tunnel configuration issue.
 +
***** Also failing on x86. CSIT maintainer is trying to root cause the problem.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Jieqiang
 +
*** Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** gcc-9 is hard-coded and used, so compilation issue is gone.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Dean will schedule shipping these two TX2 servers to FD.io lab.
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
** Ed Kern - Install nomad service in those two servers - Juraj & Jieqiang
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** Resolve vectorized endianness conversion error in Mellanox RDMA driver.
 +
**** Patch (https://gerrit.fd.io/r/c/vpp/+/26950) is merged.
 +
*** To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
 +
*** Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
 +
** Resolve VPP compiling issue with clang-6.
 +
*** Patch (https://gerrit.fd.io/r/c/vpp/+/26949) is merged.
 +
** VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
 +
** N1SDP enablement. - Lijian
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is merged.
 +
**** https://gerrit.fd.io/r/c/vpp/+/26804
 +
*** IOMMU limitation issue is gone after upgrade the kernel and fw
 +
**** Share kernel/fw upgrade version to Govind
 +
*** Investigate 4x loop unrolling performance degradation issue.
 +
*** Throughput performance drop as flow number increases in N1SDP. 
 +
** ACL optimization investigation on n1sdp - Govind
 +
*** Patch to remove redundancy prefetches are committed - Govind
 +
*** Filed a confluence page to record the ACL investigation.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
  
* Action Items - Last Week
+
'''04/28/2020'''
** Sirshak: To update LF RT #54919 to follow up on cavium-2 status. - having troubles with login will sort it out today.
+
* Attendees
** Sirshak: Release Machine to EdK as soon as ThunderX is up: cavium-1 done cavium-2 still has issues with network connectivity.
+
** Govindarajan Mohandoss
** Sirshak: Status on new ThunderXs: Will be decided after talks with Maciek.
+
** Honnappa Nagarahalli
** Sirshak: vm unresponsive issue: No updates didnt get time to try will try this week.
+
** Juraj Linkes
** Sirshak: To ask about CSIT performance topology connection status. - TBD after call with Maciek.
+
** Tina Tsou
** Nitin: VPP-1064 (Patch rejected by dave barach) Discuss cross compilation with Sachin. (Seperate or one unified Makefile). - No Updates.
+
** Jieqiang Wang
** HN: memcpy benchmarking updates honnappa - 2 more tests to be done based on Ola's suggestion.  
+
** Arthur Marshall
** Adarsh openssl issues: Will communicate with Sachin to get this resolved. Made changes based sachin's suggestions still issues to be resolved.
+
* General
** Adarsh preparing a sheet updated with his progress on CSIT. - Added to the google sheets.
+
* CSIT
 +
** VPP Performance Test
 +
*** Two failures in performance testing
 +
**** one failure is related with CSIT script, NAT44 is common issue, failing with x86 also.
 +
**** the other failure is related with VPP image on Arm, IPSec tunnel configuration issue.
 +
*** iommu_passthrough=1 does not make any differences on Taishan server - Lijian
 +
*** We cannot do kernel upgrade with Ubuntu-18.04.1/Ubuntu-18.04.2/Ubuntu-18.04.3/Ubuntu-18.04.4 on Taishan.
 +
**** For now, can the kernel of Taishan server be left as it is now, linux-4.15.0.54. - Juraj
 +
**** One possible option/improvement is to port FD.io CSIT performance testing to some more advanced Arm servers, e.g., Ampere
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** Will send email to community about two options to resolve gcc-7 issue with CentOS-7
 +
***** 1. update gcc-7 requirement to gcc-8 in Makefile
 +
***** 2. remove gcc-7 limitation in Makefile, and get user install gcc-8 manually
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Update server information to CSIT documentation. - Juraj & Jieqiang
 +
** Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
 +
*** https://gerrit.oss.arm.com/#/c/160812/
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1SDP enablement. - Lijian
 +
** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
**** This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
*** The degradation is seen even when L3 cache is enabled.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
**** Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.
  
 +
'''04/21/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** iommu_passthrough=1 does not make any differences on Taishan server - Lijian
 +
*** We cannot do kernel upgrade with Ubuntu-18.04.1/Ubuntu-18.04.2/Ubuntu-18.04.3/Ubuntu-18.04.4 on Taishan.
 +
**** For now, can the kernel of Taishan server be left as it is now. Please confirm with Peter. - Juraj
 +
**** One possible option/improvement is to port FD.io CSIT performance testing to some more advanced Arm servers, e.g., Ampere
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** CentOS-8 is working fine. Will try CentOS-7 later.
 +
**** Is there any gcc version requirement in VPP official release?
 +
**** AES instructions in VPP source code requires gcc version newer than gcc-8.
 +
**** 'make install-deps' failure with CentOS-7 on Arm.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** The servers, intel NICs, and Mellanox NICs works good so far.
 +
*** Root-causing the RDMA issue with Mellanox NIC.
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1SDP enablement. - Lijian
 +
** gcc-10 is not working so far.
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
**** This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
*** The degradation is seen even when L3 cache is enabled.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
**** Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.
  
* fd.io lab
+
'''04/14/2020'''
** cavium-2 follow up via LF #54919.
+
* Attendees
** Talk to Macek regrading TG physical placement on rack.
+
** Govindarajan Mohandoss
** Juraj : Needs access to fd.io lab. Tina to help Juraj with this.
+
** Honnappa Nagarahalli
** Juraj to send email to EdW to get access to fd.io lab.'
+
** Juraj Linkes
** Sirshak to add OS version to fd.io lab machines.
+
** Tina Tsou
 +
** Jieqiang Wang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
 +
**** Ubuntu-18.04 lts version is supposed to be kernel 4.15.72?
 +
**** Will try fresh install with local Taishan servers.
 +
***** Will try with Ubuntu-18.04.1/Ubuntu-18.04.2/Ubuntu-18.04.3/Ubuntu-18.04.4
 +
***** Will do fresh installation with Ubuntu-18.04.2 and then install kernel 4.15.72
 +
** VPP Path
 +
*** Try iommu_passthrough=1 in Taishan servers and see if it makes any differences - Lijian
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm Jenkins jobs - Juraj & Jieqiang
 +
**** CentOS-8 is working fine. Will try CentOS-7 later.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
**** This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
*** The degradation is seen even when L3 cache is enabled.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
 +
*** After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
**** Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.
  
 +
'''04/07/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
 +
**** Ubuntu-18.04 lts version is supposed to be kernel 4.15.72?
 +
**** Will try cobbler with local Taishan servers, to try fresh install.
 +
***** Jieqiang will try fresh installation of kernel 4.15.72 in local Taishan through cobbler.
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
**** Jieqiang updated docker file locally to add centOS as part of CI and facing some issues.
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
**** Need 2 Thunderx2 servers to run the jobs for every VPP/CSIT patch submission instead of every half hour with a new VPP build. The current
 +
**** ThunderX2 server doesn't respond when the jobs are requested to run for every patch submission. No voting rights (+1 from CI) for VPP device
 +
**** suite.
 +
* FD.io lab
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 
* VPP
 
* VPP
** HN->Nitin: Stick with memcpy. Nitin concern SIMD unit being idle with new GCC. Feedback from arm compiler team that vector instructions dont perform as expected on many platforms. 1ns better(dpdk_input node) if using SIMD memcpy on ThunderX. Nitin to try using restricted on non-SIMD memcpy.
+
** Vectorization
** 1019: CSIT. Py-lint issues. Patch submitted. Khem to merge with Lucian's Patch.
+
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
** 1023: Khem, Adarsh to talk to Sachin to resolve openssl issue. - Sachin suggested some config changes resulted in VPP being unstable. Still working it out.
+
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
** 1043: No updates. Sirshak to investigate this and Khem to reach out to Sanil regarding known Taishan problems with KVM.
+
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
** 990: Brian Updates - Sirshak to get status offline.  
+
***** These patches are kept in backlog for now.
** 1267: l3fwd performance tuning: Status on Marvel patch: - No Updates. Nitin to submit his modified patch with -2.
+
** Investigate bihash operations in L2 throughput are hot-spots
** VPP-1276: Sachin facing issues with building rpm. - Any change in status ? No Updates. Low priorty for Sachin. Needs Help.  
+
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
** VPP-1284: TLS corruption: Dynamic linking related to Thread local storage. Logs recorded with this tkt.
+
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
** Sirshak to add Porting and Tuning section.
+
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
** Sirshak to track arm master build failure.
+
** N1SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
**** This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.  
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
*** The degradation is seen even when L3 cache is enabled.
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
 +
**** Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.
 +
 
 +
'''03/31/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
** Michaela Tahiri
 +
* General
 
* CSIT
 
* CSIT
** Adarsh openssl issues:  
+
** VPP Performance Test
** Performance Testing Khem : NUMA node numbering issue. Last Update: Still working internally. Status: Internal patch for BIOS.
+
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
** Khem: to create a Jira tkt to document automation task of CSIT.
+
**** Ubuntu-18.04 lts version is supposed to be kernel 4.15.72?
** Khem : trex installation- Having x86 TG internally. Any luck ?
+
**** Will try cobbler with local Taishan servers, to try fresh install.
** Brian to use cache stashing result. Updates:  
+
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** https://docs.fd.io/csit/master/trending/introduction/failures.html#n-tsh
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
 +
*** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-tsh/161/archives/log.html.gz
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in Apil.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
** Trying to make IPsec enabled with Arm platform. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
  
* Action Items - Next Week
+
'''03/24/2020'''
** Sirshak: To update LF RT #54919 to follow up on cavium-2 status.
+
* Attendees
** Sirshak: Release Machine to EdK as soon as ThunderX is up.
+
** Govindarajan Mohandoss
** Sirshak: Status on new ThunderXs: Will be decided after talks with Maciek.
+
** Honnappa Nagarahalli
** Sirshak: vm unresponsive issue: No updates didnt get time to try will try this week.
+
** Lijian Zhang
** Sirshak: To ask about CSIT performance topology connection status.
+
** Juraj Linkes
** Sirshak: to add OS version to fd.io lab machines.
+
** Tina Tsou
** Sirshak: to add Porting and Tuning section.
+
** Jieqiang Wang
** Sirshak: to track arm master build failure.
+
** Michaela Tahiri
** Juraj: Access to fd.io lab.
+
* General
** Nitin: VPP-1064 Support multiple cache line sizes per architecture.
+
* CSIT
** HN: memcpy benchmarking updates honnappa - 2 more tests to be done based on Ola's suggestion.  
+
** VPP Performance Test
** Adarsh openssl updates
+
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
** Khem: to create a Jira tkt to document automation task of CSIT.
+
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** We are required to provide justification and use case for cross-compilation for VPP on Arm - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and externally with NXP.
 +
***** make build/build-release TARGET_PLATFORM=n1sdp  // for n1sdp cross compiling
 +
***** make build/build-release  // for generic vpp image
 +
***** make build/build-release TARGET_PLATFORM=native  // for native vpp image
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in Apil.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.  
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
  
 +
'''03/17/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** We are required to provide justification and use case for cross-compilation for VPP on Arm - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and externally with NXP.
 +
** VPP Device
 +
*** Issue of huge pages running out has been resolved by resetting the servers.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in Apil.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - Usage of MAP is recorded in confluence
 +
*** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Patch is upstreamed for community review
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Confirm if community agrees with patch - Lijian
 +
*** Check how DPDK is detecting numa-id for a specific NIC device - Lijian
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** N1 SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
**** Sending Govind the steps on installing GCC-9.2.0
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
**** Will sync up with James Yang about cache line fill buffers
 +
** Investigating mtrie data structure on control-plane. - Govind
 +
*** 3% macro-benchmarking by adding prefetches to adj table on ThunderX2
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
** Plans
 +
*** N1SDP performance investigation and improvement - Planned - Lijian
 +
*** ACL plugin investigation - Planned - Govind & Lijian
 +
*** IPsec investigation - Indicative - Govind
 +
*** Lockless data-plane investigation by Govind in backlog
 +
*** Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
  
'''5/15/2018'''
+
'''03/10/2020'''
 
* Attendees
 
* Attendees
** Sirshak Das
+
** Govindarajan Mohandoss
** Stanislav Chlebec
+
** Sachin Saxena
+
** Khemendra Kumar
+
** Andy Wang
+
 
** Honnappa Nagarahalli
 
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 
** Tina Tsou
 
** Tina Tsou
** Andrew Pinski
+
** Jieqiang Wang
** John Bromhead
+
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Upgrading kernel version to 72 cannot boot up normally, so have reverted back to previous version.
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** We are required to provide justification and use case for cross-compilation for VPP on Arm - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and externally with NXP.
 +
** VPP Device
 +
*** Issue of huge pages running out has been resolved by resetting the servers.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in Apil.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - Usage of MAP is recorded in confluence
 +
*** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Check if detecting the source of SIGPROF is possible - Govind
 +
*** Confirm with Community about the possible solutions to this issue - Lijian
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Confirm if community agrees with patch - Lijian
 +
*** Check how DPDK is detecting numa-id for a specific NIC device - Lijian
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** Fix Makefile issue recently introduced on Arm machine - Jieqiang
 +
*** Patch is upstreamed for code review - https://gerrit.fd.io/r/c/vpp/+/25259
 +
** N1 SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
**** Sending Govind the steps on installing GCC-9.2.0
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
**** Will sync up with James Yang about cache line fill buffers
 +
** Investigating mtrie data structure on control-plane. - Govind
 +
*** 3% macro-benchmarking by adding prefetches to adj table on ThunderX2
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
 +
 
 +
'''03/03/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 
** Juraj Linkes
 
** Juraj Linkes
** rkinsell
+
** Tina Tsou
** Nitin Saxena
+
** Jieqiang Wang
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Investigate Ubuntu-20.04 on Arm servers - Juraj & Jieqiang
 +
*** Investigate adding CentOS on Arm to Jenkins jobs - Juraj & Jieqiang
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** We are required to provide justification and use case for cross-compilation for VPP on Arm - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and externally with NXP.
 +
** VPP Device
 +
*** Issue of huge pages running out has been resolved by resetting the servers.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** The current ThunderX2 in Arm lab are pre-production servers.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - Usage of MAP is recorded in confluence
 +
*** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** Fix Makefile issue recently introduced on Arm machine - Jieqiang
 +
*** Patch is upstreamed for code review - https://gerrit.fd.io/r/c/vpp/+/25259
 +
** Investigating memory copy in ip4-rewrite on ThunderX2 - Govind
 +
*** Check the assembly code with other Arm CPU also.
 +
*** Send Govind the memory copy with fixed length.
 +
** N1 SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
**** Sending Govind the steps on installing GCC-9.2.0
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
**** Will sync up with James Yang about cache line fill buffers
 +
** Investigating mtrie data structure on control-plane. - Govind
 +
*** 3% macro-benchmarking by adding prefetches to adj table on ThunderX2
 +
** iova_mode == VA not working issue is not root-caused
 +
*** DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
 +
*** However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
 +
** Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
 +
** Create Confluence page to record all the performance benchmarking data - Lijian
  
* Action Items - Last Week
+
'''02/25/2020'''
** Nitin: Run a VPP performance test to understand if the memcpy neon version provides any benefits. - Able to run with l3fwd test case. Gives better numbers.
+
* Attendees
** Sirshak: Create a higher LF ticket so that it is easier for Trishan/Acton/Venessa/Mohammed to follow up on bringing up ThunderX/mcbin - Not Created yet as I think we are close to solving the issue. If its not solved after today's call will create the tkt.
+
** Govindarajan Mohandoss
** Nitin: start email discussion with Dave to address the creation of single makefile for all ARMv8 devices. Still understanding cross compilation works. Communicating with Sachin.
+
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
** Jieqiang Wang
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Govind will talk with George Zhao for Taishan fw version supporting Meltdown issue.
 +
*** Huawei is investigating which fw version of Taishan server supporting Meltdown issue. Will update with us soon.
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending wiki: https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** We are required to provide justification and use case for cross-compilation for VPP on Arm - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and externally with NXP.
 +
** VPP Device
 +
*** Issue of huge pages running out has been resolved by resetting the servers.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** The current ThunderX2 in Arm lab are pre-production servers.
 +
*** We are about to purchase two official ThunderX2 servers in market.
 +
*** Raise the budget requirement from CE-OSS - Dean & Honnappa
 +
*** Check the ThunderX2 configurations required - Govind & Juraj
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - Usage of MAP is recorded in confluence
 +
*** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** Fix Makefile issue recently introduced on Arm machine - Jieqiang
 +
*** Patch is upstreamed for code review - https://gerrit.fd.io/r/c/vpp/+/25259
 +
** Investigating memory copy in ip4-rewrite on ThunderX2 - Govind
 +
*** Check the assembly code with other Arm CPU also.
 +
*** Send Govind the memory copy with fixed length.
 +
** N1 SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
**** Sending Govind the steps on installing GCC-9.2.0
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
**** Will sync up with James Yang about cache line fill buffers
 +
** Investigating mtrie data structure on control-plane. - Govind
 +
*** 3% macro-benchmarking by adding prefetches to adj table on ThunderX2
  
* New Joinees
 
** Stanislav Chlebec - pantheon
 
  
* fd.io lab
+
'''02/18/2020'''
** Follow up on ThunderX to getting mgmt IP - IP addresses are assigned, but are not up yet.- Have a call today to discuss this with Mohammed
+
* Attendees
** USB to Ethernet Question: Andrew: shows up as Ethernet interface.
+
** Govindarajan Mohandoss
** Release Machine to EdK as soon as ThunderX is up. - Sirshak to set mgmt IP and handover the machine.
+
** Honnappa Nagarahalli
** Cavium has shipped more machines as well - Delivered a week back. Tina to follow up with Trishan: 2 Delivered. Sirshak to ask in todays meeting for status on new ThunderX.
+
** Lijian Zhang
** See the Taishan setup for any VM issue. - Sirshak is trying to reproduce the issue. - Reproduced still debugging.
+
** Juraj Linkes
** Khemendra : Topology is correct. Sirshak to ask about CSIT performance topology connection status.
+
** Tina Tsou
** Khemendra: Intel NIC to be used or Mellanox. HN: Intially use Intel later move to Mellanox.
+
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VM-VHost test failing on 3n-tsh server.
 +
**** Issue with testpmd failure in VM has been resolved and merged.
 +
*** Govind will talk with Geoge for Taishan fw version supporting Meltdown issue.
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** Will discuss about the cross compilation with qemu emulation solution in the monthly VPP call tomorrow - Juraj
 +
**** Govind will lead the cross compilation justification discussion internally and with NXP.
 +
*** VPP crash issue on Taishan server is resolved and patch is resolved.
 +
**** ThunderX2 has the same issue and has been resolved also.
 +
** VPP Device
 +
*** Issue of huge pages running out has been resolved by resetting the servers.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Customer engineers claims ThunderX2 does not support i40e intel NIC, which seems not correct.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 
* VPP
 
* VPP
** VPP-1064 Dave Barach rejected the patch based on the solution Damjan and Nitin had decided upon following the reason that current approach breaks cross compilation. - NXP has upstreamed the DPAA2 patch, uses a separate segment makefile (dpaa.mk) for DPAA2. NXP does cross compilation most of the time. The approach could be that all platforms create a segment makefile and combine all of them into a single ARMv8 segment makefile. - Nitin Still discussing with Sachin regrading cross compilation
+
** Vectorization
** One solution suggested was creating a platform specific Makefile for ThunderX - Any Decisions - Same as above.
+
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
** memcpy benchmarking updates honnappa - 2 more tests to be done based on Ola's suggestion. Nitin tested with restrict.
+
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
** 1019: No update. Few rough edges to clean up.
+
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
** 1021: Is it Closed ? Closed.
+
** MAP with VPP - error is resolved. Sort of working. Record the details.
** 1023: migrated to openssl using DPDK manual but facing failed TCs - openSSL is integrated in his local environment - VPP not stable in his environment - Updated in the ticket. Status: Aadarsh still trying to get help from community. Khem, Aadarsh to talk to Sachin regarding openssl issues.
+
*** Usage of MAP is recorded in confluence
** 1043: No updates. Sirshak to investigate this.
+
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
** 990: Brian Updates:
+
*** Will update the patch to ignore sigprof singal - Jieqiang
** 1267: l3fwd performance tuning: Marvell to upstream a patch to enable dpdk on mcbin by making changes to dpdk plugin in vpp. Updates: natalie sent a email. Working on upstreaming changes to VPP for dpdk_plugin. Working on comparing musdk vs dpdk.  
+
**** Patch is updated by adding more comments. - Jieqiang
** Auto-detection of memory channels: Startup conf solution decided. Updates: No updates not priorty now bug raised by Nitin.
+
** Benchmarking AVF drivers on Arm servers - Jieqiang
** Sachin facing issues with build rpm currently on 1801 will open a Jira Tkt if issues persists with 1804. Updates: Jira VPP-1276 to track this issue.
+
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** Fix Makefile issue recently introduced on Arm machine - Jieqiang
 +
*** Patch is ready for code review.
 +
** Investigating memory in ip4-rewrite on ThunderX2 - Govind
 +
*** Check the assembly with other Arm CPU also.
 +
** N1 SDP enablement. - Lijian
 +
*** GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
 +
*** Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
 +
*** It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
 +
** Investigating mtrie data structure on control-plane. - Govind
 +
*** 3% macro-benchmarking by adding prefetches to adj table on ThunderX2
 +
 
 +
 
 +
'''02/11/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
* General
 
* CSIT
 
* CSIT
** Adarsh openssl issues: Will communicate with Sachin to get this reolved
+
** VPP Performance Test
** Adarsh preparing a sheet updated with his progress on CSIT.
+
*** VM-VHost test failing on 3n-tsh server.
** Performance Testing Khem : NUMA node numbering issue Updates: No updates. Still working internally.
+
*** Tina to confirm which BIOS version on Taishan server support Meldown.
** Khem facing issues with trex installation on ARM hence he will try getting a x86 machine as TG. Updates: Still working on getting an x86 in internal lab.
+
**** NICs cannot be bound to VFIO_PCI driver in VM which caused the failure.
** brian to use cache stashing result. Updates:  
+
**** Will try iommu-passthrough=0/1 - Juraj
 +
*** Will confirm with Joyce about this issue - Lijian
 +
*** Please double confirm if there's any failures with weekly test - Juraj
 +
*** https://docs.fd.io/csit/rls1908/report/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
 +
*** Will discuss about the cross compilation with qemu emulation solution in the monthly VPP call tomorrow - Juraj
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab.
 +
*** All Intel and Mellanox NICs have been verified on ThunderX2-02.
 +
*** Intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Customer engineers claims ThunderX2 does not support i40e intel NIC, which seems not correct.
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
* VPP
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) vs VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
 +
*** To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
 +
** Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
 +
** Fix Makefile issue recently introduced on Arm machine - Jieqiang
 +
** Investigating memory in ip4-rewrite on ThunderX2 - Govind
 +
*** Check the assembly with other Arm CPU also.
  
* Action Items - Next Week
 
** Sirshak: To update LF RT #54919 to follow up on cavium-2 status. - having troubles with login will sort it out today.
 
** Sirshak: Release Machine to EdK as soon as ThunderX is up: cavium-1 done cavium-2 still has issues with network connectivity.
 
** Sirshak: Status on new ThunderXs: Will be decided after talks with Maciek.
 
** Sirshak: vm unresponsive issue: No updates didnt get time to try will try this week.
 
** Sirshak: To ask about CSIT performance topology connection status. - TBD after call with Maciek.
 
** Nitin: VPP-1064 (Patch rejected by dave barach) Discuss cross compilation with Sachin. (Seperate or one unified Makefile).
 
** HN: memcpy benchmarking updates honnappa - 2 more tests to be done based on Ola's suggestion.
 
** Adarsh openssl issues: Will communicate with Sachin to get this resolved
 
** Adarsh preparing a sheet updated with his progress on CSIT.
 
  
   
+
'''02/04/2020'''
 +
* Attendees
 +
** Govindarajan Mohandoss
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Juraj Linkes
 +
** Tina Tsou
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VM-VHost test failing on 3n-tsh server.
 +
*** Govind to send background details about Taishan kernel upgrade to Tina to confirm with George Zhao.
 +
**** The VM-VHost test cases have never passed before as per the previous logs in Taishan server.
 +
**** Issue is not reproducible locally - VHost/Virtual Ethernet interface creation passes in Taishan server in local setup.
 +
**** Next Steps: Follow up with Peter Mikus to debug the issue in Taishan server in CSIT lab.
 +
*****                Build a local test setup to run the Testpmd application in VM.
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Patch to resolve "show pci" crash issue is merged. Will ask CSIT team to remove the workaround. - Lijian
 +
**** Have upgraded Python2 to Python3 successfully.
 +
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
 +
***** Will send email to CSIT-dev on how to avoid the similar case/issue.
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
**** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** Will investigate 4-5 test failures in 'make test' - Lijian
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Cables for intel NICs have been ordered.
 +
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
 +
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7) Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
*** Check if performance tests includes AVF driver or not?
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
 +
*** Will try one patch to enable N1SDP board.
 +
*** Please try AVF with Mcbin if possible.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
 +
** EPIC for next quarter:
  
'''5/8/2018'''
+
'''01/28/2020'''
 
* Attendees
 
* Attendees
 +
** Govindarajan Mohandoss
 
** Honnappa Nagarahalli
 
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** VM-VHost test failing on 3n-tsh server.
 +
**** The VM-VHost test cases have never passed before as per the previous logs in Taishan server.
 +
**** Issue is not reproducible locally - VHost/Virtual Ethernet interface creation passes in Taishan server in local setup.
 +
**** Next Steps: Follow up with Peter Mikus to debug the issue in Taishan server in CSIT lab.
 +
*****                Build a local test setup to run the Testpmd application in VM.
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Patch to resolve "show pci" crash issue is merged. Will ask CSIT team to remove the workaround. - Lijian
 +
**** Have upgraded Python2 to Python3 successfully.
 +
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
 +
***** Will send email to CSIT-dev on how to avoid the similar case/issue.
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
**** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** Will investigate 4-5 test failures in 'make test' - Lijian
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Cables for intel NICs have been ordered.
 +
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
 +
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
*** Check if performance tests includes AVF driver or not?
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
 +
*** Will try one patch to enable N1SDP board.
 +
*** Please try AVF with Mcbin if possible.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
 +
** EPIC for next quarter:
 +
 +
'''01/21/2020'''
 +
* Attendees
 
** Tina Tsou
 
** Tina Tsou
** Andrew Pinski
+
** Govindarajan Mohandoss
** Natalie Samsonov
+
** Honnappa Nagarahalli
** John Bromhead
+
** Michaela Tahiri
** Sachin Saxena
+
* General
** Khemendra Kumar
+
* CSIT
** Andy Wang
+
** VPP Performance Test
 +
*** VM-VHost test failing on 3n-tsh server.
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Patch to resolve "show pci" crash issue is merged. Will ask CSIT team to remove the workaround. - Lijian
 +
**** Have upgraded Python2 to Python3 successfully.
 +
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
 +
***** Will send email to CSIT-dev on how to avoid the similar case/issue.
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
**** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** Will investigate 4-5 test failures in 'make test' - Lijian
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Cables for intel NICs have been ordered.
 +
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
 +
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
*** Check if performance tests includes AVF driver or not?
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
 +
*** Will try one patch to enable N1SDP board.
 +
*** Please try AVF with Mcbin if possible.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
 +
** EPIC for next quarter:
 +
 
 +
 
 +
'''01/14/2020'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 
** Juraj Linkes
 
** Juraj Linkes
** rkinsell
+
** Christian Hopps
** Nitin Saxena
+
** Dean Arnold
** Ed Kern
+
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Patch to resolve "show pci" crash issue is merged. Will ask CSIT team to remove the workaround. - Lijian
 +
**** Have upgraded Python2 to Python3 successfully.
 +
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
 +
***** Will send email to CSIT-dev on how to avoid the similar case/issue.
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
**** By fixing software bug, VPP can boot up normally with 16K/64K page size.
 +
**** Will investigate 4-5 test failures in 'make test' - Lijian
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Cables for intel NICs have been ordered.
 +
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
 +
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
*** Check if performance tests includes AVF driver or not?
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
*** Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
 +
*** Will try one patch to enable N1SDP board.
 +
*** Please try AVF with Mcbin if possible.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
 +
** EPIC for next quarter:
  
* Action Items - Last Week
+
'''01/07/2020'''
** Sirshak: Follow up with Mohammed regarding ThunderX mgmt connectivity and mcbin - IP addresses allocated cavium-2 has IPMI connectivity but console still hanging. cavium-1,3 - Not able to connect to IPMI. - Create a higher LF ticket so that it is easier for Trishan/Acton/Venessa/Mohammed to follow up.
+
* Attendees
** Sirshak to hold a call with Khem and Adarsh to understand the Vm_vhsot issue because of nested VMs - Contact established still working on analyzing the setup.  
+
** Tina Tsou
** Sirshak to create consolidated ARM ecosystem xls to reflect CSIT effort. (Need to add the link to the excel sheet to AArch64 page) - Not Done will do it next week.
+
** Honnappa Nagarahalli
** Honnappa: memcpy benchmarking - Micro benchmarks run on mcbin, qualcomm - vector Load/Store usually go to the LSU unit
+
** Lijian Zhang
** Brian : CSIT-990(buildroot) - Nitin ran on mcbin, it is failing at a different place - Brian to continue next week
+
** Jieqiang Wang
** Brian to publish a pictorial representation of rx queues and tx queues in multicore case for mcbin. - Moved to next week
+
** Jason Zhang
** Khem to analyze make test failure in Taishan - 1804 - Tested with the latest code (make test), all test cases passing
+
** Juraj Linkes
** ARM - For TG for deciding connectivity - MCBin and Taishan - Sirshak/Brian working on it.
+
** Christian Hopps
** Sirshak/Brian to recheck validity of ASLR issue. - Not Done. Next Week.
+
** Dean Arnold
 +
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Patch to resolve "show pci" crash issue is merged. Will ask CSIT team to remove the workaround. - Lijian
 +
**** Have upgraded Python2 to Python3 successfully.
 +
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
 +
***** Will send email to CSIT-dev on how to avoid the similar case/issue.
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
*** Trying to update kernel to 64K page size on CentOS - No update - Lijian
 +
**** VPP can boot up normally with 16K/64K page size. Will investigate 4-5 test failures in 'make test' - Lijian
 +
**** Will try with CentOS 8 which seems to be working fine with 64K page size.
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Cables for intel NICs have been ordered.
 +
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
 +
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7) Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** Will update the patch to ignore sigprof singal - Jieqiang
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
*** VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
 +
*** Check if performance tests includes AVF driver or not?
 +
** AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
  
* New Joinees
 
** Yuval Caduri - from Marvell responsible for MUSDK driver - packet processor 8K chips
 
** Natalie - responsible for network PMD DPDK driver
 
** Dmitri Epshtein - Responsible for crypto driver expert
 
  
* fd.io lab
+
'''12/17/2019'''
** Follow up on ThunderX to getting mgmt IP - IP addresses are assigned, but are not up yet.
+
* Attendees
** Release Machine to EdK as soon as ThunderX is up.
+
** Tina Tsou
** Cavium has shipped more machines as well - Delivered a week back. Tina to follow up with Trishan.
+
** Honnappa Nagarahalli
** See the Taishan setup for any VM issue. - Sirshak is trying to reproduce the issue.
+
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
** Xiaoming Jiang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Have upgraded Python2 to Python3 successfully.
 +
**** Ask CSIT community how to identify performance hold/stop issue asap - Juraj
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
*** Trying to update kernel to 64K page size on CentOS - No update - Lijian
 +
**** Will try with CentOS 8 which seems to be working fine with 64K page size.
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
 +
*** Cables for intel NICs have been ordered.
 +
*** Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
 +
** Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** VPP-1064 Dave Barach rejected the patch based on the solution Damjan and Nitin had decided upon following the reason that current approach breaks cross compilation. - NXP has upstreamed the DPAA2 patch, uses a separate segment makefile (dpaa.mk) for DPAA2. NXP does cross compilation most of the time. The approach could be that all platforms create a segment makefile and combine all of them into a single ARMv8 segment makefile.
+
** Align Arm patches with VPP release plan.
** One solution suggested was creating a platform specific Makefile for ThunderX
+
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
** Honnappa Suggested as this not just a ThunderX issue but also Qualcomm issue hence a ARM specific Makefile would be better.(Issue 128 byte Cache Line Size)
+
*** RC1 2020-01-15 (F0+7) Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
** Honnappa no update on memcpy benchmarking will do that next week
+
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
** 1019: fixed in local will upstream soon - Patch has issues and some of the issues are fixed
+
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
** 1021: Patch submitted centos env issue CSIT follow up. - This can be closed
+
** Vectorization
** 1023: migrated to openssl using DPDK manual but facing failed TCs - openSSL is integrated in his local environment - VPP not stable in his environment - Updated in the ticket.
+
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
** 1043: No updates
+
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
** 990: Brian to Retry on mcbin
+
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
** 1267: l3fwd performance tuning: Marvell to upstream a patch to enable dpdk on mcbin by making changes to dpdk plugin in vpp.
+
** MAP with VPP - error is resolved. Sort of working. Record the details.
** Auto-detection of memory channels: Andrew's comment no really way to do that hence to go with making it a runtime argument via startup conf instead of being hard coded.
+
*** Usage of MAP is recorded in confluence
** Sachin facing issues with build rpm currently on 1801 will open a Jira Tkt if issues persists with 1804.
+
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
** Benchmarking AVF drivers on Arm servers - Jieqiang
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
*** Patches are upstreamed, but not reviewed yet.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Use rte_mbuf_sanity_check checking meta data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
 +
 
 +
'''12/10/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
** Xiaoming Jiang
 +
* General
 
* CSIT
 
* CSIT
** Adarsh stalled with failure of test cases after using openssl.
+
** VPP Performance Test
** Performance Testing Khem : NUMA node numbering issue.
+
*** Performance data on Arm in official release 19.08 is available.
** NUMA node no issue not seen in ThunderX. Khem to post the details of issue and the workaround on Taishan.
+
**** https://docs.fd.io/csit/rls1908/report/
** Khem facing issues with trex installation on ARM hence he will try getting a x86 machine as TG.
+
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
** Nitin known issue with trex with arm and mellanox card.
+
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
** Khem to try L2BD and L2XC.
+
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
** brian to use cache stashing and see the results.
+
*** 'show pci' command will cause crash issue, which affects all performance tests on Taishan server only.
 +
**** Issue is root-caused and the patch is in community review - https://gerrit.fd.io/r/c/vpp/+/23849
 +
**** 'show pci' is replaced with 'show ver' temporarily. Now performance test is running fine.
 +
**** Will change performance job to be running daily from current weekly running. - Juraj
 +
**** Have upgraded Python2 to Python3 successfully.
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
*** Trying to update kernel to 64K page size on CentOS - Lijian
 +
**** Will try with CentOS 8 which seems to be working fine with 64K page size.
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2.
 +
*** What's the preferred work method with Mellanox NIC, using DPDK pmd or RDMA? - Juraj
 +
*** Check BIOS version - Lijian
 +
*** Make sure all NICs are plugged into same PCI slot number - Lijian
 +
*** Verify intel i40e driver/firmware version - Lijian
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Usage of MAP is recorded in confluence
 +
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** MAP can give profiling data at certain different time-line spots
 +
*** MAP cannot do profiling with specific CPU cores, and cannot give assembly views
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Use rte_mbuf_sanity_check checking meta data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
  
* Action Items - Next Week
 
** Nitin: Run a VPP performance test to understand if the memcpy neon version provides any benefits.
 
** Sirshak: Create a higher LF ticket so that it is easier for Trishan/Acton/Venessa/Mohammed to follow up on bringing up ThunderX/mcbin
 
** Nitin: start email discussion with Dave to address the creation of single makefile for all ARMv8 devices
 
  
''' 5/1/2018 '''
+
'''12/03/2019'''
* New Joinees
+
* Attendees
** Natalie and Yuval from Marvell for engineering input.
+
** Tina Tsou
* fd.io lab
+
** Honnappa Nagarahalli
** Follow up on ThunderX to getting mgmt IP
+
** Lijian Zhang
** Release Machine to EdK as soon as ThunderX is up.
+
** Jieqiang Wang
** Cavium has shipped more machines as well.
+
** Jason Zhang
** See the Taishan setup for any VM issue.
+
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
** Xiaoming Jiang
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** There's a Python API issue which affects all performance tests on Taishan server only.
 +
**** The failure turns out to be caused by PCI show with Mellanox NICs on Taishan servers.
 +
**** Talk to Peter to temporarily remove 'PCI dump' for Taishan servers - Juraj
 +
**** Could you try debug version of VPP with the setup and capture the traceback log? - Juraj
 +
**** Will try to root cause the problem with Taishan + Mellanox NIC - Lijian
 +
** VPP Path
 +
*** Verifying VPP on Centos/Arm - Juraj
 +
*** Trying to update kernel to 64K page size on CentOS - Lijian
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
*** VPP device failed after Python3 upgrade
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** VPP-1064 Dave Barach rejected the patch based on the solution Damjan and Nitin had decided upon following the reason that current approach breaks cross compilation.  
+
** Align Arm patches with VPP release plan.
** One solution suggested was creating a platform specific Makefile for ThunderX
+
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
** Honnappa Suggested as this not just a ThunderX issue but also Qualcomm issue hence a ARM specific Makefile would be better.(Issue 128 byte Cache Line Size)
+
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
** Honnappa no update on memcpy benchmarking will do that next week
+
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
** 1019: fixed in local will upstream soon
+
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
** 1021: Patch submitted centos env issue CSIT follow up.
+
** Vectorization
** 1023: migrated to openssl using DPDK manual but facing failed TCs
+
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
** 1043: No updates
+
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
** 990: Brian to Retry on mcbin
+
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
** 1267: l3fwd performance tuning: Marvell to upstream a patch to enable dpdk on mcbin by making changes to dpdk plugin in vpp.
+
** MAP with VPP - error is resolved. Sort of working. Record the details.
** Auto-detection of memory channels: Andrew's comment no really way to do that hence to go with making it a runtime argument via startup conf instead of being hard coded.
+
*** Usage of MAP is recorded in confluence
** Sachin facing issues with build rpm currently on 1801 will open a Jira Tkt if issues persists with 1804.
+
**** https://confluence.arm.com/display/BSGSoftware/An+introduction+to+using+MAP+with+VPP
 +
*** MAP can give profiling data at certain different time-line spots
 +
*** MAP cannot do profiling with specific CPU cores, and cannot give assembly views
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Use rte_mbuf_sanity_check checking meta data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
 +
 
 +
'''11/26/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Dean Arnold
 +
** Michaela Tahiri
 +
** Xiaoming Jiang
 +
* General
 
* CSIT
 
* CSIT
** Adarsh stalled with failure of test cases after using openssl.
+
** VPP Performance Test
** Performance Testing Khem : NUMA node numbering issue.
+
*** Performance data on Arm in official release 19.08 is available.
** NUMA node no issue not seen in ThunderX. Khem to post the details of issue and the workaround on Taishan.
+
**** https://docs.fd.io/csit/rls1908/report/
** Khem facing issues with trex installation on ARM hence he will try getting a x86 machine as TG.
+
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
** Nitin known issue with trex with arm and mellanox card.
+
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
** Khem to try L2BD and L2XC.
+
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
** brian to use cache stashing and see the results.
+
*** There's a Python API issue which affects all performance tests on Taishan server only.
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Use rte_mbuf_sanity_check checking meta data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
  
* Action Items - Next Week
+
'''11/19/2019'''
** Sirshak: Follow up with Mohammed regarding ThunderX mgmt connectivity and mcbin.
+
* Attendees
** Sirshak to hold a call with Khem and Adarsh to understand the Vm_vhsot issue because of nested VMs - Not done yet will do it next week.
+
** Tina Tsou
** Sirshak to create consolidated ARM ecosystem xls to reflect CSIT effort. - Not Done will do it next week.
+
** Honnappa Nagarahalli
** Honnappa: memcpy benchmarking
+
** Lijian Zhang
** Brian : CSIT-990(buildroot)
+
** Jieqiang Wang
** Brian to publish a pictorial representation of rx queues and tx queues in multicore case for mcbin. - Moved to next week
+
** Jason Zhang
** Khem to analyze make test failure in Taishan - 1804 - Next Week
+
** Juraj Linkes
** ARM - For TG for deciding connectivity - MCBin and Taishan - Working on it.
+
** Christian Hopps
** CSIT 990 brian to try - Next Week
+
** Dean Arnold
** Sirshak/Brian to recheck validity of ASLR issue. - Not Done. Next Week.
+
** Michaela Tahiri
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
*** There's a Python API issue which affects all performance tests on Taishan server only.
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
*** Loop Nitin, Sachin, Honnappa, Lijian in container cross-compilation discussion.
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Bench-mark VPP on Dawn N1SDP board
 +
*** Use rte_mbuf_sanity_check checking meta data.
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
 +
** EPIC for next quarter:
  
* Action Items - Last Week
+
'''11/12/2019'''
** Khem to ask mohammed, anton for power clearance for 2 new taishan. - Ok for Power Clearance
+
* Attendees
** Sirshak to hold a call with Khem and Adarsh to understand the Vm_vhsot issue because of nested VMs - Not done yet will do it next week.
+
** Tina Tsou
** Sirshak and Brian to discuss on TG connectivity. - Done
+
** Honnappa Nagarahalli
** Sirshak to create consolidated ARM ecosystem xls to reflect CSIT effort. - Not Done will do it next week.
+
** Lijian Zhang
** Nitin: To post vlib_main 1804_rc2 issue to community. - Done
+
** Jieqiang Wang
** Sirshak : to check if vlib_main is a issue in centriq. - Done
+
** Jason Zhang
** Nitin: AI for creating Jira for number of memory channel identification. - Done
+
** Juraj Linkes
** Brian to publish a pictorial representation of rx queues and tx queues in multicore case for mcbin. - Moved to next week
+
** Christian Hopps
** John B - 1G to USB adapters Ship to lab. - Done
+
** Stanislav Clebec
** Khem to analyze make test failure in Taishan - 1802 rc2 - Next Week
+
* General
** ARM - For TG for deciding connectivity - MCBin and Taishan - Working on it.
+
* CSIT
** CSIT 990 brian to try - Next Week
+
** VPP Performance Test
** Sirshak to take 1103 and 1114 - Done
+
*** Performance data on Arm in official release 19.08 is available.
** Nitin to Create l3fwd tkt - Done
+
**** https://docs.fd.io/csit/rls1908/report/
** Brian to create a mcbin crash tkt. Next Week
+
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
** Maen to provide contact for IO Stashing on mcbin. - Contacted Brian. Brian to provide further input.
+
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
** Sirshak/Brian to recheck validity of ASLR issue. - Not Done. Next Week.
+
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure - Merged.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lockless
 +
** EPIC for next quarter:
  
''' 4/25/2018 '''
+
'''10/29/2019'''
* Meeting Time
+
* Attendees
** Proposed time 6-8am Tuesday PST.
+
** Tina Tsou
** Tina to update wiki with new meeting time.
+
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Stanislav Clebec
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure - Merged.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** ThunderX
+
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
*** OS installed on ThunderX. Switch being sent.
+
** Current Configurations:
*** 1 ThunderX booted.
+
*** RAM: 256G
*** Plan to use 1G to USB adapters.
+
*** Disk: 480G SSD
*** Varun POC for Cavium.
+
*** The boxes are coming with Qlogic cards which are not supported in VPP.
** Taishan
+
** Changes required to the servers:
*** Its up and connected to Internet.
+
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
*** Build and make test 2 TCs failing (VCL TCs failing) - 1802 rc2 used.
+
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
*** Brian no update for TG - Meeting on it next week.
+
*** Need 2 Intel NICs XL710-QDA2 for each server.
*** Khem to ask mohammed, anton for power clearance for 2 new taishan.
+
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
** MCBin
+
*** Disk size to 480G
*** Maen POC - To Contact Mohammed.
+
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
*** Maen to provide engineering contact for help to Nitin.
+
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** Round Table status on Porting tkts.
+
** Align Arm patches with VPP release plan.
** Nitin: vlib_main taking a lot of time on both mcbin and thunderx2
+
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
** Sirshak to take on ARM tkts.
+
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Add max-size parameter to pmalloc module. - Jieqiang
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lockless
 +
** EPIC for next quarter:
 +
 
 +
'''10/22/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Stanislav Clebec
 +
* General
 
* CSIT
 
* CSIT
** Adarsh looking at IPv4 failed test cases with priorty.
+
** VPP Performance Test
** Sirshak to hold a call with Khem and Adarsh to understand the Vm_vhsot issue because of nested VMs
+
*** Performance data on Arm in official release 19.08 is available.
** Cavium to publish mcbin cist performance nos but low priority. Nitin faced build-root issue with this.
+
**** https://docs.fd.io/csit/rls1908/report/
** Maciek to host a kick off call.
+
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
** Sirshak and Brian to discuss on TG connectivity.
+
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
** Sirshak to create consolidated ARM ecosystem xls to reflect CSIT effort.
+
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
* Performance Benchmarking
+
** VPP Path
** Nitin: To post vlib_main 1804_rc2 issue to community.
+
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
** Nitin: vlib_main issue in mcbin and thunderx2 at different points within the function. Not a hotspot in x86.
+
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
** Sirshak : to check if vlib_main is a issue in centriq.
+
*** Tried cross-compiling with DPDK only.
** Nitin: AI for creating Jira for number of memory channel identification.
+
*** Initial cross-compiling is working fine.
** AI for creating Jira for the crash on Mcbin – Brian
+
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
** Khem to get started on CSIT performance suite this week and publish on shared xls.
+
*** There's issues to build vpp distros.
** Brian to publish a pictorial representation of rx queues and tx queues in multicore case for mcbin.  
+
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
* Action Items - Last Week
+
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
** Sirshak to add link to xls to wiki page. - Done by somebody else.
+
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
** Brian to raise LF RT ticket about MACCHIATObins - Done. Pinged Mohammed yet hear back from him.
+
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
** Nitin to check 'make test' on MACCHIATObin (16GB DRAM) - Failed. Error related to Python scripts.
+
***** setup proper platform specific folder names in DEB packages
** Honnappa, Khem to check Clang build on arm64. - Tried clang build on Centriq made some changes still fails. clang on x86 has errors still passes. 'make test' fails on x86. Jira Card to be created - '''AI(Sirshak)'''. Khem to try.
+
***** proper architecture string included in the DEB package name
* Action Items
+
** VPP Device
** John B- 1G to USB adapters Ship to lab.  
+
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure - Merged.
** Khem to analyze make test failure in Taishan - 1802 rc2
+
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
** ARM - For TG for deciding connectivity - MCBin and Taishan
+
*** Currently VPP device is not executed per patch. Issue is still under investigation.
** CSIT 990 brian to try
+
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
** Sirshak to take 1103 and 1114
+
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
** Nitin to Create l3fwd tkt
+
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
** Brian to create a mcbin crash tkt.
+
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
** Maen to provide contact for IO Stashing on mcbin.
+
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
** Sirshak/Brian to recheck validity of ASLR issue.
+
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
** Sirshak to track down issues.
+
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 480G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7) Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Add max-size parameter to pmalloc module. - Lijian
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
 +
*** Getting entries in ACL cache-line aligned, and bench-mark it.
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Investigating bi-hash lockless implementation - Jason
 +
*** Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lockless
 +
** EPIC for next quarter:
  
''' 4/18/2018 '''
+
'''10/15/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Stanislav Clebec
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure - Merged.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** Temporarily borrow 1x ThunderX to be used for ONAP demo at OpenStack Summit (end of May)? Yes.
+
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
** OS exists on ThunderXs; Varun will keysign with EdW; need to resolve OS netdev connectivity over 10/40GbE
+
** Current Configurations:
** OS exists on TaiShan2280; no connectivity to the Internet
+
*** RAM: 256G
 +
*** Disk: 240G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** RC2
+
** Align Arm patches with VPP release plan.
*** 'make' passes, 'make test' fail, 'make test-all' ???  - MACCHIATObin (4GB DRAM)
+
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
*** 'make' passes, 'make test' pass, 'make test-all' fails - Centriq
+
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
*** 'make' passes, 'make test' pass, 'make test-all' fails - x86
+
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
** Build
+
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
*** Testing Verify and Merge jobs for 18.04 master on arm64 today
+
** Vectorization
*** Clang build fails on arm? 'CC=clang CXX=clang make'
+
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Investigating bi-hash lockless implementation - Jason
 +
** EPIC for next quarter:
 +
 
 +
'''10/08/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Stanislav Clebec
 +
* General
 
* CSIT
 
* CSIT
** Adarsh updated CSIT status in xls
+
** VPP Performance Test
** CSIT-1023: decided to go with OpenSSL instead of ARMv8 crypto library, in DPDK, due to number of algorithms supported
+
*** Performance data on Arm in official release 19.08 is available.
*** e.g. AES-GCM not supported by ARMv8 crypto library
+
**** https://docs.fd.io/csit/rls1908/report/
** Nitin updated CSIT-990 (buildroot) with more information
+
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
* Action Items
+
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
** Sirshak to add link to xls to wiki page.
+
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
** Brian to raise LF RT ticket about MACCHIATObins
+
** VPP Path
** Nitin to check 'make test' on MACCHIATObin (16GB DRAM)
+
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
** Honnappa, Khem to check Clang build on arm64
+
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure - Merged.
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
 +
** Current Configurations:
 +
*** RAM: 256G
 +
*** Disk: 240G SSD
 +
*** The boxes are coming with Qlogic cards which are not supported in VPP.
 +
** Changes required to the servers:
 +
*** The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
 +
**** https://www.amazon.com/SF-Cable-Power-Extension-IEC320/dp/B007O0EIRU
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** Align Arm patches with VPP release plan.
 +
*** F0 2020-01-08        APIs frozen. Only low-risk changes accepted on main branch.
 +
*** RC1 2020-01-15 (F0+7)  Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
 +
*** RC2 2020-01-22 (RC1+7) Second artifacts posted.
 +
*** Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
 +
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
*** Will do software profiling with MAP on VPP.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
**** To check cycles by applying CRC32 calculation unrolling
 +
** Investigating bi-hash lockless implementation - Jason
 +
** EPIC for next quarter:
  
''' 4/11/2018 '''
+
'''10/01/2019'''
* Proposal to keep meeting at current time with additional overflow meeting at 8AM PST
+
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Stanislav Clebec
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** Looks like it would be only build and unit test. VPP device and performance tests would run on the physical devices in the lab.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** MACCHIATObins just arrived at VEXXHOST
+
** Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
** Nitin working on getting IPMI login credentials to provision OS on ThunderX
+
** Current Configurations:
** Need to connect Skylake TG machines to Arm machines
+
*** RAM: 256G
*** ETA: 1wk
+
*** Disk: 240G SSD
** Khem working with Aton (LF) to provision OS on TaiShan2280
+
*** The boxes are coming with Qlogic cards which are not supported in VPP.
*** ETA: 1wk, Ubuntu 17.10
+
** Changes required to the servers:
 +
*** Need 2 Intel NICs XL710-QDA2 for each server.
 +
*** If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
 +
*** Disk size to 480G
 +
*** Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
 +
*** Cables: N1, P1 to N2, P1 and so on
 +
*** Cables for IPMI and Management port: 2
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** OS: Ubuntu 18.04
 +
*** Server info in CSIT docs:
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
*** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** Brian to do more benchmarking on MACCHIATObin
+
** Align Arm patches with VPP release plan. - Lijian
** Khem working on benchmarking clib_memcpy64_x4()
+
** Vectorization
 +
**** https://gerrit.fd.io/r/c/vpp/+/22391 - vlib: vectorized buffer pointer to index with 128-bit SIMD
 +
**** https://gerrit.fd.io/r/c/vpp/+/22392 - ethernet: 128-bits vectorized next node selection
 +
**** https://gerrit.fd.io/r/c/vpp/+/20273 - vppinfra: vectorize eth_input_adv_and_flags_x4 with 128-bit SIMD
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
** Investigating bi-hash lockless implementation - Jason
 +
** EPIC for next quarter:
 +
 
 +
'''09/24/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 +
** Finished PPT and demo to Pravin - Will share with Juraj and Honnappa.
 
* CSIT
 
* CSIT
** Lucian submitted patches for CSIT-1019, CSIT-1021
+
** VPP Performance Test
** Lucian looking for contact for ARMv8 crypto driver in DPDK for CSIT-1023
+
*** Investigate DPDK performance job - Juraj
*** See CSIT-1023 for details; looks like DPDK issue?
+
*** Performance data on Arm in official release 19.08 is available.
** Nitin to add more details to CSIT-990
+
**** https://docs.fd.io/csit/rls1908/report/
* Action Items
+
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
** Sirshak to move JIRA tickets to xls
+
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
** Lucian to work with Nitin/Jerin on CSIT-1023
+
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
 +
**** Currently trending data could be monitored manually only.
 +
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilation
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
*** https://gerrit.fd.io/r/#/c/vpp/+/21152/ - to fix occasional VPP device failure
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
 +
** Inform Prashant in FD.io lab for the incoming ThunderX2 - Lijian
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
 +
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
**** Vectorize the data buffer index to data buffer pointer function.
 +
**** Jieqiang has finished code reviewing. Honnappa to review the patches.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** There's no crash issue with latest VPP code.
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
*** Finished reviewing the patches. Honnappa to review the patches.
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 4/4/2018 '''
+
'''09/17/2019'''
* Propose to move the meeting +2 hours?
+
* Attendees
* RC1 cut today
+
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 +
** Show CSIT CI/CD, Jenkins status, log and the voting right if there's any failure - Juraj & Lijian
 +
*** Will sync up with Juraj/Stan on Thursday on CSIT demo to Arm product manager.
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance data on Arm in official release 19.08 is available.
 +
**** https://docs.fd.io/csit/rls1908/report/
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
 +
**** Currently trending data could be monitored manually only.
 +
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilaion
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** Allocate 3 ThunderX for EdK to integrate into CI
+
** Inform Prashant in FD.io lab for the incoming ThunderX2 - Lijian
*** JohnB from Cavium agreed to supply 3 more ThunderX for CSIT (will pre-install FW & OS)
+
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
** Brian working on provisioning SSDs for MACCHIATObins
+
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
** Khem can ping IPMI interfaces on TaiShan2280s; also needs an OS to be installed
+
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** Discussed [https://schd.ws/hosted_files/onsna18/6c/ons_fdio_brooks.pdf ONS slides]
+
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
** Khem has patch for clib_memcpy64_x4() and needs help benchmarking
+
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
**** Vectorize the data buffer index to data buffer pointer function.
 +
**** Jieqiang has finished code reviewing. Honnappa to review the patches.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Crash issue is reproduced - Jieqiang
 +
*** Crash is gone after applying the patch.
 +
**** There's crash issue when executing 'show hardwares'
 +
**** https://gerrit.oss.arm.com/#/c/131831/
 +
** Investigate bihash operations in L2 throughput are hot-spots
 +
*** Cache misses and CRC32 calculation are possible opportunities.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
*** Finished reviewing the patches. Honnappa to review the patches.
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''09/10/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 +
** Show CSIT CI/CD, Jenkins status, log and the voting right if there's any failure - Juraj & Lijian
 +
*** Talk to Song about it.
 
* CSIT
 
* CSIT
** Lucian found and created JIRA tickets for 3 issues while running CSIT
+
** VPP Performance Test
** Nitin created JIRA ticket for buildroot issue
+
*** Performance data on Arm in official release 19.08 is available.
** Khem seeing issues with VM
+
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
* Action Items
+
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
** Nitin/Varun to help provision Ubuntu 16.04 and firmware update on ThunderX machines
+
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
 +
**** Currently trending data could be monitored manually only.
 +
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
**** https://gerrit.fd.io/r/#/c/vpp/+/21035/, VPP path cross compilaion
 +
***** setup proper platform specific folder names in DEB packages
 +
***** proper architecture string included in the DEB package name
 +
** VPP Device
 +
** Totally 29 VPP device test cases executed, and all passed on Arm servers.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Inform Prashant in FD.io lab for the incoming ThunderX2 - Lijian
 +
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
 +
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Crash issue is reproduced - Jieqiang
 +
*** Crash is gone after applying the patch
 +
**** There's crash issue when executing 'show hardwares'
 +
**** https://gerrit.oss.arm.com/#/c/131831/
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 3/28/2018 '''
+
'''09/03/2019'''
* Sachin Saxena from NXP joined the call, welcome
+
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 +
** Show CSIT CI/CD, Jenkins status, log and the voting right if there's any failure - Juraj & Lijian
 +
*** Talk to Song about it.
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
 +
**** Currently trending data could be monitored manually only.
 +
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Issue is root-caused. Patch is in community review - https://gerrit.fd.io/r/c/vpp/+/21469
 +
** Totally 29 VPP device test cases executed, and 26 cases passed, and only 3 tap related tests failed.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** Khemendra is having issues with Rudy's emails. Hence, not been able to access Taishan servers
+
** Inform Prashant in FD.io lab for the incoming ThunderX2 - Lijian
** Nitin will try to access the servers this week
+
** Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
** MACCHIATObin setup under progress
+
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
** OD1000 is added to Jenkins slave. The build is failing currently. The build can be triggered manually.
+
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** Discuss Single core, L3Fwd sample perf numbers and analysis next week
+
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
** Sachin is working on compiling 18.01. Native compilation works fine, cross compilation is failing
+
** Align Arm patches with VPP release plan. - Lijian
** Nitin still working on patch for cache line size
+
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
** VPP-1126 is being used in DPDK input node. Khemendra will take a look at it this week.
+
*** Will check VPP release schedule and map with Arm Quarterly plan.
** VPP-1129 Brian/Sirshak will take a look. Looks like it can be closed.
+
*** Note down patches in community review and align them to VPP release plan.
** VPP-1114 Patch under internal review
+
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Crash issue is reproduced - Jieqiang
 +
**** https://gerrit.oss.arm.com/#/c/131831/
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''08/27/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 
* CSIT
 
* CSIT
** Khemendra having issues with interface bring up failing intermittently. Nitin suggested to add delay.
+
** VPP Performance Test
** Nicolas/Lucian debugging TC-07
+
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
** Khemendra having issues with TG VM crashing randomly with Ubuntu 16.04, QEMU 2.10. Solved by moving to Ubuntu 17.10, QEMU 2.10
+
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
** Nitin using Ubuntu 16.04 with 4.13 kernel
+
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
* Action Items
+
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
** Discuss Single core, L3Fwd sample perf numbers and analysis next week - Brian
+
**** Currently trending data could be monitored manually only.
** VPP-1126 Take a look this week as it affects DPDK input node - Khemendra
+
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
** Need more attention on solution for buildroot issue, need more information on failure [https://jira.fd.io/browse/CSIT-990 CSIT-990] - Nitin
+
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
** Create an excel sheet with the test case status - Nicolas/Lucian
+
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** There's issues to build vpp distros.
 +
**** what is not finished is packaging - currently we support Ubuntu's DEB packages for aarch64 architecture (make PLATFORM=aarch64-generic pkg-deb) this is what I currently am trying to sort out ...
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Issue is root-caused. Patch is in community review - https://gerrit.fd.io/r/c/vpp/+/21469
 +
** Totally 29 VPP device test cases executed, and 26 cases passed, and only 3 tap related tests failed.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 +
* FD.io lab
 +
** Inform Prashant in FD.io lab for the incoming ThunderX2 - Lijian
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
 +
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Run VPP with MAP and reproduce the previous crash/failures - Jieqiang
 +
*** Got latest license to install MAP on Shanghai server.
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 3/21/2018 '''
+
'''08/20/2019'''
* Key signing party! Thank you Ed!
+
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 +
* CSIT
 +
** VPP Performance Test
 +
*** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
 +
*** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
 +
*** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
**** If there's performance drop in CSIT performance testing, what action will be taken and who will take care of the drop?
 +
**** Currently trending data could be monitored manually only.
 +
*** https://jenkins.fd.io/job/csit-vpp-perf-verify-1908-3n-tsh/, takes a lots of time, 3 days, 61 hours (28 hours on x86)
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-skx/5/archives/log.html.gz
 +
**** https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-1908-3n-tsh/2/archives/log.html.gz
 +
*** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
*** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
**** Some common failures due to Python bindings or something inside VPP image.
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine.
 +
*** Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** on Arm, different default memory map regions for normal page and huge page;
 +
**** vring with huge-page mapped to normal page region addresses is not working.
 +
**** 1. Reserve 16G VA space for future usage, automatic, private, anonymous and without HUGETLB option.
 +
***** base = mmap (0x410000000, 16 << 30, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 +
**** 2. From the 16G VA space, pick up a 40M unused space, redo mmap() with the HUGETLB option, address fixed
 +
***** vaWithinBase = mmap (base, 40 << 20, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED | MAP_HUGETLB | MAP_LOCKED, fd, 0);
 +
**** 3. Use vaWithinBase to initialize vring and vring_desc
 +
** Totally 29 VPP device test cases executed, and 26 cases passed, and only 3 tap related tests failed.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-csit-verify-hourly/
 +
**** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-vpp-verify-hourly/
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** VEXXHOST currently working on getting another PDU because there are not enough power ports
+
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
** Received SSDs for MACCHIATObins
+
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** Discuss high level plan for VPP on Arm
+
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
** Nitin still working on patch for cache line size
+
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Run VPP with MAP and reproduce the previous crash/failures - Jieqiang
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''08/13/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
** Khemendra Kumar
 +
** Greeshma Katarki
 +
* General
 +
** Share VPN application and FD.io server access info to Greeshma and Khem.
 
* CSIT
 
* CSIT
** Need more attention on solution for buildroot issue [https://jira.fd.io/browse/CSIT-990 CSIT-990]
+
** VPP Performance Test
** Nitin moving towards L2 & L3 perf test cases
+
** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
** VM crash due to buffer overflow when multiple VMs share NVRAM; resolved in Fedora27
+
** Trending data recorded with https://docs.fd.io/csit/master/trending/introduction/introduction.html
''' 3/14/2018 '''
+
** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
* Key signing party! Thank you Ed!
+
** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
** Only 1 out of 199 test cases failed, 8 test cases show random 'show hardware-interfaces' failure.
 +
** Some failures are related with 'show hardware-interfaces'/'show vhost dump', time-out.
 +
*** Juraj to send Lijian the commands/APIs in random dump failure.
 +
*** https://jira.fd.io/browse/CSIT-1453
 +
*** SFP eeprom dump is enabled with 'show hardware-interfaces detail' only. Patch is merged.
 +
*** Juraj will change CSIT script with 'show hardware-interfaces verbose', https://gerrit.fd.io/r/#/c/csit/+/21085/
 +
**** CSIT patch is merged.
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
*** Patch to generate daily data and trending graph is committed.
 +
**** https://gerrit.fd.io/r/#/c/csit/+/20962/
 +
**** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** Initial cross-compiling is working fine. Patch is under review. https://gerrit.fd.io/r/#/c/vpp/+/21035/
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Have gone thru the whole patch, pmalloc module and tap interface code, but cannot identify the root-cause - Lijian
 +
**** Buffer allocate/free based pmalloc seems to be causing the problem.
 +
**** mmap() regions with normal page and huge-page have separate VA spaces.
 +
** Totally 29 VPP device test cases executed, and 26 cases passed, and only 3 tap related tests failed.
 +
*** Currently VPP device is not executed per patch. Issue is still under investigation.
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** ToR switch issue resolved; confirm mgmt IP address assignment to racked Huawei/Cavium machines
+
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
** Started provisioning MACCHIATObins; Andy ordered SSDs to go with them
+
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** No updates
+
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
 +
** Align Arm patches with VPP release plan. - Lijian
 +
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedule and map with Arm Quarterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** All 7 patches are merged.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Jieqiang checked the video by Sirshak
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''08/06/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
* General
 
* CSIT
 
* CSIT
** Adarsh started running CSIT on virtual topology; moved past a paramiko issue, seeing other test failures
+
** VPP Performance Test
** Ongoing discussions on getting Adrian access to machines
+
** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
''' 3/7/2018 '''
+
** Daily job is running twice a day on x86; on Arm, it takes 16 hours and will run one time each day.
 +
** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
** Only 1 out of 199 test cases failed, 8 test cases show random 'show hardware-interfaces' failure.
 +
** Some failures are related with 'show hardware-interfaces'/'show vhost dump', time-out.
 +
*** Juraj to send Lijian the commands/APIs in random dump failure.
 +
*** https://jira.fd.io/browse/CSIT-1453
 +
*** SFP eeprom dump is enabled with 'show hardware-interfaces detail' only. Patch is merged.
 +
*** Juraj will change CSIT script with 'show hardware-interfaces verbose', https://gerrit.fd.io/r/#/c/csit/+/21085/
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
*** Patch to generate daily data and trending graph is committed.
 +
**** https://gerrit.fd.io/r/#/c/csit/+/20962/
 +
**** Trending page: https://docs.fd.io/csit/master/trending/index.html
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Have gone thru the whole patch, pmalloc module and tap interface code, but cannot identify the root-cause - Lijian
 +
**** Buffer allocate/free based pmalloc seems to be causing the problem.
 +
** Totally 29 VPP device test cases executed, and 26 cases passed, and only 3 tap related tests failed.
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** Trishan (LF) to help follow up on progress in FD.io lab
+
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** More discussion on patch for cache line size; use MIDR register exported by proc fs
+
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
** Decision has been made to use wrappers for atomics
+
** Align Arm patches with VPP release plan. - Lijian
** Damjan reworked PCI handling code and added native driver for Intel AVF (XL710 i.e. Fortville)
+
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
*** Measuring 132 clocks per packet on Skylake (ip4 routing) with VLIB_FRAME_SIZE 256 (default); +1Mpps over DPDK avf/i40e PMD
+
*** Will check VPP release schedule and map with Arm Quarterly plan.
** Damjan reworked memcpy() in MEMIF; achieve 2x25GbE line rate with these changes
+
*** Note down patches in community review and align them to VPP release plan.
** Sirshak working on getting VPP running on Qualcomm Centriq with Mellanox NIC
+
*** It has been challenging to do that in VPP.
*** Seeing issues with external DPDK; static works but not shared; is VPP build system missing -libverbs -lmlx5 in LDFLAGS?
+
** Vectorization
*** Nitin noticed DPDK 17.11 Mellanox PMD does not compile
+
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
*** Mellanox recently submitted a patch to VPP to support dynamic loading of Mellanox libraries
+
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** All 7 patches are merged.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Jieqiang checked the video by Sirshak
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Lockless patch with IPv4 mtrie - Jason
 +
** Investigating bi-hash lockless implementation - Jason
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''07/30/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
* General
 
* CSIT
 
* CSIT
** Adrian does not have machines to work with in Bucharest; machine in Paris that Gabriel was using no longer available
+
** VPP Performance Test
*** AndyW to help resolve
+
** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
** Adarsh moved past VM issues; able to launch VPP in VM with virtio interface; starting to run CSIT scripts
+
** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
''' 2/28/2018 '''
+
** Only 1 out of 199 test cases failed, 8 test cases show random 'show hardware-interfaces' failure.
 +
** Some failures are related with 'show hardware-interfaces'/'show vhost dump', time-out.
 +
*** Juraj to send Lijian the commands/APIs in random dump failure.
 +
*** https://jira.fd.io/browse/CSIT-1453
 +
*** Will check details with x86 server also. It's slow also on x86, but only 5 sec, but it takes 40 sec on Taishan  - Lijian
 +
*** It’s quite time-consuming for ‘show hardware-interfaces’ reading eeprom of the SFP, via software emulated I2C bus.
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
*** Tried cross-compiling with DPDK only.
 +
*** We can put the cross-compiling knowledge into section 'for developers', vpp-docs project.
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Have gone thru the whole patch, pmalloc module and tap interface code, but cannot identify the root-cause - Lijian
 +
**** pmalloc module test cases failed on Arm server due to sudo privilege.
 +
** Totally 35 VPP device test cases passed, and only 3 tap related tests failed.
 +
** VPP device job is running now and will be triggered per VPP patch and CSIT patch
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2/
 +
*** https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-1n-tx2/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-semiweekly/
 +
*** https://jenkins.fd.io/view/csit/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/
 
* FD.io lab
 
* FD.io lab
** Ed Kern to try containerized CI on one OD1000 in parallel with Vanessa
+
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
** Received MACCHIATObins in Austin
+
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 
* VPP
 
* VPP
** Adarsh trying to run VPP in VM but getting PCI mapping issue; trying to connect to Linux bridge on host
+
** https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
** Patches for build breakage were committed; arm64 build stable now
+
** Align Arm patches with VPP release plan.
** Brian able to reproduce low PPS numbers seen on MACCHIATObin
+
*** Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
 +
*** Will check VPP release schedual and map with Arm Quaterly plan.
 +
*** Note down patches in community review and align them to VPP release plan.
 +
*** It has been challenging to do that in VPP.
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Message queue optimize it with relaxed atomic intrinsics - Lijian
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
 +
*** Spinlock with inner loop got improvement on both x86 and Arm.
 +
*** Read/write lock got a little degradation with the patch.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Jieqiang checked the video by Sirshak
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''07/23/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
* General
 
* CSIT
 
* CSIT
** Adarsh can reproduce a crash in qemu 2.10 Ubuntu 16.04; going to try Ubuntu 17.10
+
** VPP Performance Test
** Need to partition func test cases across people
+
** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
''' 2/21/2018 '''
+
** Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
 +
** Only 1 out of 199 test cases failed, 8 test cases show random 'show interface' failure.
 +
** Some failures are related with 'show hardware'/'show interface'/'show vhost dump', time-out.
 +
*** https://jira.fd.io/browse/CSIT-1453
 +
*** Will check details with x86 server also. It's slow also on x86, but only 5 sec, but it takes 40 sec on Taishan  - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
**** 1. All tests are failing. 'show hardware' takes too much time. https://jira.fd.io/browse/VPP-1722
 +
**** 2. To figure out which test cases are executed
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** Working on MAC learning test failures on Cortex-A72 server - Jieqiang
 +
**** Enlarge duration can fix the failure, but will investigate more details.
 +
**** Issues have been fixed in latest master branch. Investigating the details.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Send email and current debug details to community calling for volunteer to fix it. - Lijian
 +
**** pmalloc module test cases failed on Arm server.
 +
*** Changes are uploaded to community gerrit.
 +
*** vpp VMs seems to bring up well. Will work on init script and bring up vpp.
 +
**** VM tests passed. Patches are to be submitted for community review.
 +
**** All the patches are merged and all images are built.
 +
*** Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
 +
**** Docker images for both Arm and x86 are merged and available.
 +
**** https://jenkins.fd.io/sandbox/job/csit-vpp-device-master-ubuntu1804-1n-tx2-weekly/1/console
 +
**** Docker image is verified on Arm server, but to verify it on x86 server also and try it in Jenkins.
 +
**** Arm and x86 have separate docker image. Arm docker image is to be built.
 +
**** Totally 35 test cases, and only 3 tap related tests failed.
 +
*** Ed to help set up numad cluster with dual ThunderX and one ThunderX2
 
* FD.io lab
 
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
**** Server info in CSIT docs:
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD
 +
**** https://gerrit.fd.io/r/gitweb?p=csit.git;a=blob;f=docs/lab/testbed_specifications.md;h=afa36ff56c7be09621e85bae6a1498aadf3a1981;hb=HEAD#l495
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Message queue optimize it with relaxed atomic intrinsics - Lijian
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
 +
*** Spinlock with inner loop got improvement on both x86 and Arm.
 +
*** Read/write lock got a little degradation with the patch.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
*** Inform MAP owner that Jieqiang will take care of MAP on VPP. - Lijian
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin/Bluefield.
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 +
'''07/16/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
* General
 
* CSIT
 
* CSIT
** Gabriel updated CSIT/AArch64 wiki with PASS/FAIL/OTHER list
+
** VPP Performance Test
*** OTHER - failure due to expect-like parsing of output(?)
+
** Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
*** FAIL - ssh timeout during PCIe rescan(?)
+
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
** Moved past first UEFI crash; still seeing crashing on startup (Gabriel)
+
*** creating a job. - Everything is ready except the docker image
*** Setup new Ubuntu environment
+
**** 1. All tests are failing. 'show hardware' takes too much time. https://jira.fd.io/browse/VPP-1722
*** Continue debugging UEFI issue on Fedora with JeremyL
+
**** 2. To figure out which test cases are executed
** Ubuntu is used pretty much everywhere except for additional CentOS CSIT perf
+
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
** Nitin working on upstreaming changes to CSIT
+
** VPP Path
** Adarsh working on getting VM interfaces working
+
*** Working on MAC learning test failures on Cortex-A72 server - Jieqiang
 +
**** Enlarge duration can fix the failure, but will investigate more details.
 +
**** Issues have been fixed in latest master branch. Investigating the details.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Send email and current debug details to community calling for volunteer to fix it. - Lijian
 +
*** Changes are uploaded to community gerrit.
 +
*** vpp VMs seems to bring up well. Will work on init script and bring up vpp.
 +
**** VM tests passed. Patches are to be submitted for community review.
 +
**** Patch is splited into three small pieces. Two patches (kernel image for VM test/generic CSIT changes to support ThunderX2 testbed) are merged. Third patch about code changes for VM test to be merged, Arm specific code and use kernel image.
 +
*** Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
 +
**** Docker images for both Arm and x86 are merged and available.
 +
**** Docker image is verified on Arm server, but to verify it on x86 server also and try it in Jenkins.
 +
*** Ed to help set up numad cluster with dual ThunderX and one ThunderX2
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
 +
*** It’s 1RU blade ThunderX2.
 +
*** The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** The machine should have a big RAM, more than 120G and 256G preferred.
 +
*** The machine should Three NICs (XL710-QDA2, 2x40G).
 +
*** The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
 +
** ThunderX1
 
* VPP
 
* VPP
** More discussion on how to handle cache line size
+
** VPP host-stack Hotspots
** Sync'd on patches for build breakage
+
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Message queue optimize it with relaxed atomic intrinsics - Lijian
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
**** The patch is also enabled for x86. Will ask maintainer to review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
 +
*** Spinlock with inner loop got improvement on both x86 and Arm.
 +
*** Read/write lock got a little degradation with the patch.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin.
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 2/14/2018 '''
+
'''07/09/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
** Christian Hopps
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** IPSEC test cases are failing and skipped on Arm server in CI/CD
 +
**** https://jira.fd.io/browse/VPP-1714
 +
**** Create a Jira ticket to track all the info related to this issue - Juraj
 +
*** Working on MAC learning test failures on Cortex-A72 server - Jieqiang
 +
**** Enlarge duration can fix the failure, but will investigate more details.
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Send email and current debug details to community calling for volunteer to fix it. - Lijian
 +
*** Changes are uploaded to community gerrit.
 +
*** vpp VMs seems to bring up well. Will work on init script and bring up vpp.
 +
**** VM tests passed. Patches are to be submitted for community review.
 +
*** Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
 +
**** Docker images for both Arm and x86 are merged and available.
 +
*** Ed to help set up numad cluster with dual ThunderX and one ThunderX2
 
* FD.io lab
 
* FD.io lab
** Working on getting access to LF lab in order to setup OD1000 environment
+
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
** Check with tykeal & zxiiro on trust policy for getting others access (Brian)
+
** Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
** VEXXHOST
+
*** Update the current status to Pravin. - Lijian
*** Mohammed says they do not have extra rack shelf - we need to send one for 3x MACCHIATObin
+
*** The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
*** LF RT tickets: #52434 (ThunderX), #52435 (TaiShan2280), #52436 (MACCHIATObin)
+
*** Require a bigger than 120G RAM, prefer 256G
 +
*** Three NICs and each has two ports.
 +
** ThunderX1
 
* VPP
 
* VPP
** Build, unit test, deb/rpm
+
** VPP host-stack Hotspots
*** 64B/128B cache line size - working on passing this configuration to rest of build system i.e. DPDK (Nitin)
+
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
*** RPi3 32-bit
+
** Message queue optimize it with relaxed atomic intrinsics - Lijian
**** Some parts of patch are 32-bit related, some RPi3 related
+
** Vectorization
**** If there is justification, look into maintaining a 32-bit build on ARM
+
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
** Porting & Tuning
+
** Spinlock/read-write lock optimization - Jason
*** If patches need to be tested on multiple Arm chips, please use DO_NOT_MERGE and Code Review -2
+
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
*** Two NEON related patches merged, working in progress on others, Nitin testing CLASSIFY_USE_SSE
+
*** Spinlock with inner loop got improvement on both x86 and Arm.
 +
*** Read/write lock got a little degradation with the patch.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Apply dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
 +
*** Will do bench-marking profiling on mcbin.
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
 +
** To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
 
 +
'''07/02/2019'''
 +
* Attendees
 +
** Tina Tsou
 +
** Honnappa Nagarahalli
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Jason Zhang
 +
** Juraj Linkes
 +
* General
 
* CSIT
 
* CSIT
** Please open JIRA ticket with details on VM crashing on startup. DONE: [https://jira.fd.io/browse/CSIT-922 CSIT-922]
+
** VPP Performance Test
** Khem working on running VPP func tests on internal setup
+
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** IPSEC test cases are failing and skipped on Arm server in CI/CD
 +
**** https://jira.fd.io/browse/VPP-1714
 +
**** Create a Jira ticket to track all the info related to this issue - Juraj
 +
*** Working on MAC learning test failures on Cortex-A72 server - Jieqiang
 +
**** Enlarge duration can fix the failure, but will investigate more details.
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - Lijian
 +
**** The current default C compiler identification is GNU 8.3.0
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
**** Send email and current debug details to community calling for volunteer to fix it. - Lijian
 +
*** vpp VMs seems to bring up well. Will work on init script and bring up vpp.
 +
*** Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
 +
*** Set up numad cluster with dual ThunderX and one ThunderX2
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
 +
*** Update the current status to Pravin. - Lijian
 +
*** The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** Require a bigger than 120G RAM, prefer 256G
 +
*** Three NICs and each has two ports.
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Message queue, remove atomic intrinsics and use lock version only - Lijian
 +
*** Have verified removing atomic intrinsics in message_queue alloc/free APIs, and require confirmation from Florin.
 +
** Vectorization
 +
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
 +
** Spinlock/read-write lock optimization - Jason
 +
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
 +
** Fix ip4_forward compiling - Jason
 +
*** Will check gerrit CI/CD related with that patch. Check why it's not warning in gerrit Jenkins.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Spread dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Will do bench-marking profiling on mcbin.
 +
** Think of memory usage and optimization for smaller device/memory
 +
*** http://espressobin.net/announcing-espressobin-v7-revision/
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 2/7/2018 '''
+
'''06/25/2019'''
* LF lab
+
* Attendees
** OD1000 - last machine was racked; Vanessa needs credentials
+
** Tina Tsou
** Taishan2280 - machines arrived at Vexxhost; confirm with Rudy/Mohammed
+
** Honnappa Nagarahalli
** ThunderX - machines arrived at Vexxhost; send board details to Mohammed
+
** Lijian Zhang
** MACCHIATObin - boards arrived in Arm SJC waiting for enclosures (Andy)
+
** Jieqiang Wang
* Build, unit test, packaging
+
** Jason Zhang
** 64B/128B cache line size - working on it (Nitin)
+
** Juraj Linkes
** Interest in ILP32 from Cavium; customer coming from MIPS32
+
* General
*** [https://www.slideshare.net/linaroorg/bkk16305b-ilp32-performance-on-aarch64 BKK16-305B ILP32 Performance on AArch64]
+
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** IPSEC test cases are failing and skipped on Arm server in CI/CD
 +
**** https://jira.fd.io/browse/VPP-1714
 +
**** Create a Jira ticket to track all the info related to this issue - Juraj
 +
*** Working on MAC learning test failures on Cortex-A72 server - Jieqiang
 +
**** Enlarge duration can fix the failure, but will investigate more details.
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - Juraj
 +
**** The current default C compiler identification is GNU 8.3.0
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
 +
*** Crypto test cases, will use dpdk driver if configured, native-vpp implementation, fall back to openSSL
 +
**** Will try Crypto test cases next week - Juraj
 +
*** Juraj to send Lijian the details of vpp VMs, Lijian will confirm internally
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
 +
*** Firstly will sponsor the machine
 +
*** The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
 +
*** Require a bigger than 120G RAM, prefer 256G
 +
*** Three NICs and each has two ports.
 +
** ThunderX1
 
* VPP
 
* VPP
** NEON usage in vhost - sent first patch for review (Nitin)
+
** VPP host-stack Hotspots
*** Need to verify how it performs on other Arm-based machines (Brian)
+
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
*** VPP maintainers prefer to use SIMD wrappers, but it might not always be possible
+
** Message queue, remove atomic intrinsics and use lock version only - Lijian
**** Cavium/Arm had to rewrite algorithm for AArch64 instead of use SIMD wrappers in DPDK
+
*** Have verified removing atomic intrinsics in message_queue alloc/free APIs, and require confirmation from Florin.
** CLIB_HAVE_VEC128 - working on it (Gabriel)
+
** Vectorization
** Discussed compiler builtins for atomics in VPP call; need to spin another patch with wrappers based on architecture (Kevin)
+
**** Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
** Seeing prefetch hostspots on TX2+MlnxCX4en (similar to Armada8040) (Nitin)
+
** Spinlock optimization - Jason
 +
*** Refactored spinlock and added test file for spinlock. Patches are under internal review.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Spread dual/quad optimization - Lijian
 +
*** Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
 +
*** Will do bench-marking profiling on mcbin.
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''06/18/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina Tsou
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Juraj Linkes
 +
* General
 
* CSIT
 
* CSIT
** libvirt crashing on VM startup (Hierofalcon) (Gabriel)
+
** VPP Performance Test
*** Need someone who can reproduce this issue (Arm TBD)
+
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
** Huawei also seeing VM issues (Khem)
+
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
** buildroot doesn't work on Arm (Nitin)
+
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
*** Root issue: no support in GRUB for AArch64 in buildroot (?)
+
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
**** Need someone who can reproduce this issue (Arm TBD)
+
*** creating a job. - Everything is ready except the docker image
*** Peter Mikus replied to Nitin on csit-dev mail list
+
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
*** Using a temporary workaround: use a different VM image (Ubuntu Cloud) instead of one produced by buildroot
+
** VPP Path
**** Working on patching DPDK in VM image (Ubuntu Cloud) just like done in buildroot
+
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - Juraj
* Misc
+
**** The current default C compiler identification is GNU 8.3.0
** OpenFlow (Nitin, Damjan)
+
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
*** Is there an OpenFlow agent for VPP, and can VPP implement OpenFlow rules/tables?
+
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
*** VPP is not flow-based like OVS is; they are different
+
** VPP Device
*** Can ODL/Honeycomb be used?
+
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
*** Will confirm with Florin to remove atomic intrinsics in message_queue alloc/free APIs
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - Upstreamed.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
** Spread qual/quad optimization - ethernet-input
 +
** Redo perf/MAP profiling/bench-marking
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** Apply dual/quad optimization on more data path nodes
 +
*** Investigate and optimize VPP hash and bihash library
 +
*** VPP translation overhead analysis btw Mbuf and VLIB buffer ENTNET-1293
 +
*** VPP Memif performance analysis and optimization ENTNET-1292
 +
*** VPP l3fwd performance analysis and optimization ENTNET-751
 +
*** Using MAP with VPP ENTNET-1288
  
''' 1/31/2018 '''
+
'''06/11/2019'''
* LF lab
+
* Attendees
** OD1000 - 1 replacement being installed this week
+
** Sirshak Das
** Huawei & Cavium boards should arrive at colo this week; confirm with Rudy
+
** Honnappa Nagarahalli
* Build, unit test, packaging
+
** Tina Tsou
** Kubeproxy/NAT failures
+
** Lijian Zhang
*** Not arch related
+
** Jieqiang Wang
*** Part of extended unit tests, so does not block CI
+
** Juraj
** `make test` passes on D03 & D05 (Ubuntu)
+
* General
* MACCHIATObin
+
** Seeing hotspots in VPP graph nodes
+
*** L3 forwarding - ip4 rewrite node
+
*** L2 cross-connect
+
*** Try reducing quad loop to a dual loop
+
*** dpdk-input node highly opt for x86 (could contribute to low perf) but hotspots still in rte_mbuf_t conversion(?)
+
** Some examples of runtime code selection based on uarch exist in the codebase
+
 
* CSIT
 
* CSIT
** Adrian Oanca join from Enea
+
** VPP Performance Test
** Gabriel seeing VM crashing during boot; related to # interfaces assigned (6)
+
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
** Nitin ran into issue with buildroot on arm64; see thread on csit-dev
+
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - Juraj
 +
**** The current default C compiler identification is GNU 8.3.0
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
*** Will confirm with Florin to remove atomic intrinsics in message_queue alloc/free APIs
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - Upstreamed.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
** Spread qual/quad optimization - ethernet-input
 +
** Redo perf/MAP profiling/bench-marking
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 1/24/2018 '''
+
'''06/04/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina Tsou
 +
** Lijian Zhang
 +
** Jieqiang Wang
 +
** Stan
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 
* VPP
 
* VPP
** DPDK issue with non-pci network cards
+
** VPP host-stack Hotspots
** build & test status updated
+
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
** VPP-1127 (VEC_128 enable) under discussion. Should we enable this by default ?
+
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
** add Nitin to review Neon commits
+
** iperf3 performance with Hoststack.
** VPP-1114 currently internal review
+
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
** VPP-1064 under rework after review by Damjan
+
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - Upstreamed.
 +
** MAP with VPP - error is resolved. Sort of working. Record the details.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''05/28/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina
 +
** Lijian Zhang
 +
** Vijay (vijayakumar.rajamanickam@nokia.com)
 +
* General
 
* CSIT
 
* CSIT
** first 3-nodes functional tests status list
+
** VPP Performance Test
** TODO Gabriel: share CSIT VM setup env
+
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
** nested VM: build-root package support for ARM. Create Jira ticket for Brian.
+
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - internal Review.
 +
** MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 1/17/2018 '''
+
'''05/21/2019'''
* Tina to send calendar invite for meeting
+
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina
 +
** Lijian Zhang
 +
** Vijay (vijayakumar.rajamanickam@nokia.com)
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 
* FD.io lab
 
* FD.io lab
** Cavium shipping
+
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 
* VPP
 
* VPP
** Kubeproxy tests failing
+
** VPP host-stack Hotspots
** Khem trying to find out the PCIe address for a given netdev interface
+
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - internal Review.
 +
** MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''05/14/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina
 +
** Lijian Zhang
 +
** Vijay (vijayakumar.rajamanickam@nokia.com)
 +
* General
 
* CSIT
 
* CSIT
** Gabriel setting up 3 node topo with VMs
+
** VPP Performance Test
** Gabriel working on PASS/FAIL status
+
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
* [https://docs.fd.io/csit/rls1710/report/index.html CSIT 17.10 report]
+
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
** VPP generic distro package building patch - Patch updated. Require Damjan's follow up review.
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - internal Review.
 +
** MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
 +
** Investigate hyperscan plugin in VPP - Sirshak
 +
*** DPI plugin?
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 1/10/2018 '''
+
'''05/07/2019'''
* Meeting moved 2 hours earlier - 6AM PT / 3PM CET / 7:30PM IST / 10PM CST
+
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina
 +
** Lijian Zhang
 +
** Vijay (vijayakumar.rajamanickam@nokia.com)
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 
* FD.io lab
 
* FD.io lab
** Cavium ThunderX shipping soon
+
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 
* VPP
 
* VPP
** Kumar to look at VPP-1126
+
** VPP host-stack Hotspots
** Gabriel proposed https://gerrit.fd.io/r/#/c/10049/ as follow-up to Damjan's patch
+
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
*** Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input.
 +
** VPP generic distro package building patch - Patch updated Damjan's follow up review required.
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** Vectorization in esp-encrypt, optimize memcpy_le. Upstreamed(https://gerrit.fd.io/r/#/c/18398/). - Lijian
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - internal Review.
 +
** MAP(Arm Proprietary Performance Analysis Tool) with VPP - Tried internal Patch still failing. Continuing to work on it.
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
 
 +
'''04/30/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Honnappa Nagarahalli
 +
** Tina
 +
* General
 
* CSIT
 
* CSIT
** Gabriel's patch for aarch64 support in CSIT merged
+
** VPP Performance Test
** VirtualBox not supported on Arm / Vagrant unknown
+
** Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
*** This is OK for upstream since automation expects VMs to already exist
+
** Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
* Performance
+
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
** Need plan for 1T; use TaiShans that were sent to lab
+
** Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
* AIs
+
*** creating a job. - Everything is ready except the docker image
** Brian: Follow up with Vanessa and EdW regarding 'resource issue'
+
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
** Gabriel: Update CSIT wiki page; which tests are passing/failing?
+
** VPP Path
** Brian: Check with Vanessa how to split machines between CI jobs and CSIT jobs
+
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status:  
 +
*** mcbin: Kernel Migration on mcbin. Status:
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status:
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
* VPP
 +
** VPP host-stack Hotspots
 +
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
 +
*** Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input.
 +
** VPP generic distro package building patch - Patch updated Damjan's follow up review required.
 +
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** VPP machiatto bin showing some unstable performance.
 +
** Vectorization
 +
*** Vectorization in esp-encrypt, optimize memcpy_le. Upstreamed(https://gerrit.fd.io/r/#/c/18398/). - Lijian
 +
*** ethernet-input causes performance drop on AArch64.
 +
**** There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
 +
**** A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
 +
** TAS patch - internal Review.
 +
** MAP(Arm Proprietary Performance Analysis Tool) with VPP - Tried internal Patch still failing. Continuing to work on it.
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
  
''' 1/3/2018 '''
+
'''04/23/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Lijian Zhang
 +
** Juraj Linkeš
 +
** Vijay
 +
** Nitin
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Honnappa Nagarahalli
 +
* General
 +
* CSIT
 +
** VPP Performance Test
 +
** List all the blockers on aarch64 in CSIT wiki page - Done - Stan or Juraj
 +
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
 +
*** b. merging CSIT patch. - Closing done
 +
*** c. creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** QSFP+ is available and working now.
 +
**** Juraj to resend email to Mahamad about the details, including Sirshak and Tina
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
 
* FD.io lab
 
* FD.io lab
** One OD1000 sent for RMA
+
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
** Huawei PO sent out
+
** ThunderX1
** Cavium PO sent out (?)
+
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
 +
*** Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
 +
*** These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
 +
*** Investigate why these three blades have only one numa node - Juraj
 
* VPP
 
* VPP
** Gabriel working on patch for "show cpu" to display MIDR as human readable
+
** Investigate session_queue_node_fn/vlib_worker_loop.
** Nitin sent preliminary patch for vhost-user NEON impl
+
*** Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
*** Seeing perf differences on different cores; tradeoff is single-threaded perf vs. NEON
+
*** Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input
** Kumar built and unit test successfully on D03
+
** Rebase VPP distro package building patch; contact Damjan in slack; Talk with Damjan in vpp meeting - Lijian & Sirshak
** Nitin to resume patch for supporting different cache line sizes for the same arch
+
** Investigating message queue, understand use case with svm queue, talk the ideas with Florin - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Perf degradation is fixed. Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK sample apps on it now.
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
** Vectorization
 +
*** Vecterization in esp-encrypt, https://gerrit.fd.io/r/#/c/18398/ - Get improvement on ThunderX/OcteonTX/Taishan, but degradation on ThunderX2 - Lijian
 +
** TAS patch will be ready soon (Sirshak)
 +
** MAP with VPP is ongoing - Sirshak
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
* Action Items - Last Week
 +
* Action Items - Next Week
 +
 
 +
'''04/16/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Lijian Zhang
 +
** Juraj Linkeš
 +
** Vijay
 +
** Nitin
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Honnappa Nagarahalli
 +
* General
 
* CSIT
 
* CSIT
** Gabriel cleaned up WIP patch; ready for review
+
** VPP Performance Test
** Kumar starting CSIT func tests with Ubuntu VMs
+
** List all the blockers on aarch64 in CSIT wiki page - Done - Stan or Juraj
*** Scripts for running on dedicated hardware need to be modified, e.g. PCIe resources
+
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
** Kumar to send doc on testing
+
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
* Performance
+
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
** Kumar to start thread on performance testing
+
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
* AIs
+
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
** Brian: Check with Tina on shipping and open LF RT ticket once they have arrived
+
*** b. merging CSIT patch. - Closing done
** Brian: Need a way to choose either SW or NEON impl based on chip
+
*** c. creating a job. - Everything is ready except the docker image
** Gabriel: Create list of broken CSIT tests for 2-node topology
+
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
''' 12/20/2017 '''
+
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
'''No meeting next week - Dec 27'''
+
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** QSFP+ is available and working now.
 +
**** Juraj to resend email to Mahamad about the details, including Sirshak and Tina
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
 
* FD.io lab
 
* FD.io lab
** OD1000s - build only
+
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
*** 1 of 3 needs to be RMAd
+
** ThunderX1
*** Can these be up in time to show 'make test' passes on ARM for 18.01 release report?
+
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
** TaiShan
+
*** Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
*** PO in progress
+
*** Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
** ThunderX - build only
+
*** These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
*** PO went out
+
*** Investigate why these three blades have only one numa node - Juraj
 
* VPP
 
* VPP
** Patches / JIRAs
+
** Investigate session_queue_node_fn/vlib_worker_loop. - https://jira.arm.com/browse/ENTNET-1179 - Done
*** Patch for extended test failure, but still more (new) extended test failures - Gabriel
+
*** Will create two Jira tickets to track the findings. - Lijian
*** Nitin to post vhost-user.c changes for NEON
+
** Rebase VPP distro package building patch; contact Damjan in slack; Talk with Damjan in vpp meeting - Lijian & Sirshak
**** Nitin will finish Gabriel's original NEON patch to add CLIB_HAVE_VEC_128
+
** Investigating message queue - Lijian
** Can we share code on Github e.g. NEON perf tests?
+
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Perf degradation is fixed. Investigating performance degradation on Bluefield - Sirshak
 +
**** Will resume Taishan host-stack setup - Lijian
 +
*** Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK sample apps on it now.
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
** Vectorization
 +
*** Vecterization in esp-encrypt, https://gerrit.fd.io/r/#/c/18398/ - Get improvement on ThunderX/OcteonTX/Taishan, but degradation on ThunderX2 - Lijian
 +
** EPIC for next quarter:
 +
*** ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
 +
*** Message Queue - Planned (Lijian)
 +
*** VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
 +
*** TAS patch (Sirshak)
 +
*** MAP with VPP - Planned (Sirshak)
 +
*** Roadmap for TCP optimization
 +
**** Timer implementation - (Sirshak) - Indicative
 +
**** perf analysis - Planned (Sirshak)
 +
***** TCP state machine from weak memory model perspective
 +
* Action Items - Last Week
 +
* Action Items - Next Week
 +
 
 +
'''04/09/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Lijian Zhang
 +
** Juraj Linkeš
 +
** Nitin
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Honnappa Nagarahalli
 +
* General
 
* CSIT
 
* CSIT
** Leading question: How many CSIT test cases are passing/failing?
+
** VPP Performance Test
** Environment issues preventing running through all CSIT test cases; Gabriel needs dedicated machines or more RAM
+
** List all the blockers on aarch64 in CSIT wiki page - Done - Stan or Juraj
** Cavium & Huawei will join Gabriel in CSIT replication on ARM hardware next week
+
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
*** Cavium previously ran vhost test cases manually, now moving to CSIT
+
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
*** Patch to resolve the issue is in community review. https://gerrit.fd.io/r/#/c/18278/ - Lijian
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
 +
*** b. merging CSIT patch. - Closing done
 +
*** c. creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** QSFP+ is available and working now.
 +
**** Juraj to resend email to Mahamad about the details, including Sirshak and Tina
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
 +
*** Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
 +
*** These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
 +
*** Investigate why these three blades have only one numa node - Juraj
 +
* VPP
 +
* VPP Hoststack
 +
** Investigate session_queue_node_fn/vlib_worker_loop. - https://jira.arm.com/browse/ENTNET-1179 - Done
 +
** Rebase VPP distro package building patch; contact Damjan in slack; Talk with Damjan in vpp meeting - Lijian & Sirshak
 +
** Investigating message queue - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Perf degradation is fixed. Investigating performance degradation on Bluefield - Sirshak
 +
*** Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK sample apps on it now.
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
** Vectorization
 +
*** Vecterization in esp-encrypt, https://gerrit.fd.io/r/#/c/18398/ - Get improvement on ThunderX/OcteonTX/Taishan, but degradation on ThunderX2 - Lijian
 +
*** ethernet-input - will implement for aarch64 128bits only
 +
*** Create vectorization specific EPIC - Lijian
 +
* Action Items - Last Week
 +
* Action Items - Next Week
  
''' 12/13/2017 '''
+
'''04/02/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Nitin
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
* General
 +
* VPP Hoststack
 +
** After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
 +
** Investigate session_queue_node_fn/message queue data structure. - Investigating the source code
 +
** Review https://gerrit.fd.io/r/#/c/18398/ - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Upstreamed. Perf degradation is fixed.
 +
*** Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK on it now.
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
* CSIT
 +
** VPP Performance Test
 +
** List all the blockers on aarch64 in CSIT wiki page - Stan or Juraj
 +
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
** Both binaries and packages built with generic option by default, and provide Makefile variable NATIVE_OPTIMIZE=Y for end user to build native optimized images.
 +
*** Prepare email and a draft patch asking comments from community - Lijian
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
 +
*** b. merging CSIT patch. - Closing done
 +
*** c. creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** QSFP+ is available and working now.
 +
**** Juraj to resend email to Mahamad about the details, including Sirshak and Tina
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
 +
*** thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
 +
* FD.io lab
 +
** Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
 +
*** Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
 +
*** These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
 +
*** Investigate why these three blades have only one numa node - Juraj
 
* VPP
 
* VPP
** Quick overview of work items
+
** Write description/expectation about the two NEON related patch - Lijian
** Waiting to hear back from LF about OD1000 connectivity
+
** Investigating performance degradation on CortexA72 - Sirshak
*** Changes needed to ci-mgmt
+
** Message queue - Sirshak
 +
** Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - To close it.
 +
** Vectorization
 +
*** ethernet-input - no progress yet
 +
** 128B cache line size
 +
** Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
 +
 
 +
'''03/26/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Nitin
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
* General
 +
* VPP Hoststack
 +
** After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
 +
** Investigate session_queue_node_fn/message queue data structure. - Investigating the source code
 +
** Review https://gerrit.fd.io/r/#/c/18398/ - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Upstreamed. Perf degradation is fixed.
 +
*** Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK on it now.
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 
* CSIT
 
* CSIT
** Starting to reproduce CSIT on x86 and ARM (with Gabriel's WIP patch)
+
** VPP Performance Test
*** Some issues with environment variables (perf tests on 2-node)
+
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
** Need Nexus to support aarch64 packages
+
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
*** Need a contact for Nexus
+
** Both binaries and packages built with generic option by default, and provide Makefile variable NATIVE_OPTIMIZE=Y for end user to build native optimized images.
* Share known issues on wiki!
+
*** Prepare email and a draft patch asking comments from community - Lijian
* Request CSIT 'deep dive'
+
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
 +
*** b. merging CSIT patch. - Closing done
 +
*** c. creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** QSFP+ is available and working now.
 +
**** Juraj to resend email to Mahamad about the details, including Sirshak and Tina
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
 +
*** thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
 +
* FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
 +
*** Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
 +
*** These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
 +
*** Investigate why these three blades have only one numa node - Juraj
 +
* VPP
 +
** Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - To close it.
 +
** Vectorization
 +
*** ethernet-input - no progress yet
 +
** 128B cache line size
 +
** Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
  
''' 12/06/2017 '''
+
'''03/19/2019'''
* Can we access the OD1000 in csit lab ?
+
* Attendees
** currently mainly working with VMs
+
** Sirshak Das
* added dedicated wiki page for CSIT : https://wiki.fd.io/view/CSIT/AArch64
+
** Juraj Linkeš
* WIP : https://gerrit.fd.io/r/#/c/9474/
+
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
* General
 +
* VPP Hoststack
 +
** After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
 +
** vlib_worker_loop and session_queue_node_fn are two major hot-spots. - Just started
 +
** Enable NEON instruction in Buffer pool free function. Patch is committed.
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Upstreamed, but still working on issues, e.g., performance degradation
 +
*** Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Done by Malvika.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
*** Prepare email and a draft patch asking comments from community - Lijian
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
 +
*** b. merging CSIT patch. - Closing done
 +
*** c. creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
**** Juraj to resend email to Mahamad about the details, including Sirshak and Tina
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
**** Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
 +
*** Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
 +
*** thunderx2: Status: Talk to edk about deployment strategy with 1-node. Also blocked by QSFP+ issue.
 +
* FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Juraj setup call with LF people. Status: Done.
 +
** ThunderX2
 +
*** Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
 +
* VPP
 +
** Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - In internal review.
 +
** Vectorization
 +
*** ethernet-input - no progress yet
 +
*** buffer pools - https://jira.fd.io/browse/VPP-1560. In internal review
 +
** 128B cache line size
 +
*** VPP image with 128B cache line size crashed on ThunderX2 - Cannot reproduce crash with my setup
 +
** Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
 +
** Commit VPP distro making patch - Lijian
 +
** Plugin 25G NIC Taishan server, and connect the 25G ports to x86 25G NIC - Lijian
 +
** Follow Jianlin's suggestion, update Uboot and Kernel, and then sync up with Juraj - Lijian
  
''' 11/29/2017 '''
+
'''03/12/2019'''
*VPP
+
* Attendees
** vhost-user.c - SSE4.2 only. Implement range search using NEON. (nitin)
+
** Sirshak Das
** OD1000 status ?
+
** Juraj Linkeš
*** build only
+
** Stanislav Chlebec
*** can we access them ?
+
** Khemendra Kumar
*** what wan we do to help in general ?
+
** Tina Tsou
** x86 intrinsic review
+
** Andy Wang
** build VPP on ARM VM on x86
+
** Gorka
*CSIT
+
** Fede
** what platforms wil lbe made available
+
** Honnappa Nagarahalli
 +
* General
 +
** Tina to update the meeting notice.
 +
* VPP Hoststack
 +
** After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
 +
** vlib_worker_loop and session_queue_node_fn are two major hot-spots. - No progress
 +
** Enable NEON instruction in Buffer pool free function. Patch is committed.
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. All test failures are resolved.
 +
*** Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Switched to Malvika.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
*** Prepare email and a draft patch asking comments from community - Lijian
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
 +
*** b. merging CSIT patch. - Closing done
 +
*** c. creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
*** Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
 +
*** thunderx2: Status: Talk to edk about deployment strategy with 1-node.
 +
* FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Juraj setup call with LF people. Status: Done.
 +
** ThunderX2
 +
*** Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
 +
* VPP
 +
** Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - In internal review.
 +
** Vectorization
 +
*** ethernet-input - no progress yet
 +
*** buffer pools - https://jira.fd.io/browse/VPP-1560. In internal review
 +
** 128B cache line size
 +
*** VPP image with 128B cache line size crashed on ThunderX2
 +
** thunderx2 crashing - No update
 +
** Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
 +
** Commit VPP distro making patch - Lijian
 +
** Plugin 25G NIC Taishan server, and connect the 25G ports to x86 25G NIC - Lijian
 +
** Follow Jianlin's suggestion, update Uboot and Kernel, and then sync up with Juraj - Lijian
  
''' 11/22/2017 '''
+
'''03/05/2019'''
* VPP CI
+
* Attendees
** 3 ThunderX for Chrismas
+
** Sirshak Das
 +
** Juraj Linkeš
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
* General
 +
* VPP Hoststack
 +
** After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
 +
** vlib_worker_loop and session_queue_node_fn are two major hot-spots. - No progress
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. All test failures are resolved.
 +
*** Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Switched to Malvika.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 
* CSIT
 
* CSIT
** func on VM vs perfs on HW
+
** VPP Performance Test
** func on x86 VMs OK with 2 nodes
+
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
** DPDK integration WIP : https://gerrit.fd.io/r/#/c/9474/
+
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
** issues
+
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
*** how to access the lab ?
+
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
* Next steps
+
*** b. merging CSIT patch. - Closing done
** VPP
+
*** c. creating a job. - Everything is ready except the docker image
** CSIT
+
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
*** structure work & send email (Gabriel)
+
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
*** is xxhash vs crc32 finished ? (Gabriel)
+
** VPP Path
*** ask Maciek & setup a presentation meeting with someone from CSIT (Tina)
+
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
*** find a time to reschedule this meeting before the CSIT weekly call (Brian)
+
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
*** Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
 +
*** thunderx2: Status: Talk to edk about deployment strategy with 1-node.
 +
* FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Juraj setup call with LF people. Status: Done.
 +
** ThunderX2
 +
*** Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
 +
* VPP
 +
** Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - No progress
 +
*** Investigate with latest VPP code on x86 server - Lijian - Send emails to vpp-dev mailor if there's problem. Will not put much effort.
 +
** Vectorization
 +
*** ethernet-input
 +
*** buffer pools
 +
** 128B cache line size
 +
*** Will try this on Taishan server - Slightly performance degradation with 128 bytes cache line
 +
** thunderx2 crashing - No update
 +
** Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
  
''' 11/15/2017 '''
+
'''02/26/2019'''
* VPP upstream status
+
* Attendees
** build && build-release OK
+
** Sirshak Das
** "make test" && "make test-debug" OK
+
** Juraj Linkeš
** packaging:
+
** Stanislav Chlebec
*** Ubuntu 16.04 OK
+
** Khemendra Kumar
*** Ubuntu 17.10 ? (TBC)
+
** Tina Tsou
*** fedora-26 OK
+
** Andy Wang
* vpp continuous test
+
** Gorka
** all task required for jenkin's "verify" job are ready
+
** Fede
** TODO: request gerrit hook to Dave Barachs / vpp-dev (NB & GG)
+
** Honnappa Nagarahalli
** set up ci in fdio lab
+
* General
 +
* VPP Hoststack
 +
** After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
 +
** el0_sys hot-spot on Taishan D05 only, no plan to fix it.
 +
** vlib_worker_loop and session_queue_node_fn are two major hot-spots. - No progress
 +
** memcpy optimization
 +
*** memcpy patch verification on taishan by khem l3 forwarding usecase- Lijian Status(khem): No updates.
 +
*** memcpy patch consumes more clocks in OcteonTX2 - updated by Nitin.
 +
*** Stopped working on this patch.
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Test failure on SCTP, not root-caused yet.
 +
*** Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Switched to Marvikar
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 
* CSIT
 
* CSIT
** setting up env
+
** VPP Performance Test
** ThunderX platforms should arrive this week
+
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
** csit work sharing
+
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
 +
*** b. merging CSIT patch. - Closing done
 +
*** c. creating a job. - Everything is ready except the docker image
 +
** Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
 +
** Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559 - No update
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
**** Doesn't work, seems to be caused improper cross-building-tools. https://wiki.fd.io/view/VPP/Build_System_Deep_Dive
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
*** Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
 +
*** thunderx2: Status: Talk to edk about deployment strategy with 1-node.
 +
* FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Juraj setup call with LF people. Status: Done.
 +
** ThunderX2
 +
*** Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
 +
* VPP
 +
** Buffer Pools per NUMA
 +
** Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560
 +
*** Investigate with latest VPP code on x86 server - Lijian - Send emails to vpp-dev mailor if there's problem. Will not put much effort.
 +
** Vectorization
 +
*** ethernet-input
 +
*** buffer pools
 +
** 128B cache line size
 +
*** Will try this on Taishan server - Slightly performance degradation with 128 bytes cache line
 +
** Qualcomm no change iperf3
 +
** thunderx2 crashing - No update
 +
** Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
  
''' 11/8/2017 '''
+
'''02/19/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
* General
 +
* VPP Hoststack
 +
** After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
 +
** memcpy optimization
 +
*** memcpy patch verification on taishan by khem l3 forwarding usecase- Lijian Status(khem): No updates.
 +
*** memcpy patch consumes more clocks in OcteonTX2 - updated by Nitin.
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch.
 +
*** Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
 +
** Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/
 +
*** b. merging CSIT patch.
 +
*** c. creating a job.
 +
** Target: master trending job
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
*** Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
 +
*** thunderx2: Status: Talk to edk about deployment strategy with 1-node.
 +
* FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Juraj setup call with LF people. Status: Done.
 +
** ThunderX2
 +
*** Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
 +
* VPP
 +
** Buffer Pools per NUMA
 +
** Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560
 +
** 1GB page taking long time Status: fixed.
 +
*** Investigate with latest VPP code on x86 server
 +
** Vectorization
 +
*** ethernet-input
 +
*** buffer pools
 +
*** memcpy
 +
** 128B cache line size
 +
*** Will try this on Taishan server - Lijian
 +
** Qualcomm no change iperf3
 +
** thunderx2 crashing - No update
 +
** Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
  
* Unit tests
+
'''02/11/2019'''
** Tests pass except for random initialization failures
+
* Attendees
** Need to hear back from upstream about Extended unit tests
+
** Sirshak Das
* Should we run plugins such as NSH SFC?
+
** Juraj Linkeš
* Hardware to lab
+
** Stanislav Chlebec
** Huawei h/w stalled
+
** Khemendra Kumar
** 3x ThunderX shipping to FD.io lab
+
** Tina Tsou
* CSIT replication
+
** Andy Wang
** Cavium replicating on ThunderX2; getting started
+
** Gorka
* Let's track our work in Jira; Brian to migrate tasks to Jira
+
** Fede
 +
** Honnappa Nagarahalli
 +
* General
 +
* VPP Hoststack
 +
** memcpy optimization
 +
*** memcpy patch verification on taishan by khem l3 forwarding usecase- Lijian Status(khem): No updates.
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
 +
*** svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch.
 +
*** Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
* CSIT
 +
** VPP Performance Test
 +
** Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible.
 +
** Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
 +
*** a. Host Config
 +
*** b. merging CSIT patch.
 +
*** c. creating a job.
 +
** Target: master trending job
 +
** VPP Path
 +
*** gcc-8 compilation: Jira(Sirshak): https://jira.fd.io/browse/VPP-1559
 +
*** cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
 +
**** Status: Juraj to bring this up in CSIT call. (start with just cross-compilation)
 +
** VPP Device
 +
*** thunderx  Status: 1-node topology was rewired because of QSFP+ switch.
 +
*** mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
 +
*** thunderx2: Status: Talk to edk about deployment strategy with 1-node.
 +
* FD.io lab
 +
** ThunderX1
 +
*** QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
 +
*** Juraj setup call with LF people. Status: Done.
 +
** ThunderX2
 +
*** Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
 +
* VPP
 +
** Buffer Pools per NUMA
 +
** Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560
 +
** 1GB page taking long time Status: fixed.
 +
** Vectorization
 +
*** ethernet-input
 +
*** buffer pools
 +
*** memcpy
 +
** 128B cache line size
 +
** Qualcomm no change iperf3
 +
** thunderx2 crashing
 +
** Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
 +
**
 +
'''02/05/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
* General
 +
* VPP Hoststack
 +
** memcpy optimization
 +
*** Check optimized memory copy version are deployed on Taishan and ThunderX2 during runtime - Lijian
 +
*** Send memcpy patch to Khem and Fede for further verification - Lijian Status: fede: small improvement in mcbin with iperf3, khem to try them with l3 forwarding
 +
** iperf3 performance with Hoststack.
 +
*** ip4_local_inline quad loop under investigation
 +
*** Working on svm_fifo alternate version with front and back pointers synchronized instead of cursize.
 +
** Verifying per NUMA node buffer pool https://gerrit.fd.io/r/#/c/16638/
 +
*** sirshak create jira id in fd.io jira. https://jira.fd.io/browse/VPP-1560
 +
*** Hanging of VPP is actually VPP taking a lot of time to allocate 400K chunks for 1GB - Damjan has this in his todo list
 +
*** gcc-8 compilation still fails on ARM.
 +
**** sirshak create a jira id in fd.io jira. Status: https://jira.fd.io/browse/VPP-1559
 +
*** Octeon-Tx failure. Status: unknown
 +
** Gorka is trying some optimal configs for VCL. Status: no updates.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** OcteonTx boots to buildroot with no dhclient hence an impasse. Still not clear how to use USB stick.
 +
* CSIT
 +
** VPP Path
 +
*** Sirshak to keep track of gcc-8 compilation, once clean we can switch to gcc-8. https://jira.fd.io/browse/VPP-1559
 +
*** ARM CI results are overwritten by x86 machines. Should be a Jenkin issue. Monitor if this corner will happen again. - Juraj
 +
*** Add cross compilation CI Juraj: https://jira.fd.io/browse/CTP-3
 +
** VPP Device
 +
*** thunderx: 1-node topology on cavium thunderx. Status: no updates.
 +
*** Kernel Migration on mcbin. Status:
 +
*** ThunderX2:
 +
** VPP Performance Test
 +
*** Stan and Khem to come up with a summary of current status and an estimate of at least upstreaming basic L2/L3 performance suites.
 +
*** Juraj to come up with a solution for NUMA node anamoly in Taishan.
 +
*** https://gerrit.fd.io/r/#/c/16850/ Status: Juraj has a version all ready to work. Package installation blocker.
 +
*** Package installation error Status: Juraj to investigate logs.
 +
* FD.io lab
 +
** ThunderX1 -
 +
*** New QSFP+ switch for ThunderX1 is available now: QSFP+ to be connected SFP+ switch.
 +
*** Juraj to setup a call with LF folks on.
 +
** ThunderX2 -
 +
*** Andy still waiting cables.
 +
*** Juraj to remind Andy of when the cable will be available.
 +
*** Juraj to follow up on ssh connectivity to thunderx2.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
 +
*** [Lijian] Check if setting default cache line size to 128 will degradate thru-put on Taishan/Qualcomm/ThunderX2
 +
**** no perf diff in Qualcomm
 +
**** vpp crashes on thunderx2
 +
**** waiting for results on A72 (Taishan)
 +
*** [Sirshak] on ethernet-input node, investigate vectorized buffer index, Damjan's per numa node buffer pool patch. Status: No updates
 +
**** open fd.io jira tkt. https://jira.fd.io/browse/VPP-1560
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
* Action Items - Next Week
 +
**
  
''' 10/25/2017 '''
+
'''01/29/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Lijian Zhang
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Garcia
 +
** Manuel
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
** John Ddigilio
 +
* General
 +
** DPDK multi-core scheduler
 +
** https://gerrit.fd.io/r/#/c/15084/
 +
** Please join slack.
 +
** Merge TCP optimization meeting into VPP/Aarch64 community public meeting.
 +
* VPP Hoststack
 +
** TaiShan Server with Debian distro crashed with command of 'ip probe-neighbor' when doing VPP hoststack with iperf3
 +
** With 64 bytes packets, on ThunderX2, 10G NIC, VPP hoststack bandwidth is about 1/2 of Linux Kernel stack.
 +
** With 64 bytes packets, on Taishan, 10G NIC, VPP hoststack bandwidth is about 2x of Linux Kernel stack.
 +
** Memory copy patch gives 4% improvement on VPP hoststack on Taishan server.
 +
** Check optimized memory copy version are deployed on Taishan and ThunderX2 during runtime - Lijian
 +
** Send memcopy patch to Khem and Fede for further verification - Lijian
 +
** iperf3 performance with Hoststack.
 +
*** Investigating lock-less fifo, ip4_local_forward node and memory reordering for VPP hoststack - Sirshak
 +
** Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
 +
** Verifying https://gerrit.fd.io/r/#/c/16638/ - Suppose to give better performance, but VPP hang with this patch on some Arm machines.
 +
** Gorka is trying some optimal configs for VCL. - No Updates.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** OcteonTX is received in ARM lab. Will boot it up firstly and then start doing profiling with it.
 +
* FD.io lab
 +
** ThunderX1 -
 +
*** New Arista switch for ThunderX1 is available now. Gathering details that required by LF lab before sending the switch to CSIT lab. - Juraj - Andy will try to send the switch to CSIT this Thursday.
 +
** ThunderX2 -
 +
*** Cable type is confirmed. Procurement is in the process.
 +
*** Juraj to remind Andy of when the cable will be available.
 +
*** Require access to these servers in FD.io lab. Anton gives the IP to access them.(ADMIN/ADMIN)
 +
* CSIT
 +
** VPP Path
 +
*** So far so good.
 +
*** ARM CI results are overwritten by x86 machines. Should be a Jenkin issue. Monitor if this corner will happen again. - Juraj
 +
** VPP Device
 +
*** thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts. Juraj is able to modify, execute the scripts in Container. Things to fix is scripts for 1-link 1-node topology and interfaces binding to VPP. Is able to run successfully a traffic test.
 +
*** Kernel Migration on mcbin. Juraj is able to build all the images, but got kernel panic. Try with the more recent uBoot version. Tried latest uBoot image, but still has the same issue.
 +
*** Juraj to investigate further work once ThunderX2 is available.
 +
** VPP Performance Test
 +
*** perftest - https://jenkins.fd.io/job/vpp-csit-verify-perf-master-2n-skx - Triggered manually now if patch is perf sensitive.
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** Development of L2 test script is under-going now.
 +
*** Khem will get L2 work in CI firstly, and then IP4, and other test cases.
 +
*** Stan starts to work on performance scripts with Khem. Is able to connect Taishan machines in CSIT lab.
 +
*** The performance topology in wiki link is to update per below file.
 +
*** https://github.com/FDio/csit/blob/master/docs/lab/Testbeds_Xeon_Skx_Arm_Atom.md
 +
*** Stan and Khem to come up with a summary of current status and an estimate of at least upstreaming basic L2/L3 performance suites.
 +
**** Install Ubuntu-18.04 on Huawei Taishan servers firstly, and then investigate upstreaming performance test framework to enable Aarch64
 +
**** Taishan server works with Ubuntu 18.04, CSIT lab updated Ubuntu 18.04 in Taishan
 +
**** Install the packages on Taishan server from cloud repository, to check if VPP can get intel NICs on Taishan - Lijian
 +
**** https://packagecloud.io/app/fdio/master/search?q=19.01-rc0%7E642-g31fe7aa3&filter=debs&filter=debs&dist=ubuntu%2Fbionic
 +
*** Stan installed latest CSIT scripts on packet generator server(x86 NEON) and Tainshan servers in FD.io lab.
 +
*** https://gerrit.fd.io/r/#/c/16850/
 +
*** Some of L2 and L3 test cases passed.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
 +
*** [Lijian] Check if setting default cache line size to 128 will degradate thru-put on Taishan/Qualcomm/ThunderX2
 +
*** [Sirshak] on ethernet-input node, investigate vectorized buffer index.
 +
** Memory Ordering
 +
*** [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
 +
** [Juraj] IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
 +
** [Juraj] Kernel Migration on mcbin. Juraj is able to build all the images.
 +
* Action Items - Next Week
 +
** [Sirshak] -
 +
 
 +
'''01/22/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Lijian Zhang
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Garcia
 +
** Manuel
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
** John Ddigilio
 +
* General
 +
** DPDK multi-core scheduler
 +
** https://gerrit.fd.io/r/#/c/15084/
 +
** Please join slack.
 +
** Merge TCP optimization meeting into VPP/Aarch64 community public meeting.
 +
* VPP Hoststack
 +
** TaiShan Server with Debian distro crashed with command of 'ip probe-neighbor' when doing VPP hoststack with iperf3
 +
** With 64 bytes packets, on ThunderX2, 10G NIC, VPP hoststack bandwidth is about 1/4 of Linux Kernel stack.
 +
** iperf3 performance with Hoststack.
 +
*** Investigating lock-less fifo, ip4_local_forward node and memory reordering for VPP hoststack - Sirshak
 +
** Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
 +
** Gorka is trying some optimal configs for VCL. - No Updates.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** OcteonTX is received in ARM lab. Will boot it up firstly and then start doing profiling with it.
 +
* FD.io lab
 +
** ThunderX1 -
 +
*** New Arista switch for ThunderX1 is available now. Gathering details that required by LF lab before sending the switch to CSIT lab. - Juraj - Andy will try to send the switch to CSIT this Thursday.
 +
** ThunderX2 -
 +
*** Cable type is confirmed. Procurement is in the process.
 +
*** Require access to these servers in FD.io lab.
 +
* CSIT
 +
** VPP Path
 +
*** So far so good.
 +
*** ARM CI results are overwritten by x86 machines. Should be a Jenkin issue. Monitor if this corner will happen again. - Juraj
 +
** VPP Device
 +
*** thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts. Juraj is able to modify, execute the scripts in Container. Things to fix is scripts for 1-link 1-node topology and interfaces binding to VPP.
 +
*** Kernel Migration on mcbin. Juraj is able to build all the images, but got kernel panic. Try with the more recent uBoot version.
 +
*** Juraj to investigate further work once ThunderX2 is available.
 +
** VPP Performance Test
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** Development of L2 test script is under-going now.
 +
*** Khem will get L2 work in CI firstly, and then IP4, and other test cases.
 +
*** Stan starts to work on performance scripts with Khem. Is able to connect Taishan machines in CSIT lab.
 +
*** The performance topology in wiki link is to update per below file.
 +
*** https://github.com/FDio/csit/blob/master/docs/lab/Testbeds_Xeon_Skx_Arm_Atom.md
 +
*** Stan and Khem to come up with a summary of current status and an estimate of at least upstreaming basic L2/L3 performance suites.
 +
**** Install Ubuntu-18.04 on Huawei Taishan servers firstly, and then investigate upstreaming performance test framework to enable Aarch64
 +
**** Lijian to verify Ubuntu-18.04 on Taishan server.
 +
*** Stan installed latest CSIT scripts on packet generator server(x86 NEON) and Tainshan servers in FD.io lab.
 +
*** https://gerrit.fd.io/r/#/c/16850/
 +
*** Some of L2 and L3 test cases passed.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
 +
*** [Sirshak] on ethernet-input node, investigate vectorized buffer index.
 +
** Memory Ordering
 +
*** [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
 +
** [Juraj] IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
 +
** [Juraj] Kernel Migration on mcbin. Juraj is able to build all the images.
 +
* Action Items - Next Week
 +
** [Sirshak] - To update patch list in VPP/Aarch64 wiki
 +
 
 +
'''01/15/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Lijian Zhang
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
** Tina Tsou
 +
** Andy Wang
 +
** Garcia
 +
** Manuel
 +
** Gorka
 +
** Fede
 +
** Honnappa Nagarahalli
 +
** John Ddigilio
 +
* General
 +
** DPDK multi-core scheduler
 +
** https://gerrit.fd.io/r/#/c/15084/
 +
** Please join slack.
 +
** Merge TCP optimization meeting into VPP/Aarch64 community public meeting.
 +
* VPP Hoststack
 +
** iperf3 performance with Hoststack.
 +
*** Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
 +
** Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
 +
** Gorka is trying some optimal configs for VCL. - No Updates.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** OcteonTX is received in ARM lab. Will boot it up firstly and then start doing profiling with it.
 +
* FD.io lab
 +
** ThunderX2 -
 +
*** New Arista switch is available now. Gathering details that required by LF lab before sending the switch to CSIT lab. - Juraj
 +
*** Cable type is confirmed. Procurement is in the process.
 +
* CSIT
 +
** VPP Path
 +
*** IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
 +
*** We have voting verify on bionic. Upload nexus disabled but merge job working. - Juraj created LF ticket for nexus upload. Both master merge job and verifying job are working fine.
 +
*** ARM CI results are overwritten by x86 machines. Should be a Jenkin issue. Monitor if this corner will happen again. - Juraj
 +
** VPP Device
 +
*** thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts.
 +
*** Kernel Migration on mcbin. Juraj is able to build all the images.
 +
** VPP Performance Test
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** Development of L2 test script is under-going now.
 +
*** Khem will get L2 work in CI firstly, and then IP4, and other test cases.
 +
*** Stan starts to work on performance scripts with Khem. Is able to connect Taishan machines in CSIT lab.
 +
*** The performance topology in wiki link is to update per below file.
 +
*** https://github.com/FDio/csit/blob/master/docs/lab/Testbeds_Xeon_Skx_Arm_Atom.md
 +
*** Stan and Khem to come up with a summary of current status and an estimate of at least upstreaming basic L2/L3 performance suites.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
 +
** Memory Ordering
 +
*** [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
 +
** [Juraj] IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
 +
** [Juraj] Kernel Migration on mcbin. Juraj is able to build all the images.
 +
* Action Items - Next Week
 +
** [Sirshak] - To update patch list in VPP/Aarch64 wiki
 +
 
 +
'''01/08/2019'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Lijian Zhang
 +
** Stanislav Chlebec
 +
** Khemendra Kumar
 +
* General
 +
** DPDK multi-core scheduler
 +
** https://gerrit.fd.io/r/#/c/15084/
 +
** Please join slack.
 +
* VPP Hoststack
 +
** iperf3 performance with Hoststack.
 +
*** Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
 +
** Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
 +
** Gorka is trying some optimal configs for VCL. - No Updates.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown). 
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
 +
** [Lijian] Working on IP4 reassembly and GBP failures. - fixed. Juraj has upstreamed patched to enable these two tests.
 +
** [Sirshak] Kernel Migration mcbin. Juraj is working on based on Jianlin's suggestion.
 +
** [Andy] Getting a new Arista switch next year.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] working on vectorized memory copy - Macro benchmarking is done and data is updated to Jira.
 +
** Memory Ordering
 +
*** [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
 +
* CSIT
 +
** VPP Path
 +
* VPP Path Failures
 +
*** We have voting verify on bionic. Upload nexus disabled but merge job working. - Juraj created LF ticket for nexus upload. Both merge job and verifying job are working fine.
 +
** VPP Device
 +
*** thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts.
 +
*** thunderx2: Juraj working with LF to get this resolved.
 +
*** mcbin: Juraj can contact Jianlin if needed.
 +
** VPP Performance Test
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** Development of L2 test script is under-going now.
 +
*** Khem will get L2 work in CI firstly, and then IP4, and other test cases.
 +
*** Stan is starting working on VPP performance test. Khem to send email to Stan on VPP performance testing stuff.
 +
* FD.io lab
 +
** New Arista switch to be proccured next year.
 +
** ThunderX2 - Racked. Andy is trying to buy cables compatible to Intel XL710. Juraj to confirm info required by lab people before sending out the cables.
 +
* Action Items - Next Week
 +
 
 +
'''12/18/2018'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Tina Tsou
 +
** Stanislav Chlebec
 +
** Avinash
 +
** Khemendra
 +
* General
 +
** DPDK multi-core scheduler
 +
** https://gerrit.fd.io/r/#/c/15084/
 +
** Cancelling calls on 25th of Dec and 1st of jan. Next meeting 8th Jan.
 +
** Please join slack.
 +
* VPP Hoststack
 +
** iperf3 performance with Hoststack.
 +
*** Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
 +
** Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
 +
** Gorka is trying some optimal configs for VCL. - No Updates.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown). 
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working.
 +
** [Lijian] Working on IP4 reassembly and GBP failures. - Some preliminary on gbp waiting Neale. Juraj to give access to Lijian to investigate on ThunderX.
 +
** [Sirshak] Kernel Migration mcbin. Status: Jianlin to work with Juraj to get fd.io mcbins up and running. Sirshak to setup a meeting.
 +
** [Andy] Getting a new Arista switch next year.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] working on vectorized memory copy - Still benchmarking and setting it up for internal review.
 +
*** [Lijian] Patch for compiling issue with GCC-8.x is under community review. Status: No updtaes.
 +
*** [Lijian] Patch for fixing StringTest failure is under community review. Status: Abandoned.
 +
*** [Lijian] Patch for CDP failure is under community review. Status: No updates.
 +
** Memory Ordering
 +
*** [Sirshak] svm_fifo lockless alternate algorithm for SPSC.
 +
* CSIT
 +
** VPP Path
 +
* VPP Path Failures
 +
** https://jira.fd.io/browse/VPP-1475 - IP4 random reassembly failure in master, also seen on x86
 +
** https://jira.fd.io/browse/VPP-1491 - GBP L3/L2 Endpoint Learning failure
 +
*** We have voting verify on bionic. Upload nexus disabled but merge job working. Juraj to create LF ticket for nexus upload.
 +
** VPP Device
 +
*** thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts.
 +
*** thunderx2: Sirshak working with LF to get this resolved.
 +
*** mcbin: Sirshak to setup a meeting between Juraj and Jianlin.
 +
** VPP Performance Test
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** Development of L2 test script is under-going now.
 +
*** Khem will get L2 work in CI firstly, and then IP4, and other test cases.
 +
* FD.io lab
 +
** New Arista switch to be proccured next year.
 +
** ThunderX2 - Racked. IPMI Static IP configuration missing. Sirshak with LF.
 +
* Action Items - Next Week
 +
 
 +
'''12/11/2018'''
 +
* Attendees
 +
** Sirshak Das
 +
** Juraj Linkeš
 +
** Tina Tsou
 +
** Stanislav Chlebec
 +
* VPP Hoststack
 +
** iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance. Three case: kernel to kernel; kernel to VPP hoststack; VPP hoststack to VPP hoststack
 +
** ongoing perf analysis. One patch(https://gerrit.fd.io/r/#/c/16184/) is merged, and the other one is under internal review.
 +
** Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
 +
** Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
 +
** Gorka is trying some optimal configs for VCL.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
**
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. Two scripts of L2 performance suites for CI management repository are done, investigating on for CSIT repository, and three more scripts to be developed.
 +
** [Lijian] Working on IP4 reassembly and GBP failures
 +
** [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far. - To confirm with Jianling and Joyce.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] working on vectorized memory copy - Second priority, no update so far.
 +
*** [Lijian] Patch for compiling issue with GCC-8.x is under community review.
 +
*** [Lijian] Patch for fixing StringTest failure is under community review.
 +
*** [Lijian] Patch for CDP failure is under community review.
 +
** Memory Ordering
 +
*** [Sirshak] To start work on Arithmetic and Logic relaxed functions.
 +
* VPP Path failures
 +
** https://jira.fd.io/browse/VPP-1475 - IP4 random reassembly failure in master, also seen on x86
 +
** https://jira.fd.io/browse/VPP-1491 - GBP L3/L2 Endpoint Learning failure
 +
* CSIT
 +
** VPP Path
 +
*** Actually, everything is ready. The only thing is to get CI patch merged.
 +
** VPP Device
 +
*** thunderx: 1-node topology on cavium thunderx is in place, but there are errors. Will continue investigation.
 +
*** thunderx2: Racked. Lack of static IP. Sirshak gave a work-around to fix lacking of static IP to Anton.
 +
*** mcbin: Kernel issue yet to try suggestion from Garcia and Damjan. To confirm with Jianling and Joyce - Lijian
 +
** VPP Performance Test
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** Development of L2 test script is under-going now. Khem will get L2 work in CI firstly, and then IP4, and other test cases.
 +
* FD.io lab
 +
** Arista switch is still not working. Andy and Anton are working on the exact requirement for the switch, and two possible option: Andy to replace the Arista or buy a new one.
 +
** ThunderX2 - Racked. Lack of IP.
 +
* Action Items - Next Week
 +
** [Lijian] to continue to investigate make test failures.
 +
** [Andy] to work with Anton to resolve Arista problem.
 +
 
 +
'''12/04/2018'''
 +
* Attendees
 +
** Sirshak Das
 +
** Andy Wang
 +
** Juraj Linkeš
 +
** Khemendra
 +
** Garcia
 +
** Manuel
 +
** Gorka
 +
** Fede
 +
** Tina Tsou
 +
* VPP Hoststack
 +
** iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance. Three case: kernel to kernel; kernel to VPP hoststack; VPP hoststack to VPP hoststack
 +
** ongoing perf analysis. Two patches ongoing. One is upstreamed and the other is under internal review. Hotpots on memory copy or maybe other stuff.
 +
** Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
 +
** Gorka is trying some optimal configs for VCL.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
**
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. Two scripts of L2 performance suites for CI management repository are done, investigating on for CSIT repository, and three more scripts to be developed.
 +
** [Lijian] VPP dlmalloc crash issue root-caused and fixed by maintainer. Florin Coras fixed time-out issues.
 +
** [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far. - To confirm with Jianling and Joyce.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] working on vectorized memory copy - Second priority, no update so far.
 +
*** [Lijian] Patch for compiling issue with GCC-8.x is under internal review.
 +
*** [Lijian] Patch for fixing StringTest failure is under internal review.
 +
** Memory Ordering
 +
*** [Sirshak] To start work on Arithmetic and Logic relaxed functions.
 +
* CSIT
 +
** VPP Path
 +
*** https://jira.fd.io/browse/VPP-1475 - IP4 random reassembly failure in master, also seen on x86
 +
*** https://jira.fd.io/browse/VPP-1476 - L2FIB failures in master, also seen on x86 - fixed
 +
*** https://jira.fd.io/browse/VPP-1491 - GBP L3/L2 Endpoint Learning failure
 +
*** https://jira.fd.io/browse/VPP-1490 - Traffic doesn't work in make test, 1604 issue(pmalloc issue) - to confirmed its current status
 +
*** https://jira.fd.io/browse/VPP-1497 - Cannot run in parallel problem - fixed
 +
*** VPP-1476, VPP-1475, VPP-1478. These failures are seen on Debian x86 VM also.
 +
*** Get CSIT/Aarch64 pass with partial test cases - Juraj - https://gerrit.fd.io/r/#/c/16282/
 +
*** VPP dlmalloc crash issue root-caused and fixed by maintainer.
 +
*** Florin Coras fixed time-out issue.
 +
** VPP Device
 +
*** thunderx: 1-node topology on cavium thunderx is in place, but there are errors. Will continue investigation.
 +
*** thunderx2: Racked. Lack of IP. To confirm with Anton.
 +
*** mcbin: Kernel issue yet to try suggestion from Garcia and Damjan. To confirm with Jianling and Joyce - Lijian
 +
** VPP Performance Test
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** Development of L2 test script is under-going now. Khem will get L2 work in CI firstly, and then IP4, and other test cases.
 +
* FD.io lab
 +
** Arista switch is still not working. Andy and Anton are working on the exact requirement for the switch, and two possible option: Andy to replace the Arista or buy a new one.
 +
** ThunderX2 - Racked. Lack of IP.
 +
* Action Items - Next Week
 +
** [Lijian] to continue to investigate make test failures.
 +
** [Andy] to work with Anton to resolve Arista problem.
 +
 
 +
 
 +
'''11/27/2018'''
 +
* Attendees
 +
** Juraj Linkeš
 +
** Khemendra
 +
** Garcia
 +
** Manuel
 +
** Gorka
 +
** Fede
 +
** Tina Tsou
 +
* VPP Hoststack
 +
** iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance.
 +
** ongoing perf analysis, two patches ongoing. Hotpots on memory copy or maybe other stuff. Will share patches with community.- Sirshak
 +
** Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
 +
** Gorka is trying some optimal configs for VCL.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** Alternate test cases.
 +
**
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. Status: Working with Juraj to get scripts ready for jobs. - Need to prepare some scripts. First to understand how the script works and then add more options.
 +
** [Lijian] Status on VPP path failures. Status: Still debugging. Still at early study stage.
 +
** [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] working on vectorized memory copy
 +
** Memory Ordering
 +
*** [Sirshak] To start work on Arithmetic and Logic relaxed functions.
 +
* CSIT
 +
** VPP Path
 +
*** 3 failures currently stalling deployment.
 +
*** VPP-1476, VPP-1475, VPP-1478
 +
*** These failures are seen on Debian x86 VM also.
 +
*** Parallelization(n=32) is resulting in failures. Seems also be caused by below two patches.
 +
*** VPP-1490, caused by https://gerrit.fd.io/r/#/c/15106/ and https://gerrit.fd.io/r/#/c/15534/.
 +
*** VPP-1491, VPP-1497 about parallazation and GBP failure are filed.
 +
*** Get CSIT/Aarch64 pass with partial test cases - Juraj
 +
** VPP Device
 +
*** thunderx: Juraj created a LF tkt for wiring the 1-node topology on cavium thunderx.
 +
*** thunderx2: to be racked by this Friday.
 +
*** mcbin: Kernel issue yet to try suggestion from Garcia and Damjan.
 +
** VPP Performance Test
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** L2 test is working now manually. Khem is trying to get it work in CI, and then IP4, and other test cases.
 +
* FD.io lab
 +
** Arista switch is missing cable. Andy will send tracking no. for cables.
 +
** ThunderX2 - to be racked by this Friday.
 +
* Action Items - Next Week
 +
** [Lijian] to investigate VPP-1490 issue.
 +
** [Andy] Andy will send tracking no. for cables.
 +
 
 +
'''11/20/2018'''
 +
* Attendees
 +
** Sirshak Das
 +
** Andy Wang
 +
** Juraj Linkeš
 +
** Khemendra
 +
** Garcia
 +
** Manuel
 +
** Gorka
 +
** Fede
 +
** Tina Tsou
 +
* VPP Hoststack
 +
** iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance.
 +
** ongoing perf analysis, two patches ongoing. Hotpots on memory copy or maybe other stuff. Will share patches with community.- Sirshak
 +
** Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
 +
** Gorka is trying some optimal configs for VCL.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** Alternate test cases.
 +
**
 +
* Action Items - Last Week
 +
** [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
 +
** [Khem] Deployment of only L2 CSIT performance suite. Status: Working with Juraj to get scripts ready for jobs. - Need to prepare some scripts. First to understand how the script works and then add more options.
 +
** [Lijian] Status on VPP path failures. Status: Still debugging. Still at early study stage.
 +
** [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far.
 +
* VPP
 +
** Vectorization
 +
*** [Lijian] working on vectorized memory copy
 +
** Memory Ordering
 +
*** [Sirshak] To start work on Arithmetic and Logic relaxed functions.
 +
* CSIT
 +
** VPP Path
 +
*** 3 failures currently stalling deployment.
 +
*** VPP-1476, VPP-1475, VPP-1478
 +
*** These failures are seen on Debian x86 VM also.
 +
*** Parallelization(n=32) is resulting in failures. Seems also be caused by below two patches.
 +
*** VPP-1490, caused by https://gerrit.fd.io/r/#/c/15106/ and https://gerrit.fd.io/r/#/c/15534/.
 +
*** VPP-1491, VPP-1497 about parallazation and GBP failure are filed.
 +
*** Get CSIT/Aarch64 pass with partial test cases - Juraj
 +
** VPP Device
 +
*** thunderx: Juraj created a LF tkt for wiring the 1-node topology on cavium thunderx.
 +
*** thunderx2: to be racked by this Friday.
 +
*** mcbin: Kernel issue yet to try suggestion from Garcia and Damjan.
 +
** VPP Performance Test
 +
*** Working ongoing on writing scripts for Performance Jobs.
 +
*** L2 test is working now manually. Khem is trying to get it work in CI, and then IP4, and other test cases.
 +
* FD.io lab
 +
** Arista switch is missing cable. Andy will send tracking no. for cables.
 +
** ThunderX2 - to be racked by this Friday.
 +
* Action Items - Next Week
 +
** [Lijian] to investigate VPP-1490 issue.
 +
** [Andy] Andy will send tracking no. for cables.
 +
 
 +
 
 +
'''11/12/2018'''
 +
* Attendees
 +
** Sirshak Das
 +
** Andy Wang
 +
** Juraj Linkeš
 +
** Khemendra
 +
** Garcia
 +
** Gorka
 +
* VPP Hoststack
 +
** iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance.
 +
** ongoing perf analysis, two patches ongoing. Hotpots on memory copy or maybe other stuff. - Sirshak
 +
** Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
 +
** Gorka is trying some optimal configs for VCL.
 +
** VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
 +
** Alternate test cases.
 +
** khem to get more information on benchmarking DMM. Khem to send the information to
  
* Gabriel working on vpp init failure in linux_pci_init()
+
== Status Report Ligato/Contiv ==
* Kumar to check with GeorgeZ on Huawei boards shipped to CSIT; need to verify tests also on this environment (package versions from distro)
+
[[File:Capture LandC.PNG]]
* Brian to check whether anything else needs to be done besides 'make test' for upstream enablement
+

Latest revision as of 15:13, 21 November 2023

Get Involved

Meeting Details

IRC Channel

#fdio-arm on freenode.net

Slack

Request invitation at https://slack.fd.io/

Jira

Jira issues with ARM64 label

Presentations

Release Milestones

18.10

18.07

18.04

  • CI
    • Upstream patch verification on ARMv8 machines
    • .deb packages

Machines

The FD.io lab is hosted at VEXXHOST colocation centre in Montreal Québec, Canada.

Platform Role Status Hostname IP IPMI Cores RAM Ethernet Distro
Marvell ThunderX VPP dev debug server Running vpp-marvell-dev 10.30.51.38 10.30.50.38 96 128GB 3x40GbE QSFP+ / 4x10GbE SFP+ Ubuntu 18.04.4
CI build server Running in Nomad s53-nomad 10.30.51.39 10.30.50.39 96 128GB 3x40GbE QSFP+ / 4x10GbE SFP+ Ubuntu 18.04.4
CI build server Running in Nomad s54-nomad 10.30.51.40 10.30.50.40 96 128GB 3x40GbE QSFP+ / 4x10GbE SFP+ Ubuntu 18.04.4
CI build server Running in Nomad s52-nomad 10.30.51.65 10.30.50.65 96 256GB 2xQSFP+ / USB Ethernet Ubuntu 18.04.4
CI build server Running in Nomad s51-nomad 10.30.51.66 10.30.50.66 96 256GB 2xQSFP+ / USB Ethernet Ubuntu 18.04.4
CI build server Running in Nomad s49-nomad 10.30.51.67 10.30.50.67 96 256GB 2xQSFP+ / USB Ethernet Ubuntu 18.04.4
CI build server Running in Nomad s50-nomad 10.30.51.68 10.30.50.68 96 256GB 2xQSFP+ / USB Ethernet Ubuntu 18.04.4
Marvell ThunderX2 Perf DUT candidate Running s27-t13-sut1 10.30.51.69 10.30.50.69 224 128GB 3x40GbE QSFP+ XL710-QDA2 Ubuntu 18.04.2
VPP device server Running in Nomad s55-t36-sut1 10.30.51.70 10.30.50.70 256 256GB 2x40GbE QSFP+ XL710-QDA2 / 2x10/25GE SFP+ ConnectX5 Ubuntu 18.04.4
VPP device server Running in Nomad s56-t37-sut1 10.30.51.71 10.30.50.71 256 256GB 2x40GbE QSFP+ XL710-QDA2 / 2x10/25GE SFP+ ConnectX5 Ubuntu 18.04.4
Huawei TaiShan 2280 CSIT testbed Running in CI s17-t33-sut1 10.30.51.36 10.30.50.36 64 128GB 2x10GbE SFP+ Intel X520-DA2 / 2x25GbE SFP28 Mellanox CX-4 18.04.1
CSIT testbed Running in CI s18-t33-sut2 10.30.51.37 10.30.50.37 64 128GB 2x10GbE SFP+ Intel X520-DA2 / 2x25GbE SFP28 Mellanox CX-4 18.04.1
Marvell MACCHIATObin N/A Decommissioned s20-t34-sut1 10.30.51.41 10.30.51.49, then connect to /dev/ttyUSB0 4 16GB 2x10GbE SFP+ Ubuntu 16.04.4
N/A Decommissioned s21-t34-sut2 10.30.51.42 10.30.51.49, then connect to /dev/ttyUSB1 4 16GB 2x10GbE SFP+ Ubuntu 16.04.5
N/A Decommissioned fdio-mcbin3 10.30.51.43 10.30.51.49, then connect to /dev/ttyUSB2 4 16GB 2x10GbE SFP+ Ubuntu 16.04.5
Power Cycler Operational 10.30.50.80
SoftIron OverDrive 1000 N/A Decommissioned softiron-1 10.30.51.12 N/A 4 8GB openSUSE
N/A Decommissioned softiron-2 10.30.51.13 N/A 4 8GB openSUSE
N/A Decommissioned softiron-3 10.30.51.14 N/A 4 8GB openSUSE

Note: to get lab access, create a gpg key, upload it to keyserver, have it signed by a trusted anchor in a video call (fingerprint will be needed) and then an ARM authority (Tina) needs to send an e-mail to helpdesk@fd.io with your name, e-mail, keygrip and key fingerprint

CI

Covers automated build, unit test, and packaging for various Linux distros on ARMv8 machines.

Jenkins job Status Description
vpp-arm-verify-master-ubuntu1604 Running xxx
vpp-arm-merge-master-ubuntu1604 Running xxx
vpp-arm-verify-1804-ubuntu1604 Running xxx
vpp-arm-merge-1804-ubuntu1604 Running xxx

Next steps:

  • make test added to verify jobs
  • Clang build
  • openSUSE Leap 15 | CentOS 7 | Ubuntu 18.04
  • vpp-csit-verify-virl-master or equivalent CSIT functional testing

CSIT

Covers automated functional and performance integration testing on ARMv8 3-node and 2-node testbeds.

https://wiki.fd.io/view/CSIT/AArch64

Contiv-VPP

This Kubernetes network plugin uses FD.io VPP to provide network connectivity between PODs.

https://github.com/contiv/vpp

The installation guide of Contiv-VPP on Arm64 platform is

https://github.com/contiv/vpp/blob/master/docs/arm64/MANUAL_INSTALL_ARM64.md

Porting and Tuning Roadmap

  • VPP Vectorization: Expanding the Neon Library for IPv4 forwarding code path - Sirshak/Lijian
  • Tuning the quad loop/dual loop for small cores - Lijian
  • General performance analysis and tuning of various graph nodes for IPv4 forwarding test case - Sirshak/Lijian
  • Memory Ordering - Sirshak
  • CSIT Performance Test - Khemendra
  • CSIT Device Test - Juraj
  • CSIT Path Test - Juraj

Known Issues

GCC 5.3 ICEs during FP register allocation. Please use GCC 5.4 or newer.

Activity

Recent Patches

misc: vppctl fix heap-buffer-overflow & memleaks Merged 12/14 Tianyu Li
crypto-native: fix build error on Arm using clang-13 Merged 12/14 Jieqiang Wang
snort: fix unused result warning for gcc-10 Merged 11/06 Tianyu Li
l2: fix array-bounds error for prefetch on Arm Merged 11/07 Tianyu Li
ip6: fix IPv6 address calculation error using "ip route add" CLI Merged 10/21 Jieqiang Wang
ipsec: Performance improvement of ipsec4_output_node using flow cache Merged 10/13 Govindarajan Mohandoss
build: fix centos rpm build Merged 10/08 Tianyu Li
vppinfra: fix potential memory access error in _pool_init_fixed Merged 10/05 Jieqiang Wang
svm: fix asan check failed @svm_map_region on arm Merged 06/24 Tianyu Li
l2: fix vrrp prefix mac comparison Merged 06/09 Tianyu Li
build: fix build error after make wipe Merged 06/04 Tianyu Li
memif: fix input node buffer prefetch Merged 05/21 Tianyu Li
memif: fix gcc-10 build error on arm platform Merged 05/21 Tianyu Li
papi: fix ubuntu 1804 make test socket.close error Merged 04/16 Tianyu Li
rdma: fix skip_ipv4_cksum behavior in scalar path Merged 04/15 Tianyu Li
vppinfra: correct intrinsic called by u16x16_from_u8x16 Merged 04/15 Lijian Zhang
vppinfra: fix compiling error due to incompatible udphdr field names Merged 03/05 Jieqiang Wang
avf: optimized with NEON SIMD instruction Merged 12/18 Lijian Zhang
ip: fix compiling error with gcc-10 Merged 09/01 Jieqiang Wang
build: Fix 'make install-deps' errors on aarch64 CentOS 7 Merged 07/29 Jieqiang Wang
acl: correct acl vat help message Merged 07/24 Lijian Zhang
build: add libssl-dev library for ubuntu 20.04 Merged 06/04 Jieqiang Wang
dpdk: fix compiling issue with clang Merged 05/08 Lijian Zhang
vppinfra: fix u32x4_byte_swap on Arm Merged 05/08 Lijian Zhang
build: support arch-specific compiling for Neoverse N1 Merged 04/30 Lijian Zhang
dpdk: false link down issue with ixgbe NIC Merged 03/23 Lijian Zhang
vlib: fix error when creating avf interface on SMP system Merged 03/21 Jieqiang Wang
vlib: leave SIGPROF signal with its default handler Merged 03/21 Jieqiang Wang
build: add libssl-dev for ubuntu 16.04 and 18.04 Merged 03/11 Jieqiang Wang
vlib: fix code of getting numa node with specific cpu_id Merged 02/17 Lijian Zhang
docs: add physmem section in configuration parameters Merged 12/19 Jieqiang Wang
vlib: add max-size configuration parameter for pmalloc Merged 12/18 Jieqiang Wang
crypto: not use vec api with opt_data[VNET_CRYPTO_N_OP_IDS] Merged 11/13 Lijian Zhang
acl: add missing square brackets to vat_help option in acl api Merged 10/31 Jieqiang Wang
dpdk: apply dual loop unrolling in DPDK TX Merged 09/12 Lijian Zhang
ip: apply dual loop unrolling in ip4_rewrite Merged 09/12 Lijian Zhang
ip: apply dual loop unrolling in ip4_input Merged 09/12 Lijian Zhang
build: fix running error with vmxnet3_test_plugin.so Merged 09/11 Jianlin Lv
build: fix unsupported CMake comparison operation Merged 09/05 Jianlin Lv
tap: fix tap interface not working on Arm issue Merged 09/04 Lijian Zhang
build: fix vpp compilation failure on ThunderX2 and Amp Merged 08/19 Jianlin Lv
vppinfra: Update "show cpu" output for AArch64 chips Merged 08/19 Nitin Saxena
vppinfra: refactor test_and_set spinlocks to use clib_spinlock_t Merged 08/02 Jason Zhang
vppinfra: added performance test for clib_rwlock_t (test_rwlock.c) Merged 08/02 Jason Zhang
vppinfra: refactor clib_rwlock_t to use single condition variable Merged 08/02 Jason Zhang
vppinfra: refactor clib_spinlock_t to use compare and swap Merged 08/02 Jason Zhang
vppinfra: added lock performance test for clib_spinlock_t (test_spinlock.c) Merged 08/02 Jason Zhang
vppinfra: refactor use of CLIB_MEMORY_BARRIER () Merged 08/02 Jason Zhang
vppinfra: conformed spinlocks to use CLIB_PAUSE Merged 08/02 Jason Zhang
vppinfra: add u64x2_scatter/u32x4_scatter Merged 06/21 Lijian Zhang
vppinfra: add u64x2_gather/u32x4_gather Merged 06/21 Lijian Zhang
fix compiling error with marvell pp2 plugin Merged 06/11 Jianlin Lv
Switch atomic release API from __sync to __atomic builtin Merged 06/05 Sirshak Das
Switch atomic test and set API from __sync to __atomic builtin Merged 06/05 Sirshak Das
Build packages for generic Arm architecture Merged 05/15 Lijian Zhang
Enable NEON instructions in memcpy_le Merged 05/01 Lijian Zhang
svm_fifo rework to avoid contention on cursize Merged 04/17 Sirshak Das
Re-enable aarch64 neon instruction in vlib_buffer_free_inline Merged 03/20 Lijian Zhang
sctp chunk_len fix Merged 03/06 Sirshak Das
Use acquire/release ordering when accessing svm_fifo shared variable cursize Merged 11/29 Sirshak Das
Optimize xxx_zero_byte_mask NEON function. Merged 11/07 Lijian Zhang
Enable atomic swap and store macro with acquire and release ordering. Merged 11/03 Sirshak Das
Add and enable msb mask vector intrinsic for aarch64. Merged 10/31 Lijian Zhang
vppinfra: add atomic macros for __sync builtins Merged 10/19 Sirshak Das
vppinfra: Fix extendto_high aarch64 NEON api. Merged 10/09 Sirshak Das
Support dynamic dual/quad loop selection on aarch64 Merged 10/01 Lijian Zhang
Enable verbose output during VPP cmake compiling Merged 9/25 Lijian Zhang
dpdk_plugin: fix mlx5 build and runtime issues Merged 9/27 Sirshak Das
Add and enable u32x4_extend_to_u64x2_high for aarch64 NEON intrinsics. Merged 9/12 Sirshak Das
Add horizontal add (hadd) vector intrinsic via NEON. Merged 9/11 Sirshak Das
Add u32x4_extend_to_u64x2 for aarch64 using NEON intrinsics Merged 9/11 Sirshak Das
Replacing vtbl NEON intrinsic with rev NEON intrinsic for byte_swap. Merged 9/11 Sirshak Das
Fix array bound failure in api_sr_localsid_add_del Merged 8/30 Lijian Zhang
cmake: fix marvell plugin build Merged 8/28 Brian Brooks
fix dpdk_plugin.so load failure with DPDK 18.08 Merged 8/23 Lijian Zhang
Fix a bug in function pipe_rx Merged 8/17 Lijian Zhang
fix compiling warnings with GCC Merged 8/17 Lijian Zhang
Update AArch64 CSIT machines into FD.io VPP docs Merged 8/17 Lijian Zhang
Add support for shuffle vector intrinsic via Neon in ARM Merged 8/1 Sirshak Das
Improve cpu { coremask-% } configure option Merged 8/1 Yi He
Fix undefined symbol: fformat_append_cr in vat plugins loading Merged 7/31 Yi He
pp2: increase recycle batch size Merged 7/10 Brian Brooks
pp2: change default queue size Merged 7/26 Brian Brooks
pp2: use configured RX queue size Merged 7/10 Brian Brooks
Fix load_unaligned undefined and other possible build failures Merged 6/26 Sirshak Das
Enable PMU cycle counter for graph node cycles Sirshak Das
Fix clang compilation on aarch64: extraneous parentheses Merged 6/13 Sirshak Das
Fix clang compilation on aarch64: value size does not match register size Merged 5/30 Sirshak Das
Fix clang compilation on aarch64: sizeof operator error Merged 5/30 Sirshak Das
Fix clang compilation on aarch64: replace -pie with -fPIE for dpdk compilation Merged 5/30 Sirshak Das
dpdk: set dmamap iova address value according to eal_iova_mode Merged 5/28 Sachin Saxena
Fixes make test errors with clang compiler on aarch64 Merged 5/27 Sirshak Das
Fix broken compilation for non-numa aware platforms Merged 5/16 Sachin Saxena
build-data: Common makefile for NXP DPAA1/DPAA2 platforms Merged 5/4 Sachin Saxena
arm64: Avoid setting march to corei7 when Cross Compiling for ARM Merged 5/4 Sachin Saxena
use restrict keyword VPP-1126 Khemendra Kumar
Autotools: Autodetection of cache line size VPP-1064 Nitin Saxena
add 'is_all_zero(x)' for NEON - fix build break Merged 2/20 Adrian Oanca
u8x16_compare_byte_mask optimization Merged 2/24 Adrian Oanca
Added u8x16,u32x4,u64x2 variants of _zero_byte_mask(x) for ARM/NEON platform Merged 2/26 VPP-1129 Adrian Oanca
add CLIB_HAVE_VEC128 with NEON intrinsics Merged 02/08 VPP-1127 Gabriel Ganne
Use neutral vector code for ethernet_frame_is_tagged Merged 2/19 Damjan Marion
vhost: Added ARMV8 NEON version of function map_guest_mem() Merged 2/7 VPP-1085 Nitin Saxena
vppinfra: use __atomic_fetch_add instead of __sync_fetch_and_add builtins VPP-1114 Kevin Wang
Arm system counter cleanup Merged 1/30 VPP-1125 Brian Brooks
svm: ... on autodetected VA space size (fixup again) Merged 01/10 Gabriel Ganne
svm: calc base address on AArch64 based on autodetected VA space size (fixup) Merged 01/10 Gabriel Ganne
svm: calc base address on AArch64 based on autodetected VA space size Merged 01/09 Damjan Marion
show cpu microarchitecture Merged 01/06 Gabriel Ganne
Fix Debian Packaging on AARCH64 Merged 01/06 Nitin Saxena
more extended tests fixes Merged 12/16 Gabriel Ganne
Use crc32 wrapper Merged 12/16 VPP-1086 Gabriel Ganne
implement clib_smp_pause() for arm and aarch64 platform Merged 12/15 VPP-1066 Kevin Wang
make "test-all" target pass again (for all platforms) Merged 12/13 Gabriel Ganne
fill "show cpu" Flag list on aarch64 platforms Merged 12/06 VPP-1065 Gabriel Ganne
remove smp dead code Merged 12/06 VPP-1066 Gabriel Ganne
net/virtio: support modern device id Merged 11/28 Gabriel Ganne
use REV on aarch64 for endianness swapping Merged 11/21 VPP-1067 Gabriel Ganne
armv8 crc32 - fix macro name Merged 11/15 Gabriel Ganne
bier - fix node table declaration Merged 11/14 Gabriel Ganne
Map SVM regions at a sane offset on arm64 Merged 11/10 Brian Brooks
bfd tests fix Merged 11/07 Gabriel Ganne
debian packaging fix Merged 11/06 Gabriel Ganne
lb test fix Merged 10/31 Gabriel Ganne
conditional x86intrin.h inclusion Merged 10/25 Gabriel Ganne
fix test_lb_ip4_gre6() cleanup Merged 10/24 Gabriel Ganne
null-terminate some formatted string Merged 10/20 Gabriel Ganne
lb plugin - fix format() type mismatches Merged 10/16 Gabriel Ganne
Use AESNI=y only on x86_64 machines Merged 10/14 Brian Brooks
Improved arm64 chip detection Merged 09/11 Brian Brooks
Native arm64 build: dpdk/Makefile change Merged 08/31 Brian Brooks

Meeting Minutes

11/21/2023

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Niyaz Murshed
    • Jieqiang Wang
  • CSIT
    • Status
      • Dave Wallace help monitor the AArch64 CI/CD status, which looks fine
      • Replace old thunderX2 with Ampera Altra, bugdets got approved, still in progress
        • Sync with CSIT folks in the call when possible -- Juraj
      • Maciek asked about the availability of N2-based hardwares
        • Plans to ship N2-based servers(Nvidia Grace(V2)/Ampere One(in-house design by Ampere)) to FD.io lab in next year
        • Timeline TBD
      • IPSec test cases
        • Patch already merged
        • QAT cards in Austin labs, plan to ship them to FD.io lab
      • RDMA test cases
        • MLX DPDK test cases are enabled, RDMA are not on AArch64
  • VPP
    • Detailed planning for VPP projects in the next call
    • Refactor OpenSSL usage in VPP IPsec -- Lijian
      • Move key generation and initialization steps out of data plane to control plane, see performance boost
    • Investigate make test framework in VPP -- Lijian
      • Patch broke wireguard test cases so need to figure out the work flow
    • VPP ramp-up -- Niyaz
      • Investigate VPP graph node mechanism and how to add nodes to the group
    • IPSec scalability tests -- Jieqiang
      • Try to figure out dpdk-rss-flows.py and how to generate balanced rss flows for IPSec tests

07/18/2023

  • Attendees
    • Jieqiang Wang
    • Tianyu Li
    • Juraj Linkes
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • IPSec & VxLAN performance drop issue on Ampere Altra
      • QAT cards are planned to be shipped
      • need to pay attention to the execution time for IPSec release testing
      • Need to investigate further on performance degradation issue
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
      • Will have a debug meeting with RDMA maintainers on the issues.
      • ConnectX6 NIC info will be updated in doc first
      • Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
    • MRR failed cases
      • Probably due to latest DPDK upgrade, not an arm-specific issue.
    • New test cases list on 3n-alt
      • NAT tests cannot be added because they are running on 2-node testbed only
      • enable IPSec flow cache(arm)/IPSec SPD fast path feature
    • Release testing
    • Plan to replace TX2 with Altra as VPP device testing testbed

06/20/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj Linkes
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • IPSec & VxLAN performance drop issue on Ampere Altra
      • QAT cards are planned to be shipped
      • need to pay attention to the execution time for IPSec release testing
      • Need to investigate further on performance degradation issue
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
      • Will have a debug meeting with RDMA maintainers on the issues.
      • ConnectX6 NIC info will be updated in doc first
      • Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
    • MRR failed cases
      • Probably due to latest DPDK upgrade, not an arm-specific issue.
    • New test cases list on 3n-alt
      • NAT tests cannot be added because they are running on 2-node testbed only
      • enable IPSec flow cache(arm)/IPSec SPD fast path feature

05/16/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • The link issue in DPDK testpmd test cases on Ampere Altra is still there.
      • Try cable switch while upgrading NIC firmeare and drivers
      • Try to reproduce the tests after the NIC firmware
      • Try different port pairs of the same two NICs
    • IPSec & VxLAN performance drop issue on Ampere Altra
      • QAT cards are planned to be shipped
      • need to pay attention to the execution time for IPSec release testing
      • Need to investigate further on performance degradation issue
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
      • Will have a debug meeting with RDMA maintainers on the issues.
      • ConnectX6 NIC info will be updated in doc first
      • Decide which test cases to be run on the testbeds(time consideration/iterative test/coverage test)
    • MRR failed cases
      • Probably due to latest DPDK upgrade, not an arm-specific issue.
  • VPP

04/18/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • The link issue in DPDK testpmd test cases on Ampere Altra is still there.
    • IPSec & VxLAN performance drop issue on Ampere Altra
      • QAT cards are planned to be shipped
      • need to pay attention to the execution time for IPSec release testing
      • Need to investigate further on performance degradation issue
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
      • Will have a debug meeting with RDMA maintainers on the issues.
      • ConnectX6 NIC info will be updated in doc first
  • VPP

04/04/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • The link issue in DPDK testpmd test cases on Ampere Altra is still there.
    • IPSec & VxLAN performance drop issue on Ampere Altra
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
      • Will have a debug meeting with RDMA maintainers on the issues.
  • VPP

03/07/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • CSIT
    • Timeout issue happens preriodically on Taishan server, even in release testing.
    • The link issue in DPDK testpmd test cases on Ampere Altra is still there.
    • Verify job, Merge Job, Device Testing, and release testing is so far so good.
    • RDMA PMD claims ConnectX4/5 support; Whether ConnectX6/7 is supported or not?
  • VPP

2/21/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • Miscellaneous
    • Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Confirm with Vexxhost people if replacing intel NICs is feasible
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
            • Will talk to dpdk i40e maintainer to seek their help
              • Dpdk Port/link status broken - l3fwd have the some issue
              • Sent detail email to i40e maintainer in dpdk-dev mail list waiting for response
        • New links for VPP perf trending/report pages
          • Daily trending: https://s3-docs.fd.io/csit/master/trending/
          • Release report: https://s3-docs.fd.io/csit/master/report/
          • Need to investigate 22.10 release testing result
            • Compiler version change seems to be one of factors for perf degradation
              • Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
              • New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
          • VM testcase timeout issue on 3-tsh testbed
            • Timeout issue occured when starting VPP inside VM, but not for starting testpmd
            • Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
              • isolcpus seems to be working fine
              • still need to root cause the timeout issue- sometimes slower
              • run dpdk build, just use the non-isolated cores for build
              • both VM and VPP start slower than before
              • VPP loading plugins and timeout happens
              • Is VPP crashing? - not crash
              • Is the VM bound with isolated core? - need to check
              • Will set up a live debug session for Tianyu and Juraj
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Enable VPP device testing per patch
        • Voting right for VPP device testing on Arm is enabled
        • VPP device testing on Arm runs per VPP/CSIT patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
    • MLX NICs Planning
      • CX6 and CX7 - CX7 is hard to get on market - MLX Nics will be used and reported
      • CX6 vpp native rdma driver has issues, dpdk mlx driver is fine.
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate


2/7/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Juraj
  • Miscellaneous
    • Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Confirm with Vexxhost people if replacing intel NICs is feasible
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
            • Will talk to dpdk i40e maintainer to seek their help
              • Dpdk Port/link status broken - l3fwd have the some issue
              • Sent detail email to i40e maintainer in dpdk-dev mail list waiting for response
        • New links for VPP perf trending/report pages
          • Daily trending: https://s3-docs.fd.io/csit/master/trending/
          • Release report: https://s3-docs.fd.io/csit/master/report/
          • Need to investigate 22.10 release testing result
            • Compiler version change seems to be one of factors for perf degradation
              • Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
              • New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
          • VM testcase timeout issue on 3-tsh testbed
            • Timeout issue occured when starting VPP inside VM, but not for starting testpmd
            • Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
              • isolcpus seems to be working fine
              • still need to root cause the timeout issue- sometimes slower
              • run dpdk build, just use the non-isolated cores for build
              • both VM and VPP start slower than before
              • VPP loading plugins and timeout happens
              • Is VPP crashing? - not crash
              • Is the VM bound with isolated core? - need to check
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Enable VPP device testing per patch
        • Voting right for VPP device testing on Arm is enabled
        • VPP device testing on Arm runs per VPP/CSIT patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
    • MLX NICs Planning
      • CX6 and CX7 - CX7 is hard to get on market - MLX Nics will be used and reported
      • CX6 vpp native rdma driver has issues, dpdk mlx driver is fine.
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

1/17/2023

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
  • Miscellaneous
    • Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Confirm with Vexxhost people if replacing intel NICs is feasible
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
            • Will talk to dpdk i40e maintainer to seek their help
        • New links for VPP perf trending/report pages
          • Daily trending: https://s3-docs.fd.io/csit/master/trending/
          • Release report: https://s3-docs.fd.io/csit/master/report/
          • Need to investigate 22.10 release testing result
            • Compiler version change seems to be one of factors for perf degradation
              • Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
              • New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
          • VM testcase timeout issue on 3-tsh testbed
            • Timeout issue occured when starting VPP inside VM, but not for starting testpmd
            • Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Enable VPP device testing per patch
        • Voting right for VPP device testing on Arm is enabled
        • VPP device testing on Arm runs per VPP/CSIT patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

12/20/2022

  • Attendees
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
  • Miscellaneous
    • Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Confirm with Vexxhost people if replacing intel NICs is feasible
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
            • Will talk to dpdk i40e maintainer to seek their help
        • New links for VPP perf trending/report pages
          • Daily trending: https://s3-docs.fd.io/csit/master/trending/
          • Release report: https://s3-docs.fd.io/csit/master/report/
          • Need to investigate 22.10 release testing result
            • Compiler version change seems to be one of factors for perf degradation
              • Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
              • New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
          • VM testcase timeout issue on 3-tsh testbed
            • Timeout issue occured when starting VPP inside VM, but not for starting testpmd
            • Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Enable VPP device testing per patch
        • Voting right for VPP device testing on Arm is enabled
        • VPP device testing on Arm runs per VPP/CSIT patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

12/06/2022

  • Attendees
    • Juraj Linkes
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
  • Miscellaneous
    • Reschedule the meeting to 8:30 am for Juraj and 3:30 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Confirm with Vexxhost people if replacing intel NICs is feasible
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
            • Will talk to dpdk i40e maintainer to seek their help
        • New links for VPP perf trending/report pages
          • Daily trending: https://s3-docs.fd.io/csit/master/trending/
          • Release report: https://s3-docs.fd.io/csit/master/report/
          • Need to investigate 22.10 release testing result
            • Compiler version change seems to be one of factors for perf degradation
              • Old version: clang 10.0.0-4ubuntu1, gcc Ubuntu 9.4.0-1ubuntu1~20.04.1
              • New version: clang 14.0.0-1ubuntu1, gcc Ubuntu 11.3.0-1ubuntu1~22.04
          • VM testcase timeout issue on 3-tsh testbed
            • Timeout issue occured when starting VPP inside VM, but not for starting testpmd
            • Config isolcpu in kernel boot parameter is deprecated, Tianyu proposed a solution that Juraj would try
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Enable VPP device testing per patch
        • Voting right for VPP device testing on Arm is enabled
        • VPP device testing on Arm runs per VPP/CSIT patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

11/15/2022

  • Attendees
    • Juraj Linkes
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
  • Miscellaneous
    • Reschedule the meeting to 9 am for Juraj and 3 pm for Shanghai folks
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
        • Good news, No more slow down after 200 rounds of testing.
          • Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
          • Distro upgrade to ubuntu 22.04 is still ongoing - no ETA yet
            • Server configuration will remain the same, already integrated in ansible playbook
          • Re-enable voting IF no more issue with 22.04 device testing
            • Submit a patch to enable voting right after meeting
      • Test meltdown/spectre vulnerabilities
        • CSIT maintainers ask for tools if existing to test vulnerabilities on Arm platform(not just limited to Arm)
        • Will confirm this issue with support team - Lijian
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • VM cases failed only on 3n-alt performance testbed, error log report some file missing, likely configuration issue
        • Another intermit failed VM issue happens on tx2 and alt, need to figure out above case first
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate


10/18/2022

  • Attendees
    • Juraj Linkes
    • Tianyu Li
    • Lijian Zhang
    • Jieqiang Wang
  • Miscellaneous
    • Reschedule the meeting to 9 am for Juraj and 3 pm for Shanghai folks
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • CSIT perf numbers VS local perf numbers
            • VPP cloud image in CSIT VS native built VPP in local env
            • One DPDK patch introduced perf degradation on Arm platform
            • Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
          • Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
            • And the procedures of developing/developing test cases in CSIT (performance & device testing)
            • Juraj should have already sent to Jieqiang previously.
          • 22.06 release testing will happen soon
          • NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Replace XL710 NIC? - try asking tomorrow.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
        • New links for VPP perf trending/report pages
        • NUMA issue
          • Will run performance report on Arm testbed onece patch to resolve NUMA issue is merged
          • Dave will help merge the patch into the corresponding branches


    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
        • Good news, No more slow down after 200 rounds of testing.
          • Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
          • Suggest to rerun test after upgrade to 22.04
          • Re-enable voting after not more issue with 22.04 device testing
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT cards can be seen with new kernel update
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

9/20/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Lijian Zhang
    • Jieqiang Wang
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • CSIT perf numbers VS local perf numbers
            • VPP cloud image in CSIT VS native built VPP in local env
            • One DPDK patch introduced perf degradation on Arm platform
            • Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
          • Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
            • And the procedures of developing/developing test cases in CSIT (performance & device testing)
            • Juraj should have already sent to Jieqiang previously.
          • 22.06 release testing will happen soon
          • NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Replace XL710 NIC? - try asking tomorrow.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
        • New links for VPP perf trending/report pages
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
        • Good news, No more slow down after 200 rounds of testing.
          • Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
          • Suggest to rerun test after upgrade to 22.04
          • Re-enable voting after not more issue with 22.04 device testing
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
        • QAT enabled Kernel patch release about October, upgrade kernel required.
      • QAT test cases are developed based on Python APIs / CLIs
    • VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate


9/6/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Lijian Zhang
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-alt/
          • CSIT perf numbers VS local perf numbers
            • VPP cloud image in CSIT VS native built VPP in local env
            • One DPDK patch introduced perf degradation on Arm platform
            • Configuration difference between CSIT env and local env(Hugepage size, startup.conf parameters and etc)
          • Juraj will write down the procedures on setting up Ampere Altra setup in FD.io lab
            • And the procedures of developing/developing test cases in CSIT (performance & device testing)
            • Juraj should have already sent to Jieqiang previously.
          • 22.06 release testing will happen soon
          • NDR/PDR data difference - deep dive needed, waiting ampere folks engagement
          • DPDK testpmd XL710 interface not up failure(2 ampere back to back) - VPP and other apps, the same interface works fine.
            • Local setup not reproduced - need to schedule a debug session to reproduce the issue in FD.io lab.
            • Tried to wait more time, interface still not up, restart is not enough either - Need to figure out reliable workaround.
            • Replace XL710 NIC? - try asking tomorrow.
            • Tried old version of DPDK - 21.08 does not work. May need to try older version.
            • Will it related to NIC's Speed, Duplex and Auto negotiation configuration?
            • May try to upgrade the NIC's firmware. - check local xl710 firmware version
        • New links for VPP perf trending/report pages
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
        • Good news, No more slow down after 200 rounds of testing.
          • Thunderx2 servers need to upgrade from 20.04 to 22.04 - Peter - ongoing, ETA a few days to 2 weeks
          • Suggest to rerun test after upgrade to 22.04
          • Re-enable voting after not more issue with 22.04 device testing
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
        • QAT enabled Kernel patch release about October, upgrade kernel required.
      • QAT test cases are developed based on Python APIs / CLIs
    • VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

8/16/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Masksym Vynnvk
    • Jieqiang Wang
    • Tianyu Li
    • Lijian Zhang
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
    • VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

8/2/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Masksym Vynnvk
    • Jieqiang Wang
    • Tianyu Li
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
    • VPP build servers - 2 new ampere altra server, 2 old thunder x1 servers decommission or not - need to confirm
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX5 NIC - scalabilty test
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case
            • IPSec two core test with QAT offload, performance is poor on Ampere - need to investigate

7/19/2022

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP using 100G MLX NIC
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
      • Work on IPsec input - Zach
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case


7/5/2022

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • Investigate VPP cross compilation with buildroot - for running VPP on FVP - distro like ubuntu is slower than buildroot - Lijian
      • Depends on some libraries, dpdk, ipsec_mb, rdma-core and nasm etc - optional
    • Investigate One Terabit throughput test on Arm platform
      • Investigate automate rfc2544 no drop rate throughput test with Ixia on N1 platform - Tianyu
      • Kernel cmdline may impact on NDR PDR results - Jieqiang
      • Intern help to benchmark VPP on N1 platforms
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Tested perfmon patch - Jieqiang
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card
              • VPP rdma native driver only - vpp meta data corrupt - may related to memory barrier
            • QAT single core test done - investigate multiple core QAT case

6/21/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Tested perfmon patch - Jieqiang
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card

6/7/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
        • Investigate SVE vs NEON packet checksum comparison
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Tested perfmon patch - Jieqiang
      • Review SPD flow cache patch from Intel folks - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage
            • Investigating crash issue with 90% linerate IPSec traffic with QAT card

5/17/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Lijian Zhang
    • Tina Tsou
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)
            • Needs a kernel patch to resolve crash issue for QAT card
              • Patch made by Yoan is upstream and waits for review
              • Try patched VPP to verify QAT card usage

4/5/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Lijian Zhang
    • Tina Tsou
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • Device Testing on ThunderX2 servers
        • Juraj will commit the patch to disable the failling test cases
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
    • QAT cards
      • Govind will ship another 2x QAT from Austin to FD.io lab
      • Will procure 2x QAT cards and verify them internally firstly.
      • The existing QAT cards will be removed and returned to Vexxhost/FD.io lab
      • QAT test cases are developed based on Python APIs / CLIs
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)

3/15/2022

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Lijian Zhang
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Performance servers have arrived at FD.io lab
        • Servers are in the processing of wiring, expected to be operational soon
        • Will follow the trend for Arm servers if more mlx NICs are installed on X86
        • Plan to install QAT cards on performance servers
        • Juraj to get QAT card avalibility from CSIT community
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
      • Rebase the patch and final round of benchmarking for frag/reassembly nodes
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
            • Kernel with aarch64 patch is expected to release soon
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step
          • Patch to resolve iommu issue for mlx NIC when using with QAT card
            • Benchmark IPSec test case with QAT card/mlx NIC(single-core/multi-core)

3/1/2022

  • Attendees
    • Govindarajan Mohandoss
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Paper work for shipment is done
      • Build servers will arrive at end of Jan
      • Performance servers will arrive in Feb
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • reassembly node opt by adding prefetch
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
    • VPP IPv6 fragmentation
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

1/25/2022

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
      • Paper work for shipment is done
      • Build servers will arrive at end of Jan
      • Performance servers will arrive in Feb
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Confluence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on arm internal platform
    • VPP IPv4 fragmentation & reassembly - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

1/18/2022

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • VM cases failed on 1 min timeout on creating VM (qemu cli)- tx2 node
        • Reboot server recover and monitoring
        • Need to look into it, try manually
          • May need to upgrade iavf driver
      • Server in-accessiable
        • Reboot server recover the service
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • 2 build servers ready for shipment - 1 RU, no pcie slot for NICs
      • 2 performance servers waiting for Intel NICs
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
    • VPP IPv4 fragmentation - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

1/11/2022

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
    • VPP IPv4 fragmentation - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
      • Benchmark IPv4 fragmentation node using rdma plugin
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

12/14/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
    • VPP IPv4 fragmentation - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
    • VPP IPv6 Benchmarking and Profiling - Jieqiang
      • IPv6 profiling
        • No perf bump for lookup_x2 function in Fd.io gerrit
        • Try Mellaonx nics for IPv6 routing tests
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

12/07/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VM cases failed only on Arm
        • Tried to increase the timeout to see it will fix the issue
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab - about Jan 2022
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
    • VPP IPv4 fragmentation - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
    • VPP IPv6 Benchmarking and Profiling - Jieqiang
      • IPv6 profiling
        • No perf bump for lookup_x2 function in Fd.io gerrit
        • Try Mellaonx nics for IPv6 routing tests
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Rely on kernel patch https://lore.kernel.org/linux-arm-kernel/20210517195405.3079458-1-robh@kernel.org/ to enable the feature
      • v10 kernel patch is ready, which fixes intermittent large statistic number for events
        • Modify the commit message and upstream the perfmon patch - Zach
          • Depends on kernel patch
        • Building Intel QAT driver on arm to test IPsec crypt - Picchi
          • Run IPsec with QAT on skylake this week, run the same setup on Ampere is next step

11/30/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
            • Ping Dave about enabling VPP device testing per patch
    • New Arm servers shipment to the FD.io lab
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
        • 500G disk/256G RAM
        • Each job will consume about 16G memory
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enabled with DMC-620
    • VPP IPv4 fragmentation - Tianyu & Jieqiang
      • Add multi-arch support for ip4-frag node but see no perf bump
      • Apply loop unrolling on ip4-frag node
    • VPP IPv6 Benchmarking and Profiling - Jieqiang
      • IPv6 profiling
        • No perf bump for lookup_x2 function in Fd.io gerrit
        • Try Mellaonx nics for IPv6 routing tests
    • CNF PoC proposal preparation- Tianyu
      • Add support for VPP aarch64 docker image build
      • Calico use cases exploration on VPP
      • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach

11/23/2021

  • Attendees
    • Tianyu Li
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
              • VPP Device configuration align with VPP Performance configuration - no issue yet
          • Enable VPP device testing per patch
            • Monitor for a week and enable vote right then
    • New Arm servers shipment to the FD.io lab
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
        • Need to confirm with RAM/disk size for the new build servers
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performance number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enbaled with DMC-620
    • VPP IPv4 fragmentation
    • VPP IPv6 Benchmarking and Profiling
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Performance degradation with quad loop unrolling applied on ip6_lookup_inline
        • Patch the current kernel to enable perfmon plugin on VPP
        • Need to check performance for IPv6 subnet routing
        • Try Mellaonx nics for IPv6 routing tests
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach


11/16/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • AVF interface creation issue:
        • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
          • Related to PF i40e driver, vlan stripping configured on VF, PF driver return not allowd.
          • Not related to iavf driver, AVF interface - vpp native driver have this issue
          • dpdk iavf ignore the error and continue initialization, while vpp abort the init process
          • Intel will fix the issue from PF driver - workaround: use old i40e driver (from ubuntu 20.04)
        • Race condition occur on /dev/vfio mounting
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and result looks good right now
              • Patch has been merged
              • By not mounting whole /dev/vfio, mount only /dev/vfio/xxx used
              • Addressed comments, waiting Peter's review - Peter approved, patch merged and monitoring.
              • Met tx2 server reboot issue when monitoring - RAS CONTROLLER: Fatal unrecoverable error detected ** NBU Error **
              • Reinstall the tx2 server with older kernel version and VPP device testing works fine with Juraj's patch
          • Enable VPP device testing per patch
    • New Arm servers shipment to the FD.io lab
      • New servers are in the procurement process
      • Plan to replace old thunderx1 build servers with more advanced Arm servers
      • Intel NIC firmware upgrade on Arm - not supported
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunence page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Performonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
        • Liangxing will help to benchmark VPP with FPGA enbaled with DMC-620
    • VPP IPv4 fragmentation
    • VPP IPv6 Benchmarking and Profiling
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Performance degradation with quad loop unrolling applied on ip6_lookup_inline
        • Patch the current kernel to enable perfmon plugin on VPP
        • Need to check performance for IPv6 subnet routing
        • Try Mellaonx nics for IPv6 routing tests
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach

11/09/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tina Tsou
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunce page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Perfmonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
      • VPP IPv4 fragmetation
    • VPP IPv6 Benchmarking and Profiling
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Performance degradation with quad loop unrolling applied on ip6_lookup_inline
        • Patch the current kernel to enable perfmon plugin on VPP
        • Need to check performance for IPv6 subnet routing
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach

11/02/2021

  • Attendees
    • Tianyu Li
    • Govindarajan Mohandoss
    • Jieqiang Wang
    • Tina Tsou
  • VPP
    • VPP SVE implementation - Lijian
      • SVE validation on FPGA platform - Conflunce page ready
      • Run L3 traffic testing with NEON/SVE-VLA/SVE-VLS VPP version
        • Perfmonce number: NEON>SVE-VLA>SVE-VLS on FPGA without DMC-620
        • FPGA team promises to provide FPGA image with DMC-620
    • VPP IPv6 Benchmarking and Profiling
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Performance degradation with quad loop unrolling applied on ip6_lookup_inline
        • Patch the current kernel to enable perfmon plugin on VPP
        • Need to check performance for IPv6 subnet routing
    • VPP Prefetch
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Improve ansible scripts to deploy VPP&snort on K8S pods automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach


10/26/2021

  • Attendees
    • Juraj Linkes
    • Tianyu Li
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week. - closed
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • Inbound IPsec: reproduced and need to investigate - Juraj
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • IPsec SPD input/output case ongoing
          • Adding IPsec SPD outbound test cases 64B 1, 100 and 1k SPD entries, 1, 2, 4 cores, on tx2 testbed - clarified
            • Flow cache on and off cases need to be measured.
          • L2 BD 20k test cases execute time too long, removed on taishan.
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
        • 3n-tsh testbed unreachable, investigating right now - Juraj
          • TG firmware is under upgradation
          • Server unreachable due to firmware & driver update - resolved - update all done
        • Release testing for 21.10 starts
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - Juraj
          • Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
        • AVF interface creation issue:
          • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
            • race condition occur
            • try mounting a part of /dev/vfio to see if issue can be resolved
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and looks good right now
              • Addressed comments, waiting Peter's review..
              • Will enable voting right soon after the patch gets merged
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
        • New servers are in the procurement process
        • Plan to replace old thunderx1 build servers with more advanced Arm servers
            • CPU not fully utilized on Arm, need further investigation
    • Intel NIC firmware upgrade on Arm - not supported
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
            • Enable DMC 620 more close to real system, but performance will drop
            • Build a system using VPP memif and pktgen
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
        • Plan to try quad loop unrolling for ip6_lookup_inline function
    • VPP Prefetch
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Try to use ansible to deploy VPP automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

10/19/2021

  • Attendees
    • Juraj Linkes
    • Tianyu Li
    • Govindarajan Mohandoss
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - Juraj
          • Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
        • AVF interface creation issue:
          • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
            • race condition occur
            • try mounting a part of /dev/vfio to see if issue can be resolved
            • Run six rounds of tests with patch https://gerrit.fd.io/r/c/csit/+/34045 and looks good right now
              • Will enable voting right soon after the patch gets merged
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
        • New servers are in the procurement process
        • Plan to replace old thunderx1 build servers with more advanced Arm servers
            • CPU not fully utilized on Arm, need further investigation
    • Intel NIC firmware upgrade on Arm - not supported
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
            • Enable DMC 620 more close to real system, but performance will drop
            • Build a system using VPP memif and pktgen
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
        • Plan to try quad loop unrolling for ip6_lookup_inline function
    • VPP Prefetch
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
          • Try to use ansible to deploy VPP automatically
    • VPP IPsec on Arm - Govind
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

10/12/2021

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Govindarajan Mohandoss
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - Juraj
          • Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
        • AVF interface creation issue:
          • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
            • race condition occur
            • try mounting a part of /dev/vfio to see if issue can be resolved
            • Talked with Peter, Juraj is working on prototype of mounting part of /dev/vfio
            • x86 vpp device job is fine, duo to firmware & driver is old
            • arm vpp device servers have drivers updated, vlan striping not allowed, vlan configuration cannot removed from lab view.
            • only performance testbeds have NIC drivers updated
            • maintainer doesn't want to a option from vpp config
            • may need to check x86 have the same issue with the same version driver before reaching intel folks
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
        • New servers are in the procurement process
        • Plan to replace old thunderx1 build servers with more advanced Arm servers
            • CPU not fully utilized on Arm, need further investigation
    • Intel NIC firmware upgrade on Arm - not supported
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
            • Enable DMC 620 more close to real system, but performance will drop
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
    • VPP Prefetch
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

09/28/2021

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - Juraj
          • Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
        • AVF interface creation issue:
          • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
            • race condition occur
            • try mounting a part of /dev/vfio to see if issue can be resolved
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
        • New servers are in the procurement process
        • Plan to replace old thunderx1 build servers with more advanced Arm servers
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
    • VPP Prefetch
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

09/14/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patchset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate - Juraj
                • Learn more about RFC and need time to understand more - on hold - waiting Neale's response
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing done.
          • A few more jobs run for release 21.06 and will be finished soon
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - juraj
          • Narrow down whether some of those arguments are the reason behind this and that is indeed the case: --volume /dev/vfio:/dev/vfio causes the issue.
        • AVF interface creation issue:
          • Can't create AVF interface on VFs with configured VLAN - happens with latest i40e driver on tx2
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
      • Try IPv4 multicast testing to verify the scenario when refcnt > 1
        • GDB shows that mbufs are copied instead of reference from src port to all dst ports
        • Direct/Indirect mbuf for VPP multicast testing
        • Try IPv4 multicasting & L2 flood testing which works fine
        • ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
          • show mbuf is copied so that ref_cnt will always be one
            • dpdk 21.08 have the patches, need to verify on vpp
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

09/07/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patchset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate - Juraj
                • Learn more about RFC and need time to understand more - on hold - waiting Neale's response
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing done.
          • A few more jobs run for release 21.06 and will be finished soon
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
          • Still see /dev/vfio resource busy error after linux kernel upgradation, but less frequently than before
          • Dig into the log for more details - juraj
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
        • Lijian has summarized feedbacks from Juraj and raised Jira ticket to DevOps team
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
      • Try IPv4 multicast testing to verify the scenario when refcnt > 1
        • GDB shows that mbufs are copied instead of reference from src port to all dst ports
        • Direct/Indirect mbuf for VPP multicast testing
        • Try IPv4 multicasting & L2 flood testing which works fine
        • ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
          • show mbuf is copied so that ref_cnt will always be one
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done
        • Patch upstream has dependency on kernel patch, waiting for this before upstream to VPP
        • Building Intel QAT driver on arm to test IPsec crypt - Zach

08/31/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patchset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate - Juraj
                • Learn more about RFC and need time to understand more - on hold - waiting Neale's response
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing done.
          • A few more jobs run for release 21.06 and will be finished soon
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
        • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
          • config option set to N, /dev/vfio device or resource busy error
          • config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
            • The longer the server runs, more test cases fails
          • Next to do
            • Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
              • Juraj modified script to reproduce the issue - Lijian will try it locally
              • Also seen in Intel QAT card from Zach
            • Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
            • Lijian have slight different firmware version, driver version
            • Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
            • Lijian can use Juraj's script to reproduce the issue on local tx2 server
              • Reducing the numa buffer allocation size resolves this issue
              • Observed from the error log of numa buffer allocation
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
            • Run standalone SVE test cases on FPGA
            • Ask for DMC 620 images to run for FPGA
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
        • VPP CLI configuration and hotspot function are recorded in Confluence page
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
      • Try IPv4 multicast testing to verify the scenario when refcnt > 1
        • GDB shows that mbufs are copied instead of reference from src port to all dst ports
        • Direct/Indirect mbuf for VPP multicast testing
        • Try IPv4 multicasting & L2 flood testing which works fine
        • ip4-replicate node in IPv4 multicasting/l2-flood node in L2 flood testing
          • show mbuf is copied so that ref_cnt will always be one
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach
      • Review perfmon code - Lijian & Govind
      • Implemented statistics from PMUv3 - done

08/24/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patchset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate - Juraj
                • Learn more about RFC and need time to understand more - on hold - waiting Neale's response
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing done.
          • A few more jobs run for release 21.06 and will be finished soon
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
        • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
          • config option set to N, /dev/vfio device or resource busy error
          • config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
            • The longer the server runs, more test cases fails
          • Next to do
            • Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
              • Juraj modified script to reproduce the issue - Lijian will try it locally
              • Also seen in Intel QAT card from Zach
            • Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
            • Lijian have slight different firmware version, driver version
            • Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
            • Lijian can use Juraj's script to reproduce the issue on local tx2 server
              • Reducing the numa buffer allocation size resolves this issue
              • Observed from the error log of numa buffer allocation
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
        • Need vexxhost guys confirm with ethernet / power cable type info.
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
          • Able to access FPGA platform, investigating adding vpp to buildroot filesystem - Lijian
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
      • Try IPv4 multicast testing to verify the scenario when refcnt > 1
        • GDB shows that mbufs are copied instead of reference from src port to all dst ports
        • Will try L2 flood test case & understand VPP/multicast code
        • Direct/Indirect mbuf for VPP multicast testing
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
        • Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
        • For 64B cacheline size native build on Arm, may need to change code.
        • Current VPP does not support 64B cacheline size compilation for Arm images.
        • Issues about prefetch on current VPP code base
          • Issue 1 support 128B/64B cache-line size in Arm image
          • Issue 2 prefetch 'overflow' for native build
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach

08/17/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patchset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate - Juraj
                • Learn more about RFC and need time to understand more
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing ongoing
          • A few more jobs run for release 21.06 and will be finished soon
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
        • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
          • config option set to N, /dev/vfio device or resource busy error
          • config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
            • The longer the server runs, more test cases fails
          • Next to do
            • Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
              • Juraj modified script to reproduce the issue - Lijian will try it locally
              • Also seen in Intel QAT card from Zach
            • Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
            • Lijian have slight different firmware version, driver version
            • Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
            • Lijian can use Juraj's script to reproduce the issue on local tx2 server
              • Reducing the numa buffer allocation size resolves this issue
              • Observed from the error log of numa buffer allocation
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
    • VPN access request to FD.io Arm servers
      • Lijian/Jieqiang has got VPN access now
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • IPv6 profiling
        • Hotspot function - ip6_lookup_node/ip6_rewrite_node
        • Will try perfmon & understand two node functions
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
      • Try IPv4 multicast testing to verify the scenario when refcnt > 1
        • GDB shows that mbufs are copied instead of reference from src port to all dst ports
        • Will try L2 flood test case & understand VPP/multicast code
        • Direct/Indirect mbuf for VPP multicast testing
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
        • Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
        • For 64B cacheline size native build on Arm, may need to change code.
        • Current VPP does not support 64B cacheline size compilation for Arm images.
        • Issues about prefetch on current VPP code base
          • Issue 1 support 128B/64B cache-line size in Arm image
          • Issue 2 prefetch 'overflow' for native build
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • Discussion on the default action on the IPsec inbound interface which does not match
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach

08/10/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries - still under review
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Outbound IPsec finished.
              • Waiting for new version of patcheset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing ongoing
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863

`

    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
        • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
          • config option set to N, /dev/vfio device or resource busy error
          • config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
            • The longer the server runs, more test cases fails
          • Next to do
            • Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
              • Juraj modified script to reproduce the issue - Lijian will try it locally
              • Also seen in Intel QAT card from Zach
            • Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
            • Lijian have slight different firmware version, driver version
            • Tried Mellanox card (rdma driver) multiple times - not see the same issue (on XL710 NIC) happens - Juraj
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
    • VPN access request to FD.io Arm servers
      • Lijian has got VPN access now
      • Juraj singed Jieqiang's key
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128, CLI issue only, CSIT's python API works fine.
      • Internal patch to resolve this issue under review - upstreamed
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
        • Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
        • For 64B cacheline size native build on Arm, may need to change code.
        • Current VPP does not support 64B cacheline size compilation for Arm images.
    • VPP Classifier - Lijian
      • 3 patches: prefetch, key-value compare simd improvement, cache to look up
      • 4 loop unrolling decreasing performance
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach

08/03/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2106-3n-tsh/
        • 2106 testing partial finished. 21.01.1 ongoing, should be done sometime next week.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries - still under review
            • Try VPP IPsec test cases with a fix on outbound interface - Govind & Juraj
              • Outbound IPsec: 10 entries still slower with latest change, related to traffic pattern
              • Waiting for new version of patcheset to verify test cases
            • Will try the fix on inbound IPsec tests when the Jenkins is back to normal - Juraj
              • Inbound IPsec: reproduced and need to investigate
          • Flow cache with 1, 10 SPD entries slower, still investigating. Manual test local vs CSIT have different result on 1-10 SPD policies.
        • Release testing ongoing
          • Comparison between 21.06 and 21.01.1 is ongoing.
        • IPsec SPD input/output case ongoing
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863

`

    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
      • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
        • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
          • Resulting in the same failure as before, only happen on AArch64 platform
        • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
          • config option set to N, /dev/vfio device or resource busy error
          • config option set to Y & iommu_passthrough = 1, IP packets Rx timeout
            • The longer the server runs, more test cases fails
          • Next to do
            • Need to figure out what arm-smmu-v3.4.auto: event 0x10 means
              • Also seen in Intel QAT card from Zach
            • Will try to reproduce this issue on local thx2 with 20.04 distro - Lijian
            • Will try Mellanox card to see if same issue happens - Juraj
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
    • VPN access request to FD.io Arm servers
      • Lijian has got VPN access now
      • Juraj singed Jieqiang's key
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
    • VPP IPv6 Benchmarking and Profiling
      • VPP CLI 'ip route add ipv6_addr/mask' outputs wrong IPv6 routes with mask 123-128
      • Internal patch to resolve this issue under review
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
        • Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
        • For 64B cacheline size native build on Arm, may need to change code.
        • Current VPP does not support 64B cacheline size compilation for Arm images.
    • VPP Classifier - Lijian
      • 3 patches: prefetch, key-value compare simd improvement, cache to look up
      • 4 loop unrolling decreasing performance
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
          • Calico use cases exploration on VPP
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach

07/27/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
            • Random issue, more frequently happening on Arm
              • Not see in CI recently or manually.
        • scapy unexpected timeout issue: packet drop or slow issue?
            • vfio-pci driver may be the root cause - bind/unbind
      • Connection issue between Jenkins and the build executor in FD.io lab
    • Shipment of new advanced server to the FD.io lab
      • One link between TG and DUT, multiple link between DUT for testing LACP.
      • Two advanced servers are in plan to ship
    • VPN access request to FD.io Arm servers
      • Lijian has got VPN access now
      • Juraj singed Jieqiang's key
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
          • Rebased SVE patch per Nitin's request, waiting for Nitin's feedback on running these patches
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform - Conflunce page ready
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
    • VPP mbuf-fast-free tx offload
      • Vector path shows performance improvement, still need to investigate scalar path
      • Performance improvement for IPv4 routing test cases using vector path
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
        • Damjan has merged 2 patches, waiting for the last patch, for generic 128B cacheline size.
        • For 64B cacheline size native build on Arm, may need to change code.
    • VPP Classifier - Lijian
      • 3 patches: prefetch, key-value compare simd improvement, cache to look up
      • 4 loop unrolling descreasing performance
    • VPP memif - Tianyu
      • CNF PoC proposal preparation
          • Add support for VPP aarch64 docker image build
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch - merged
    • VPP Perfmon plugin enablement on Arm - Zach

07/20/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated.
            • This is fixed in DPDK 21.05 version by making iavf PMD as default.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to container(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
            • Random issue, more frequently happening on Arm
            • vfio-pci driver may be the root cause - bind/unbind
      • Connection issue between Jenkins and the build executor in FD.io lab
    • Shipment of new advanced server to the FD.io lab
      • Two advanced servers are in plan to ship
    • VPN access request to FD.io Arm servers
      • Lijian has got VPN access now
      • Juraj singed Jieqiang's key
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform
          • Run unit tests from DPDK and VPP bihash on FPGA
          • Try Lijian's SVE patch to see any cycle count improvement
    • VPP mbuf-fast-free tx offload
      • Performance improvement for IPv4 routing test cases using vector path
    • VPP Prefetch
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
    • VPP IPsec on Arm - Govind
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
      • SPD prototype change on ipsec_output/encryption node - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach

07/13/2021

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries - still under review
            • Expected to be merged soon
          • Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
            • Hugepage size, numa-node, core isolation etc. may need to check.
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
        • Release testing ongoing
        • IPsec SPD input/output case ongoing
        • Juraj may share the steps how CSIT handle new configuration changes
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
        • 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
          • May need to check VM and IPsec cases
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
            • Random issue, more frequently happening on Arm
            • vfio-pci driver may be the root cause - bind/unbind
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
    • Shipment of new advanced server to the FD.io lab
      • New servers are in shortage.
    • VPN access request to FD.io Arm servers
      • Will remind Machiek to sign Lijian's GPG public key.
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)


07/06/2021

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries - still under review
            • Expected to be merged soon
          • Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
            • Hugepage size, numa-node, core isolation etc. may need to check.
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
        • Release testing ongoing
        • IPsec SPD input/output case ongoing
        • Juraj may share the steps how CSIT handle new configuration changes
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
        • 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
          • May need to check VM and IPsec cases
    • VPP Path
      • Voting and working fine.
      • CentOS-8 jobs have been removed.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
            • Random issue, more frequently happening on Arm
            • vfio-pci driver may be the root cause - bind/unbind
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
    • Shipment of new advanced server to the FD.io lab
      • New servers are in shortage.
    • VPN access request to FD.io Arm servers
      • Will remind Machiek to sign Lijian's GPG public key.
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform
    • VPP Prefetch
      • Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
        • Repeat the same test on Ampere server - PMU cache-miss less for write always
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
      • 3 patches: prefetch, key-value compare simd improvement, cache to look up
      • Investigating VPP classify function, use case, benchmarking - Lijian
        • Start with simple use case
        • VPP Classify basic inbound L3 src ip / prot case
        • Benchmark VPP classifier on Arm/X86 platform
          • Done some NEON changes, see some microbranchmark improvement
      • investigate CSIT case
        • No classify test case in CSIT. - Jieqiang
        • may be there is a CSIT case named iacldstbase
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
        • Benchmarking DPDK memif vs VPP memif
          • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
          • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
          • Patches have been upstreamed and waiting for review
          • Prepare the memif readout - Tianyu
          • Add support for VPP aarch64 docker image build
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
      • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
    • VPP IPsec on Arm - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
        • VPP uses linear search on SPD lookups
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach

06/29/2021

  • Attendees
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries - still under review
            • Expected to be merged soon
          • Flow cache with 1, 10 SPD entries slower, still investigating. Mannual test local vs CSIT have different result on 1-10 SPD policies.
            • Hugepage size, numa-node, core isolation etc. may need to check.
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
        • Release testing ongoing
        • IPsec SPD input/output case ongoing
        • Juraj may share the steps how CSIT handle new configuration changes
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm - Fixed and passing.
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
        • 21.06 vs 21.01 see performance drop on https://docs.fd.io/csit/master/report/_static/vpp/performance-changes-3n-tsh-1t1c-pdr.txt
          • May need to check VM and IPsec cases
    • VPP Path
      • Voting and working fine.
      • Community plans to drop the support for CentOS-8.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • /sys/module/vfio/parameters/enable_unsafe_noiommu_mode affect the behavior in some way
            • Debugging
            • vfio-pci driver may be the root cause - bind/unbind
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
    • Shipment of new advanced server to the FD.io lab
      • New servers are in shortage.
    • VPN access request to FD.io Arm servers
      • mbuf-fast-free dpdk enablement with VPP (DEV_TX_OFFLOAD_MBUF_FAST_FREE)
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform
    • VPP Prefetch
      • Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
        • Repeat the same test on Ampere server - PMU cache-miss less for write always
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
      • 3 patches: prefetch, key-value compare simd improvement, cache to look up
      • Investigating VPP classify function, use case, benchmarking - Lijian
        • Start with simple use case
        • VPP Classify basic inbound L3 src ip / prot case
        • Benchmark VPP classifier on Arm/X86 platform
          • Done some NEON changes, see some microbranchmark improvement
      • investigate CSIT case
        • No classify test case in CSIT. - Jieqiang
        • may be there is a CSIT case named iacldstbase
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
        • Benchmarking DPDK memif vs VPP memif
          • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
          • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
          • Patches have been upstreamed and waiting for review
          • Prepare the memif readout - Tianyu
          • Add support for VPP aarch64 docker image build
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
      • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
    • VPP IPsec on Arm - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
        • VPP uses linear search on SPD lookups
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Upstreamed https://gerrit.fd.io/r/c/vpp/+/32903
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach

06/22/2021

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Test cases with 1, 10, 100, 1000 SPD entries
            • Expected to be merged soon
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
        • Release testing ongoing
        • IPsec SPD input/output case ongoing
        • Juraj may share the steps how CSIT handle new configuration changes
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • Community plans to drop the support for CentOS-8.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
            • Resulting in the same failure as before, only happen on AArch64 platform
            • vfio-pci driver may be the root cause
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
    • Shipment of new advanced server to the FD.io lab
      • New servers are in shortage.
    • VPN access request to FD.io Arm servers
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform
    • VPP Prefetch
      • Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
        • Repeat the same test on Ampere server
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
      • Investigating VPP classify function, use case, benchmarking - Lijian
        • Start with simple use case
        • VPP Classify basic inbound L3 src ip / prot case
        • Benchmark VPP classifier on Arm/X86 platform
          • Done some NEON changes, see some microbranchmark improvement
      • investigate CSIT case
        • No classify test case in CSIT. - Jieqiang
        • may be there is a CSIT case named iacldstbase
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
        • Benchmarking DPDK memif vs VPP memif
          • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
          • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
          • Patches have been upstreamed and waiting for review
          • Prepare the memif readout - Tianyu
          • Add support for VPP aarch64 docker image build
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
      • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
    • VPP IPsec on Arm - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
        • VPP uses linear search on SPD lookups
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Internal review for IPsec input node flow cache implementation - Zach & Govind
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach

06/15/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
          • VPP exits with the IPsec startup config, try startup config from Zach's email - Juraj
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Fixed
        • Release testing ongoing
        • IPsec SPD input/output case ongoing
        • Juraj may share the steps how CSIT handle new configuration changes
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform. - 4 cores container case fail on x86 and arm
        • Steps to enable test case in CSIT https://gerrit.fd.io/r/c/csit/+/31863
    • VPP Path
      • Voting and working fine.
      • Community plans to drop the support for CentOS-8.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results - not reproduced so far.
          • New issue: different error related moving VF from host to containter(not involving dpdk/VPP) - just started investigating
            • /usr/bin/vpp[3789]: pci: 0000:91:02.0: open_vfio_iommu_group: open '/dev/vfio/141': Device or resource busy
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly. - DaveW
    • Shippment of new adavanced server to the FD.io lab
      • New servers are in shortage.
  • VPP
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • Juan met and fixing some issue running SVE in qemu VM
        • SVE validation on FPGA platform
    • VPP Prefetch
      • Benchmark VPP using prefetch read always vs prefetch write always on N1SDP - Jieqiang
        • Repeat the same test on Ampere server
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
      • Investigating VPP classify function, use case, benchmarking - Lijian
        • Start with simple use case
        • VPP Classify basic inbound L3 src ip / prot case
        • Benchmark VPP classifier on Arm/X86 platform
          • Done some NEON changes, see some microbranchmark improvement
      • investigate CSIT case
        • No classify test case in CSIT. - Jieqiang - may be there is a CSIT case named iacldstbase
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
        • Benchmarking DPDK memif vs VPP memif
          • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
          • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
          • Patches have been upstreamed and waiting for review
          • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
      • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
    • VPP IPsec on Arm - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input/output nodes - Govind & Zach - waiting for maintainer review
        • VPP uses linear search on SPD lookups
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Waiting for review comments on outbound side before upstream to VPP
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach


06/08/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Add new IPSec NULL encryption & decryption test cases - Juraj
          • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
          • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
          • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
          • VPP exits with the IPsec startup config, try startup config from Zach's email - Juraj
          • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Work in progress.
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Some container test cases failed on all platform.
    • VPP Path
      • Voting and working fine.
      • Community plans to drop the support for CentOS-8.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
          • Use iavf PMD instead of i40evf on all VPP branches, waiting for the test results.
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
    • Shippment of new adavanced server to the FD.io lab
      • New servers are in shortage.
  • VPP
    • VPP default compiler on Arm platform
      • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
        • Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
          • No obvious performance improvement, keep the original default compiler
    • VPP SVE implementation - Lijian
      • Vector length specific patch is ready
      • SVE patch ready and upstreamed, under review - Lijian
        • SVE patch sent to Nitin, Nitin will review the patch when back to work.
        • SVE validation on FPGA platform
    • VPP Prefetch
      • Benchmark VPP using prefetch read always vs prefetch write always - Jieqiang
      • Refactor prefetch implementation in VPP per CPU's actual cache line size - Tianyu & Jieqiang
    • VPP Classifier - Lijian
      • Investigating VPP classify function, use case, benchmarking - Lijian
        • Start with simple use case
        • VPP Classify basic inbound L3 src ip / prot case
        • Benchmark VPP classifier on Arm/X86 platform
      • investigate CSIT case
        • No classify test case in CSIT. - Jieqiang
    • VPP memif - Tianyu
      • Investigating VPP memif - Tianyu
        • Benchmarking DPDK memif vs VPP memif
          • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
          • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
          • Patches have been upstreamed and waiting for review
          • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
      • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
    • VPP IPsec on Arm - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input/output nodes - Govind & Zach
        • VPP uses linear search on SPD lookups
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Waiting for review comments on outbound side before upstream to VPP
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • VPP Perfmon plugin enablement on Arm - Zach


06/01/2021

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Zachary Leaf
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
              • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
              • IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
            • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
            • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - Work in progress.
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate cabling issue on Taishan performance test-bed - resolved.
          • Some container case are seems failure on all platform.
    • VPP Path
      • Voting and working fine.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
          • Internal ticket has been raised
            • DPDK uses i40evf PMD and it is old and scheduled to be deprecated. After using iavf PMD, the issue is not seen. This is fixed in DPDK 21.05 version by making iavf PMD as default. This fix will be ported to all the older VPP LTS release branches. Currently, the fix is planned only for VPP 20.09 release.
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
      • Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
    • Vector length specific patch is ready
    • Investigating VPP classify function, use case, benchmarking - Lijian
      • Start with simple use case
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
    • SVE patch ready and upstreamed, under review - Lijian
      • SVE patch sent to Nitin, Nitin will review the patch when back to work.
      • Review memif patch
        • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
      • VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case - No classify test case in CSIT. - Jieqiang
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
    • Work on IPsec input/output nodes - VPP uses linear search on SPD lookups - Govind & Zach
      • SPD prototype change on ipsec_output/encryption node, introducing flow cache with hash, has performance improvements, discussing with community - Govind
        • https://gerrit.fd.io/r/c/vpp/+/31694
        • IPSec unit tests done - 'make test' cases implemented & included in patch - Zach
          • Testing of flow cache functionality, including hash collisions and stale entry overwrites
      • IPSec input node/decryption flow cache implemented in a separate patch - Zach
        • Waiting for review comments on outbound side before upstream to VPP
        • Discovered issue with SPD policy counter/statistics on input side, to be fixed in additional standalone patch
    • Perfmon plugin enablement on Arm - Zach

05/25/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Zachary Leaf
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
              • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
              • IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
            • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
            • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308 - patch fully tested, waiting for review - Juraj
            • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready - will look into it
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate cabling issue on Taishan performance test-bed - resolved.
          • Some container case are seems failure on all platform.
    • VPP Path
      • Voting and working fine.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
            • Intel folks debugged, tried updating firmware/drivers, VF driver updated, old driver: i40evf, new driver: iavf. After modified dpdk code by using iavf, issue can be fixed. Need to find proper solution of dpdk.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
          • Internal ticket has been raised
            • Try the new version of DPDK but it does not help
            • Contact Intel devs for the possible advice
            • Workaround may impact too much to all test cases
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
      • Plan to benchmark gcc-10 vs clang-12 on arm L2/L3, 1/10k - Jieqiang
    • Vector length specific patch is ready
    • Investigating VPP classify function, use case, benchmarking - Lijian
      • Start with simple use case
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
    • SVE patch ready and upstreamed, under review - Lijian
      • SVE patch sent to Nitin, Nitin will review the patch when back to work.
      • Review memif patch
        • Memif C11 atomics has been updated by maintainer, not using atomic_relaxed - Tianyu
      • VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case - No classify test case in CSIT. - Jieqiang
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Performance improvement using loop unrolling for memif nodes - running CSIT perftest
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform. - Govind
      • IPsec input node optimization work in progress - Zach & Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
          • https://gerrit.fd.io/r/c/vpp/+/31694
          • IPSec unit test - make test new cases implementation
          • Make test cases for IPSec policy mode - Done, included in Govind's patch, waiting for maintainer review - Zach
            • Add/Remove/Add+Remove+Readd/Hash collisions/Multiple interfaces & rules
          • Review the patch and grasp the basics about IPSec - Lijian
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • perfmon CMN-600 investigating - Zach
      • VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
      • Plan to upstream perfmon plugin - resolving review comments - Zach
      • IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
      • IPSec decryption / input node - Zach

05/18/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Zachary Leaf
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
              • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
              • IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
            • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
            • New IPsec test cases - https://gerrit.fd.io/r/c/csit/+/32308
            • Enable flow cache option in startup.conf for VPP CSIT IPsec test cases when the patch is ready
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
    • VPP Path
      • Voting and working fine.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
          • Try to reproduce with another set of firmware and etc but issues still exist
          • https://doc.dpdk.org/guides/nics/i40e.html
          • Internal ticket has been raised
            • Try the new version of DPDK but it does not help
            • Contact Intel devs for the possible advice
            • Workaround may impact too much to all test cases
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
      • Lab move for the fd.io lab
        • Lab moving started stage 2, moved part of the servers to make sure ci service not down.
        • Lab move is done, some issues with taishan testbed
        • Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
      • Plan to benchmark gcc-10 vs clang-12
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
    • SVE patch ready and upstreamed, under review - Lijian
      • SVE patch sent to Nitin, Nitin will review the patch when back to work.
      • Review memif patch
        • Functional bug related to C11 atomics has been resolved by VPP maintainer.
      • VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case. - Jieqiang
      • Make test cases for IPSec policy mode - Zach
        • Add/Remove/Add+Remove+Readd/Hash collisions/Multiple interfaces & rules - Add more test cases
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Performance improvement using loop unrolling for memif nodes
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform. - Govind
      • IPsec input node optimization work in progress - Zach & Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • perfmon CMN-600 investigating - Zach
      • VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
      • Plan to upstream perfmon plugin - resolving review comments - Zach
      • IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
      • IPSec decryption / input node - Zach

05/11/2021

  • Attendees
    • Lijian Zhang
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Zachary Leaf
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
              • New IPSec SPD test cases will not have NULL encrypt/decrypt config.
              • IPSec SPD test cases will be ready next week, how to make SPD policy change - IP address range changes - Juraj
            • CSIT Maintainers approved to add new IPSec Policy mode test cases with multiple SPD rules.
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
    • VPP Path
      • Voting and working fine.
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • Moved the NIC from arm server to x86 server can easily update firmware, given script to reproduce, intel busy with other stuff, debugging in progress.
          • https://doc.dpdk.org/guides/nics/i40e.html
          • Internal ticket has been raised
            • Try the new version of DPDK but it does not help
            • Contact Intel devs for the possible advice
            • Workaround may impact too much to all test cases
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
      • Lab move for the fd.io lab
        • Lab moving started stage 2, moved part of the servers to make sure ci service not down.
        • Arm servers documented in https://gerrit.fd.io/r/c/csit/+/30662
        • Almost all except performance testbed, which will be moved this week, everything is smooth so far.
        • ubuntu 1804 -> 2004
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
    • SVE patch ready and upstreamed, under review - Lijian
      • SVE patch sent to Nitin, Nitin will review the patch when back to work.
      • Review memif patch
      • VPP Classify basic inbound L3 src ip / prot case, investigate CSIT case.
      • Make test cases for IPSec policy mode - Jieqiang
    • Review memif test cases/memif cases
    • Finished coding of SVE string library, bihash key compare functions
    • SVE unit testing based on test_vec, fix test_vec issues
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
          • Will do readout presentation with extended people - Tianyu
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Performance improvement using loop unrolling for memif nodes
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform. - Govind
      • IPsec input node optimization work in progress - Zach & Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • Using startup parameter to enable the IPsec flow cache feature
    • Adding Python test case to test IPSec node behavior - Jieqiang
    • perfmon CMN-600 investigating - Zach
      • VPP perfmon CMN-600 patch abandon, system level, not vpp node level, linux perf can give the same result - Zach
      • Plan to upstream perfmon plugin - resolving review comments - Zach
      • IPSec flow cache outbound done, working on inbound side in seperate patch - Zach
      • IPSec decryption / input node - Zach

04/27/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
    • VPP Path
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
          • https://doc.dpdk.org/guides/nics/i40e.html
          • Internal ticket has been raised
            • Try the new version of DPDK but it does not help
            • Contact Intel devs for the possible advice
            • Workaround may impact too much to all test cases
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
    • SVE patch ready and upstreamed, under review - Lijian
      • Make test cases for IPSec policy mode - Jieqiang
    • Review memif test cases/memif cases
    • Finished coding of SVE string library, bihash key compare functions
    • SVE unit testing based on test_vec, fix test_vec issues
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
          • Will do readout presentation with extended people - Tianyu
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Performance improvement using loop unrolling for memif nodes
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform. - Govind
      • IPsec input node optimization work in progress - Zach & Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • Using startup parameter to enable the IPsec flow cache feature
    • Adding Python test case to test IPSec node behavior - Jieqiang
    • perfmon CMN-600 investigating - Zach
      • Plan to upstream perfmon plugin - resolving review comments - Zach
      • IPSec decryption / input node - Zach

04/13/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • Some of the IPSec test cases(Policy tests) has been added to daily testing.
            • Enabled the policy tests in mrr-daily testing and it's now running on both 2n-tx2 and 3n-tsh (and also available for per-patch on-demand testing) - Juraj
            • Add new IPSec NULL encryption & decryption test cases - Juraj
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Some issues occurred during the upgrade.
          • Patch to resolve the building error of DPDK on 3n-tsh testbed.
          • Root cause is the change of build system of DPDK on 3n-tsh related to SOC id detection.
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
    • VPP Path
    • VPP Device
      • https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-device-master-ubuntu2004-aarch64-1n-tx2/
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
        • Will try to reproduce the issue with x86 servers.
        • This issue is common to all platforms(Arm & Intel)
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Jieqiang helped to verify most fixed size vector wrappers - unit test code
    • SVE Remaining works - variable type convention - need some workaround for 256bit convention
    • VLA patch, coding and verification done. string memcpy/memset, bihash key compare functions, rdma/bonding node
      • Make test cases for IPSec policy mode - Jieqiang
    • Review memif test cases/memif cases
    • Finished coding of SVE string library, bihash key compare functions
    • SVE unit testing based on test_vec, fix test_vec issues
    • Test template update - Jieqiang
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
          • Will do readout presentation with extended people - Tianyu
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Prepare the memif readout - Tianyu
      • Try to apply C11 weak memory model on VPP memif - Tianyu
        • Use 'show runtime'/perfmon to see cycle improvement
        • Run memif unit test
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • Using startup parameter to enable the IPsec flow cache feature
    • Adding Python test case to test IPSec node behavior - Jieqiang
    • perfmon CMN-600 investigating - Zach
      • Plan to upstream perfmon plugin - resolving review comments - Zach
      • IPSec decryption - Zach


03/30/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
          • Start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • IPSec policy test cases are not running by default.
            • 2 node IPsec SPD policy test case patch is ready, starting with 1 and 1k tunnels. (40, 400 tunnels in seperate patch)
            • https://gerrit.fd.io/r/c/csit/+/31605
            • Fix the wrong CLI commands but configuration still has problems.
            • Send the correct robot framework tags for IPSec policy test cases to Govind - Juraj
        • 2n-tx2 & 3n-tsh has been upgraded to ubuntu 20.04, everything is working fine now.
          • Some issues occurred during the upgrade.
          • Patch to resolve the building error of DPDK on arm testbed.(taishan dpdk cases still have issues, investigating)
          • Juraj is investigating running those test cases with 2N-TX2 topology.
          • Juraj will investigate adding IPSec test cases on Taishan performance test-bed.
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • The issue could be reproduced on Arm servers with the NIC with latest firmware version. (still reproduced, no update yet)
        • Will try to reproduce the issue with x86 servers.
      • "Make Test Testcase Error or Failure" --> There was an intermittent VPP anomaly introduced last week with the change from shmem to socket transport for PAPI which causes the MAP unittests to fail [0]. The root cause is being addressed and should be fixed shortly.
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
    • Review memif test cases/memif cases
    • Finished coding of SVE string library, bihash key compare functions
    • SVE unit testing based on test_vec, fix test_vec issues
    • Test template update
    • SVE unit test in qemu-vm, met compiling issue, investigating
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
          • Will do readout presentation with extended people - Tianyu
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
        • Review the confluence page and prepare the memif readout - Lijian & Tianyu
        • Race condition occurred hen connecting DPDK memif PMD interface(slave) with VPP memif interface(master)
        • Prepare the memif readout - Tianyu
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Working on IPsec flow cache, discussed with Neal, maintainer agree with the change
    • Using startup parameter to enable the IPsec flow cache feature
    • Discuss with jieqiang adding python test case to test ipsec node behavior
    • perfmon CMN-600 investigating - Zach

03/16/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Vector length specific patch is ready
      • 128 and 256 fixed size vector wrappers are ready, needs verification
      • Verify SVE vector length specific wrappers - Jieqiang
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
        • Extend vector length agnostic opportunities
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
          • Will do readout presentation with extented people - Tianyu
    • Investigating VPP memif - Tianyu
      • Benchmarking DPDK memif vs VPP memif
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • Perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

03/09/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • CentOS-7 will be enabled with master branch for support lts release
        • CentOS-7 Jenkins on Arm will not be supported.
        • CentOS-8 will be supported by the end of this year by Redhat.
      • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
        • Dependency on maintainers to fix this issue
        • Intel will ship a new NIC with latest firmware
          • Shipment takes a long time empirically
            • NIC has been shipped to vexxhost, wait for NIC arrival.
          • Try to reproduce the issue on this NIC on Arm platform
          • Updating firmware on the current NIC is risky
        • Voting rights will be enabled once this issue is fixed
            • Maintainer raised the ticket to get intel people involved
            • Will not update the firmware because the release testing is ongoing
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
        • Will show Arm roadmap in the next TSC meeting
        • Proposals have been sent to VPP maintainer on verifying SVE/SVE2
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
        • 128 and 256 fixed size vector wrappers are ready, needs verification
        • Verify SVE vector length specific wrappers - Jieqiang
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
        • Extend vector length agnostic opportunities
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
        • Analysis of benchmarking results for Ampere Altra
          • A lot of context switches occur on Ampere Altra compared to N1SDP
          • perf tools used to capture the perf events
          • Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
          • Remove interrupts on altra but no performance improvement seen
          • instruction cache misses are higher on altra than N1
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Focus more on data-plane performance benchmarking and optimization - Tianyu
        • Record the benchmarking results of VPP CNF 3 test cases in excel template
    • VPP compiling error on CentOS 7 - Jieqiang
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • SPD prototype change, introducing flow cache with hash, have performance improvement, discussing with community.
        • perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

02/23/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
      • CentOS-7 will be enabled with master branch for support lts release
        • CentOS-7 Jenkins on Arm will be supported.
      • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
        • Dependency on maintainers to fix this issue
        • Intel will ship a new NIC with latest firmware
          • Shipment takes a long time empirically
          • Try to reproduce the issue on this NIC on Arm platform
          • Updating firmware on the current NIC is risky
        • Voting rights will be enabled once this issue is fixed
            • Maintainer raised the ticket to get intel people involved
            • Will not update the firmware because the release testing is ongoing
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features inside QEMU docker - Lijian
        • Latest VPP binary crash on the QEMU docker
          • System call fails inside QEMU docker when running VPP
        • Verify SVE/SVE2 features inside ARM QEMU VM
        • VPP maintainers want real hardware to verify SVE code
          • This solution will be abandoned.
        • 'make test' execution is slow
        • Sync with DPDK team/VPP community to decide the solution
        • Proposals have been sent to VPP maintainer on verifying SVE/SVE2
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
        • 128 and 256 fixed size vector wrappers are ready, needs verification
        • Verify SVE vector length specific wrappers - Jieqiang
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
        • Extend vector length agnostic opportunities
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
        • Analysis of benchmarking results for Ampere Altra
          • A lot of context switches occur on Ampere Altra compared to N1SDP
          • perf tools used to capture the perf events
          • Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
          • Remove interrupts on altra but no performance improvement seen
          • instruction cache misses are higher on altra than N1
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Investigate VPP memif test cases in VPP CSIT - Jieqiang
      • Investigate VPP use cases proposals in containers - Tianyu
        • 3 use cases are investigated.
        • Will explore the memif logic and share the progress.
        • Will share the link on details about how to run VPP in container.
      • Investigate VPP agent usage - Tianyu
        • Focus more on data-plane performance benchmarking and optimization - Tianyu
    • VPP compiling error on CentOS 7 - Jieqiang
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • perfmon plugin enablement on Arm - Zach
          • patch upstream has dependency on kernel patch
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

02/09/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
        • Dependency on maintainers to fix this issue
        • Intel will ship a new NIC with latest firmware
          • Shippment takes a long time empirically
          • Try to reproduce the issue on this NIC on Arm platform
          • Updating firmware on the current NIC is risky
        • Voting rights will be enabled once this issue is fixed
            • Maintainer raised the ticket to get intel people involved
            • Will not update the firmware because the release testing is ongoing
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features inside QEMU docker - Lijian
        • Latest VPP binary crash on the QEMU docker
          • System call fails inside QEMU docker when running VPP
        • Verify SVE/SVE2 features inside ARM QEMU VM
        • 'make test' execution is slow
        • Sync with DPDK team/VPP community to decide the solution
        • Proposals have been sent to VPP maintainer on verifying SVE/SVE2
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
        • Analysis of benchmarking results for Ampere Altra
          • A lot of context switches occur on Ampere Altra compared to N1SDP
          • perf tools used to capture the perf events
          • Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
          • Remove interrupts on altra but no performance improvement seen
          • instruction cache misses are higher on altra than N1
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Investigate VPP memif test cases in VPP CSIT - Jieqiang
      • Investigate VPP use cases proposals in containers - Tianyu
        • 3 use cases are investigated.
        • Will explore the memif logic and share the progress.
        • Will share the link on details about how to run VPP in container.
      • Investigate VPP agent usage - Tianyu
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

02/02/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
        • Dependency on maintainers to fix this issue
        • Voting rights will be enabled once this issue is fixed
          • Implementation is ready, and will do test it with actual patches.
          • Apply file locking mechanism to allow that only one VPP instances are running.
            • https://gerrit.fd.io/r/c/csit/+/30425
            • Patches are under review
            • Maintainer raised the ticket to get intel people involved
            • Will not update the firmware because the release testing is ongoing
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features inside QEMU docker - Lijian
        • Latest VPP binary crash on the QEMU docker
          • System call fails inside QEMU docker when running VPP
        • Verify SVE/SVE2 features inside ARM QEMU VM
        • 'make test' execution is slow
        • Sync with DPDK team/VPP community to decide the solution
        • Proposals have been sent to VPP maintainer on verifying SVE/SVE2
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
        • Analysis of benchmarking results for Ampere Altra
          • A lot of context switches occur on Ampere Altra compared to N1SDP
          • perf tools used to capture the perf events
          • Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
          • Remove interrupts on altra but no performance improvement seen
          • instruction cache misses are higher on altra than N1
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Investigate VPP memif test cases in VPP CSIT - Jieqiang
      • Investigate VPP use cases proposals in containers - Tianyu
        • 3 use cases are investigated.
        • Will explore the memif logic and share the progress.
        • Will share the link on details about how to run VPP in container.
      • Investigate VPP agent usage - Tianyu
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

01/19/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tianyu Li
    • Jieqiang Wang
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2101-3n-tsh/
        • CSIT official release 20.09 is available
          • https://docs.fd.io/csit/rls2009/report/
          • Jieqiang will compare the performance data with release 20.09
            • Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
            • DPDK testpmd running inside VM, l2 cross connect running inside VPP.
            • Check the number for CSIT 2101 release
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Hardware configurations/wiring are done; Physical connection to the TG is done.
        • almost done, two steps need to be done
          • start with basic L2/L3/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • Take the execution time into consideration if we want run release testing on 2n-thx2.
          • It takes 9 hours to finish the one round testing.
          • Tests are running fine
            • L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
            • Suitable time to run release testing on 2n-tx2 testbed.
            • Will investigate IPSec test cases on 2n-tx2 - Juraj
            • Add memif test case to 2n-tx2 once the release testing is done.
    • VPP Path
      • CentOS-7 will be enabled with master branch for support lts release
        • CentOS-7 Jenkins on Arm will be supported.
      • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge'
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
          • Implementation is ready, and will do test it with actual patches.
          • Apply file locking mechanism to allow that only one VPP instances are running.
            • https://gerrit.fd.io/r/c/csit/+/30425
            • Patches are under review
            • Machiek raised the ticket to get intel people involved
            • Will not update the firmaware because the release testing is ongoing
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features inside QEMU docker
        • Latest VPP binary crash on the QEMU docker - Lijian
      • Lab move for the fd.io lab
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
        • Analysis of benchmarking results for Ampere Altra
          • A lot of context switches occur on Ampere Altra compared to N1SDP
          • perf tools used to capture the perf events
          • Talk with Ampere or N1 team on how to enable CMN-600 counters for ampere altra
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Investigate VPP memif test cases in VPP CSIT - Jieqiang
      • Investigate VPP use cases proposals in containers - Tianyu
        • 3 use cases are investigated.
        • Will explore the memif logic and share the progress.
        • Will share the link on details about how to run VPP in container.
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind


01/05/2021

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Jieqiang Wang
    • Tianyu Li
    • Tina Tsou
  • CSIT
    • VPP Performance Test
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-mrr-weekly-master-3n-tsh/
      • https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-report-iterative-2009-3n-tsh/
        • CSIT official release 20.09 is available
          • https://docs.fd.io/csit/rls2009/report/
          • Jieqiang will compare the performance data with release 20.09
            • Will investigate test case eth-l2xcbase-eth-2vhostvr1024-1vm - Jieqiang
            • DPDK testpmd running inside VM, l2 cross connect running inside VPP.
      • Leverage current spare TX2 server as 2-node topology performance test-bed.
        • Hardware configurations/wiring are done; Physical connection to the TG is done.
        • almost done, two steps need to be done
          • start with basic L2/L3/IPSec/ACL/classifiers tests and use DPDK PMD on 2n-thx2 firstly(daily testing)
          • Take the execution time into consideration if we want run release testing on 2n-thx2.
          • Tests are running fine
            • L2/L3 tests are running fine, IPSec tests are not supported on 2-node topo, ACL/Classifiers needs investigation.
            • Suitable time to run release testing on 2n-tx2 testbed.
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
          • Implementation is ready, and will do test it with actual patches.
          • Apply file locking mechanism to allow that only one VPP instances are running.
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features inside QEMU docker
        • Latest VPP binary crash on the QEMU docker - Lijian
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
      • Try to capture some software benchmarking results
      • https://gerrit.fd.io/r/c/vpp/+/29942 - first proposal - preferred
      • https://gerrit.fd.io/r/c/vpp/+/30326 - second proposal - not preferred
      • Investigate the scalable SIMD instructions on RISC-V - Lijian
      • Investigate how to run traffic tests for VPP in docker - Lijian
        • Plan to talk with VPP maintainers on this topic
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
    • Investigate VPP test cases in container
      • Investigate VPP test cases in VPP CSIT - Jieqiang
      • Investigate VPP use cases proposals in containers - Tianyu
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

12/22/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
    • Will cancel the meeting on Dec 29th;
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maintainers
      • LF will provide QSFP+ fiber switch for FD.io lab.
        • Basically done. LF just procured the existing fiber switch currently rent by Arm in FD.io lab.
        • Send the progress to relavent people in Arm - Lijian
        • Confirm with Tina to ensure Arm is not charged - Lijian
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
      • Verify SVE/SVE2 features on VPP CSIT
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
        • avf-input node with neon optimization is merged.
        • ethernet-input patch needs to split into two parts required by VPP maintainer
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind


12/15/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
    • Will cancel the meeting on Dec 29th;
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • VPP community is responding this issue actively. - Juraj
        • Working on a workaround to make sure not starting multiple VPP instances at the same time - Juraj
          • Implementation is ready, and will do test it with actual patches.
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maitainers
      • LF will provide QSFP+ fiber switch for FD.io lab.
        • Basically done. LF just procured the existing fiber switch currently rent by Arm in FD.io lab.
        • Send the progress to relavent people in Arm - Lijian
      • Arm is required to present Arm achievement and plan to TSC.
        • Govind will prepare the slides
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • SOC ID will be available on /proc entry starting from kernel version 5.9 to differentiate vendor CPUs.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • Optimize ethernet-input and avf-input node with NEON intrinsics
        • Benchmarking result shows some improvement from vectorization with ethernet-input and avf-input node
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals are upstreamed, will discuss the proposals with maintainers
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • No positive for 4x loop unrolling on Ampere, so will keep 2x unrolling for Neoverse N1
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Performance on Altra is about 30%-40% lower than 8268.
      • Performance on Altra is slightly better than N1SDP.
      • IO testing is doable with specific PCIe slots, with which PCIe bandwidth is not the bottle-neck.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • Benchmark Altra vs Cascade 8268
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind


12/08/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Juraj Linkes
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • VPP device testing issues may be caused by XL710 i40e fw or kernel module.
        • Working with VPP/DPDK/Intel to root cause this issue. - Juraj
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
        • Which is acceptable by CSIT maitainers
      • LF will provide QSFP+ fiber switch for FD.io lab.
        • Vexxhost just has a spare one, and LF will buy it for FD.io lab, which will probably happen this month.
      • N1SDP shipment to FD.io
        • Govind will track the status
      • CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
        • Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
        • Arm is required to present Arm achievement and plan to TSC.
          • Govind will prepare the slides
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • SOC id will be available on /proc entry starting from kernel version 5.9
        • Will investigate the details - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Benchmarked cross-connect and TX queue is dropping packets
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • To optimize ethernet-input and avf-input node with NEON intrinsics
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 two proposals upstreamed
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch.
      • Have to repeat the testing in the future.
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Ampere altra server has some PCIe bugs.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • IPsec on Arm platform. - Govind
      • Apply prefetches and loop-unrolling with ipsec-out node
        • Didn't observe much performance improvement (2%) so far
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

12/1/2020

  • Attendees
    • Govindarajan Mohandoss
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
      • LF will provide QSFP+ fiber switch for FD.io lab.
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • To enable voting right for the VPP device jobs. - Juraj
          • Failed tests due to sw_interface_dump api issue. - Juraj
        • VPP device job is unstable
          • Race condition occurs when multiple VPP instances are starting.
          • Will try to update the i40e driver & firmware.
      • N1SDP shipment to FD.io
        • Govind will update the shippment status to Juraj and Machiek.
        • Will still have to ship N1SDP to FD.io lab, and Machiek confirmed that there will be enough rack space.
      • CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
        • Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
        • Arm is required to present Arm achievement and plan to TSC.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • SOC id will be available on /proc entry starting from kernel version 5.9
        • Will investigate the details - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • To optimize ethernet-input and avf-input node with NEON intrinsics
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 proposal
      • Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
      • Patches are upstreamed for comments
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Ampere altra server has some PCIe bugs.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • IPsec on Arm platform. - Govind
      • Apply prefetches with ipsec-out node
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

11/24/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
    • VPP Device
      • Current VPP device testing on TX2 is around 40 mins - 45 mins
      • LF will provide QSFP+ fiber switch for FD.io lab.
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • To enable voting right for the VPP device jobs. - Juraj
          • Failed tests due to sw_interface_dump api issue. - Juraj
      • N1SDP shipment to FD.io
        • Govind will update the shippment status to Juraj and Machiek.
        • Will still have to ship N1SDP to FD.io lab, and Machiek confirmed that there will be enough rack space.
      • CSIT budget plan for 10G switch purchase in FD.io lab. - Juraj, Tina
        • Trishan de Lanerolle <tdelanerolle@linuxfoundation.org> from LF is working with Machiek to provide 10G switch.
        • Arm is required to present Arm achievement and plan to TSC.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • SOC id will be available on /proc entry starting from kernel version 5.9
        • Will investigate the details - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • To optimize ethernet-input and avf-input node with NEON intrinsics
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 proposal
      • Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
      • Patches are upstreamed for comments
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Ampere altra server has some PCIe bugs.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • IPsec on Arm platform. - Govind
      • Apply prefetches with ipsec-out node
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

11/17/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Tina Tsou
  • General
  • CSIT
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • SOC id will be available on /proc entry starting from kernel version 5.9
        • Will investigate the details - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Initial benchmarking and analysis is done, and profiling result is recorded.
      • To optimize ethernet-input and avf-input node with NEON intrinsics
      • Try cross-connect with AVF PMD driver to check avf-input node only - Lijian
    • SVE/SVE2 proposal
      • Refactored ethernet-input node with SVE/SVE2 intrinsics per Damjan's suggestion
      • Patches are upstreamed for comments
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
    • Benchmark the performance using AVF PMD driver on Ampere server. - Jieqiang
      • Ampere altra server has some PCIe bugs.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
    • IPsec on Arm platform. - Govind
      • Apply prefetches with ipsec-out node
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind

11/10/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
      • CSIT DPDK test cases will be enabled on Arm servers and data will be included next CSIT release report.
      • L2 learning 1Mx flows, 4T4C, with release-2005, about 20% performance drop.
        • The patch caused this issue has been identified - https://gerrit.fd.io/r/c/vpp/+/26549
        • Repeat tests on local N1SDP and cascade server. - Jieqiang
        • Repeat the test case with latest master branch. - Jieqiang
        • The patch introduced this perf drop need to be analyzed. - Jieqiang, Lijian
        • This patch needs to be analysed on VPP 2005 and 2001 releases. - Jieqiang, Lijian
        • The perf drop rate is ~5-8% on latest VPP code compared to the original data.
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
      • 1 Spare TX2 server can be used to create 2 node topology to run performance tests.
        • Juraj to check with Peter about the feasibility.
        • Move the thx2 to the same rack for tg and install the same nic on tg.
        • 1g NIC for management installed on thx2, but cannot be net-booted.
          • Able to net-boot from the built-in 10G NIC.
          • The tx2 has been moved to the same rack where the tg is located.
          • Plan to set up the weekly perf tests on the new topo.
        • Port the robotframe configuration steps for tsh testbeds from thx1 to thx2 to speed up perf tests. - Juraj
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
        • https://gerrit.fd.io/r/c/ci-management/+/28022 automate the generation of docker builder images.
        • Plan to drop the support for CentOS 7 from Dave.
        • Tried Dave's patch to generate docker image on Arm and saw some errors. - Juraj
        • Test arm centos7 jenkins builder image. - Juraj.
        • Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
        • Figure out which host of two hosts to run the Jenkins job.
        • Revert to old kernel version 4.15.0-55 to avoid AVF issue.
        • AVF issue is common across the platform.
          • Differences between avf driver versions may be the root cause of behavior changes.
        • New VPP device job takes about 55 minutes to finish, which needs to be reduced to 45 minutes around.
          • Python runs slower on new thx2 servers than 1-node skylake.
          • Try new version of Python(such as 3.8) or split the device tests into two parts.
          • Check how many CPUs get utilized for robot framework execution on thx2 server.
          • Two thunderx2 are running fine right now and the VPP device jobs are almost done.
          • Disabling hyperthreading on new thx2 will speed up the VPP device tests.
          • Enable the voting right for the VPP device jobs. - Juraj
            • Failed tests due to sw_interface_dump api issue. - Juraj
      • N1SDP shippment to FD.io
        • Get response from Maciek about the rack space and traffic generator availability.
      • CSIT budget plan for 10g switch purchase in FD.io lab. - Juraj, Tina
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
      • Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
      • Summarize the meeting minutes and action items. - Lijian
      • SOC id will be available on /proc entry from kernel version 5.9 - Ljian, Honnappa
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
      • Start benchmarking AVF PMD driver in VPP on N1SDP.
      • Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
      • avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
    • SVE/SVE2 proposal
      • Will send email to Damjan asking him to review
      • SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
      • No further comments from VPP community.
      • Apply the SVE/SVE2 intrinsics on refactoring ethernet-input node. - Lijian
        • SVE/SVE2 functionality to be tested on the new development platform.
        • Verify SVE/SVE2 code changes on simulator.
        • Try to run standalone SVE codes on the new FPGA platform.
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
    • Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
      • Ampere altra server has some PCIe bugs.
      • Try the VFs with DPDK plugin. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
      • Find out the tuned configuration for cross connect test cases using AVF PMD driver.
        • Figure out corresponding configurations in CSIT scripts.
        • Repeat the ACL ingress SL test cases locally for N1SDP.
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
        • Will investigate the prefetches and loop unrolling on IPsec input node. - Govind
    • Plans

11/03/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
        • https://gerrit.fd.io/r/c/ci-management/+/28022 automate the generation of docker builder images.
        • Test arm centos7 jenkins builder image. - Juraj.
        • Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
        • Figure out which host of two hosts to run the Jenkins job.
        • Revert to old kernel version 4.15.0-55 to avoid AVF issue.
        • AVF issue is common across the platform.
          • Differences between avf driver versions may be the root cause of behavior changes.
        • New VPP device job takes about 55 minutes to finish, which needs to be reduced to 45 minutes around.
          • Python runs slower on new thx2 servers than 1-node skylake.
          • Try new version of Python(such as 3.8) or split the device tests into two parts.
          • Check how many CPUs get utilized for robot framework execution on thx2 server.
          • Two thunderx2 are running fine right now and the VPP device jobs are almost done.
      • N1SDP shippment to FD.io
        • Get response from Machiek about the rack space and traffic generator avalability.
      • CSIT budget plan for 10g switch purchase in FD.io lab. - Juraj, Tina
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
      • Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
      • Summarize the meeting minutes and action items. - Lijian
      • SOC id will be available on /proc entry from kernel version 5.9 - Ljian, Honnappa
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
      • Start benchmarking AVF PMD driver in VPP on N1SDP.
      • Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
      • avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
    • SVE/SVE2 proposal
      • Will send email to Damjan asking him to review
      • SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
      • No further comments from VPP community.
      • Apply the SVE/SVE2 intrinsics on refractoring ethernet-input node. - Lijian
        • SVE/SVE2 functionality to be tested on the new development platform.
    • Repeat the 4x and 2x loop unrolling tests on Ampere server for L3 forwarding with Lijian's patch. - Jieqiang
    • Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
      • Find out the tuned configuration for cross connect test cases using AVF PMD driver.
        • Figure out corresponding configurations in CSIT scripts
        • Repeat the ACL ingress SL test cases locally for N1SDP.
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
        • Review akshitha's PPT on SLC eviction and share it with the team. - Govind.
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
    • Plans

10/27/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
        • Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
        • Figure out which host of two hosts to run the Jenkins job.
        • Revert to old kernel version 4.15.0-55 to aviod AVF issue.
          • Differences between avf driver versions may be the root cause of behavior changes.
        • New VPP device job takes about 55 minutes to finish, which needs to be reduced to 40 minutes around.
          • Python runs slower on new thx2 servers than 1-node skylake.
          • Try new version of Python(such as 3.8) or split the device tests into two parts.
          • Check how many CPUs get utilized for robot framework execution on thx2 server.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
      • Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
      • Summarize the meeting minutes and action items. - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
      • Start benchmarking AVF PMD driver in VPP on N1SDP.
      • Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
      • avf_input and avf_output nodes don't comsume lots of CPU cycles than dpdk-related nodes do.
    • SVE/SVE2 proposal
      • Will send email to Damjan asking him to review
      • SVE proposal patch is upstreamed, call for comments - https://gerrit.fd.io/r/c/vpp/+/28986
      • No further comments from VPP community.
      • Apply the SVE/SVE2 on ethernet-input node. - Lijian
    • Repeat the 4x and 2x loop unrolling tests on Ampere server. - Jieqiang
    • Benchmark the performance of L2/L3/ACL tests using AVF PMD driver on Ampere server. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
        • Figure out corresponding configurations in CSIT scripts
        • Repeat the ACL ingress SL test cases locally for N1SDP.
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
      • Work on IPsec input node and VPP uses linear search on SPD lookup.
        • Will try loop unrolling on the SPD lookup.
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
        • Tuned version has higher SLC eviction than the default version, talk with CPU team to figure out the reason.
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
    • Plans

10/20/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Check with Dave about what we should do with CentOS-7 on Arm Jenkins if CentOS-8 is the main distro for VPP verification. - Juraj
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
        • Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
        • Errors happen when running latest VPP debug image, which was introduced by https://gerrit.fd.io/r/c/vpp/+/29490 - Lijian
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
        • Figure out which host of two hosts to run the Jenkins job.
        • Two failed test cases related to AVF plugin.
          • The root cause is the newer kernel version - 4.15.0-118-generic fails, 4.15.0-72-generic works.
          • Downgrade the kernel version to 4.15.0-72-generic and continue the VPP device testing.
          • Try the same experiment on X86 to see if this issue is arm-specific or not. - Juraj
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
      • Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
      • Start benchmarking AVF PMD driver in VPP on N1SDP.
      • Investigate the performance gap using AVF PMD driver between N1SDP and Cascade Lake. - Lijian
    • SVE/SVE2 proposal
    • Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
      • Finished the benchmarking and shared the data to team. - Jieqiang
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
        • Figure out corresponding configurations in CSIT scripts
        • Repeat the ACL ingress SL test cases locally for N1SDP.
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
      • Tune PMD config to achieve zero cache eviction for System Level Cache on N1SDP - kamalakshitha
      • Prefetches on IPsec and learn about the cache behavior. - kamalakshitha
    • Plans

10/13/2020

  • Attendees
    • Govindarajan Mohandoss
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
    • Tina Tsou
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
        • Check with CSIT maintainers about the concrete plan for the enablement of Centos8 gerrit Jenkins job on aarch64. - Juraj
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
        • Figure out which host of two hosts to run the Jenkins job.
        • Two failed test cases related to AVF plugin.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
      • Detect the CPU type from firmware for Perseus-CPU servers, need to confirm with customers. - Lijian
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
    • SVE/SVE2 proposal
    • Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
      • Finished the benchmarking and shared the data to team.
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
        • Figure out corresponding configurations in CSIT scripts
        • Repeat the ACL ingress SL test cases locally for N1SDP.
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
    • Plans

10/06/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Tina Tsou
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Totally 6x ThunderX1 servers in Nomad cluster
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - First step;
        • CentOS-8 on Arm Jenkins is created and could be triggered manually with 'beta-verify' and 'beta-merge' (Ubuntu-20.04 job will also be triggered)
        • https://gerrit.fd.io/r/c/ci-management/+/28960
    • VPP Device
      • CSIT will install normally used os distro and kernel.
      • 2x ThunderX2 servers are setup in FD.io lab. SSH and IPMI connections are working.
        • The servers are physically installed. Packages are installed. CSIT tests are run on these servers outside of jenkins. pxeboot (N/W boot) works fine with 10G NIC (Inband) and not with 1G NIC. One of the server works after reboot and the other server loses N/W connectivity.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate vendor CPUs and other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
    • SVE/SVE2 proposal
    • Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
      • Finished the benchmarking and shared the data to team.
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
      • Figure out corresponding configurations in CSIT scripts
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
    • Plans

09/29/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate Vendor CPUs with other Perseus CPUs
      • Nitin requires VPP generic image supporting 64B and 128B cache line CPU optimally at the same time - cannot be satisfied so far.
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
    • SVE/SVE2 proposal
    • Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
      • Finished the benchmarking and shared the data to team.
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
      • Figure out corresponding configurations in CSIT scripts
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
    • Plans

09/22/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Multi-arch support - Lijian
      • Key point is how to differentiate Vendor CPUs with other Perseus CPUs
    • Investigate VPP Intel AVF PMD driver - Lijian
      • Start investigating AVF code in VPP.
    • SVE/SVE2 proposal
    • Benchmark VPP scalability on N1SDP vs CascadeLake, with 3x CPUs.
      • Repeat scalability testing on N1SDP, e.g., 1T1C, 2T2C, 3T3C
      • Figure out corresponding configurations in CSIT scripts
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • IPsec on Arm platform. - Govind
    • Plans

09/15/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • LF will pay for the expense, and Vexhost has or will make the order for new RAM module.
      • Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
      • Check with Juraj with the latest news about the faulty RAMs.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - first step;
        • Add CentOS-7 on Arm will be second step.
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
        • Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
    • VPP Device
      • We can decommission 3x SoftIron servers directly, but for the existing ThunderX2 servers, the decommission with it could be temporarily. We probably will reinstall it in the near future.
      • Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
    • Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
    • Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
    • Budget plan for CSIT FD.io lab.
      • We have enough servers for VPP path & device tests.
      • We can ask the CSIT FD.io lab folks for saving rack space for arm servers.
      • We may plan to send new advanced servers for perf tests in future but we won't mention the specific server type.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • Vendor CPU server enablement in VPP - Lijian
      • Ready for internal review
      • Will discuss with VPP maintainer
    • Investigate VPP Intel AVF driver - Lijian
    • SVE
      • SVE intrinsics wrapper is done. Proposal patch is ready for review.
      • ACLE, architecture, sve-sve2-programming-example
      • SVE intrinsics is preferred.
      • Share dpdk team with SVE knowledge.
    • Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
      • Will repeat scalability testing on N1SDP.
    • Benchmark AVF driver btw Cascade Lake and N1SDP - Jieqiang
      • Will investigate AVF drivers on Arm. - Lijian
    • Jieqiang will figure out performance data for 1x, 10Kx flows on Cascade Lake in CSIT.
      • Conform if the system is same for the local dell server and cascade server in CSIT. - Jieqiang
      • Check if there are any test cases with 1t1c/2t2c/4t4c configured for 2n-clx testbed in CSIT - Jieqiang
      • Performance data; Configurations;
    • VM2VM
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Started system tuning on PMD TX direction.
      • Investigate mempool configuration.
      • Change the descriptor size by modifying the DPDK source code.
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans

09/08/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • LF will pay for the expense, and Vexhost has or will make the order for new RAM module.
      • Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Dave is preparing scripts to generate docker images automatically on both x86 and Arm - first step;
        • Add CentOS-7 on Arm will be second step.
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
        • Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
    • VPP Device
      • We can decommission 3x SoftIron servers directly, but for the existing ThunderX2 servers, the decommission with it could be temporarily. We probably will reinstall it in the near future.
      • Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
    • Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
    • Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
  • VPP
    • In Ubuntu-20.04, VPP on Arm will reprioritize compiler as gcc-10 > gcc-9.2.0 > clang-10
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • SVE
      • SVE intrinsics wrapper is done. Proposal patch is ready for review.
      • ACLE, architecture, sve-sve2-programming-example
      • SVE intrinsics is preferred.
    • Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
      • Will repeat scalability testing on N1SDP.
    • Benchmark AVF driver btw Cascade Lake and N1SDP - Jieqiang
      • Will investigate AVF drivers on Arm. - Lijian
    • Jieqiang will figure out performance data for 1x, 10Kx flows on Cascade Lake in CSIT.
      • Performance data; Configurations;
    • VM2VM
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Started system tuning on PMD TX direction.
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans

09/01/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Will confirm with Dean if Arm can pay for the expense. If yes, will send the proposal to vexhost.
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
        • Seems plugin working RAMs into empty slots will resolve the problem.
        • Juraj will send email to Machiek about the ownership of any FD.io lab servers, and who should pay for the charge.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
        • IPMI IP is configured via SSH Linux prompt. It's working fine now.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
        • Pending with Vexx host to proceed further.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
        • Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
    • VPP Device
      • Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
    • Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
    • Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
  • VPP
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
      • gcc-10.1.0 has compiling errors with latest VPP source code.
        • This issue is fixed by Jieqiang and available for internal review.
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • gcc-10 compiling issue is resolved and merged.
    • SVE
      • ACLE, architecture, sve-sve2-programming-example
      • SVE intrinsics is preferred.
    • Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
    • VM2VM
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Started system tuning on PMD TX direction.
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans

08/25/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
        • Seems plugin working RAMs into empty slots will resolve the problem.
        • Juraj will send email to Machiek about the ownership of any FD.io lab servers, and who should pay for the charge.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
        • IPMI IP is configured via SSH Linux prompt. It's working fine now.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
        • Pending with Vexx host to proceed further.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
        • Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
    • VPP Device
      • Mention the rack space request for the two ThunderX2 servers in CSIT meeting. - Juraj
    • Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
    • Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
  • VPP
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
      • gcc-10.1.0 has compiling errors with latest VPP source code.
        • This issue is fixed by Jieqiang and available for internal review.
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • SVE
      • ACLE, architecture, sve-sve2-programming-example
      • SVE intrinsics is preferred.
    • Benchmarked VPP on n1sdp on scalability, on 3x CPUs.
    • VM2VM
    • Transport use cases on VPP. - Govind
      • Discussed the node graph and topology.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans


08/18/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
      • Jieqiang is investigating some performance drop (between 2005 and 2008 releases) cases on Taishan servers.
      • https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh/
      • They have finished collecting data with performance testing setup, and the mrr daily is resumed
      • FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
      • Jieqiang will share investigation report, but so far there is no apparent performance differences.
        • Some 4T4C test cases on Taishan have obvious performance drop. Will compare the trending with x86 machines.
    • VPP Path
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
        • Pending with Vexx host to proceed further.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
        • Juraj/Jieqiang to help Dave Wallace to fix the script issues. Currently, the build process is done manually and will be automated.
    • VPP Device
    • Two ThunderX2 servers are received by Vexx host and currently in the storage warehouse.
    • Vexx host people will setup the servers and provide IP connectivity. Juraj will install the necessary software after that.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
      • Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
      • Investigating vlib_timer and timer wheel in VPP.
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
      • gcc-10.1.0 has compiling errors with latest VPP source code.
        • This issue is fixed by Jieqiang and available for internal review.
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans


08/11/2020

  • Attendees
    • Honnappa Nagarahalli
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
    • Filip Varga
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
      • Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
      • Investigating vlib_timer and timer wheel in VPP.
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
      • gcc-10.1.0 has compiling errors with latest VPP source code.
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans

08/04/2020

  • Attendees
    • Honnappa Nagarahalli
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
    • Filip Varga
  • General
  • CSIT
    • VPP Performance Test
      • They have finished collecting data with performance testing setup, and the mrr daily is resumed
      • FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
      • Jieqiang will share investigation report, but so far there is no apparent performance differences.
        • Some 4T4C test cases on Taishan have obvious performance drop. Will compare the trending with x86 machines.
    • VPP Path
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
      • Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
      • Investigating vlib_timer and timer wheel in VPP.
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
      • gcc-10.1.0 has compiling errors with latest VPP source code.
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Currently working on non-encryption optimization with PMD driver.
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans


07/28/2020

  • Attendees
    • Honnappa Nagarahalli
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • They have finished collecting data with performance testing setup, and the mrr daily is resumed
      • FD.io CSIT-2005 Release Report was released, https://docs.fd.io/csit/rls2005/report/
      • Jieqiang will share investigation report, but so far there is no apparent performance differences.
      • VPP performance testing is running once a week.
    • VPP Path
      • One ThunderX1 has faulty RAM. Will try to replace all the RAMs.
      • The second ThunderX1 has IPMI problem, but SSH is working fine.
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster.
        • Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Will confirm with Dave W. if he will add this Jenkins job and if he requires any help - Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been upstreamed for review and merge.
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Two ThunderX2 servers have been collected and shipped, and will target to arrive on July 30th.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
      • Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
      • Investigating vlib_timer and timer wheel in VPP.
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
    • N1SDP enablement. - Lijian
      • Preparing patches to enable creating big tables on huge-pages
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Focus both non-encryption and encryption cases.
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans

07/21/2020

  • Attendees
    • Honnappa Nagarahalli
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • 3x spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Arm has
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
      • Initial benchmarking with VPP hostack on N1SDP was done. 29Gb/s on N1SDP and 26Gb/s on Haswell.
      • Investigating vlib_timer and timer wheel in VPP.
    • Benchmarking btw gcc-10.1.0, clang-10/clang-9 and gcc-9.2.0 on Arm machines - Jieqiang
    • N1SDP enablement. - Lijian
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
        • Upstreamed and are using csit testing to verify the patch.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans


07/14/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community has started collecting performance data with these CSIT machines.
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
        • Two of the three ThunderX1 servers cannot be accessed.
        • Spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
        • Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
      • Investigating various No. of rx_q_bufs & tx_q_bufs
      • Investigating various No. of vector size, and check its affection on throughput
      • Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
    • ACL optimization investigation on n1sdp - Govind
      • Investigating using SPE counters to profile ACL plugin bottle-neck
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

07/07/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community has started collecting performance data with these CSIT machines.
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
        • Two of the three ThunderX1 servers cannot be accessed.
        • Spare ThunderX servers are used for CI and included in Nomad cluster. 1 Debugging server for VPP Dev and 3 servers (2 TX and 1 TX2) are unreachable through IPMI and one of them is reachable through SSH. IPMI unreachability is still investigated by Vexx host. CI functionality is restored with spare TX servers. TX2 server is unreachable through IPMI and VPP device jobs are not running. Faulty RAM on TX server is not fixed and yet to be debugged.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
        • Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
      • Investigating various No. of rx_q_bufs & tx_q_bufs
      • Investigating various No. of vector size, and check its affection on throughput
      • Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
    • ACL optimization investigation on n1sdp - Govind
      • Investigating using SPE counters to profile ACL plugin bottle-neck
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang


06/30/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community has started collecting performance data with these CSIT machines.
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • Will probably use 3xspare ThunderX1 servers as CI build server/nomad cluster.
        • Two of the three ThunderX1 servers cannot be accessed.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
        • Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • VPP hoststack TCP/CPS(Connnection per Second) investigation;
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • IP4-rewrite refactor patch brings performance improvement especially with 10K flows
      • Investigating various No. of rx_q_bufs & tx_q_bufs
      • Investigating various No. of vector size, and check its affection on throughput
      • Benchmark and compare PMU counters btw 4x and 2x loop unrolling on n1sdp
    • ACL optimization investigation on n1sdp - Govind
      • Investigating using SPE counters to profile ACL plugin bottle-neck
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

06/23/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community has started collecting performance data with these CSIT machines.
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
        • Two of the three ThunderX1 servers cannot be accessed.
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload.
        • The Dockerfile has been verified by Jieqiang, will send to Dave Wallace to use it for VPP Jenkins job.
        • Jieqiang will send email to Dave Wallace about CentOS-7 on Arm Jenkins job. - Jieqiang
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • L3FWD status
    • CSIT status
    • EPIC plan
      • SVE2 investigation in VPP;
      • VPP hoststack TCP/CPS(Connnection per Second) investigation;
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • Profiling with NMU-600 counters.
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

06/16/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community has started collecting performance data with these CSIT machines.
    • VPP Path
      • Juraj will follow or create new vexxhost ticket to replace faulty RAM.
      • Investigate Ubuntu-20.04 on Arm servers - Jieqiang
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
        • labelled by Dave Wallace to use it for VPP Jenkins job.
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
        • gcc-9 is hard-coded and used, so compilation issue is gone.
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • Profiling with NMU-600 counters.
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

06/09/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • VPP performance testing is running once a week.
      • Community will collect performance data with these CSIT machines.
      • IPSec tunnel configuration issue.
        • Issue is resolved.
        • Also failing on x86. CSIT maintainer is trying to root cause the problem.
          • Juraj to run the IPSec regression on Taishan server with the IPSec patch.
    • VPP Path
      • Investigate Ubuntu-20.04 on Arm servers - Jieqiang
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
        • labelled by Dave Wallace to use it for VPP Jenkins job.
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
        • gcc-9 is hard-coded and used, so compilation issue is gone.
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
      • Two ThunderX2 information will be confirmed with FD.io CSIT lab admin. - Jieqiang
      • Commit internal patch to support ThunderX2 in VPP device testing. - Jieqiang
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • If vexxhost can collect the hardware, will ship the servers asap.
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • Profiling with NMU-600 counters.
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

06/02/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • IPSec tunnel configuration issue.
        • Also failing on x86. CSIT maintainer is trying to root cause the problem.
          • Juraj to run the IPSec regression on Taishan server with the IPSec patch.
    • VPP Path
      • Investigate Ubuntu-20.04 on Arm servers - Jieqiang
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
        • labelled by Dave wallace to use it for VPP Jenkins job.
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
        • gcc-9 is hard-coded and used, so compilation issue is gone.
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers.
        • Internal patch is committed. Requires legal permission.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
      • 'make build-release CC=gcc' will override default clang-9 in vpp.
    • N1SDP enablement. - Lijian
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

05/26/2020

  • Attendees
    • Govindarajan Mohandoss
    • Juraj Linkes
    • Jieqiang Wang
    • Tina Tsou
  • General
  • CSIT
    • VPP Performance Test
      • IPSec tunnel configuration issue.
        • Also failing on x86. CSIT maintainer is trying to root cause the problem.
          • Juraj to run the IPSec regression on Taishan server with the IPSec patch.
    • VPP Path
      • Investigate Ubuntu-20.04 on Arm servers - Jieqiang
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
        • Questions on the docker file upload. The docker file needs to be tested with local VPP sand box before uploading. The docker file needs to be
        • labelled by Dave wallace to use it for VPP Jenkins job.
        • Vanessa Valderrama <vvalderrama@linuxfoundation.org>
        • 'Dave Wallace' <dwallacelf@gmail.com>
        • https://gerrit.fd.io/r/gitweb?p=ci-management.git;a=summary
        • gcc-9 is hard-coded and used, so compilation issue is gone.
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
    • Dean will schedule shipping these two TX2 servers to FD.io lab.
      • Update the document with server information before shipping the servers. Jieqiang will setup a meeting with Juraj regarding this documentation.
    • Update server information to CSIT documentation. - Juraj & Jieqiang
    • Prepare CSIT script patch for adding those two ThunderX2 servers. - Juraj & Jieqiang
      • This change can be done once TX2 servers are shipped to FDIO lab.
    • Dave wallace - Install nomad service in those two servers - Juraj & Jieqiang
      • Nomad takes care of redundancy and resources like CPU/Memory. 16 cores per job and 6 jobs in total.
    • The servers, intel NICs, and Mellanox NICs works good so far.
      • Root-causing the RDMA issue with Mellanox NIC.
    • ThunderX2 servers are in Arm local lab. Dean is setting up the hardware.
    • Two more ThunderX2 have just been ordered and are expected to arrive in Arm lab in April.
      • We are about to purchase two official ThunderX2 servers in market.
      • Raise the budget requirement from CE-OSS - Dean & Honnappa
      • Check the ThunderX2 configurations required - Govind & Juraj
    • Two ThunderX2 servers are installed in Arm lab.
    • Per patch regression: 2 node topology is freely available. New ARM setup can be made to run per patch regression.
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
    • N1SDP enablement. - Lijian
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang


05/19/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
    • Lijian Zhang
  • General
  • CSIT
    • VPP Performance Test
      • the other failure is related with VPP image on Arm, IPSec tunnel configuration issue.
        • Also failing on x86. CSIT maintainer is trying to root cause the problem.
    • VPP Path
      • Investigate Ubuntu-20.04 on Arm servers - Jieqiang
      • Investigate adding CentOS-7 on Arm Jenkins jobs - Juraj & Jieqiang
      • By fixing software bug, VPP can boot up normally with 16K/64K page size.
        • There's about 4-5 test failures in 'make test' when system is configured with 16K/64K page size - Lijian
    • VPP Device
    • VPP device job is running now and will be triggered per VPP patch and CSIT patch
  • FD.io lab
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
      • Will benchmark btw gcc-9 and clang-10 to decide which should be the default compiler, will sync up with Suresh.
    • N1SDP enablement. - Lijian
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
      • Upstream the ACL patch for CSIT performance testing experiment.
    • Trying to make IPsec enabled with Arm platform. - Govind
      • Basic IPsec functions are working. Will do benchmarking per CPU core.
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

04/28/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
  • General
  • CSIT
  • FD.io lab
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
      • Resolve vectorized endianness conversion error in Mellanox RDMA driver.
      • To benchmark Mellanox DPDK PMD driver vs RDMA driver in VPP. - Lijian
      • Confirm with Suresh on his benchmarking data/scripts with Mellanox NICs
    • Resolve VPP compiling issue with clang-6.
    • VPP default compiler is clang-9 now, which does not support optimized options -mcpu=neoverse-n1/-mtune=neoverse-n1
    • N1SDP enablement. - Lijian
      • Multi-arch, arch-specific compiling and dynamic function selection patch is merged.
      • IOMMU limitation issue is gone after upgrade the kernel and fw
        • Share kernel/fw upgrade version to Govind
      • Investigate 4x loop unrolling performance degradation issue.
      • Throughput performance drop as flow number increases in N1SDP.
    • ACL optimization investigation on n1sdp - Govind
      • Patch to remove redundancy prefetches are committed - Govind
      • Filed a confluence page to record the ACL investigation.
    • Trying to make IPsec enabled with Arm platform. - Govind
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • Will try to run VPP device testing on local ThunderX2 servers with XL710 25G NICs.
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang

04/28/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
    • Arthur Marshall
  • General
  • CSIT
  • FD.io lab
  • VPP
    • Enabled RDMA driver on Arm, and Mellanox DPDK PMD driver is also working on ThunderX2
    • Investigate bihash operations in L2 throughput are hot-spots
      • To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
      • To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
    • Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
    • N1SDP enablement. - Lijian
    • GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
      • Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
      • It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
    • iova_mode == VA not working issue is not root-caused
      • DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
      • However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
        • This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
    • Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
    • Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
      • The degradation is seen even when L3 cache is enabled.
    • Trying to make IPsec enabled with Arm platform. - Govind
    • Create Confluence page to record all the performance benchmarking data - Lijian
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
        • Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.

04/21/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
  • General
  • CSIT
  • FD.io lab
  • VPP
    • Investigate bihash operations in L2 throughput are hot-spots
      • To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
      • To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
    • Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
    • N1SDP enablement. - Lijian
    • gcc-10 is not working so far.
      • GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
      • Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
      • It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
    • iova_mode == VA not working issue is not root-caused
      • DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
      • However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
        • This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
    • Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
    • Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
      • The degradation is seen even when L3 cache is enabled.
    • Trying to make IPsec enabled with Arm platform. - Govind
    • Create Confluence page to record all the performance benchmarking data - Lijian
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
        • Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.

04/14/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
  • General
  • CSIT
  • FD.io lab
  • VPP
    • Investigate bihash operations in L2 throughput are hot-spots
      • To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
      • To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
    • Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
    • N1SDP enablement. - Lijian
      • GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
      • Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
      • It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
    • iova_mode == VA not working issue is not root-caused
      • DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
      • However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
        • This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
    • Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
    • Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
      • The degradation is seen even when L3 cache is enabled.
    • Trying to make IPsec enabled with Arm platform. - Govind
    • Create Confluence page to record all the performance benchmarking data - Lijian
    • Trying to reproduce the CSIT VPP device testing on local servers - Jieqiang
      • After VPP device scripts can run well on local servers, Jieqiang can investigate more features, IPv4, IPv6, Tunnel, IPSec.
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
        • Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.

04/07/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Juraj Linkes
    • Tina Tsou
    • Jieqiang Wang
  • General
  • CSIT
  • FD.io lab
  • VPP
    • Vectorization
    • Investigate bihash operations in L2 throughput are hot-spots
      • To confirm with Damjan if he has plan to rewrite l2-nodes or not - Lijian
      • To confirm CRC32 calculation is compiled with O2 or O3 options - Lijian
    • Benchmarking, profiling and comparation btw VPP and testpmd are done. Will review with team. - Lijian
    • N1SDP enablement. - Lijian
      • GCC-9.2.0 is used with "-march=armv8.2-a+crc+crypto -mtune=neoverse-n1" compiler options.
      • Multi-arch, arch-specific compiling and dynamic function selection patch is ready.
      • It seems dual loop-unrolling gives better performance over quad-loop-unrolling.
    • iova_mode == VA not working issue is not root-caused
      • DMA mapping btw iova & pa; VPP and DPDK are using va as iova, and then do the DMA mapping.
      • However IOMMU on N1SDP requires a limited memory space, less than 40 bits?.
        • This issue will not be seen in the latest N1 Firmware. Upgrading to latest Firmware is pending.
    • Share L2/L3/ACL throughput wiz & wo L3 cache - Govind
    • Will try with L3 cache enabled to see if performance drop as flow number increasing issue is fixed or not. - Govind
      • The degradation is seen even when L3 cache is enabled.
    • Trying to make IPsec enabled with Arm platform. - Govind
    • Create Confluence page to record all the performance benchmarking data - Lijian
    • Plans
      • N1SDP performance investigation and improvement - Planned - Lijian
      • ACL plugin investigation - Planned - Govind & Lijian
      • IPsec investigation - Indicative - Govind
      • Lockless data-plane investigation by Govind in backlog
      • Continue investigating CSIT and CI management script, and run CSIT script on local servers - Jieqiang
        • Jieqiang needs 2 Intel NICs to make the test bed ready for VPP Path tests. Jieqiang and Lijian to discuss with Juraj to run the jenkins job on CentOS.

03/31/2020

03/24/2020

03/17/2020

03/10/2020

03/03/2020

02/25/2020


02/18/2020


02/11/2020


02/04/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Juraj Linkes
    • Tina Tsou
  • General
  • CSIT
  • FD.io lab
    • Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
      • Cables for intel NICs have been ordered.
      • Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
    • Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
    • Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
    • Current Configurations:
      • RAM: 256G
      • Disk: 480G SSD
      • The boxes are coming with Qlogic cards which are not supported in VPP.
    • Changes required to the servers:
      • The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
      • Need 2 Intel NICs XL710-QDA2 for each server.
      • If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
      • Disk size to 480G
      • Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
      • Cables: N1, P1 to N2, P1 and so on
      • Cables for IPMI and Management port: 2
    • Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
    • Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
    • ThunderX1
  • VPP
    • Align Arm patches with VPP release plan.
      • F0 2020-01-08 APIs frozen. Only low-risk changes accepted on main branch.
      • RC1 2020-01-15 (F0+7) Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
      • RC2 2020-01-22 (RC1+7) Second artifacts posted.
      • Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
    • Vectorization
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Benchmarking AVF drivers on Arm servers - Jieqiang
      • VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
      • Check if performance tests includes AVF driver or not?
    • AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
      • Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
      • Will try one patch to enable N1SDP board.
      • Please try AVF with Mcbin if possible.
    • Investigate bihash operations in L2 throughput are hot-spots
      • Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
      • Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
      • Cache misses and CRC32 calculation are possible opportunities.
        • To check cycles by applying CRC32 calculation unrolling
    • Bench-mark VPP on Dawn N1SDP board
      • Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
    • Investigating bi-hash lockless implementation - Jason
      • Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
    • Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
    • EPIC for next quarter:

01/28/2020

  • Attendees
    • Govindarajan Mohandoss
    • Honnappa Nagarahalli
  • General
  • CSIT
  • FD.io lab
    • Two ThunderX2 servers are installed in Arm lab. But intel NIC can not be enumerated on one ThunderX2-01.
      • Cables for intel NICs have been ordered.
      • Universal rails will be tried with ThunderX2 servers. If it works, will send the rails to FD.io lab.
    • Script/commands to verify the NICs are ready. Will try with Mellanox NIC on ThunderX2-02 firstly - Lijian
    • Confirm the power cable requirements for the Vexxhost lab Inform about the 2 servers coming to the lab - Juraj
    • Current Configurations:
      • RAM: 256G
      • Disk: 480G SSD
      • The boxes are coming with Qlogic cards which are not supported in VPP.
    • Changes required to the servers:
      • The power cable specifications will be 4 x 6ft 14 AWG C13 to C14 Power Cables.
      • Need 2 Intel NICs XL710-QDA2 for each server.
      • If there is space for 2 more cards, we should add 2 Mellanox cards SR-IOV capability.
      • Disk size to 480G
      • Both the servers should have the NICs in the same PCIe slots. It does not matter which PCIe slots the cards are in.
      • Cables: N1, P1 to N2, P1 and so on
      • Cables for IPMI and Management port: 2
    • Will call a meeting once the ThunderX2 arrives to Dean, inviting Juraj/Vxxhost people in FD.io lab to make sure hardware ready.
    • Require and install 1G NIC for the management port because of the 1G management switch. - Lijian
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
    • ThunderX1
  • VPP
    • Align Arm patches with VPP release plan.
      • F0 2020-01-08 APIs frozen. Only low-risk changes accepted on main branch.
      • RC1 2020-01-15 (F0+7) Code complete, pull first release throttle branch, only bug fixes accepted on throttle train. After pull: main branch reopens for new feature / risky commits. First artifacts posted.
      • RC2 2020-01-22 (RC1+7) Second artifacts posted.
      • Formal Release 2020-01-29 (RC2+7) 20.01 release artifacts available
    • Vectorization
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Benchmarking AVF drivers on Arm servers - Jieqiang
      • VPP+DPDK (5.18Mpps/5.21Mpps) VS VPP+AVF (8.39Mpps/8.38Mpps) on ThunderX2.
      • Check if performance tests includes AVF driver or not?
    • AVF failed to create AVF interface on SMP CPU on N1SDP/Qualcomm - Jieqiang
      • Current N1SDP does not support SRIOV, so cannot run AVF on N1SDP.
      • Will try one patch to enable N1SDP board.
      • Please try AVF with Mcbin if possible.
    • Investigate bihash operations in L2 throughput are hot-spots
      • Apply prefetches and dual loop with l2_fwd node, and failed to l2_learn
      • Lock-free allocation/free give 7%-11% improvement on ThunderX2, but no improvement on x86 and CortexA72.
      • Cache misses and CRC32 calculation are possible opportunities.
        • To check cycles by applying CRC32 calculation unrolling
    • Bench-mark VPP on Dawn N1SDP board
      • Done finishing single flow with L2/L3/input-ACL on N1SDP board, will share the data.
    • Investigating bi-hash lockless implementation - Jason
      • Firstly apply make_working_copy for all bihash update operations, and then apply RCU to make look-up lock-less
    • Internal CI is not working due to Python3.6 upgrade in vpp code repository. - Jieqiang
    • EPIC for next quarter:

01/21/2020


01/14/2020

01/07/2020


12/17/2019

12/10/2019


12/03/2019

11/26/2019

11/19/2019

11/12/2019

10/29/2019

10/22/2019

10/15/2019

10/08/2019

10/01/2019

09/24/2019

09/17/2019

09/10/2019

09/03/2019

08/27/2019

08/20/2019

08/13/2019

08/06/2019

07/30/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
    • Christian Hopps
  • General
  • CSIT
  • FD.io lab
  • VPP
    • https://tools.ietf.org/html/draft-hopps-ipsecme-iptfs-01 - From Christian
    • Align Arm patches with VPP release plan.
      • Once our work items are added to release plan, the community is forced to review the patches and provide the feedback in a timely manner.
      • Will check VPP release schedual and map with Arm Quaterly plan.
      • Note down patches in community review and align them to VPP release plan.
      • It has been challenging to do that in VPP.
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue optimize it with relaxed atomic intrinsics - Lijian
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
        • The patch is also enabled for x86. Will ask maintainer to review.
    • Spinlock/read-write lock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
      • Spinlock with inner loop got improvement on both x86 and Arm.
      • Read/write lock got a little degradation with the patch.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
      • Jieqiang checked the video by Sirshak
    • Apply dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
      • Will do bench-marking profiling on mcbin/Bluefield.
    • Think of memory usage and optimization for smaller device/memory
    • Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
    • To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

07/23/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
    • Christian Hopps
  • General
  • CSIT
    • VPP Performance Test
    • Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
    • Trying to fix all the failures with daily test in performance test. Basically almost all the tests passed locally.
    • Only 1 out of 199 test cases failed, 8 test cases show random 'show interface' failure.
    • Some failures are related with 'show hardware'/'show interface'/'show vhost dump', time-out.
    • Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
      • creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • VPP Path
      • Working on MAC learning test failures on Cortex-A72 server - Jieqiang
        • Enlarge duration can fix the failure, but will investigate more details.
        • Issues have been fixed in latest master branch. Investigating the details.
      • cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
    • VPP Device
      • VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
        • Send email and current debug details to community calling for volunteer to fix it. - Lijian
        • pmalloc module test cases failed on Arm server.
      • Changes are uploaded to community gerrit.
      • vpp VMs seems to bring up well. Will work on init script and bring up vpp.
        • VM tests passed. Patches are to be submitted for community review.
        • All the patches are merged and all images are built.
      • Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
      • Ed to help set up numad cluster with dual ThunderX and one ThunderX2
  • FD.io lab
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue optimize it with relaxed atomic intrinsics - Lijian
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
        • The patch is also enabled for x86. Will ask maintainer to review.
    • Spinlock/read-write lock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
      • Spinlock with inner loop got improvement on both x86 and Arm.
      • Read/write lock got a little degradation with the patch.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
      • Inform MAP owner that Jieqiang will take care of MAP on VPP. - Lijian
    • Apply dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
      • Will do bench-marking profiling on mcbin/Bluefield.
    • Think of memory usage and optimization for smaller device/memory
    • Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
    • To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

07/16/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
    • Christian Hopps
  • General
  • CSIT
    • VPP Performance Test
    • Performance job is merged. https://jenkins.fd.io/view/csit/job/csit-vpp-perf-mrr-daily-master-3n-tsh
    • Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
      • creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • VPP Path
      • Working on MAC learning test failures on Cortex-A72 server - Jieqiang
        • Enlarge duration can fix the failure, but will investigate more details.
        • Issues have been fixed in latest master branch. Investigating the details.
      • cross compilation in VPP PATH. Jira(Juraj): https://jira.fd.io/browse/CTP-3
    • VPP Device
      • VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
        • Send email and current debug details to community calling for volunteer to fix it. - Lijian
      • Changes are uploaded to community gerrit.
      • vpp VMs seems to bring up well. Will work on init script and bring up vpp.
        • VM tests passed. Patches are to be submitted for community review.
        • Patch is splited into three small pieces. Two patches (kernel image for VM test/generic CSIT changes to support ThunderX2 testbed) are merged. Third patch about code changes for VM test to be merged, Arm specific code and use kernel image.
      • Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
        • Docker images for both Arm and x86 are merged and available.
        • Docker image is verified on Arm server, but to verify it on x86 server also and try it in Jenkins.
      • Ed to help set up numad cluster with dual ThunderX and one ThunderX2
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Require one more ThunderX2 to form a normal cluster (1xThunderX + 2xThunderX2), to enable voting right for Arm servers in CSIT
      • It’s 1RU blade ThunderX2.
      • The machine will be handled by Dean’s team. Cambridge folk will set up the machine before sending it to FD.io lab.
      • The machine should have a big RAM, more than 120G and 256G preferred.
      • The machine should Three NICs (XL710-QDA2, 2x40G).
      • The script assumes the two ThunderX2 have the same NIC type, same fiber SFP type, and NICs are plugged into same PCI slots.
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue optimize it with relaxed atomic intrinsics - Lijian
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
        • The patch is also enabled for x86. Will ask maintainer to review.
    • Spinlock/read-write lock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
      • Spinlock with inner loop got improvement on both x86 and Arm.
      • Read/write lock got a little degradation with the patch.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Apply dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
      • Will do bench-marking profiling on mcbin.
    • Think of memory usage and optimization for smaller device/memory
    • Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
    • To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

07/09/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
    • Christian Hopps
  • General
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
    • Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
      • creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • VPP Path
    • VPP Device
      • VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
        • Send email and current debug details to community calling for volunteer to fix it. - Lijian
      • Changes are uploaded to community gerrit.
      • vpp VMs seems to bring up well. Will work on init script and bring up vpp.
        • VM tests passed. Patches are to be submitted for community review.
      • Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
        • Docker images for both Arm and x86 are merged and available.
      • Ed to help set up numad cluster with dual ThunderX and one ThunderX2
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
      • Update the current status to Pravin. - Lijian
      • The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
      • Require a bigger than 120G RAM, prefer 256G
      • Three NICs and each has two ports.
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue optimize it with relaxed atomic intrinsics - Lijian
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
    • Spinlock/read-write lock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
      • Spinlock with inner loop got improvement on both x86 and Arm.
      • Read/write lock got a little degradation with the patch.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Apply dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Submitted patches on applying on dpdk-input, ethernet-input, ip4-input, ip4-rewrite nodes
      • Will do bench-marking profiling on mcbin.
    • Think of memory usage and optimization for smaller device/memory
    • Think about running VPP on big-endian CPU, as there's mixture usage of gcc vector extension and vector intrinsics
    • To confirm the firmware, vpp side to enable mcbin cache stashing - Honnappa
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective


07/02/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
  • General
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
    • Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
      • creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • VPP Path
    • VPP Device
      • VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
        • Send email and current debug details to community calling for volunteer to fix it. - Lijian
      • vpp VMs seems to bring up well. Will work on init script and bring up vpp.
      • Will confirm with Ed about where to upload VPP docker for VPP device - Juraj
      • Set up numad cluster with dual ThunderX and one ThunderX2
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
      • Update the current status to Pravin. - Lijian
      • The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
      • Require a bigger than 120G RAM, prefer 256G
      • Three NICs and each has two ports.
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue, remove atomic intrinsics and use lock version only - Lijian
      • Have verified removing atomic intrinsics in message_queue alloc/free APIs, and require confirmation from Florin.
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
    • Spinlock/read-write lock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
    • Fix ip4_forward compiling - Jason
      • Will check gerrit CI/CD related with that patch. Check why it's not warning in gerrit Jenkins.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Spread dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Will do bench-marking profiling on mcbin.
    • Think of memory usage and optimization for smaller device/memory
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

06/25/2019

  • Attendees
    • Tina Tsou
    • Honnappa Nagarahalli
    • Lijian Zhang
    • Jieqiang Wang
    • Jason Zhang
    • Juraj Linkes
  • General
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server: https://jira.fd.io/browse/VPP-1569
    • Adding Taishan test bed to CSIT Status: https://gerrit.fd.io/r/#/c/16850/
      • creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • VPP Path
    • VPP Device
      • VPP tap interface is not working on all Arm servers. Works on stable/1810, but not working on stable/1901.
      • Crypto test cases, will use dpdk driver if configured, native-vpp implementation, fall back to openSSL
        • Will try Crypto test cases next week - Juraj
      • Juraj to send Lijian the details of vpp VMs, Lijian will confirm internally
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • Discuss requiring another ThunderX2 1U blade with Pravin and Tina - Lijian
      • Firstly will sponsor the machine
      • The machine will be handled by Jingjing's team. Cambridge folk will set up the machine before sending it to FD.io lab.
      • Require a bigger than 120G RAM, prefer 256G
      • Three NICs and each has two ports.
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Message queue, remove atomic intrinsics and use lock version only - Lijian
      • Have verified removing atomic intrinsics in message_queue alloc/free APIs, and require confirmation from Florin.
    • Vectorization
        • Optimize eth_input_adv_and_flags_x4 is upstreamed and under community review.
    • Spinlock optimization - Jason
      • Refactored spinlock and added test file for spinlock. Patches are under internal review.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Spread dual/quad optimization - Lijian
      • Benchmarking on Cortex-A72 with dpdk, ethernet-input, ip4 rewrite, tx-output nodes
      • Will do bench-marking profiling on mcbin.
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

06/18/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina Tsou
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj Linkes
  • General
  • CSIT
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
      • Will confirm with Florin to remove atomic intrinsics in message_queue alloc/free APIs
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - Upstreamed.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Investigate hyperscan plugin in VPP - Sirshak
    • Spread qual/quad optimization - ethernet-input
    • Redo perf/MAP profiling/bench-marking
      • DPI plugin?
    • EPIC for next quarter:
      • Apply dual/quad optimization on more data path nodes
      • Investigate and optimize VPP hash and bihash library
      • VPP translation overhead analysis btw Mbuf and VLIB buffer ENTNET-1293
      • VPP Memif performance analysis and optimization ENTNET-1292
      • VPP l3fwd performance analysis and optimization ENTNET-751
      • Using MAP with VPP ENTNET-1288

06/11/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina Tsou
    • Lijian Zhang
    • Jieqiang Wang
    • Juraj
  • General
  • CSIT
  • FD.io lab
    • Require two ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
      • Will confirm with Florin to remove atomic intrinsics in message_queue alloc/free APIs
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - Upstreamed.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Investigate hyperscan plugin in VPP - Sirshak
    • Spread qual/quad optimization - ethernet-input
    • Redo perf/MAP profiling/bench-marking
      • DPI plugin?
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

06/04/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina Tsou
    • Lijian Zhang
    • Jieqiang Wang
    • Stan
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - Upstreamed.
    • MAP with VPP - error is resolved. Sort of working. Record the details.
    • Investigate hyperscan plugin in VPP - Sirshak
      • DPI plugin?
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

05/28/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina
    • Lijian Zhang
    • Vijay (vijayakumar.rajamanickam@nokia.com)
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - internal Review.
    • MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
    • Investigate hyperscan plugin in VPP - Sirshak
      • DPI plugin?
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

05/21/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina
    • Lijian Zhang
    • Vijay (vijayakumar.rajamanickam@nokia.com)
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - internal Review.
    • MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
    • Investigate hyperscan plugin in VPP - Sirshak
      • DPI plugin?
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

05/14/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina
    • Lijian Zhang
    • Vijay (vijayakumar.rajamanickam@nokia.com)
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
    • VPP generic distro package building patch - Patch updated. Require Damjan's follow up review.
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - internal Review.
    • MAP with VPP - Tried internal Patch still failing. Continuing to work on it.
    • Investigate hyperscan plugin in VPP - Sirshak
      • DPI plugin?
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

05/07/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina
    • Lijian Zhang
    • Vijay (vijayakumar.rajamanickam@nokia.com)
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
      • Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input.
    • VPP generic distro package building patch - Patch updated Damjan's follow up review required.
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • Vectorization in esp-encrypt, optimize memcpy_le. Upstreamed(https://gerrit.fd.io/r/#/c/18398/). - Lijian
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - internal Review.
    • MAP(Arm Proprietary Performance Analysis Tool) with VPP - Tried internal Patch still failing. Continuing to work on it.
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

04/30/2019

  • Attendees
    • Sirshak Das
    • Honnappa Nagarahalli
    • Tina
  • General
  • CSIT
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
  • VPP
    • VPP host-stack Hotspots
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
      • Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input.
    • VPP generic distro package building patch - Patch updated Damjan's follow up review required.
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - ongoing - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • VPP machiatto bin showing some unstable performance.
    • Vectorization
      • Vectorization in esp-encrypt, optimize memcpy_le. Upstreamed(https://gerrit.fd.io/r/#/c/18398/). - Lijian
      • ethernet-input causes performance drop on AArch64.
        • There's performance drop issue after the ethernet-input optimization. The major reason is after the refactor, if promiscuous is enabled on NIC, all traffic from the NIC will fall into so-called slow path.
        • A vectorized patch to optimize eth_input_adv_and_flags_x4 is under internal review.
    • TAS patch - internal Review.
    • MAP(Arm Proprietary Performance Analysis Tool) with VPP - Tried internal Patch still failing. Continuing to work on it.
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective

04/23/2019

  • Attendees
    • Sirshak Das
    • Lijian Zhang
    • Juraj Linkeš
    • Vijay
    • Nitin
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • List all the blockers on aarch64 in CSIT wiki page - Done - Stan or Juraj
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • QSFP+ is available and working now.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
      • Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
      • These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
      • Investigate why these three blades have only one numa node - Juraj
  • VPP
    • Investigate session_queue_node_fn/vlib_worker_loop.
      • Decrease or remove ATOMIC_ACQUIRE atomics in foreach_device_and_queue
      • Investigate ((xd->flags & DPDK_DEVICE_FLAG_ADMIN_UP) == 0) in dpdk_device_input
    • Rebase VPP distro package building patch; contact Damjan in slack; Talk with Damjan in vpp meeting - Lijian & Sirshak
    • Investigating message queue, understand use case with svm queue, talk the ideas with Florin - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Perf degradation is fixed. Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK sample apps on it now.
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
    • Vectorization
    • TAS patch will be ready soon (Sirshak)
    • MAP with VPP is ongoing - Sirshak
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective
  • Action Items - Last Week
  • Action Items - Next Week

04/16/2019

  • Attendees
    • Sirshak Das
    • Lijian Zhang
    • Juraj Linkeš
    • Vijay
    • Nitin
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • List all the blockers on aarch64 in CSIT wiki page - Done - Stan or Juraj
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • QSFP+ is available and working now.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
      • Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
      • These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
      • Investigate why these three blades have only one numa node - Juraj
  • VPP
    • Investigate session_queue_node_fn/vlib_worker_loop. - https://jira.arm.com/browse/ENTNET-1179 - Done
      • Will create two Jira tickets to track the findings. - Lijian
    • Rebase VPP distro package building patch; contact Damjan in slack; Talk with Damjan in vpp meeting - Lijian & Sirshak
    • Investigating message queue - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Perf degradation is fixed. Investigating performance degradation on Bluefield - Sirshak
        • Will resume Taishan host-stack setup - Lijian
      • Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK sample apps on it now.
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
    • Vectorization
    • EPIC for next quarter:
      • ethernet-input - Planned (Lijian). will implement for aarch64 128bits only
      • Message Queue - Planned (Lijian)
      • VPP svm_fifo patch performance optimization on A72 cores – Planned (Sirshak)
      • TAS patch (Sirshak)
      • MAP with VPP - Planned (Sirshak)
      • Roadmap for TCP optimization
        • Timer implementation - (Sirshak) - Indicative
        • perf analysis - Planned (Sirshak)
          • TCP state machine from weak memory model perspective
  • Action Items - Last Week
  • Action Items - Next Week

04/09/2019

  • Attendees
    • Sirshak Das
    • Lijian Zhang
    • Juraj Linkeš
    • Nitin
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Honnappa Nagarahalli
  • General
  • CSIT
    • VPP Performance Test
    • List all the blockers on aarch64 in CSIT wiki page - Done - Stan or Juraj
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • QSFP+ is available and working now.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
      • Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
      • These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
      • Investigate why these three blades have only one numa node - Juraj
  • VPP
  • VPP Hoststack
    • Investigate session_queue_node_fn/vlib_worker_loop. - https://jira.arm.com/browse/ENTNET-1179 - Done
    • Rebase VPP distro package building patch; contact Damjan in slack; Talk with Damjan in vpp meeting - Lijian & Sirshak
    • Investigating message queue - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Perf degradation is fixed. Investigating performance degradation on Bluefield - Sirshak
      • Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK sample apps on it now.
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
    • Vectorization
      • Vecterization in esp-encrypt, https://gerrit.fd.io/r/#/c/18398/ - Get improvement on ThunderX/OcteonTX/Taishan, but degradation on ThunderX2 - Lijian
      • ethernet-input - will implement for aarch64 128bits only
      • Create vectorization specific EPIC - Lijian
  • Action Items - Last Week
  • Action Items - Next Week

04/02/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Nitin
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • Investigate session_queue_node_fn/message queue data structure. - Investigating the source code
    • Review https://gerrit.fd.io/r/#/c/18398/ - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Upstreamed. Perf degradation is fixed.
      • Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK on it now.
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • List all the blockers on aarch64 in CSIT wiki page - Stan or Juraj
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Both binaries and packages built with generic option by default, and provide Makefile variable NATIVE_OPTIMIZE=Y for end user to build native optimized images.
      • Prepare email and a draft patch asking comments from community - Lijian
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • QSFP+ is available and working now.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
  • FD.io lab
    • Require one ThunderX2(currently only one thunderX2 in the lab) in FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
      • Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
      • These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
      • Investigate why these three blades have only one numa node - Juraj
  • VPP
    • Write description/expectation about the two NEON related patch - Lijian
    • Investigating performance degradation on CortexA72 - Sirshak
    • Message queue - Sirshak
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - To close it.
    • Vectorization
      • ethernet-input - no progress yet
    • 128B cache line size
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

03/26/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Nitin
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • Investigate session_queue_node_fn/message queue data structure. - Investigating the source code
    • Review https://gerrit.fd.io/r/#/c/18398/ - Lijian
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Upstreamed. Perf degradation is fixed.
      • Octeon-Tx Status(Sirshak): Done by Malvika. Running DPDK on it now.
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Both binaries and packages built with generic option by default, and provide Makefile variable NATIVE_OPTIMIZE=Y for end user to build native optimized images.
      • Prepare email and a draft patch asking comments from community - Lijian
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • QSFP+ is available and working now.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: server is set up. Management connection works. Intel NICs are well connected. Will prepare the server for VPP device testing. Now is working on containers for VPP device. Will probably be able to run VPP device tests manually this week.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Management connection thru QSFP+ switch is working now. Moving VPP device work to ThunderX1 blade servers.
      • Will use these four new ThunderX1 servers for CI, Genkins to replace the previous three old ThunderX1 servers.
      • These four ThunderX1 blade are not identical. The first one has two numa nodes, and other three blades have one numa node.
      • Investigate why these three blades have only one numa node - Juraj
  • VPP
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - To close it.
    • Vectorization
      • ethernet-input - no progress yet
    • 128B cache line size
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

03/19/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • vlib_worker_loop and session_queue_node_fn are two major hot-spots. - Just started
    • Enable NEON instruction in Buffer pool free function. Patch is committed.
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Upstreamed, but still working on issues, e.g., performance degradation
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Done by Malvika.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
      • Prepare email and a draft patch asking comments from community - Lijian
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
        • Juraj to resend email to Mahamad about the details, including Sirshak and Tina
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
        • Confirm with Gorka if their mcbin can support docker. If yes, then ask them to provide image with their latest kernel/file system/dtd
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node. Also blocked by QSFP+ issue.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - In internal review.
    • Vectorization
    • 128B cache line size
      • VPP image with 128B cache line size crashed on ThunderX2 - Cannot reproduce crash with my setup
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week
    • Commit VPP distro making patch - Lijian
    • Plugin 25G NIC Taishan server, and connect the 25G ports to x86 25G NIC - Lijian
    • Follow Jianlin's suggestion, update Uboot and Kernel, and then sync up with Juraj - Lijian

03/12/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
    • Tina to update the meeting notice.
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • vlib_worker_loop and session_queue_node_fn are two major hot-spots. - No progress
    • Enable NEON instruction in Buffer pool free function. Patch is committed.
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. All test failures are resolved.
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Switched to Malvika.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
      • Prepare email and a draft patch asking comments from community - Lijian
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - In internal review.
    • Vectorization
    • 128B cache line size
      • VPP image with 128B cache line size crashed on ThunderX2
    • thunderx2 crashing - No update
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week
    • Commit VPP distro making patch - Lijian
    • Plugin 25G NIC Taishan server, and connect the 25G ports to x86 25G NIC - Lijian
    • Follow Jianlin's suggestion, update Uboot and Kernel, and then sync up with Juraj - Lijian

03/05/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • vlib_worker_loop and session_queue_node_fn are two major hot-spots. - No progress
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. All test failures are resolved.
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Switched to Malvika.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560 - No progress
      • Investigate with latest VPP code on x86 server - Lijian - Send emails to vpp-dev mailor if there's problem. Will not put much effort.
    • Vectorization
      • ethernet-input
      • buffer pools
    • 128B cache line size
      • Will try this on Taishan server - Slightly performance degradation with 128 bytes cache line
    • thunderx2 crashing - No update
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

02/26/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • el0_sys hot-spot on Taishan D05 only, no plan to fix it.
    • vlib_worker_loop and session_queue_node_fn are two major hot-spots. - No progress
    • memcpy optimization
      • memcpy patch verification on taishan by khem l3 forwarding usecase- Lijian Status(khem): No updates.
      • memcpy patch consumes more clocks in OcteonTX2 - updated by Nitin.
      • Stopped working on this patch.
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch. Test failure on SCTP, not root-caused yet.
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation. - Switched to Marvikar
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config, almost done with patch https://gerrit.fd.io/r/#/c/16837/ - Done
      • b. merging CSIT patch. - Closing done
      • c. creating a job. - Everything is ready except the docker image
    • Target: master trending job - firstly create trending graph from daily data; then create release report(require some manual work)
    • Add license header/copy right to scripts - Sirshak/Honnappa to confirm with Andy Waffa
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Buffer Pools per NUMA
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560
      • Investigate with latest VPP code on x86 server - Lijian - Send emails to vpp-dev mailor if there's problem. Will not put much effort.
    • Vectorization
      • ethernet-input
      • buffer pools
    • 128B cache line size
      • Will try this on Taishan server - Slightly performance degradation with 128 bytes cache line
    • Qualcomm no change iperf3
    • thunderx2 crashing - No update
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

02/19/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • After assigned dedicated CPU processors for VPP main/VPP worker/iperf3 server, both ThunderX2 and Taishan Server VPP hoststack give better performance compared with Linux stack.
    • memcpy optimization
      • memcpy patch verification on taishan by khem l3 forwarding usecase- Lijian Status(khem): No updates.
      • memcpy patch consumes more clocks in OcteonTX2 - updated by Nitin.
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch.
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • VPP running on Arm side, x86 iperf3 client observes unstable performance rate.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible. https://jira.fd.io/browse/VPP-1566
    • Installed VPP crashed on Taishan server, https://jira.fd.io/browse/VPP-1569
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
    • Target: master trending job
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
      • Confirm if Jianlin's board has the exactly same plugable switches with Juraj's boards - Lijian
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Buffer Pools per NUMA
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560
    • 1GB page taking long time Status: fixed.
      • Investigate with latest VPP code on x86 server
    • Vectorization
      • ethernet-input
      • buffer pools
      • memcpy
    • 128B cache line size
      • Will try this on Taishan server - Lijian
    • Qualcomm no change iperf3
    • thunderx2 crashing - No update
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

02/11/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • memcpy optimization
      • memcpy patch verification on taishan by khem l3 forwarding usecase- Lijian Status(khem): No updates.
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation Status(Sirshak): no updates.
      • svm_fifo: Status(Sirshak): Working on fixing VPP Path errors from svm_fifo patch.
      • Octeon-Tx Status(Sirshak): yet to try steps from gorka for usb ubuntu rootfs installation.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
  • CSIT
    • VPP Performance Test
    • Package Installation error Status(Juraj): interfaces configured numa node 2,3 are not visible.
    • Estimates from Khem and Stan/Juraj. Status: https://gerrit.fd.io/r/#/c/16850/
      • a. Host Config
      • b. merging CSIT patch.
      • c. creating a job.
    • Target: master trending job
    • VPP Path
    • VPP Device
      • thunderx Status: 1-node topology was rewired because of QSFP+ switch.
      • mcbin: Kernel Migration on mcbin. Status: was able to update uboot but not boot the new kernel. Jialin suggested different boot parameters, juraj yet to try.
      • thunderx2: Status: Talk to edk about deployment strategy with 1-node.
  • FD.io lab
    • ThunderX1
      • QSFP+ switch for ThunderX1 Status: ONL OS to be installed on QSFP+ switch.
      • Juraj setup call with LF people. Status: Done.
    • ThunderX2
      • Cables: Sent. Juraj to open another tkt for wiring the ThunderX2.
  • VPP
    • Buffer Pools per NUMA
    • Verify effects and make NEON changes Jira: https://jira.fd.io/browse/VPP-1560
    • 1GB page taking long time Status: fixed.
    • Vectorization
      • ethernet-input
      • buffer pools
      • memcpy
    • 128B cache line size
    • Qualcomm no change iperf3
    • thunderx2 crashing
    • Taishan/A72 Status: Khem to try 128B cache line on taishan (performance difference).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

02/05/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Gorka
    • Fede
    • Honnappa Nagarahalli
  • General
  • VPP Hoststack
    • memcpy optimization
      • Check optimized memory copy version are deployed on Taishan and ThunderX2 during runtime - Lijian
      • Send memcpy patch to Khem and Fede for further verification - Lijian Status: fede: small improvement in mcbin with iperf3, khem to try them with l3 forwarding
    • iperf3 performance with Hoststack.
      • ip4_local_inline quad loop under investigation
      • Working on svm_fifo alternate version with front and back pointers synchronized instead of cursize.
    • Verifying per NUMA node buffer pool https://gerrit.fd.io/r/#/c/16638/
      • sirshak create jira id in fd.io jira. https://jira.fd.io/browse/VPP-1560
      • Hanging of VPP is actually VPP taking a lot of time to allocate 400K chunks for 1GB - Damjan has this in his todo list
      • gcc-8 compilation still fails on ARM.
      • Octeon-Tx failure. Status: unknown
    • Gorka is trying some optimal configs for VCL. Status: no updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • OcteonTx boots to buildroot with no dhclient hence an impasse. Still not clear how to use USB stick.
  • CSIT
    • VPP Path
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Status: no updates.
      • Kernel Migration on mcbin. Status:
      • ThunderX2:
    • VPP Performance Test
      • Stan and Khem to come up with a summary of current status and an estimate of at least upstreaming basic L2/L3 performance suites.
      • Juraj to come up with a solution for NUMA node anamoly in Taishan.
      • https://gerrit.fd.io/r/#/c/16850/ Status: Juraj has a version all ready to work. Package installation blocker.
      • Package installation error Status: Juraj to investigate logs.
  • FD.io lab
    • ThunderX1 -
      • New QSFP+ switch for ThunderX1 is available now: QSFP+ to be connected SFP+ switch.
      • Juraj to setup a call with LF folks on.
    • ThunderX2 -
      • Andy still waiting cables.
      • Juraj to remind Andy of when the cable will be available.
      • Juraj to follow up on ssh connectivity to thunderx2.
  • VPP
    • Vectorization
      • [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
      • [Lijian] Check if setting default cache line size to 128 will degradate thru-put on Taishan/Qualcomm/ThunderX2
        • no perf diff in Qualcomm
        • vpp crashes on thunderx2
        • waiting for results on A72 (Taishan)
      • [Sirshak] on ethernet-input node, investigate vectorized buffer index, Damjan's per numa node buffer pool patch. Status: No updates
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
  • Action Items - Next Week

01/29/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Lijian Zhang
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Honnappa Nagarahalli
    • John Ddigilio
  • General
  • VPP Hoststack
    • TaiShan Server with Debian distro crashed with command of 'ip probe-neighbor' when doing VPP hoststack with iperf3
    • With 64 bytes packets, on ThunderX2, 10G NIC, VPP hoststack bandwidth is about 1/2 of Linux Kernel stack.
    • With 64 bytes packets, on Taishan, 10G NIC, VPP hoststack bandwidth is about 2x of Linux Kernel stack.
    • Memory copy patch gives 4% improvement on VPP hoststack on Taishan server.
    • Check optimized memory copy version are deployed on Taishan and ThunderX2 during runtime - Lijian
    • Send memcopy patch to Khem and Fede for further verification - Lijian
    • iperf3 performance with Hoststack.
      • Investigating lock-less fifo, ip4_local_forward node and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
    • Verifying https://gerrit.fd.io/r/#/c/16638/ - Suppose to give better performance, but VPP hang with this patch on some Arm machines.
    • Gorka is trying some optimal configs for VCL. - No Updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • OcteonTX is received in ARM lab. Will boot it up firstly and then start doing profiling with it.
  • FD.io lab
    • ThunderX1 -
      • New Arista switch for ThunderX1 is available now. Gathering details that required by LF lab before sending the switch to CSIT lab. - Juraj - Andy will try to send the switch to CSIT this Thursday.
    • ThunderX2 -
      • Cable type is confirmed. Procurement is in the process.
      • Juraj to remind Andy of when the cable will be available.
      • Require access to these servers in FD.io lab. Anton gives the IP to access them.(ADMIN/ADMIN)
  • CSIT
    • VPP Path
      • So far so good.
      • ARM CI results are overwritten by x86 machines. Should be a Jenkin issue. Monitor if this corner will happen again. - Juraj
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts. Juraj is able to modify, execute the scripts in Container. Things to fix is scripts for 1-link 1-node topology and interfaces binding to VPP. Is able to run successfully a traffic test.
      • Kernel Migration on mcbin. Juraj is able to build all the images, but got kernel panic. Try with the more recent uBoot version. Tried latest uBoot image, but still has the same issue.
      • Juraj to investigate further work once ThunderX2 is available.
    • VPP Performance Test
  • VPP
    • Vectorization
      • [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
      • [Lijian] Check if setting default cache line size to 128 will degradate thru-put on Taishan/Qualcomm/ThunderX2
      • [Sirshak] on ethernet-input node, investigate vectorized buffer index.
    • Memory Ordering
      • [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
    • [Juraj] IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
    • [Juraj] Kernel Migration on mcbin. Juraj is able to build all the images.
  • Action Items - Next Week
    • [Sirshak] -

01/22/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Lijian Zhang
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Honnappa Nagarahalli
    • John Ddigilio
  • General
  • VPP Hoststack
    • TaiShan Server with Debian distro crashed with command of 'ip probe-neighbor' when doing VPP hoststack with iperf3
    • With 64 bytes packets, on ThunderX2, 10G NIC, VPP hoststack bandwidth is about 1/4 of Linux Kernel stack.
    • iperf3 performance with Hoststack.
      • Investigating lock-less fifo, ip4_local_forward node and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
    • Gorka is trying some optimal configs for VCL. - No Updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • OcteonTX is received in ARM lab. Will boot it up firstly and then start doing profiling with it.
  • FD.io lab
    • ThunderX1 -
      • New Arista switch for ThunderX1 is available now. Gathering details that required by LF lab before sending the switch to CSIT lab. - Juraj - Andy will try to send the switch to CSIT this Thursday.
    • ThunderX2 -
      • Cable type is confirmed. Procurement is in the process.
      • Require access to these servers in FD.io lab.
  • CSIT
    • VPP Path
      • So far so good.
      • ARM CI results are overwritten by x86 machines. Should be a Jenkin issue. Monitor if this corner will happen again. - Juraj
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts. Juraj is able to modify, execute the scripts in Container. Things to fix is scripts for 1-link 1-node topology and interfaces binding to VPP.
      • Kernel Migration on mcbin. Juraj is able to build all the images, but got kernel panic. Try with the more recent uBoot version.
      • Juraj to investigate further work once ThunderX2 is available.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now.
      • Khem will get L2 work in CI firstly, and then IP4, and other test cases.
      • Stan starts to work on performance scripts with Khem. Is able to connect Taishan machines in CSIT lab.
      • The performance topology in wiki link is to update per below file.
      • https://github.com/FDio/csit/blob/master/docs/lab/Testbeds_Xeon_Skx_Arm_Atom.md
      • Stan and Khem to come up with a summary of current status and an estimate of at least upstreaming basic L2/L3 performance suites.
        • Install Ubuntu-18.04 on Huawei Taishan servers firstly, and then investigate upstreaming performance test framework to enable Aarch64
        • Lijian to verify Ubuntu-18.04 on Taishan server.
      • Stan installed latest CSIT scripts on packet generator server(x86 NEON) and Tainshan servers in FD.io lab.
      • https://gerrit.fd.io/r/#/c/16850/
      • Some of L2 and L3 test cases passed.
  • VPP
    • Vectorization
      • [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
      • [Sirshak] on ethernet-input node, investigate vectorized buffer index.
    • Memory Ordering
      • [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
    • [Juraj] IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
    • [Juraj] Kernel Migration on mcbin. Juraj is able to build all the images.
  • Action Items - Next Week
    • [Sirshak] - To update patch list in VPP/Aarch64 wiki

01/15/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Lijian Zhang
    • Stanislav Chlebec
    • Khemendra Kumar
    • Tina Tsou
    • Andy Wang
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Honnappa Nagarahalli
    • John Ddigilio
  • General
  • VPP Hoststack
    • iperf3 performance with Hoststack.
      • Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
    • Gorka is trying some optimal configs for VCL. - No Updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • OcteonTX is received in ARM lab. Will boot it up firstly and then start doing profiling with it.
  • FD.io lab
    • ThunderX2 -
      • New Arista switch is available now. Gathering details that required by LF lab before sending the switch to CSIT lab. - Juraj
      • Cable type is confirmed. Procurement is in the process.
  • CSIT
    • VPP Path
      • IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
      • We have voting verify on bionic. Upload nexus disabled but merge job working. - Juraj created LF ticket for nexus upload. Both master merge job and verifying job are working fine.
      • ARM CI results are overwritten by x86 machines. Should be a Jenkin issue. Monitor if this corner will happen again. - Juraj
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts.
      • Kernel Migration on mcbin. Juraj is able to build all the images.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now.
      • Khem will get L2 work in CI firstly, and then IP4, and other test cases.
      • Stan starts to work on performance scripts with Khem. Is able to connect Taishan machines in CSIT lab.
      • The performance topology in wiki link is to update per below file.
      • https://github.com/FDio/csit/blob/master/docs/lab/Testbeds_Xeon_Skx_Arm_Atom.md
      • Stan and Khem to come up with a summary of current status and an estimate of at least upstreaming basic L2/L3 performance suites.
  • VPP
    • Vectorization
      • [Lijian] Macro benchmarking on ThunderX2/Centriq(4%)/Taishan D05(10%) is done, data is updated into Jira. Code is in internal review.
    • Memory Ordering
      • [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
    • [Juraj] IP4 reassembly and GBP failures are fixed. Patches to enable them two are merged. No any test cases in blacklist for Aarch64 machine.
    • [Juraj] Kernel Migration on mcbin. Juraj is able to build all the images.
  • Action Items - Next Week
    • [Sirshak] - To update patch list in VPP/Aarch64 wiki

01/08/2019

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Lijian Zhang
    • Stanislav Chlebec
    • Khemendra Kumar
  • General
  • VPP Hoststack
    • iperf3 performance with Hoststack.
      • Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
    • Gorka is trying some optimal configs for VCL. - No Updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working and checking the test cases.
    • [Lijian] Working on IP4 reassembly and GBP failures. - fixed. Juraj has upstreamed patched to enable these two tests.
    • [Sirshak] Kernel Migration mcbin. Juraj is working on based on Jianlin's suggestion.
    • [Andy] Getting a new Arista switch next year.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy - Macro benchmarking is done and data is updated to Jira.
    • Memory Ordering
      • [Sirshak] svm_fifo lockless alternate algorithm for SPSC. - In internal review.
  • CSIT
    • VPP Path
  • VPP Path Failures
      • We have voting verify on bionic. Upload nexus disabled but merge job working. - Juraj created LF ticket for nexus upload. Both merge job and verifying job are working fine.
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts.
      • thunderx2: Juraj working with LF to get this resolved.
      • mcbin: Juraj can contact Jianlin if needed.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now.
      • Khem will get L2 work in CI firstly, and then IP4, and other test cases.
      • Stan is starting working on VPP performance test. Khem to send email to Stan on VPP performance testing stuff.
  • FD.io lab
    • New Arista switch to be proccured next year.
    • ThunderX2 - Racked. Andy is trying to buy cables compatible to Intel XL710. Juraj to confirm info required by lab people before sending out the cables.
  • Action Items - Next Week

12/18/2018

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Tina Tsou
    • Stanislav Chlebec
    • Avinash
    • Khemendra
  • General
  • VPP Hoststack
    • iperf3 performance with Hoststack.
      • Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one fd.io setup for everyone debugging VPP hoststack.
    • Gorka is trying some optimal configs for VCL. - No Updates.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. - Bootstrap script common.sh working.
    • [Lijian] Working on IP4 reassembly and GBP failures. - Some preliminary on gbp waiting Neale. Juraj to give access to Lijian to investigate on ThunderX.
    • [Sirshak] Kernel Migration mcbin. Status: Jianlin to work with Juraj to get fd.io mcbins up and running. Sirshak to setup a meeting.
    • [Andy] Getting a new Arista switch next year.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy - Still benchmarking and setting it up for internal review.
      • [Lijian] Patch for compiling issue with GCC-8.x is under community review. Status: No updtaes.
      • [Lijian] Patch for fixing StringTest failure is under community review. Status: Abandoned.
      • [Lijian] Patch for CDP failure is under community review. Status: No updates.
    • Memory Ordering
      • [Sirshak] svm_fifo lockless alternate algorithm for SPSC.
  • CSIT
    • VPP Path
  • VPP Path Failures
    • https://jira.fd.io/browse/VPP-1475 - IP4 random reassembly failure in master, also seen on x86
    • https://jira.fd.io/browse/VPP-1491 - GBP L3/L2 Endpoint Learning failure
      • We have voting verify on bionic. Upload nexus disabled but merge job working. Juraj to create LF ticket for nexus upload.
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx. Basic skeleton of docker topology done. Moving on to deploying the scripts.
      • thunderx2: Sirshak working with LF to get this resolved.
      • mcbin: Sirshak to setup a meeting between Juraj and Jianlin.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now.
      • Khem will get L2 work in CI firstly, and then IP4, and other test cases.
  • FD.io lab
    • New Arista switch to be proccured next year.
    • ThunderX2 - Racked. IPMI Static IP configuration missing. Sirshak with LF.
  • Action Items - Next Week

12/11/2018

  • Attendees
    • Sirshak Das
    • Juraj Linkeš
    • Tina Tsou
    • Stanislav Chlebec
  • VPP Hoststack
    • iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance. Three case: kernel to kernel; kernel to VPP hoststack; VPP hoststack to VPP hoststack
    • ongoing perf analysis. One patch(https://gerrit.fd.io/r/#/c/16184/) is merged, and the other one is under internal review.
    • Investigating lock-less fifo and memory reordering for VPP hoststack - Sirshak
    • Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
    • Gorka is trying some optimal configs for VCL.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. Two scripts of L2 performance suites for CI management repository are done, investigating on for CSIT repository, and three more scripts to be developed.
    • [Lijian] Working on IP4 reassembly and GBP failures
    • [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far. - To confirm with Jianling and Joyce.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy - Second priority, no update so far.
      • [Lijian] Patch for compiling issue with GCC-8.x is under community review.
      • [Lijian] Patch for fixing StringTest failure is under community review.
      • [Lijian] Patch for CDP failure is under community review.
    • Memory Ordering
      • [Sirshak] To start work on Arithmetic and Logic relaxed functions.
  • VPP Path failures
  • CSIT
    • VPP Path
      • Actually, everything is ready. The only thing is to get CI patch merged.
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx is in place, but there are errors. Will continue investigation.
      • thunderx2: Racked. Lack of static IP. Sirshak gave a work-around to fix lacking of static IP to Anton.
      • mcbin: Kernel issue yet to try suggestion from Garcia and Damjan. To confirm with Jianling and Joyce - Lijian
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now. Khem will get L2 work in CI firstly, and then IP4, and other test cases.
  • FD.io lab
    • Arista switch is still not working. Andy and Anton are working on the exact requirement for the switch, and two possible option: Andy to replace the Arista or buy a new one.
    • ThunderX2 - Racked. Lack of IP.
  • Action Items - Next Week
    • [Lijian] to continue to investigate make test failures.
    • [Andy] to work with Anton to resolve Arista problem.

12/04/2018

  • Attendees
    • Sirshak Das
    • Andy Wang
    • Juraj Linkeš
    • Khemendra
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Tina Tsou
  • VPP Hoststack
    • iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance. Three case: kernel to kernel; kernel to VPP hoststack; VPP hoststack to VPP hoststack
    • ongoing perf analysis. Two patches ongoing. One is upstreamed and the other is under internal review. Hotpots on memory copy or maybe other stuff.
    • Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
    • Gorka is trying some optimal configs for VCL.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. Two scripts of L2 performance suites for CI management repository are done, investigating on for CSIT repository, and three more scripts to be developed.
    • [Lijian] VPP dlmalloc crash issue root-caused and fixed by maintainer. Florin Coras fixed time-out issues.
    • [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far. - To confirm with Jianling and Joyce.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy - Second priority, no update so far.
      • [Lijian] Patch for compiling issue with GCC-8.x is under internal review.
      • [Lijian] Patch for fixing StringTest failure is under internal review.
    • Memory Ordering
      • [Sirshak] To start work on Arithmetic and Logic relaxed functions.
  • CSIT
    • VPP Path
    • VPP Device
      • thunderx: 1-node topology on cavium thunderx is in place, but there are errors. Will continue investigation.
      • thunderx2: Racked. Lack of IP. To confirm with Anton.
      • mcbin: Kernel issue yet to try suggestion from Garcia and Damjan. To confirm with Jianling and Joyce - Lijian
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • Development of L2 test script is under-going now. Khem will get L2 work in CI firstly, and then IP4, and other test cases.
  • FD.io lab
    • Arista switch is still not working. Andy and Anton are working on the exact requirement for the switch, and two possible option: Andy to replace the Arista or buy a new one.
    • ThunderX2 - Racked. Lack of IP.
  • Action Items - Next Week
    • [Lijian] to continue to investigate make test failures.
    • [Andy] to work with Anton to resolve Arista problem.


11/27/2018

  • Attendees
    • Juraj Linkeš
    • Khemendra
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Tina Tsou
  • VPP Hoststack
    • iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance.
    • ongoing perf analysis, two patches ongoing. Hotpots on memory copy or maybe other stuff. Will share patches with community.- Sirshak
    • Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
    • Gorka is trying some optimal configs for VCL.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • Alternate test cases.
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. Status: Working with Juraj to get scripts ready for jobs. - Need to prepare some scripts. First to understand how the script works and then add more options.
    • [Lijian] Status on VPP path failures. Status: Still debugging. Still at early study stage.
    • [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy
    • Memory Ordering
      • [Sirshak] To start work on Arithmetic and Logic relaxed functions.
  • CSIT
    • VPP Path
      • 3 failures currently stalling deployment.
      • VPP-1476, VPP-1475, VPP-1478
      • These failures are seen on Debian x86 VM also.
      • Parallelization(n=32) is resulting in failures. Seems also be caused by below two patches.
      • VPP-1490, caused by https://gerrit.fd.io/r/#/c/15106/ and https://gerrit.fd.io/r/#/c/15534/.
      • VPP-1491, VPP-1497 about parallazation and GBP failure are filed.
      • Get CSIT/Aarch64 pass with partial test cases - Juraj
    • VPP Device
      • thunderx: Juraj created a LF tkt for wiring the 1-node topology on cavium thunderx.
      • thunderx2: to be racked by this Friday.
      • mcbin: Kernel issue yet to try suggestion from Garcia and Damjan.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • L2 test is working now manually. Khem is trying to get it work in CI, and then IP4, and other test cases.
  • FD.io lab
    • Arista switch is missing cable. Andy will send tracking no. for cables.
    • ThunderX2 - to be racked by this Friday.
  • Action Items - Next Week
    • [Lijian] to investigate VPP-1490 issue.
    • [Andy] Andy will send tracking no. for cables.

11/20/2018

  • Attendees
    • Sirshak Das
    • Andy Wang
    • Juraj Linkeš
    • Khemendra
    • Garcia
    • Manuel
    • Gorka
    • Fede
    • Tina Tsou
  • VPP Hoststack
    • iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance.
    • ongoing perf analysis, two patches ongoing. Hotpots on memory copy or maybe other stuff. Will share patches with community.- Sirshak
    • Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
    • Gorka is trying some optimal configs for VCL.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • Alternate test cases.
  • Action Items - Last Week
    • [Sachin] to introduce RFC for IPsec offload support in DPDK plugin.
    • [Khem] Deployment of only L2 CSIT performance suite. Status: Working with Juraj to get scripts ready for jobs. - Need to prepare some scripts. First to understand how the script works and then add more options.
    • [Lijian] Status on VPP path failures. Status: Still debugging. Still at early study stage.
    • [Sirshak] Kernel Migration mcbin. Status: Sirshak to try inputs from garcia and damjan. - no progress so far.
  • VPP
    • Vectorization
      • [Lijian] working on vectorized memory copy
    • Memory Ordering
      • [Sirshak] To start work on Arithmetic and Logic relaxed functions.
  • CSIT
    • VPP Path
      • 3 failures currently stalling deployment.
      • VPP-1476, VPP-1475, VPP-1478
      • These failures are seen on Debian x86 VM also.
      • Parallelization(n=32) is resulting in failures. Seems also be caused by below two patches.
      • VPP-1490, caused by https://gerrit.fd.io/r/#/c/15106/ and https://gerrit.fd.io/r/#/c/15534/.
      • VPP-1491, VPP-1497 about parallazation and GBP failure are filed.
      • Get CSIT/Aarch64 pass with partial test cases - Juraj
    • VPP Device
      • thunderx: Juraj created a LF tkt for wiring the 1-node topology on cavium thunderx.
      • thunderx2: to be racked by this Friday.
      • mcbin: Kernel issue yet to try suggestion from Garcia and Damjan.
    • VPP Performance Test
      • Working ongoing on writing scripts for Performance Jobs.
      • L2 test is working now manually. Khem is trying to get it work in CI, and then IP4, and other test cases.
  • FD.io lab
    • Arista switch is missing cable. Andy will send tracking no. for cables.
    • ThunderX2 - to be racked by this Friday.
  • Action Items - Next Week
    • [Lijian] to investigate VPP-1490 issue.
    • [Andy] Andy will send tracking no. for cables.


11/12/2018

  • Attendees
    • Sirshak Das
    • Andy Wang
    • Juraj Linkeš
    • Khemendra
    • Garcia
    • Gorka
  • VPP Hoststack
    • iperf3 performance with Hoststack. - Sirshak has done some preliminary bench-marking, and compare kernel and VPP hoststack performance.
    • ongoing perf analysis, two patches ongoing. Hotpots on memory copy or maybe other stuff. - Sirshak
    • Sirshak is trying to set up one CSIT setup for everyone debugging VPP hoststack. Will share setup info
    • Gorka is trying some optimal configs for VCL.
    • VPP on both sides(iperf3 server and client) give a boost.(Reason unknown).
    • Alternate test cases.
    • khem to get more information on benchmarking DMM. Khem to send the information to

Status Report Ligato/Contiv

Capture LandC.PNG