Difference between revisions of "VPP/AArch64"

From fd.io
< VPP
Jump to: navigation, search
Line 80: Line 80:
  
 
Investigate memcpy performance (<code>src/vppinfra/string.h</code>); both inlined-by-compiler and libc versions.
 
Investigate memcpy performance (<code>src/vppinfra/string.h</code>); both inlined-by-compiler and libc versions.
 
SIMD
 
* Investigate uses of CLIB_HAVE_VEC128 that are not implemented on Arm (mheap_bootstrap.h, vhost-user.c, ixge.c)
 
* Investigate uses of splat for initialization
 
* Investigate uses of SIMD types with plain C bit-wise/arith ops (code generation) (dpdk/node.c, ...)
 
  
 
Investigate current tuning of dual/quad loop optimizations for hiding memory latency, e.g. l2_forward().
 
Investigate current tuning of dual/quad loop optimizations for hiding memory latency, e.g. l2_forward().
 +
 +
SIMD
 +
* CLIB_HAVE_VEC128 also covers 256-bit. Add CLIB_HAVE_VEC256?
 +
* ixge.[ch] - 128-bit vector types with plain C. Needs to be enabled.
 +
* mheap.c - implement is_equal
 +
* hash.h - implement irotate
 +
* vnet_classify.h - 128-bit vector types with plain C. Needs to be enabled.
 +
* vhost-user.c - SSE4.2 only. Implement range search using NEON.
 +
* ip4_mtrie.h - 128-bit vector types with plain C. Needs to be enabled.

Revision as of 01:17, 24 October 2017

Patches

conditional x86intrin.h inclusion https://gerrit.fd.io/r/#/c/8952/
null-terminate some formatted string Merged 10/20 https://gerrit.fd.io/r/#/c/8922/
lb plugin - fix format() type mismatches Merged 10/16 https://gerrit.fd.io/r/#/c/8755/
Use AESNI=y only on x86_64 machines Merged 10/14 https://gerrit.fd.io/r/#/c/8622/
Improved arm64 chip detection Merged 9/11 https://gerrit.fd.io/r/#/c/8372/
Native arm64 build: dpdk/Makefile change Merged 8/31 https://gerrit.fd.io/r/#/c/8228/

Known Issues

GCC 5.3.x ICEs during FP register allocation. Please use GCC 5.4+.

CSIT unit tests

General note:

Some tests are not meant to be played alone.

For example the test_l2bd_arp_term_\d+ list

  • test_1: create 5 hosts
  • test_2: delete 3 of the hosts created by test_1

...

Those tests should be played grouped. For example in the case above ;

# works
make test TEST=*.TestL2bdArpTerm.*
# does not work
make test V=1 TEST=*.TestL2bdArpTerm.test_l2bd_arp_term_01
make test V=1 TEST=*.TestL2bdArpTerm.test_l2bd_arp_term_02
...

make test-debug status: ?

Failed:

  • TestIP6VrfMultiInst.test_ip6_vrf_02
  • TestIP6VrfMultiInst.test_ip6_vrf_02
  • TestIPv4FibCrud.test_3_add_new_routes
  • TestIp4VrfMultiInst.test_ip4_vrf_02
  • TestL2bdMultiInst.test_l2bd_inst_02
  • TestL2bdMultiInst.test_l2bd_inst_03
  • TestIPv4FibCrud.test_2_del_routes

make test-all-debug

status: no additional test failure

make test

failed test:

  • TestLB
  • TestJVpp

make test-all status ?

Misc

Support multiple cache line sizes per architecture. AArch64 is currently hard coded to 128B. For native build, inspect ARMv8 Main ID Register in Makefile and pass cache line size as compiler option, e.g. -DCACHE_LINE_SIZE=128.

Investigate "show cpu" output and Arm CPU feature detection (AES, SHA1, SHA2, CRC32, ATOMICS) via hwcaps. src/vppinfra/cpu.[ch]

Review use of Arm architected timer in src/vppinfra/time.[ch]

Use ISB or YIELD in src/vppinfra/smp.h

Use REV in src/vppinfra/byte_order.h

Review use of __sync_xxx/__atomic_xxx builtins to ensure correct memory ordering on non-TSO machines.

Investigate hash performance (CRC32 vs xxhash) e.g. in src/vppinfra/bihash_8_8.h. Dependent on Arm CPU feature detection.

Investigate memcpy performance (src/vppinfra/string.h); both inlined-by-compiler and libc versions.

Investigate current tuning of dual/quad loop optimizations for hiding memory latency, e.g. l2_forward().

SIMD

  • CLIB_HAVE_VEC128 also covers 256-bit. Add CLIB_HAVE_VEC256?
  • ixge.[ch] - 128-bit vector types with plain C. Needs to be enabled.
  • mheap.c - implement is_equal
  • hash.h - implement irotate
  • vnet_classify.h - 128-bit vector types with plain C. Needs to be enabled.
  • vhost-user.c - SSE4.2 only. Implement range search using NEON.
  • ip4_mtrie.h - 128-bit vector types with plain C. Needs to be enabled.