VPP Usability Track
From fd.io
Contents
- 1 WORK-IN-PROGRESS - This is a planning page for addressing VPP usability aspects
- 2 Goals
- 3 VPP Use Case Requirements
- 4 VPP User Guide - Outline
- 5 Programmer's Guide - Outline
- 6 Tester's Guide - Outline
- 7 VPP vhost usability
- 8 VPP Performance Considerations
- 9 Minimal setup installation for vhost+VM connectivity
- 10 Important Test Areas
- 11 VPP Diagnostics and telemetry
WORK-IN-PROGRESS - This is a planning page for addressing VPP usability aspects
Goals
- VPP working out of the box for most/all baseline use cases.
- Profiled by target VPP consumers - use cases, users.
- Use1 - OPNFV/FDS
- Use2 - Programmable Virtual Forwarder
- Use3 - OpenStack NFVI
- Use4 - vSwitch for VMs
VPP Use Case Requirements
Functional requirements of target VPP consumers - wip - current snapshot:
- Use1 - OPNFV/FDS - VXLAN+L2BD+vhost, VLAN+L2BD+vhost, BVI, VRF, IPv4, IPv6, SNAT, ACL/classifier
- Use2 - Programmable Virtual Forwarder - VXLAN+L2BD+vhost, IPv4, IPv6, more TBC
- Use3 - OpenStack NFVI - VXLAN+L2BD+vhost
- Use4 - vSwitch for VMs - VLAN+L2BD+vhost, high-density VNF VMs (30VMs, 102vhost)
<turn it into a table..>
VPP User Guide - Outline
Outline of user guide that needs to be produced:
- Installation per Linux distro - Ubuntu, Centos, RHEL, other - Ed, Sean
- Packaging
- https://wiki.fd.io/view/VPP/Installing_VPP_binaries_from_packages#CentOS_7.2_-_VPP_master_branch_RPMs_.28in_development.29
- sudo yum install vpp-python-api
- Manual install - ??
- Automated install of VPP bench-in-a-box - Ed, Sean
- Target installation sequence:
- 1) Hugepages and CPU isolation
- a) Setup Hugepages and CPU isolation
- b) Reboot
- c) Validate Hugepages and CPU isolation
- 2) Download and build VPP (I am told we are building rather than apt/yumming packages because of lack of python api packages)
- 4) Configure /etc/vpp/startup.conf
- 5) Validating VPP python api is working
- 6) Devstack
- a) Munging ML2 driver to match vpp python api (what needs to be fixed upstream so we don’t have to do this)?
- 7) Configure various neutron and nova bits
- 8) Install Trex
- a) Download DPDK and build it for trex
- b) Download Trex and build it
- 9) Configure and run trex
- 1) Hugepages and CPU isolation
- Required work
- 0) Fix whatever is causing us to have to munge the ML2 driver to match vpp
- 1) Get vpp python api’s packaged so we just have to apt/yum install vpp and vpp-python-api
- 2) Get Trex packaged (together with whatever DPDK munging has to happen) so it can simply be apt/yum installed
- 3) Get puppet/ansible modules for vpp/trex sufficient to handle their configuration
- Once required work done, user would follow simple few steps
- 1) apt/yum install puppet/ansible
- 2) curl ${url to openstack_vpp_performance_demo.pp}
- 2) sudo puppet apply openstack_vpp_performance_demo.pp
- 3) Reboot
- 4) Run a script that fires up the performance demo (stacks/munges, neutron/nova bits, starts trex traffic)
- Target installation sequence:
- Packaging
- Environment - MK, FB
- <complete the list of areas and combinations>
- Linux environment fundamentals
- SW Dependencies
- Linux kernel ver
- QEMU ver
- DPDK ver
- HW Dependencies
- x86_64 microarchitectures
- NICs
- Initial configuration - MK, PM, DM for questions
- Startup configuration - startup.conf ...
- Interfaces
- Optimizing VPP performance - MK, PM, PF&DM for questions
- Tuning performance
- VM and vhost-user considerations
- Useful host performance telemetry
- Useful VPP performance telemetry
- Sample use cases - Chris Metz with team will work it
- L2 switching
- with VMs (vhost)
- with tunnels (vxlan, lisp-gpe)
- with security-groups
- IP routed forwarding
- with VMs (vhost)
- with tunnels (vxlan, lisp-gpe)
- with security-filters - iacl, cop-whitelist, cop-blacklist
- <add more>
- <structure differently?>
- L2 switching
- Doc Generation and Online Presentation
- Consider readthedocs approach
- Based on sphinx using readthedocs theme
- Used today by dpdk doc
- Consider readthedocs approach
Programmer's Guide - Outline
Programmer's guide - KRB for TOC, OT to help
- VPP API guide
Tester's Guide - Outline
- Running VPP tests
- vpp make tests
- sphinx auto-generated docs in place now in vpp/build-root/test-doc/build/html
- csit virl functional tests
- sphinx auto-generated docs are wip - Tibor Frank
- csit perf tests
- sphinx auto-generated docs are wip - Tibor Frank
- vpp make tests
- Programming VPP tests
- vpp make tests
- sphinx auto-generated docs in place now in vpp/build-root/test-doc/build/html
- csit virl functional tests
- csit virl tutorial already in place: https://wiki.fd.io/view/CSIT/Documentation
- csit perf tests
- csit perf tutorial already in place: https://wiki.fd.io/view/CSIT/Documentation
- vpp make tests
VPP vhost usability
- VPP vhost test requirements
- [vpp-dev], [csit-dev] consultation in progress
- VPP vhost VM use cases
- VPP vhost VM negative scenarios
- VPP disconnects
- VM disconnects
- qemu hardening for vhost
- vhost reconnect
- vhost hot-plug
- virtio1.1 re-negotiation
- Monthly DPDK-Virtio Meeting
- Other fixes:
- Improved packet tracing for troubleshooting: commit:116ea4b2a157
VPP Performance Considerations
Here initial points to be addressed for optimizing VPP performance on specific compute HW configurations:
- cpu core configuration and vpp thread mappings
- phy interfaces can be placed to thread/core
- vhost interfaces are round-robined - new feature for placement
- do after multi-queue patch by Pierre
- vhost - qsz, cpu jitter, reconnect, interop qemu-virtio
- hard to install, doesn't work all the time, crashes
- dpdk - performance with selection of NICs
- more detailed documentation about baremetal installations
- vs. just Vagrant
- dpdk new rls setup recommendations - per NIC basis
- dpdk.org is not publishing performance numbers
- csit can't address it - need manual tests and analysis
- more detailed documentation about baremetal installations
- vpp self-diagnostics for optimal setup verification
Minimal setup installation for vhost+VM connectivity
Minimal setup installation for vhost+VM connectivity:
- TRex
- ansible scripts developed in CSIT
- Consumability of CSIT-perf RF and python libraries
Important Test Areas
Listing of important areas to increase VPP test coverage -MK
- negative tests - add/remove interfaces/routes/MAC entries
- box-full tests NIC-NIC
- box-full tests NIC-VM-NIC
- stress-tests - add/remove VMs
- negative weird setups - mixed L2BD, IRB, IPv4, IPv6 forwarding
- soak-tests
- tap devices - add/remove
- negative stress/weird tests
VPP Diagnostics and telemetry
Following VPP diagnostics and telemetry aspects need to be addressed:
- live health and performance metrics
- compute HW
- Linux kernel
- VM guest
- collectd + influxdb + grafana
- VPP interface counters
- VPP vector size