Difference between revisions of "VPP Usability Track"

From fd.io
Jump to: navigation, search
(Tester's Guide - Outline)
(VPP vhost usability)
Line 125: Line 125:
 
#* Monthly DPDK-Virtio Meeting
 
#* Monthly DPDK-Virtio Meeting
 
#** https://docs.google.com/document/d/1FoPi81zQmo5y20G0Yntg-eGHX14LnWc63TwR57mCbs4/edit?ts=581f03fb#
 
#** https://docs.google.com/document/d/1FoPi81zQmo5y20G0Yntg-eGHX14LnWc63TwR57mCbs4/edit?ts=581f03fb#
 +
# Other fixes:
 +
#* Improved packet tracing for troubleshooting: commit:116ea4b2a157
  
 
==VPP Performance Considerations==
 
==VPP Performance Considerations==

Revision as of 23:17, 15 November 2016

WORK-IN-PROGRESS - This is a planning page for addressing VPP usability aspects

Goals

  1. VPP working out of the box for most/all baseline use cases.
  2. Profiled by target VPP consumers - use cases, users.
    • Use1 - OPNFV/FDS
    • Use2 - Programmable Virtual Forwarder
    • Use3 - OpenStack NFVI
    • Use4 - vSwitch for VMs

VPP Use Case Requirements

Functional requirements of target VPP consumers - wip - current snapshot:

  1. Use1 - OPNFV/FDS - VXLAN+L2BD+vhost, VLAN+L2BD+vhost, BVI, VRF, IPv4, IPv6, SNAT, ACL/classifier
  2. Use2 - Programmable Virtual Forwarder - VXLAN+L2BD+vhost, IPv4, IPv6, more TBC
  3. Use3 - OpenStack NFVI - VXLAN+L2BD+vhost
  4. Use4 - vSwitch for VMs - VLAN+L2BD+vhost, high-density VNF VMs (30VMs, 102vhost)

<turn it into a table..>

VPP User Guide - Outline

Outline of user guide that needs to be produced:

  1. Installation per Linux distro - Ubuntu, Centos, RHEL, other - Ed, Sean
    • Packaging
    • Manual install - ??
    • Automated install of VPP bench-in-a-box - Ed, Sean
      • Target installation sequence:
        • 1) Hugepages and CPU isolation
          • a) Setup Hugepages and CPU isolation
          • b) Reboot
          • c) Validate Hugepages and CPU isolation
        • 2) Download and build VPP (I am told we are building rather than apt/yumming packages because of lack of python api packages)
        • 4) Configure /etc/vpp/startup.conf
        • 5) Validating VPP python api is working
        • 6) Devstack
          • a) Munging ML2 driver to match vpp python api (what needs to be fixed upstream so we don’t have to do this)?
        • 7) Configure various neutron and nova bits
        • 8) Install Trex
          • a) Download DPDK and build it for trex
          • b) Download Trex and build it
        • 9) Configure and run trex
      • Required work
        • 0) Fix whatever is causing us to have to munge the ML2 driver to match vpp
        • 1) Get vpp python api’s packaged so we just have to apt/yum install vpp and vpp-python-api
        • 2) Get Trex packaged (together with whatever DPDK munging has to happen) so it can simply be apt/yum installed
        • 3) Get puppet/ansible modules for vpp/trex sufficient to handle their configuration
      • Once required work done, user would follow simple few steps
        • 1) apt/yum install puppet/ansible
        • 2) curl ${url to openstack_vpp_performance_demo.pp}
        • 2) sudo puppet apply openstack_vpp_performance_demo.pp
        • 3) Reboot
        • 4) Run a script that fires up the performance demo (stacks/munges, neutron/nova bits, starts trex traffic)
  2. Environment - MK, FB
    • <complete the list of areas and combinations>
    • Linux environment fundamentals
    • SW Dependencies
      • Linux kernel ver
      • QEMU ver
      • DPDK ver
    • HW Dependencies
      • x86_64 microarchitectures
      • NICs
  3. Initial configuration - MK, PM, DM for questions
    • Startup configuration - startup.conf ...
    • Interfaces
  4. Optimizing VPP performance - MK, PM, PF&DM for questions
    • Tuning performance
    • VM and vhost-user considerations
    • Useful host performance telemetry
    • Useful VPP performance telemetry
  5. Sample use cases - Chris Metz with team will work it
    • L2 switching
      • with VMs (vhost)
      • with tunnels (vxlan, lisp-gpe)
      • with security-groups
    • IP routed forwarding
      • with VMs (vhost)
      • with tunnels (vxlan, lisp-gpe)
      • with security-filters - iacl, cop-whitelist, cop-blacklist
    • <add more>
    • <structure differently?>
  6. Doc Generation and Online Presentation

Programmer's Guide - Outline

Programmer's guide - KRB for TOC, OT to help

  1. VPP API guide

Tester's Guide - Outline

  1. Running VPP tests
    • vpp make tests
      • sphinx auto-generated docs in place now in vpp/build-root/test-doc/build/html
    • csit virl functional tests
      • sphinx auto-generated docs are wip - Tibor Frank
    • csit perf tests
      • sphinx auto-generated docs are wip - Tibor Frank
  2. Programming VPP tests

VPP vhost usability

  1. VPP vhost test requirements
  2. VPP vhost VM use cases
  3. VPP vhost VM negative scenarios
    • VPP disconnects
    • VM disconnects
  4. qemu hardening for vhost
  5. Other fixes:
    • Improved packet tracing for troubleshooting: commit:116ea4b2a157

VPP Performance Considerations

Here initial points to be addressed for optimizing VPP performance on specific compute HW configurations:

  1. cpu core configuration and vpp thread mappings
    • phy interfaces can be placed to thread/core
    • vhost interfaces are round-robined - new feature for placement
      • do after multi-queue patch by Pierre
  2. vhost - qsz, cpu jitter, reconnect, interop qemu-virtio
    • hard to install, doesn't work all the time, crashes
  3. dpdk - performance with selection of NICs
    • more detailed documentation about baremetal installations
      • vs. just Vagrant
    • dpdk new rls setup recommendations - per NIC basis
    • dpdk.org is not publishing performance numbers
    • csit can't address it - need manual tests and analysis
  4. vpp self-diagnostics for optimal setup verification

Minimal setup installation for vhost+VM connectivity

Minimal setup installation for vhost+VM connectivity:

  1. TRex
  2. ansible scripts developed in CSIT
  3. Consumability of CSIT-perf RF and python libraries

Important Test Areas

Listing of important areas to increase VPP test coverage -MK

  1. negative tests - add/remove interfaces/routes/MAC entries
  2. box-full tests NIC-NIC
  3. box-full tests NIC-VM-NIC
  4. stress-tests - add/remove VMs
  5. negative weird setups - mixed L2BD, IRB, IPv4, IPv6 forwarding
  6. soak-tests
  7. tap devices - add/remove
  8. negative stress/weird tests

VPP Diagnostics and telemetry

Following VPP diagnostics and telemetry aspects need to be addressed:

  1. live health and performance metrics
    • compute HW
    • Linux kernel
    • VM guest
  2. collectd + influxdb + grafana
    • VPP interface counters
    • VPP vector size