CSIT/vhostuser test scenarios
Contents
Preamble
Information listed on this wiki page is a compilation of email based consultation that took place on csit-dev@list.fd.io and vpp-dev@list.fd.io mailers between 8-Nov and 22-Nov-2016 on this thread: https://lists.fd.io/pipermail/csit-dev/2016-November/001192.html.
Work in progress and subject to change based on ongoing feedback and interaction within FD.io projects.
Vhostuser Test Cases - Sequential List
Here an initial compilation of required vhost test cases to provide the coverage for VPP vhost use cases covered as part of this FD.io CSIT consultation. Tests are listed in sequential priority list, starting with baseline cases, and evolving to more complex ones to address identified use cases functional and performance dimenstions:
Two physical Ethernet port setup for networking - baseline:
- 2p1nic-eth-l2bdbasemaclrn-eth-2vhost-1vm-ndrdisc
- 2p1nic-dot1q-l2bdbasemaclrn-eth-2vhost-1vm-ndrdisc
- 2p1nic-ethip4-ip4base-eth-2vhost-1vm-ndrdisc
- 2p1nic-eth-l2xcbase-eth-2vhost-1vm-ndrdisc
- 2p1nic-dot1q-l2xcbase-eth-2vhost-1vm-ndrdisc
- 2p1nic-ethip4vxlan-l2bdbasemaclrn-eth-2vhost-1vm-ndrdisc
One physical Ethernet port setup for OpenStack networking - baseline:
- 1p1nic-eth-l2bdbasemaclrn-eth-2vhost-1vm-ndrdisc
- 1p1nic-dot1q-l2bdbasemaclrn-eth-2vhost-1vm-ndrdisc
- 1p1nic-ethip4-ip4base-eth-2vhost-1vm-ndrdisc
- 1p1nic-eth-l2xcbase-eth-2vhost-1vm-ndrdisc
- 1p1nic-dot1q-l2xcbase-eth-2vhost-1vm-ndrdisc
- 1p1nic-ethip4vxlan-l2bdbasemaclrn-eth-2vhost-1vm-ndrdisc
Two physical Ethernet port setup for networking - scale:
- 2p1nic-eth-l2bdscalemaclrn200-eth-2vhost-1vm-ndrdisc
- 2p1nic-dot1q-l2bdscalemaclrn200-eth-2vhost-1vm-ndrdisc
- 2p1nic-ethip4vxlan-l2bdscalemaclrn200-eth-2vhost-1vm-ndrdisc
- 2p1nic-eth-l2bdscalemaclrn200-eth-20vhost-10vm-ndrdisc
- 2p1nic-dot1q-l2bdscalemaclrn200-eth-20vhost-10vm-ndrdisc
- 2p1nic-ethip4vxlan-l2bdscalemaclrn200-eth-20vhost-10vm-ndrdisc
<list to be completed>
CSIT test case naming convention used is documented here: https://wiki.fd.io/view/CSIT/csit-perf-tc-naming-change.
Vhostuser Test Topologies
Primary test topology is vSwitch-VPP switching between physical NIC interfaces and virtual vhostuser interfaces.
Following specific vSwitch-VPP use cases have been identified so far (Topologies encoded tree notation with physical ports and NIC on the left, followed by outer most frame header, then other stacked headers up to the header processed by vSwitch-VPP, then VPP forwarding function, then encap on vhost interface, number of vhost interfaces, number of VMs):
- 2p1nic-eth-l2bdbase-eth-2vhost-1vm
- 2p1nic-dot1q-l2bdbase-eth-2vhost-1vm
- 2p1nic-ethip4-ip4base-eth-2vhost-1vm
- 2p1nic-eth-l2xcbase-eth-2vhost-1vm
- 2p1nic-dot1q-l2xcbase-eth-2vhost-1vm
- 2p1nic-ethip4vxlan-l2bdbase-eth-2vhost-1vm
Legend:
- <n>p - number of physical Ethernet ports
- <n>nic - number of NICs
- eth|dot1q|dot1ad
- eth - untagged Ethernet on the physical wire
- dot1q - single VLAN tag on physical wire
- dot1ad - double VLAN tag on physical wire
- ip4|ip6
- ip4 - IPv4 header
- ip6 - IPv6 header
- vxlan - VXLAN header
- l2bdbase, l2bdscale - VPP L2 bridge-domain, L2 MAC learning&switching
- base - baseline test with one L2 MAC flow received per interface
- scale - scale tests with many L2 MAC flows received per interface
- distinct L@ MAC flow - distinct (src-mac,dst-mac) tuple in MAC headers
- l2xcbase, l2xcscale - VPP L2 point-to-point crossconnect
- base - baseline test with one L2 MAC flow received per interface
- scale - scale tests with many L2 MAC flows received per interface
- distinct L2 MAC flow - distinct (src-mac,dst-mac) tuple in MAC headers
- ip4base, ip4scale - VPP IPv4 routed forwarding
- base - baseline test with one IPv4 flow received per interface
- scale - scale tests with many IPv4 flows received per interface
- distinct IPv4 flow - distinct (src-ip4,dst-ip4) tuple in IPv4 headers
161202 notes from TWS call
Mostly use topologies and associated use case:
- 2p1nic-eth-l2xcbase-eth-2vhost-1vm
- uses: baseline VNF,
- 2p1nic-eth-l2bdbase-eth-2vhost-1vm
- uses: baseline VNF,
- 2p1nic-dot1q-l2bdbase-eth-2vhost-1vm
- uses: baseline OpenStack,
- 2p1nic-ethip4vxlan-l2bdbase-eth-2vhost-1vm
- uses: baseline OpenStack,
- 2p1nic-dot1q-l2bdscale-<n>flows-eth-<m>vhost-<o>vm-parallel
- 2p1nic-ethip4vxlan-l2bdscale-<n>flows-eth-<m>vhost-<o>vm-parallel
- ACTION: to agree on the most applicable combinations of <n>, <m>,<o>
- uses: scale OpenStack - scale (flows-5tuple,vms)
- Irene: currently testing #flows: 2, 4k, 8k, 10k, 100k, 1m flows.
- Pierre: should also consider a rate of flow creation rate (learning) and duration.
- Maciek: "boxful" scenario - ie have a fairly large number of VMs/VNFs - say 10...
- Alec: two different strategies to fill the boxes:
- i) arbitrary high number based on requirements
- ii) optimal placement with good mapping of physical resources
- uses: scale upper-bound capacity of vSwitch
- Pierre: would like to see a upper-bound high scale - e.g. 100 vhost interfaces - total across all VMs to see how badly it scales - try not to oversubscribe physical cores with VM vCPU qemu tasks handling PMD driver
- Irene: currently max #virtio interface tested with testpmd in "fwd mac" mode is 4 interfaces.
- any other topologies to add ?
161207 notes from TWS call
- Alec's points from eamil: https://lists.fd.io/pipermail/csit-dev/2016-November/001259.html
- 2p1nic vs 1p1nic
- OpenStack ML2/VPP/VLAN will use dot1q-l2bdbase-eth-2vhost-1vm
- OpenStack VPP/VxLAN will use ethip4vxlan-l2bdbase-eth-2vhost-1vm
- adding following baseline topologies
- 1p1nic-dot1q-l2bdbase-eth-2vhost-1vm
- 1p1nic-ethip4vxlan-l2bdbase-eth-2vhost-1vm
- Test topologies of interest for OpenStack
- Irene: what about VM chaining via vSwitch?
- Alec: two popular topologies with OpenStack
- PVVP - same compute node
- p1nic-dot1q-l2bdbase-eth-4vhost-2vm-chain
- p1nic-ethip4vxlan-l2bdbase-eth-4vhost-2vm-chain
- PVVP - same compute node
- 2p1nic-dot1q-l2bdscale-<n>flows-eth-<m>vhost-<o>vm-chain
- 2p1nic-ethip4vxlan-l2bdscale-<n>flows-eth-<m>vhost-<o>vm-chain
- ACTION: to agree on the most applicable combinations of <n>, <m>,<o>
- PVP-PVP - two compute nodes
- p1nic-dot1q-l2bdbase-eth-2vhost-1vm-chain-2nodes
- p1nic-ethip4vxlan-l2bdbase-eth-2vhost-1vm-2nodes
- PVP-PVP - two compute nodes
- OpenStack throughput - packet traffic profiles
- frames sizes (untagged) - 64B, IMIX (for IPv4 7*64B, 4*570B, 1*1518B; for IPv6 7*78B, 4*570B, 1*1518B), 1518B
- VM images
- what to run in VM - testpmd, l2fwd, vpp
- not vpp, as it's already DUT as vswitch
- l2fwd - issues with vpmd support.
- testpmd - recommended by Irene and Intel team to be used in VM for virtio tests
- Irene: recommended settings - 1vCPU for main testpmd app, and 2vCPU for pmd threads
- issue with scaling this with large number of VMs, running out of pcores and/or lcores
- Irene: working on optimizing and tuning testpmd to work with 2vCPUs - 1vCPU for main testpmd app, and 1vCPU for pmd threads. Expecting progress on this next week, will send email to [csit-dev].
- need to standardize on testpmd startup parameters
- Irene: will send list of parameters optimized for using vpmd with virtio, and script examples
- VM image building
- starting point being existing CSIT VM image building: Ubuntu, Centos
- using DIB (openstack disk image builder) or else ?
- team of volunteers (?): Carsten, PMikus, Alec, Thomas, Irene, Maciek
- 2p1nic vs 1p1nic
- Thomas F. Herbert feedback from another call
- For baselining the system - Replace VPP with 2x testpmd for xconnecting 2x Phy-to-Vhost
- requires a new DPDK patch for enabling vhost in testpmd - <add-link>
- Irene: will be testing manually soon in Intel labs.
- Irene: happy to help to onboard this into CSIT.
- For baselining the system - Replace VPP with 2x testpmd for xconnecting 2x Phy-to-Vhost
- Thomas: another feedback from ODL
- asking for variety of flows - not only homogenous L2 or L3, but a mixture of traffic L2, IPv4, IPv6
- will require VPP to be configured in L2BD + IP4 + IP6 modes - need to have an precise scenario e.g.
- 2p1nic-eth-l2bd+ip4+ip6base-eth-6vhost-3vm ?
- reference to EANTC Cisco tests - they showed number of flows can be scaled up to milions - showing linear performance
- Alec need to add IP4 routed-forwarding instead of L2BD:
- 1p1nic-dot1q-ip4base-eth-2vhost-1vm
VM Workloads
OK to continue using testpmd running in VM. But must document the testpmd config used for tests: testpmd init CLI, any associated VM environment options and qemu config (vCPU, RAM). Suggestion to use testpmd in the "fwd mac" mode for best results. Having a testpmd image capable of auto-configuring itself on the virtual interfaces at init time would also be good to have.
- 2p1nic-eth-l2bdbase-eth-2vhost-1vm-testpmd
Request to also run vhost tests with vSwitch-VPP and VM-VPP, the latter acting as VNF doing IPv4/IPv6 routed forwarding. The topology may then look like this:
- 2p1nic-eth-l2bdbase-eth-2vhost-1vm-vppip4
161202 notes from TWS call
Environment optimisation for vhost:
- Compute HW
- SMT, or no SMT, ...
- Linux environment
- cpu isolation, pinnings
- grub parameters
- CFS scheduler settings
VM workloads:
- testpmd
- Alec: must document the exact configuration cli for testpmd for each identified vhost test topology, and compute environment
- irq based VMs
- Pierre: functionality and performance should be tested for Linux OS VMs (VNFs, other), Windows OS VMs (Enterprise workloads).
- Alec: OpenStack use cases - VNF based on Linux relying kernel virtio driver - kernel linux bridge, kernel L3/IPv4 routing
- Irene: iPerf, netPerf as workloads.
VPP configuration
- vhost configuration: single-queue, multi-queue
Multiple VM Tests
For performance test, we should aim for a box-full. Note that VM is not DUT, so need to ensure that and VPP NDR/PDR rate the VM has access to more CPU resources (and other critical resources) then it needs for that rate, IOW it is not overloaded. Assuming we use testpmd in VM, we should be allocating 1vCPU per pmd thread, so for 2 virtio VM need to allocate 2vCPUs for pmd threads, and 1vCPU for main testpmd thread and QEMU emulator tasks.
There was a suggestion to test minimum of 10 VMs, 10 x PVP chains (PVP - Phy-VM-Phy), with 2 networks per chain. Using topology naming convention described earlier, assuming still 2 physical ports, this would translate into:
- 2p1nic-dot1q-l2bdbase-eth-20vhost-10vm
Note that LF FD.io performance testbeds use 2 CPU socket servers with XEON E5-2699v3 CPUs, each with 18 cores. Assuming we want to avoid NUMA to start with, we will struggle fitting 10x3vCPUs into a socket. And for reference scaled tests would prefer to avoid using SMT (SMT=HyperThreading). For further discussion.
There was another suggestion to test 4 VMs, as a good number.
For more details about the current LF FD.io performance testbeds, see this wiki: https://wiki.fd.io/view/CSIT/CSIT_LF_testbed.
Vhost Single-queue and Multi-queue
vhost single-queue is the baseline. vhost multi-queue seems to matter only to huge VMs that generate lots of traffic and coming close to overloading worker thread dealing with it. Need to test both.
For multi-queue cases, ideally test 1-2-4 queues per vhost interface. With different number of VPP workers e.g. 0 workers single VPP thread, 1 worker, 2 workers, ... .
Test Flows Definitions for L2 and L3 Tests
Tests that cause the equivalent of multiple flows in OVS. Varying variety of traffic including layer 2 and layer 3 traffic. Note that the flows need to be engineered to test the actual use scenario for vSwitch-VPP forwarding mode, IOW scale L2 flows for L2BD L2 MAC switching and L3 flows for IPv4/IPv6 routed forwarding.
Many flows is a must. We should define the number of flows to test: 10k flows, 100k flows, 1m flows? How many flows are expected in a server running many VNF VMs (e.g. many = 10). Any recommended number of flows that is preferred in NFV setup?
We should also standardize on a basic limited set of frame sizes: 64B, IMIX (7*64B,4*570B,1*1518B), 1518B. This can be extended if needed to the list defined in RFC2544.
Multiple Interfaces
Most deployments will have a limited number of physical interfaces per compute node. One interface or 2 bonded interfaces per compute node. The number of vhost interfaces is going to be an order of magnitude larger. With the example of 10 VMs and 2 networks per VM, that’s 20 vhost interfaces for 1 or 2 physical interfaces. Of course there might be special configs with very different requirements (large oversubscription of VMs, or larger number of physical interfaces), but vhost scale profile of 10 x PVP with 20 vhost interfaces and 2 physical interfaces looks like a good starting point.
Equally importantly, we need to have VM restart and vhost interface restart (delete - create). OpenStack integration generates a significant amount of delete-recreate of vhost interface.
161202 notes from TWS call
- Alec: Most OpenStack deployments are using 1 physical interface for tenant traffic. See points raised in the thread on csit-dev.
- Alec: Similarly many OpenStack deployments are using L2 link bonding (LAG) of physical interfaces e.g. 2 interfaces in LAG.
Primary Overlay Case
Need to cover the OpenStack VXLAN overlay case: VTEP in the vswitch, everythying below the vswitch is VxLAN traffic, everything above the VTEP is straight L2 forwarding to the vhost interfaces. This calls for testing this topology:
- 2p1nic-ethip4vxlan-l2bdbase-eth-2vhost-1vm
And scaled up 10x PVP scenario:
- 2p1nic-ethip4vxlan-l2bdbase-eth-20vhost-10vm
Additional Overlay and Underlay Cases
Asking for MPLSoEthernet, but VPP forwarding mode is unclear.