Difference between revisions of "VPP/Using VPP In A Multi-thread Model"

From fd.io
< VPP
Jump to: navigation, search
m (Sample Configurations)
Line 4: Line 4:
  
 
* single-thread  
 
* single-thread  
* multi-thread with worker threads only
+
* multi-thread with worker threads  
* multithread with io and workers threads (deprecated in VPP 16.09)
+
* multi-thread with main thread doing IO and workers thread (deprecated in VPP 16.09)
+
 
+
<span style="color:#ff0000">''NOTE: IO threads are deprecated in VPP 16.09''</span>
+
  
 
=== Single-thread ===  
 
=== Single-thread ===  
Line 15: Line 11:
  
  
=== Multi-thread with Worker Threads Only ===
+
=== Multi-thread with Worker Threads ===
  
 
In this mode, the main thread handles management functions(debug CLI, API, stats collection) and one or more worker threads handle packet processing from input to output of the packet.
 
In this mode, the main thread handles management functions(debug CLI, API, stats collection) and one or more worker threads handle packet processing from input to output of the packet.
Line 22: Line 18:
  
 
With RSS (Receive Side Scaling) enabled multiple thread can service one physical interface (RSS function on NIC distributes traffic between different queues which are serviced by different worker threads).
 
With RSS (Receive Side Scaling) enabled multiple thread can service one physical interface (RSS function on NIC distributes traffic between different queues which are serviced by different worker threads).
 
 
=== Multithread With IO and Worker Threads  <span style="color:#ff0000">(deprecated in VPP 16.09)</span>  ===
 
 
In this mode the main thread handles management functions(debug CLI, API, stats collection). One or more io threads handle packet processing from input and dispatch packets to worker threads.
 
 
Each io thread polls input queues on subset of interfaces.
 
 
RSS is not currently supported.
 
 
One or more worker threads handle packet processing received from io thread to output of the packet on the egress interface.
 
 
 
=== Multi-thread with Main Thread Handling IO and Worker Threads <span style="color:#ff0000">(deprecated in VPP 16.09)</span> ===
 
 
 
This mode is similar to multithreading with io and worker threads except that the main thread also takes care of processing input functions.
 
 
NOTE: An io thread just does input and not output. You can think of the thread as a dispatch thread or input thread.
 
  
 
== Thread placement ==
 
== Thread placement ==
Line 50: Line 27:
 
* if "skip-cores X" is defined first X cores will not be used
 
* if "skip-cores X" is defined first X cores will not be used
 
* if "main-core X" is defined, VPP main thread will be placed on core X, otherwise 1st available one will be used
 
* if "main-core X" is defined, VPP main thread will be placed on core X, otherwise 1st available one will be used
* if "main-core-io" is specified, mode 4 is activated <span style="color:#ff0000">(deprecated)</span>
 
* if "io N" is defined vpp will allocate first N available cores and it will run IO threads on them (this automatically implies mode 3 - <span style="color:#ff0000">deprecated</span>)
 
 
* if "workers N" is defined vpp will allocate first N available cores and it will run IO threads on them (this automatically implies modes 2,3 or 4)
 
* if "workers N" is defined vpp will allocate first N available cores and it will run IO threads on them (this automatically implies modes 2,3 or 4)
 
* if "corelist-workers A,B1-Bn,C1-Cn" is defined vpp will automatically assign those CPU cores to worker threads (this automatically implies modes 2,3 or 4)
 
* if "corelist-workers A,B1-Bn,C1-Cn" is defined vpp will automatically assign those CPU cores to worker threads (this automatically implies modes 2,3 or 4)
* if "corelist-io A,B1-Bn,C1-Cn" is defined vpp will automatically assign those CPU cores to worker threads (this automatically implies modes 3 - <span style="color:#ff0000">deprecated</span>)
 
 
  
 
User can see active placement of cores by using the VPP debug CLI command <code>show threads</code>:
 
User can see active placement of cores by using the VPP debug CLI command <code>show threads</code>:
Line 67: Line 40:
 
3      vpe_wk_2            workers    59757  6      0      1      running
 
3      vpe_wk_2            workers    59757  6      0      1      running
 
4      vpe_wk_3            workers    59758  7      1      1      running
 
4      vpe_wk_3            workers    59758  7      1      1      running
5     vpe_io_0            io          59759  8      2      1      running
+
5                          stats      59775
6      vpe_io_1            io          59760  9      3      1      running
+
7                         stats      59775
+
 
vpd#
 
vpd#
 
</pre>
 
</pre>
  
The sample output above shows the main thread running on core 2 (2nd core on the CPU socket 0), worker threads running on cores 4-7, and io threads running on cores 8,9.
+
The sample output above shows the main thread running on core 2 (2nd core on the CPU socket 0), worker threads running on cores 4-7.
  
  

Revision as of 18:38, 24 April 2017

Modes

VPP can work in 4 different modes:

  • single-thread
  • multi-thread with worker threads

Single-thread

In a single-thread mode there is one main thread which handles both packet processing and other management functions (Command-Line Interface (CLI), API, stats). This is the default setup. There is no special startup config needed.


Multi-thread with Worker Threads

In this mode, the main thread handles management functions(debug CLI, API, stats collection) and one or more worker threads handle packet processing from input to output of the packet.

Each worker thread polls input queues on subset of interfaces.

With RSS (Receive Side Scaling) enabled multiple thread can service one physical interface (RSS function on NIC distributes traffic between different queues which are serviced by different worker threads).

Thread placement

Thread placement is defined in the startup config under the cpu { ... } section.

The VPP platform can place threads automatically or manually. Automatic placement works in the following way:

  • if "skip-cores X" is defined first X cores will not be used
  • if "main-core X" is defined, VPP main thread will be placed on core X, otherwise 1st available one will be used
  • if "workers N" is defined vpp will allocate first N available cores and it will run IO threads on them (this automatically implies modes 2,3 or 4)
  • if "corelist-workers A,B1-Bn,C1-Cn" is defined vpp will automatically assign those CPU cores to worker threads (this automatically implies modes 2,3 or 4)

User can see active placement of cores by using the VPP debug CLI command show threads:

vpd# show threads
ID     Name                Type        LWP     lcore  Core   Socket State
0      vpe_main                        59723   2      2      0      wait
1      vpe_wk_0            workers     59755   4      4      0      running
2      vpe_wk_1            workers     59756   5      5      0      running
3      vpe_wk_2            workers     59757   6      0      1      running
4      vpe_wk_3            workers     59758   7      1      1      running
5                          stats       59775
vpd#

The sample output above shows the main thread running on core 2 (2nd core on the CPU socket 0), worker threads running on cores 4-7.


Sample Configurations

By default, at start-up VPP uses configuration values from:

/etc/vpp/startup.conf

The following sections describe some of the additional changes that can be made to this file. This file is initially populated from the files located in the following directory:

<VPP_Install_Dir>/vpp/vpp/conf/


Manual Placement

Manual placement places the main thread on core 1, io threads on cores 3 and 19 and workers on cores 4,5,20,21.


cpu {
  main-core 1
  corelist-io  3,19
  corelist-workers  4-5,20-21
}

Auto placement

Auto placement is likely to place the main thread on core 1 and workers on cores 2,3,4.

cpu {
  skip-cores 1
  workers 3
}

Buffer Memory Allocation

The VPP platform is NUMA aware. It can allocate memory for buffers on different CPU sockets (NUMA nodes). The amount of memory allocated can be defined in the startup config for each CPU socket by using the socket-mem A[[,B],C] statement inside the dpdk { ... } section.

For example:

dpdk {
  socket-mem 1024,1024
}

The above configuration allocates 1GB of memory on NUMA#0 and 1GB on NUMA#1. Each worker thread uses buffers which are local to itself.

Buffer memory is allocated from hugepages. VPP prefers 1G pages if they are available. If not 2MB pages will be used.

VPP takes care of mounting/unmounting hugepages file-system automatically so there is no need to do that manually.

NOTE: If you are running latest VPP release, there is no need for specifying socket-mem manually. VPP will discover all NUMA nodes and it will allocate 512M on each by default. socket-mem is only needed if bigger number of mbufs is required (default is 32768 per socket and can be changed with num-mbufs startup config command).


Interface Placement in Multi-thread Setup

On startup, the VPP platform assigns interfaces (or interface, queue pairs if RSS us used) to different worker threads in round robin fashion.

The following example shows debug CLI commands to show and change interface placement:

vpd# sh dpdk interface placement
Thread 1 (vpp_wk_0 at lcore 5):
 TenGigabitEthernet2/0/0 queue 0
 TenGigabitEthernet2/0/1 queue 0
Thread 2 (vpp_wk_1 at lcore 6):
 TenGigabitEthernet2/0/0 queue 1
 TenGigabitEthernet2/0/1 queue 1


The following shows an example of moving TenGigabitEthernet2/0/0 queue 1 processing to 1st worker thread:

vpd# set dpdk interface placement TenGigabitEthernet2/0/1 queue 1 thread 1

DBGvpd# sh dpdk interface placement
Thread 1 (vpp_wk_0 at lcore 5):
 TenGigabitEthernet2/0/0 queue 0
 TenGigabitEthernet2/0/1 queue 0
 TenGigabitEthernet2/0/1 queue 1
Thread 2 (vpp_wk_1 at lcore 6):
 TenGigabitEthernet2/0/0 queue 1


NOTE: Interface placement currently works only for threading mode 2.