CSIT/Documentation

From fd.io
< CSIT
Revision as of 06:51, 14 December 2017 by Pmikus (Talk | contribs)

Jump to: navigation, search


WORK IN PROGRESS

CSIT components

FD.io CSIT system is developed using two main coding building blocks: RobotFramework and Python.

Robot Framework

Robot Framework (RF) is a test automation tool that allows to select and run a set of test suites and test cases on selected target topology (nodes) and provide output logs in a readable and parse-able form. RF uses column-based formatting for its files. For clarity and to avoid problems, CSIT team has selected the "pipe and space separated" format (rather than tab-separated format). RF is case insensitive, but we strive to be consistent in how we use upper/lower case in RF files. The value of RF really is its human readability - anyone who spends a little time reading RF source files can understand what is going on, even non-programmers can understand the essence of what given test case is doing. There are only two types of RF files stored in CSIT: resource (a.k.a. library) files, and test suites (any RF file containing test case). In resource files we store all Keywords (RF's name for functions/methods) that are generic, and could be re-used. For example NIC manipulation is going to be re-used in nearly all test-suites, hence it is placed in a file for common reuse as an RF library. All tests in RF format are stored in tests/suites/ sub-directories. RF interprets every directory there as a test suite, with all files containing test cases taken as test suites.

Python

Python is a component that needs no introduction by itself. In CSIT, we use Python (latest release of 2.7.x series) to perform tasks that are unsuitable for RF format (e.g. lower level code, or anything that starts to feel more like coding - conditionals/loops etc.). Since RF is written in Python, it integrates with Python extensions easily. In any RF file one can import .py scripts directly and reference classes and functions from the imported Py module. Here's a link to RF documentation describing how to create and import Python library into RF file].

Other use of Python code in CSIT framework is in traffic scripts. In typical test, one have to verify whether the network dataplane of VPP works according to the configured functionality, and in CSIT we do that by sending crafted packets and validating them on receive. We use Scapy Python tool for this purpose. Scapy allows one to easily create, dissect, manipulate and pretty-print packets. Here is an example how a CSIT Scapy Python traffic script looks like.

These scripts are re-used as much as possible, so that duplicated code is minimized. Traffic scripts are being run on the TG (traffic generator) node in the test topology using SSH command that passes variables to the script as command line arguments. There is an utility method in CSIT Python module that is used to generate command lines with arguments set based on passed variables. This utility method is re-used as many of the parameters are common in the scripts, such as MAC and IP addresses to use for traffic and which interface to use as Tx and Rx.

All Python code submitted to CSIT should conform to PEP-8 style. In fact, there is a Jenkins job hooked to Gerrit watching CSIT project changes, that runs Pylint against all resources/*/*.py files. No submitted patch shall increase number of Pylint violations generated by (such as the csit-validate-pylint job), but rather the focus should be on lowering the number of warnings. In some cases it is impossible to do so, so exceptions might apply, but in general one should work their hardest toward having clean Python code.

Starting with CSIT Robot Framework tests

pybot

RF provides Python executable pybot that is currently the main entry point to all CSIT tests. This documentation provide usage examples applicable to CSIT test suites execution, and as such should not be considered as completely covering pybot usage options. Such information is left for self-study - use pybot --help to see all available options or Robot Framework User Guide for details.

Log levels

All CSIT Python libraries use RF logging sub-system for logging. To specify logging verbosity level, add -L LEVEL as your command line argument. E.g. pybot -L TRACE ... to log everything. Logging information is then stored in output.xml and log.html output files. RF supported log levels are listed here from RF user guide:

  • FAIL - Used when a keyword fails. Can be used only by Robot Framework itself.
  • WARN - Used to display warnings. They shown also in the console and in the Test Execution Errors section in log files, but they do not affect the test case status.
  • INFO - The default level for normal messages. By default, messages below this level are not shown in the log file.
  • DEBUG - Used for debugging purposes. Useful, for example, for logging what libraries are doing internally. When a keyword fails, a traceback showing where in the code the failure occurred is logged using this level automatically.
  • TRACE - More detailed debugging level. The keyword arguments and return values are automatically logged using this level.

Test Suites / Test Cases

In CSIT one directory is used for all tests, conveniently named tests. All tests are grouped by suites - every directory is understood as a suite in RF, and the same applies to every file (CSIT uses exclusively .robot file suffix). This information can be leveraged by starting pybot -s suite_name to execute only test cases from that particular suite (wildcards are supported, just be aware of shell globbing). Here's more detailed explanation from RF user guide.

Selection of a concrete test case is possible too, just use -t name_of_testcase pybot option. You can combine -s and -t parameters.

Tags

To help selection of test cases for per-patch runs, CSIT has developed TAGing scheme documented in tag_documentation.rst file. At minimum, each CSIT test case must provide TAG from two groups:

  • environment tag
    • defines on which environment this test case can be run
    • e.g. VM_ENV, HW_ENV
  • topology tag
    • defines what topology this test requires
    • e.g. 3_NODE_SINGLE_LINK_TOPO

Therefore, to run all test cases that can be run on VM environment (VIRL, VmWare, VirtualBox, KVM, etc), and requires at least one link in between DUT nodes, one can type: pybot --include VM_ENVand3_NODE_SINGLE_LINK_TOPO. For usage guidelines of RF tag patterns see RF user guide.

Topology

Every CSIT test case runs on the actual network topologies, either virtual or physical. Topology is a collection of nodes with settings (like IP addresses of nodes, login credentials, and most importantly NIC and link information). Currently there is only one type of topology being used in CSIT - a three node topology with one, or two links between each pair of nodes. See the topology diagram below.

        +---------+                      +---------+
        |         <---------------------->         |
        |   DUT   |                      |   DUT   |
        |         <---------------------->         |
        +--^---^--+                      +--^----^-+
           |   |                            |    |
           |   |                            |    |
           |   |         +---------+        |    |
           |   +--------->         <--------+    |
           |             |   TG    |             |
           +------------->         <-------------+
                         +---------+

During tests all nodes are reachable thru the MGMT network connected to every node via dedicated NICs and links (not shown above for clarity). Each connection that is drawn on the diagram is potentially going to be used by test cases. It is critical that there is no other traffic going on these links but only what traffic generator scripts or DUT VPPs send out. Oftentimes other protocol packets (e.g. LLDP, DHCP generated by NICs or surrounding systems) may be present on these links, and this will cause test case failures.

Topology information is stored in YAML format in topologies/* sub-directories. To help in early syntax error detection, we provide YAML schemas in resources/topology_schemas sub-directory. Contents of your topology file have to represent your physical or virtual lab setup. Don't forget to pay special attention to login information, IP addresses and interfaces details.

Topology information is loaded from YAML file, processed and provided by CSIT Python library as a global variable named nodes during CSIT start. You might have spotted occurrences of it in .robot files as ${nodes['TG']} - this represents reference to dictionary value of parsed node named TG from topology file. This is then used with other CSIT Python libraries (e.g. SSH, DUT setup, and so on).

To specify your concrete topology file, pass -v TOPOLOGY_PATH:topologies/enabled/topology.yaml parameter as your pybot command line argument, just replace the topologies/enabled/topology.yaml with path to your own topology definition. Complete example of how to start bridge-domain tests with custom topology file in ~/my_topo.yaml looks like this: pybot -L TRACE -v TOPOLOGY_PATH:~/my_topo.yaml --include vm_envAND3_node_single_link_topo -s "bridge domain"

Test execution walk-through

After pybot execution, RF starts looking for test suites to execute. All test suites are stored in tests/ sub-directory. RF looks recursively for files in this directory, and naturally it comes to tests/suites/__init__.robot. This file is loaded before any other suite - it acts as initialization file for all suites in given directory. CSIT uses this file to initialize test run-time environment before we run any test cases. The setup currently consists of:

  • Setup Framework - prepares each node in topology for use by the framework for testing:
    • executed only and exactly once, before any test case execution;
    • uploads the whole CSIT directory (CSIT top level dir with contents) to every ${node}:/tmp/openvpp-testing directory for use during test;
    • makes sure all dependencies are installed on Nodes if needed, and prepares Python for execution in Virtualenv;
    • this last step is done parallel for each topology Node to save up time (significantly).
  • Setup All DUTs - prepares all DUTs for test execution:
    • executed in most cases(??) / potentially(??) before any test case;
    • restarts VPP instances, makes sure VPPs are up, performs common setup before test execution.
    • what about initializing TG(??).
  • Update All Interface Data On All Nodes:
    • downloads VPP interfaces data from each DUT, and stores sw_if_index of interfaces into the topology information file(??) for later use (VPP API commands take sw_if_index as parameter instead of NIC names).

After executing above listed procedures, RF/pybot loads test suites and test cases based on the selection criteria given in the command line. From there onwards it is just a plain old RF test cases execution per RF documentation. Typically there are few RF keywords (pre-defined in RF and CSIT library(??)) listed in suite setup and in test case setup that are executed before the actual test case, and from there on it's just the defined test case execution and evaluation.

Jenkins

The whole purpose of CSIT is to be continuous. To achieve just that, all possible functional test cases have to be executed every time a patch is introduced to VPP. This is achieved by executing CSIT tests against given all patches in vpp project in gerrit. If any test unexpected test fails, the Gerrit review will receive -Verified vote from Jenkins user, hence rendering the review not submittable (by normal means). This approach helps discovering problems early before VPP code is submitted to VPP master branch. Such jobs (which are triggered by a change in Gerrit review) are in general called verify jobs, and can be spotted by "verify" in their name (e.g. csit-verify, vpp-verify etc.). Analogically there are merge jobs, which are triggered after a review in Gerrit is submitted to the master/parent branch. These jobs are typically used for artefact publishing to nexus and such (e.g. VPP .deb packages).

CSIT has to have its own Jenkins jobs, that help validate incoming tests/code in all ways: code functionality (all test are executed) and code cleanliness.


All CSIT jobs with their description are listed here.

CSIT Jobs

These jobs role is to verify CSIT code is still working. For every review in CSIT gerrit for functional tests, all tests are executed to verify the patch didn't bring in any collateral. Furthermore if there is new test case/suite added, these are executed too. This way we know if the implemented test cases work as they should. We use "golden" VPP version from Nexus, that has been previously validated by CSIT that works fine - this is to eliminate multiple variables in test (i.e. have only CSIT change, leaving VPP version constant between runs).

CSIT requires PEP-8 code style, and to help code reviews pylint job has been created: csit-validate-pylint. This job's task is to detect formatting violations and report them.

VPP Jobs

VPP jobs created by CSIT have the sole purpose of validating VPP review submissions (patchsets if you will). This is achieved by executing verified version of CSIT code on top of build of VPP (built from git parent version + applied patch from given gerrit review). The verified CSIT code version is identified by a git tag (a branch currently, but that's not important), that points to a known-100%-passing-CSIT-version. Therefore if a VPP test fails, it is due to VPP code change instead of potential problem in CSIT.

VPP verification job vpp-csit-verify-virl, builds VPP, checks out csit-verified and uses the built .deb packages to test the VPP. That's how VPP diff gets tested.


Starting RobotFramework

All examples below expects cloned CSIT repository, and created virtual environment, achieved like:

username@vpp64:~/vpp/fd.io/csit$ virtualenv env
New python executable in env/bin/python
Installing setuptools, pip, wheel...done.
username@vpp64:~/vpp/fd.io/csit$ source env/bin/activate
(env)username@vpp64:~/vpp/fd.io/csit$ pip install -r requirements.txt
Collecting robotframework==2.9.2 (from -r requirements.txt (line 1))
/home/username/vpp/fd.io/csit/env/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
Collecting paramiko==1.16.0 (from -r requirements.txt (line 2))
  Using cached paramiko-1.16.0-py2.py3-none-any.whl
Collecting scp==0.10.2 (from -r requirements.txt (line 3))
  Using cached scp-0.10.2-py2.py3-none-any.whl
Collecting ipaddress==1.0.16 (from -r requirements.txt (line 4))
  Using cached ipaddress-1.0.16-py27-none-any.whl
Collecting interruptingcow==0.6 (from -r requirements.txt (line 5))
Collecting PyYAML==3.11 (from -r requirements.txt (line 6))
Collecting pykwalify==1.5.0 (from -r requirements.txt (line 7))
Collecting scapy==2.3.1 (from -r requirements.txt (line 8))
Collecting enum34==1.1.2 (from -r requirements.txt (line 9))
Collecting requests==2.9.1 (from -r requirements.txt (line 10))
  Downloading requests-2.9.1-py2.py3-none-any.whl (501kB)
    100% |████████████████████████████████| 503kB 582kB/s
Collecting ecdsa>=0.11 (from paramiko==1.16.0->-r requirements.txt (line 2))
  Using cached ecdsa-0.13-py2.py3-none-any.whl
Collecting pycrypto!=2.4,>=2.1 (from paramiko==1.16.0->-r requirements.txt (line 2))
Collecting docopt==0.6.2 (from pykwalify==1.5.0->-r requirements.txt (line 7))
Collecting python-dateutil==2.4.2 (from pykwalify==1.5.0->-r requirements.txt (line 7))
  Using cached python_dateutil-2.4.2-py2.py3-none-any.whl
Collecting six>=1.5 (from python-dateutil==2.4.2->pykwalify==1.5.0->-r requirements.txt (line 7))
  Using cached six-1.10.0-py2.py3-none-any.whl
Installing collected packages: robotframework, ecdsa, pycrypto, paramiko, scp, ipaddress, interruptingcow, PyYAML, docopt, six, python-dateutil, pykwalify, scapy, enum34, requests
Successfully installed PyYAML-3.11 docopt-0.6.2 ecdsa-0.13 enum34-1.1.2 interruptingcow-0.6 ipaddress-1.0.16 paramiko-1.16.0 pycrypto-2.6.1 pykwalify-1.5.0 python-dateutil-2.4.2 requests-2.9.1 robotframework-2.9.2 scapy-2.3.1 scp-0.10.2 six-1.10.0
/home/username/vpp/fd.io/csit/env/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
You are using pip version 7.1.2, however version 8.1.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
(env)username@vpp64:~/vpp/fd.io/csit$ export PYTHONPATH=.

Start Specific Test Suite

pybot -L TRACE -v TOPOLOGY_PATH:topologies/available/my_topo.yaml -s ipv4 tests
pybot
RobotFramework executable
-L TRACE
set the debugging level to TRACE
-v TOPOLOGY_PATH
topologies/available/my_topo.yaml
load the topology file topologies/available/my_topo.yaml
-s ipv4
start all test suites that match name ipv4>
tests
path to start tests from. In our case tests is directory containing subdirs with test suites. RF reads these sub-directories in recursive manner in search for test cases.

Start Specific Test Case

pybot -L TRACE -v TOPOLOGY_PATH:topologies/available/my_topo.yaml -t "VPP replies to ICMPv4 echo request" tests/
pybot
RobotFramework executable
-L TRACE
set the debugging level to TRACE
-v TOPOLOGY_PATH
topologies/available/my_topo.yaml
load the topology file topologies/available/my_topo.yaml
-4 "VPP replies to ICMPv4 echo request"
start test case that matches the name VPP replies to ICMPv4 echo request
tests
path to start tests from. In our case tests is directory containing subdirs with test suites. RF reads these sub-directories in recursive manner in search for test cases.

Typical problems

vpp-csit-verify job failed, now what?

how to read console logs

reading robotframework log

timeout problems

Writing a test case

This section talks about steps one has to take to add a new fully functional test. A new test will most probably require opening multiple files in different locations in CSIT code base, e.g. Python library, VAT template and RobotFramework suite/test. Usually developer has idea what they want to test - they may come with bunch of scripts they used during unit testing of their feature, or they might just have a list of newly added VPP API functions. This guide will walk one through from nothing to full working test case. For this purpose, bridge domain test case is chosen as an example.

Test case design

First, we have to specify what the given test is proving. In our case, we want to test if VPP's bridge domain (with enabled learning) forwards packets as expected. From VPP's perspective, we want to: - create a bridge domain - add two interfaces to it - set interfaces state to up

Then we will send packet to one interface and expect it will appear unchanged on the other interface. Since the bridge domain is configured as learning, and the dst mac is unknown to VPP the packet has to appear on the other interface.

To create a VPP bridge domain, VPP API has to be used:

bridge_domain_add_del
sw_interface_set_l2_bridge
sw_interface_set_flags

Lastly, one has to realize they work in a given topology. Most used topology currently is the circular topology with one link between nodes - this one is the easiest one to set up. In this topology each node has two interfaces. TG is connected to both DUTs, and DUTs are interconnected with a cable too. Therefore to test the packet flowing through the VPP's bridge domain, we have to duplicate the bridge domain configuration to both DUTs.

Test suite creation

In this example, we are going to create a new test suite, because we know no bridge domain tests are present (suppose). We have to create a new directory in tests/suites directory. Let's call it bridge_domain. RobotFramework will take the name of the directory as test suite name. We will create then a new file in this subdirectory, which will contain test cases for bridge_domain suite. For now let's call it test.robot (this could be something more explanatory like "untagged" to mark tests that are not using tagged packets).

cd $CSIT_ROOT
mkdir tests/suites/bridge_domain
touch tests/suites/bridge_domain/test.robot
${EDITOR} tests/suites/bridge_domain/test.robot

Every RobotFramework test suite file will in the end have at least two sections: Settings and Test Cases.

Settings section

This section sets properties for the whole test suite, like imports of libraries, imports of other robot framework files, tags settings, and setup/teardown phase hooks settings. Add these lines to your newly created test.robot file:

*** Settings ***
| Resource | resources/libraries/robot/default.robot

We've created a new (still empty) test suite, with two sections. In the Settings section we added a directive to make RobotFramework import other RobotFramework file. default.robot is CSIT created file with default imports that are recurring in multiple test suites. In the default.robot file, there is Variables directive, which causes the topology information to be loaded for later use. Therefore in our test cases we will have ${nodes} populated properly - reference to topology nodes information from topology yaml configuration as passed to the pybot command.

We know that we are going to send crafted packets using our traffic scripts, therefore we have to prepare TG node for this. The reason is that functional tests could get intermixed with perf tests on the same topology, and perf tests could take NICs from linux kernel and use special driver for specialized packet generation. To get around this, one should make sure the NICs are visible in linux by using RobotFramework keyword from defaul.robot:

| Setup all TGs before traffic script
| | [Documentation] | Prepare all TGs before traffic scripts execution
| | All TGs Set Interface Default Driver | ${nodes}

We have to call this only oncce, because in our suite we want to use traffic scripts only. To make RobotFramework call this KW only once, we can add Suite Setup directive to Settings section:

 *** Settings ***
 | Resource | resources/libraries/robot/default.robot
 | Suite Setup | Setup all TGs before traffic script

Furthermore we want to make sure VPP is running on DUTs, and if not start it up (VPP might have crashed during previous test). To do this before every test case, we can add following line to Settings section:

*** Settings ***
| Resource | resources/libraries/robot/default.robot
| Suite Setup | Setup all TGs before traffic script
| Test Setup | Setup all DUTs before test

Test Cases section

Now we define the test case. Test case starts with its name as first column. One can write whatever they want here, since it is just taken by RobotFramework as plain text and used as such. In CSIT we strive to have self-explanatory test case names. The goal is to have the test name short, with detail and self-explanatory. Or at least attempt to do so. The approach that we've took is to take the names of the testcases as if one would be explaining the feature to others. We claim, that VPP can forward packets through bridge domain, and we are doing it in circular topology. More documentation can be added later in [Documentation] section in test case definition, but just from the name one should be able what the test case does. So here goes an attempt to name the test case:

*** Settings ***
| Resource | resources/libraries/robot/default.robot
| Suite Setup | Setup all TGs before traffic script
| Test Setup | Setup all DUTs before test
*** Test Cases ***
| Vpp forwards packets via L2 bridge domain in circular topology

When RobotFramework will be executing these instructions in this test case we are writing, the Test Setup phase has been already executed for us. Therefore VPPs on DUTs are in up state, and we can fire off VPP commands we want to test. First thing to realize is that we have to have reference to a DUT we want to execute these commands at. For this purpose utility library NodePath has been created. It allows one to create virtual "packet" (data) path through the topology nodes and their NICs. To use the library (any library really) in this test suite, we have to import it in Settings section:

*** Settings ***
| Resource | resources/libraries/robot/default.robot
| Library | resources.libraries.python.NodePath
| Suite Setup | Setup all TGs before traffic script
| Test Setup | Setup all DUTs before test
*** Test Cases ***
| Vpp forwards packets via L2 bridge domain in circular topology


We know we plan to go through the topology in circular manner, so our packets are going to go out of TG to DUT1 to DUT2 and back to TG. We already have ${nodes} object available to us in RobotFramework files through the default.robot import. Now we have to select NICs on these ports. To refrain from using concrete port names/numbers in RobotFramework, we can do following:

*** Settings ***
| Resource | resources/libraries/robot/default.robot
| Library | resources.libraries.python.NodePath
| Suite Setup | Setup all TGs before traffic script
| Test Setup | Setup all DUTs before test
*** Test Cases ***
| Vpp forwards packets via L2 bridge domain in circular topology
| | Append Nodes | ${nodes['TG']} | ${nodes['DUT1']} | ${nodes['DUT2']}
| | ...          | ${nodes['TG']}
| | Compute Path

Note that we didn't have to write NodePath.Append Nodes. That's because RobotFramework resolves non-conflicting names automatically. One can confirm that in log.html output, where RobotFramework will prepend name of library before the function name.

This will create node traversal path using the NodePath library.

*** Settings ***
| Resource | resources/libraries/robot/default.robot
| Library | resources.libraries.python.NodePath
| Suite Setup | Setup all TGs before traffic script
| Test Setup | Setup all DUTs before test
*** Test Cases ***
| Vpp forwards packets via L2 bridge domain in circular topology
| | Append Nodes | ${nodes['TG']} | ${nodes['DUT1']} | ${nodes['DUT2']}
| | ...          | ${nodes['TG']}
| | Compute Path
| | ${tg_if1} | ${tg}= | Next Interface                                          
| | ${dut1_if1} | ${dut1}= | Next Interface
| | ${dut1_if2} | ${dut1}= | Next Interface
| | ${dut2_if1} | ${dut2}= | Next Interface
| | ${dut2_if2} | ${dut2}= | Next Interface
| | ${tg_if2} | ${tg}= | Next Interface

Now we have variables filled with interface data. For example tg_if1 contains interface information about that interface on that given node (TG). We will use these variables later in the test, and pass them to other functions and RobotFramework keywords as arguments.

Now we have to create the bridge domain, add interfaces to it, set them up and do this for both DUTs. So we know we are going to repeat some work. Why not create a RobotFramework function, also known as Keyword, for it? This operation is going to be reused number of times, so let's create a common RobotFramework resource file for it. The common RobotFramework libraries sit in resources/libraries/robot directory. Let's add bridge_domain.robot file there:

$ ${EDITOR} $CSIT_ROOT/resources/libraries/robot/bridge_domain.robot

Robot Framework's resource file's difference from RobotFramework suite file is that it does not contain any Test Cases section. It rather has Keywords section.

Let's add the KW in the newly opened resource file:

 *** Keywords ***                                                                 
 | Vpp l2bd forwarding setup                                                      
 | | [Documentation] | Setup bridge domain between 2 interfaces on VPP node  
 | | [Arguments] | ${node} | ${if1} | ${if2} | ${learn}=${TRUE}
 | | Set Interface State | ${node} | ${if1} | up                                  
 | | Set Interface State | ${node} | ${if2} | up                                  
 | | Vpp Add L2 Bridge Domain | ${node} | ${1} | ${if1} | ${if2} | ${learn}       
 | | All Vpp Interfaces Ready Wait | ${nodes}

We've named the KW as Vpp l2bd forwarding setup, and it takes 4 arguments: node to create the bridge domain on, two interfaces to add to that bridge domain, and learning option setting (which we set to default to TRUE). In the new Keyword code we actually do what we've written above: set interfaces up, add them to a bridge domain, and then wait for those interfaces to come up. Let's go over them one by one.

  • Set Interface State

Set Interface State is located in InterfaceUtil.py:

<snip>
   @staticmethod                                                                
   def set_interface_state(node, interface, state):                             
       """Set interface state on a node.                                        
                                                                                
       Function can be used for DUTs as well as for TGs.                        
                                                                                
       :param node: node where the interface is                                 
       :param interface: interface name or sw_if_index                          
       :param state: one of 'up' or 'down'                                      
       :type node: dict                                                         
       :type interface: str or int                                              
       :type state: str                                                         
       :return: nothing                                                         
       """                                                                      
<snip>

RobotFramework allows calling of functions implemented in Python directly, you just have to know how the name of the function is translated to RobotFramework formatting. Every Python underscore is translated to space in RobotFramework, and case don't matter. Hence set_interface_state is called Set Interface State in RobotFramework. Parameters are delimited by pipe.

  • Vpp Add L2 Bridge Domain | ${node} | ${1} | ${if1} | ${if2} | ${learn}

This KW has to be implemented, but for now let's have a look on what parameters we are going to pass to it. We know that we are going to create a BD, and we know that it will exist on a VPP, which is on DUT, which is represented by the ${node}. Further parameters come out of necessity to pass variables to the VPP API to create BD:

bridge_domain_add_del bd_id {bd_id} flood 1 uu-flood 1 forward 1 learn {learn} arp-term 0

We do not care for flood, uu-flood, forward and arp-term parameters, because those are going to be mostly static in our tests. So for now we only need to be able to change bridge domain ID and learn settings. Once the BD is created, those two interfaces ${if1} and {if2} have to be added there. For that we use following VPP API:

sw_interface_set_l2_bridge sw_if_index {sw_if_id} bd_id {bd_id} shg 0 enable

We use template approach to execute VAT scripts with parameters at a DUT. For each VPP API (or series of, if they are logically bound together) we have a VAT(Vpp Api Test) template file. These are stored in ${CSIT}/resources/templates/vat/. For our purposes we are going to create a file called l2_bridge_domain.vat at that location: ${CSIT}/resources/templates/vat/l2_bridge_domain.vat and enter:

bridge_domain_add_del bd_id {bd_id} flood 1 uu-flood 1 forward 1 learn {learn} arp-term 0
sw_interface_set_l2_bridge sw_if_index {sw_if_id1} bd_id {bd_id} shg 0  enable
sw_interface_set_l2_bridge sw_if_index {sw_if_id2} bd_id {bd_id} shg 0  enable

One can't really "call" a VAT template file as such, first a Python code has to be created to take that template file, replace placeholders with parameter values, and only then send it to given DUT/VPP. The proper location for this particular function would be in ${CSIT}/resources/libraries/python/L2Util.py, because it is related to L2 utilities. In that file, there is Python class named the same as the module name - L2Util, and we should add the function there. But how to name that function so that it matches our RobotFramework's reference above? Well as mentioned previously, when RF looks for a function, it searches in its imports - when it is looking for the function in Python, it translates all spaces with underscores "_". It is also important to note, that although RF looks for the function in case insensitive style, we've settled with using lower case function names with underscore delimiters.

Our new function in L2Util.py module in L2Util class will look like(don't forget to always write your documentation, we like it):

@staticmethod                                                                
def vpp_add_l2_bridge_domain(node, bd_id, port_1, port_2, learn=True):       
    """Add L2 bridge domain with 2 interfaces to the VPP node.               

    :param node: Node to add L2BD on.                                        
    :param bd_id: Bridge domain ID.                                          
    :param port_1: First interface name added to L2BD.                       
    :param port_2: Second interface name added to L2BD.                      
    :param learn: Enable/disable MAC learn.                                  
    :type node: dict                                                         
    :type bd_id: int                                                         
    :type port_1: str                                                        
    :type port_2: str                                                        
    :type learn: bool                                                        
    """                                                                      
    sw_if_index1 = Topology.get_interface_sw_index(node, port_1)             
    sw_if_index2 = Topology.get_interface_sw_index(node, port_2)             
    VatExecutor.cmd_from_template(node,                                      
                                  'l2_bridge_domain.vat',                    
                                  sw_if_id1=sw_if_index1,                    
                                  sw_if_id2=sw_if_index2,                    
                                  bd_id=bd_id,                               
                                  learn=int(learn))                          

Topology.get_interface_sw_index(node, port_1) is function in Topology Python module, that gets VPP's sw_if_index from interface. We need that for the above VPP API command. VatExecutor is module to facilitate execution of VPP API calls using vpp_api_test binary. cmd_from_template connects to node, reads contents of resources/templates/vat/l2_bridge_domain.vat and replaces occurrences of {sw_if_id1} with local variable sw_if_index1 (the same applies to sw_if_id2, bd_id and learn), and sends the result to vpp_api_test for execution. In effect, this code means: take the template, here are the values to use in the template, and execute on node (DUT).

At this time, we have on DUT1: - VPP started - Interfaces state set to up - created a bridge domain - added two interfaces to the bridge domain

  • All Vpp Interfaces Ready Wait | ${nodes}

What is still left to do is to wait for the interfaces to come up. That's being done in Vpp l2bd forwarding setup keyword: All Vpp Interfaces Ready Wait | ${nodes}. That is very versatile keyword/function that polls VPP for interface stats, and loops over their stats until all interface's operational status is the same as admin status. Using this function is much more superior to having simple "sleep" there.

  • Set up DUTs using the new keyword
*** Settings ***
| Resource | resources/libraries/robot/default.robot
| Library | resources.libraries.python.NodePath
| Suite Setup | Setup all TGs before traffic script
| Test Setup | Setup all DUTs before test
*** Test Cases ***
| Vpp forwards packets via L2 bridge domain in circular topology
| | Append Nodes | ${nodes['TG']} | ${nodes['DUT1']} | ${nodes['DUT2']}
| | ...          | ${nodes['TG']}
| | Compute Path
| | ${tg_if1} | ${tg}= | Next Interface                                          
| | ${dut1_if1} | ${dut1}= | Next Interface
| | ${dut1_if2} | ${dut1}= | Next Interface
| | ${dut2_if1} | ${dut2}= | Next Interface
| | ${dut2_if2} | ${dut2}= | Next Interface
| | ${tg_if2} | ${tg}= | Next Interface
| | Vpp l2bd forwarding setup | ${nodes['DUT1'] | ${dut1_if1} | ${dut1_if2}
| | Vpp l2bd forwarding setup | ${nodes['DUT2'] | ${dut2_if1} | ${dut2_if2}

  • Validate VPP configuration by sending traffic

Now that VPPs are configured, and all interfaces are in expected operation state, we should send packets out of one TG interface, and expect those packets to appear on the second TG interface. Let's create robot framework keyword to send L2 traffic and expect it to come on the other interface.

*** Settings ***
| Resource | resources/libraries/robot/default.robot
| Library | resources.libraries.python.NodePath
| Suite Setup | Setup all TGs before traffic script
| Test Setup | Setup all DUTs before test
*** Test Cases ***
| Vpp forwards packets via L2 bridge domain in circular topology
| | Append Nodes | ${nodes['TG']} | ${nodes['DUT1']} | ${nodes['DUT2']}
| | ...          | ${nodes['TG']}
| | Compute Path
| | ${tg_if1} | ${tg}= | Next Interface                                          
| | ${dut1_if1} | ${dut1}= | Next Interface
| | ${dut1_if2} | ${dut1}= | Next Interface
| | ${dut2_if1} | ${dut2}= | Next Interface
| | ${dut2_if2} | ${dut2}= | Next Interface
| | ${tg_if2} | ${tg}= | Next Interface
| | Vpp l2bd forwarding setup | ${nodes['DUT1'] | ${dut1_if1} | ${dut1_if2}
| | Vpp l2bd forwarding setup | ${nodes['DUT2'] | ${dut2_if1} | ${dut2_if2}
| | Send and receive ICMPv4 bidirectionally | ${nodes['TG']} | ${tg_if1} | ${tg_if2}

The last keyword is implemented in resources/libraries/robot/l2_traffic.robot, and it wraps two calls to Send and receive ICMP Packet to send packet through the topology in both directions. The Send and receive ICMP Packet takes TG node, two TG's interfaces and src/dest IP addresses as parameters. These are then used in the traffic script, which is executed by call to Run Traffic Script On Node:

| | ${src_mac}= | Get Interface Mac | ${tg_node} | ${src_int}                    
| | ${dst_mac}= | Get Interface Mac | ${tg_node} | ${dst_int}                    
| | ${args}= | Traffic Script Gen Arg | ${dst_int} | ${src_int} | ${src_mac}     
| |          | ...                    | ${dst_mac} | ${src_ip} | ${dst_ip}       
Run Traffic Script On Node | send_ip_icmp.py | ${tg_node} | ${args}

As one can notice, the above line refers to send_ip_icmp.py script. We call that a traffic script - it's purpose is to be executed on TG node and to validate certain VPP network configuration. In send_ip_icmp.py case the script is expected to generate a packet coming out from one interface and come back on the other one unchanged. The script is located in resources/traffic_scripts/send_ip_icmp.py. One can do whatever they want in these traffic scripts, but usually you'll end up with using CSIT's tools to help you parse arguments, generate packet, send it out, wait for it and validate the result.

args = TrafficScriptArg(['src_mac', 'dst_mac', 'src_ip', 'dst_ip'])          
src_mac = args.get_arg('src_mac')                                            
dst_mac = args.get_arg('dst_mac')                                            
src_ip = args.get_arg('src_ip')                                              
dst_ip = args.get_arg('dst_ip')                                              
tx_if = args.get_arg('tx_if')                                                
rx_if = args.get_arg('rx_if')                                                

rxq = RxQueue(rx_if)                                                         
txq = TxQueue(tx_if)                                                         

At this time we have two "queues" (just some naming) that allow to read/write any packets from given interfaces. In our case those two interfaces are src_int and dst_int from script above. Now, let's create raw packet that we will send out:

pkt_raw = (Ether(src=src_mac, dst=dst_mac) /                             
           IP(src=src_ip, dst=dst_ip) /                                  
           ICMP())                                                       

Above code uses scapy to generate Ethernet packet with IP header and ICMP payload. See http://www.secdev.org/projects/scapy/doc/usage.html for more information how to use scapy to generate packets of your needs.

Once we have the packet generated, we have to send it out:

txq.send(pkt_raw)                                                            

Next step is to wait for the packet come on the other interface:

ether = rxq.recv(2)
# Check whether received packet contains layers Ether, IP and ICMP           
if ether is None:                                                            
    raise RuntimeError('ICMP echo Rx timeout')                               

if not ether.haslayer('IP'):                                            
    raise RuntimeError('Not an IP packet received {0}'                       
                       .format(ether.__repr__()))                            

if not ether.haslayer('ICMP'):                                          
    raise RuntimeError('Not an ICMP packet received {0}'                     
                       .format(ether.__repr__()))                            


And that's it. We have created RobotFramework suite, test case, keyword, python code and traffic script and we have them all tied together. The last part is how to execute the test - to test a particular test case, you can pass -t parameter to pybot:

pybot ... -t "Vpp forwards packets via L2 bridge domain in circular topology" tests

Or you can use wildcards:

pybot ... -t "Vpp forwards packets via L2 bridge *" tests

CSIT Code Structure

CSIT project consists of the following:

  • RobotFramework tests, resources, and libraries.
  • bash scripts – tools, and anything system-related (copying files, installing SW on nodes, ...).
  • Python libraries
    • the brains of the execution.
    • for different functionality there is a different module, i.e.
      • vpp
        • ipv4 utils.
        • ipv6 utils.
        • xconnect.
        • bdomain.
        • VAT (vpp_api_test) helpers.
        • Config generator.
      • ssh.
      • topology.
      • packet verifier – packet generator and validator.
      • v4/v6 ip network and host address generator.
  • vpp_api_test templates.

Each RF testsuite/case has TAGs associated with it that describe what environment that it can be run on: HW/VM, or what topology it requires. RobotFramework is executed with parameter that links to topology description file, we call it topology for simplicity. This file is parsed to variable “nodes” and later used in test cases and libraries.

In general test cases are written in readable English, so that even non-coders can understand it. These top level test cases should stay the same; in other words the testcase text should not represent “how” the test is done, but “what” the test case does.

Libraries to handle VPP functionality are written in Python and are separated on per-feature basis: v4, v6, interface (admin up, state status and so on), xconnect and bdomain. More modules are going to be implemented when needed.

Performance tests are executed using packet traffic generators external to servers running VPP code. Python APIs are used to control the traffic generators. Linux Foundation hosts physical infrastructure dedicated to FD.io, consisting of three of 3-compute-node performance testbeds (compute node = x86_64 multi-core server). Two of the compute nodes run VPP code, one runs a software traffic generator. Currently CSIT performance tests are executed using trex.

CSIT Test Code Guidelines

WORK IN PROGRESS

Here are some guidelines for writing reliable, maintainable, reusable and readable Robot Framework (RF) test code. There is used Robot Framework version 2.9.2 (user guide) in CSIT.

RobotFramework test case files and resource files

  • General
    • RobotFramework test case files and resource files use special extension .robot
    • Usage of pipe and space separated file format is strongly recommended. Tabs are invisible characters which is error prone.
    • Files should be encoded in ASCII. Non-ASCII characters are allowed but they must be encoded in UTF8 (the default Robot source file encoding).
    • Line length is limited to 80 characters.
    • There must be included licence (/csit/docs/licence.rst) at the begging of each file.
    • Copy-pasting of the code is unwanted practice, any code that could be re-used has to be put into RF keyword (KW) or python library instead of copy-pasted.
  • Test cases
    • Test cases are written in Behavior-driven style – i.e. in readable English so that even non-technical project stakeholders can understand it:
  *** Test Cases ***
  | VPP can encapsulate L2 in VXLAN over IPv4 over Dot1Q
  | | Given Path for VXLAN testing is set
  | | ...   | ${nodes['TG']} | ${nodes['DUT1']} | ${nodes['DUT2']}
  | | And   Interfaces in path are up
  | | And   Vlan interfaces for VXLAN are created | ${VLAN}
  | |       ...                                   | ${dut1} | ${dut1s_to_dut2}
  | |       ...                                   | ${dut2} | ${dut2s_to_dut1}
  | | And   IP addresses are set on interfaces
  | |       ...         | ${dut1} | ${dut1s_vlan_name} | ${dut1s_vlan_index}
  | |       ...         | ${dut2} | ${dut2s_vlan_name} | ${dut2s_vlan_index}
  | | ${dut1s_vxlan}= | When Create VXLAN interface     | ${dut1} | ${VNI}
  | |                 | ...  | ${dut1s_ip_address} | ${dut2s_ip_address}
  | |                   And  Interfaces are added to BD | ${dut1} | ${BID}
  | |                   ...  | ${dut1s_to_tg} | ${dut1s_vxlan}
  | | ${dut2s_vxlan}= | And  Create VXLAN interface     | ${dut2} | ${VNI}
  | |                 | ...  | ${dut2s_ip_address} | ${dut1s_ip_address}
  | |                   And  Interfaces are added to BD | ${dut2} | ${BID}
  | |                   ...  | ${dut2s_to_tg} | ${dut2s_vxlan}
  | | Then Send and receive ICMPv4 bidirectionally
  | | ... | ${tg} | ${tgs_to_dut1} | ${tgs_to_dut2}
    • Every test case should contain short documentation. (example will be added) This documentation will be used by testdoc tool - Robot Framework's built-in tool for generating high level documentation based on test cases.
    • Do not use hard-coded constants. It is recommended to use the variable table (***Variables***) to define test case specific values. Use the assignment sign = after the variable name to make assigning variables slightly more explicit:
  *** Variables ***
  | ${VNI}= | 23
    • Common test case specific settings of the test environment should be done in Test Setup part of the Setting table ease on (***Settings***).
    • Post-test cleaning and processing actions should be done in Test Teardown part of the Setting table (e.g. download statistics from VPP nodes). This part is executed even if the test case has failed. On the other hand it is possible to disable the tear-down from command line, thus leaving the system in “broken” state for investigation.
    • Every TC must be correctly tagged. List of defined tags is in /csit/docs/tag_documentation.rst file.
    • User high-level keywords specific for the particular test case can be implemented in the keyword table of the test case to enable readability and code-reuse.
  • Resource files
    • Used to implement higher-level keywords that are used in test cases or other higher-level keywords.
    • Every keyword must contains Documentation where the purpose and arguments of the KW are described.
    • The best practice is that the KW usage example is the part of the Documentation. It is recommended to use pipe and space separated format for the example.
    • Keyword name should describe what the keyword does, specifically and in a reasonable length (“short sentence”).


Python library files

  • General
    • Used to implement low-level keywords that are used in resource files (to create higher-level keywords) or in test cases.
    • Higher-level keywords can be implemented in python library file too, especially in the case that their implementation in resource file would be too difficult or impossible, e.g. nested FOR loops or branching.
    • Every keyword, Python module, class, method, enums has to contain documentation string with the short description and used input parameters and possible return value(s).
    • The best practice is that the KW usage example is the part of the Documentation. It should contains two parts – RobotFramework example and Python example. It is recommended to use pipe and space separated format in case of RobotFramework example.
    • KW usage examples can be grouped and used in the class documentation string to provide better overview of the usage and relationships between KWs.
    • Keyword name should describe what the keyword does, specifically and in a reasonable length (“short sentence”).
    • There must be included licence (/csit/docs/licence.rst) at the begging of each file.
  • Coding
    • It is recommended to use some standard development tool (e.g. PyCharm Community Edition) and follow PEP-8 recommendations.
    • All python code (not only RF libraries) must adhere to PEP-8 standard. This is enforced by CSIT Jenkins verify job.
    • Indentation – do not use tab for indents! Indent is defined as four spaces.
    • Line length – limited to 80 characters.
    • Imports - use the full pathname location of the module, e.g. from resources.libraries.python.topology import Topology. Imports should be grouped in the following order: 1. standard library imports, 2. related third party imports, 3. local application/library specific imports. You should put a blank line between each group of imports.
    • Blank lines - Two blank lines between top-level definitions, one blank line between method definitions.
    • Do not use global variables inside library files.
    • Constants – avoid to use hard-coded constants (e.g. numbers, paths without any description). Use configuration file(s), like /csit/resources/libraries/python/constants.py, with appropriate comments.
    • Logging – log at the lowest possible level of implementation (debugging purposes). Use same style for similar events. Keep logging as verbose as necessary.
    • Exceptions – use the most appropriate exception not general one („Exception“ ) if possible. Create your own exception if necessary and implement there logging, level debug.


Performance testing

CSIT performance testing is following the same approach and uses the same code structure as CSIT functional testing. The main difference is that performance testing is currently executed on dedicated bare-metal physical compute test beds.

There are three physical testbeds available for CSIT performance testing. Reservation system script is used to prevent executing more than one running instance of CSIT test suite per testbed. The only testbed topology available today is 3-node-single-link-topo.

Traffic generator

The core of execution is the initialization of traffic generator. Traffic generator script is optimized for high performance testing to enable control of multiple streams with high throughput. All current performance test cases create two symmetric packet streams for bi-directional throughput (and shortly also latency) testing. Python API is used to control the traffic generator - to make it running, create necessary streams based on input parameters, start the traffic and read the measured data. Low level traffic generator controlling scripts are encapsulated in appropriate Robot Framework (RF) keywords. All the performance related RF keywords are located in dedicated robot library and can be used for creating new test suites/cases by adding it into settings section:

| Resource | resources/libraries/robot/performance.robot

Keyword for traffic generator initialization to use is located in performance.robot library:

| Initialize traffic generator | ${tg} | ${tg_if1} | ${tg_if2}
| ...                          | ${dut1} | ${dut1_if1} | ${dut1_if2}
| ...                          | ${dut2} | ${dut2_if1} | ${dut2_if2}
| ...                          | ${topology_type}

It is also part of suite setup:

| 3-node Performance Suite Setup

Or suite setup with NIC topology filtering:

| 3-node Performance Suite Setup with DUT's NIC model

(Note: Currently T-rex is the only traffic generator used in CSIT performance testing. More traffic generators will be available soon. DropRateSearch.py provides TG independent implementation of throughput search algorithms and finding of Non Drop Rate (NDR) and Partial Drop Rate (PDR).)

Performance testing DUTs

Each DUT in computed path is initialized per test case with startup and running configuration. Test case name should be defined in behavioral style. Example of test case defined in long_bridge_domain.robot suite:

| Find NDR by using RFC2544 linear search and 64B frames through bridge domain in 3-node topology
| | [Documentation]
| | ... | Find throughput with non drop rate for 64B frames by using
| | ... | linear search starting at 4.1Mpps, stepping down with step of 0.1Mpps
| | [Tags] | 1_THREAD_NOHTT_RSS_1 | SINGLE_THREAD
| | ${framesize}= | Set Variable | 64
| | ${start_rate}= | Set Variable | 4100000
| | ${step_rate}= | Set Variable | 100000
| | ${min_rate}= | Set Variable | 100000
| | ${max_rate}= | Set Variable | 14880952
| | Given Setup '1' worker threads and rss '1' without HTT on all DUTs
| | And   L2 bridge domain initialized in a 3-node circular topology
| | Then Find NDR using linear search and pps | ${framesize} | ${start_rate}
| | ...                                       | ${step_rate} | 3-node-bridge
| | ...                                       | ${min_rate} | ${max_rate}

In this example the name indicates that test case will search NDR (Non Drop Rate) throughput for 64B frames following RFC2544 by using linear search algorithm in a 3-node topology. As we are testing performance of DUT, it is crucial to set up the startup configuration first to best suite our needs. That could be done by calling proper keyword from the default.robot library:

| | Given Setup '1' worker threads and rss '1' without HTT on all DUTs

This will set up the DUTs and apply VPP startup configuration specific to every DUT in computed path. The startup configuration includes PCI interfaces information that is read from the topology file. Note that setting up startup configuration may require restart of DUT -- not SUT.

After starting VPP with startup configuration we need to initialize the running configuration by calling keyword(s) designed for that (custom keywords may need be written).

| | And   L2 bridge domain initialized in a 3-node circular topology

Currently available keywords in performance.robot library are 'L2 bridge domain', 'IPv4 forwarding', 'IPv6 forwarding', 'L2 xconnect' and 'VLAN dot1q'.

Throughput measurement initialization and start are implemented as keyword:

| | Then Find NDR using linear search and pps | ${framesize} | ${start_rate}
| | ...                                       | ${step_rate} | 3-node-bridge
| | ...                                       | ${min_rate} | ${max_rate}

So far we have implemented Linear Search, Binary Search and Combined Search (Linear followed by Binary for refined search). Both NDR (Non Drop Rate) and PDR (Partial Drop Rate) rates can be defined as search criteria. Results reporting is implemented as part of search keywords definition. All of them are defined in performance.robot library.

All keywords are parametrized and it is easy to control the tests by setting the variables accordingly.

Performance testing description

There are two main types of performance tests.

  • Short – run traffic based on topology and setup and FAIL if there was packet loss. Duration of traffic run is set to 10 seconds. Only single traffic run per test case is fired. Tag for this type: PERFTEST_SHORT.
  • Long – use one of the available algorithms to search for PDR/NDR based on RFC2544. Each traffic trial is currently set to 60 seconds. Tag for this type: PERFTEST_LONG.

Each long type will have its PDR (Partial Drop Rate) or NDR (Non Drop rate) version. PDR loss acceptance and type are set as parameters for each suite. Loss acceptance type should be either 'Frames' lost or 'Percentage' of interface line rate.

Performance testing Test Teardown

During test teardown phase we are _currently_ showing statistics captured from DUTs.

Performance testing Suite Teardown

During suite teardown phase we are stopping traffic generator. (Note: Actual teardown phase depends and varies based on used Traffic generator)

Performance testing tags

We designed custom TAG scheme to describe performance testing and setup itself. Full documentation is in tag_documentation.rst file. Each performance test or suite must contains at least PERFTEST tag. It also should contain tags for specific characteristic of startup configuration (e.g.: SINGLE_THREAD) or testing methodology (e.g.: NDR, PERFTEST_LONG).

Performance testing name conventions

All performance suites must conform with naming convention and must be inside of a performance suite:

[Short|Long]_[Test_Type]_[NIC_Type]

Example of use:

performance/Short_Xconnect_Intel-X520-DA2.robot

Performance testing report readability

Performance testing report readability is subject to further improvement based on feedback. We are reporting total packet Bandwidth [Gbps] and Throughput (pps) per stream and also total aggregate from traffic generator perspective. In case of any packet loss is observed we are reporting this information.

Full 'pybot' log can be downloaded and examined for more details. Log contains all information available (depends on verbose level parameter -L). Each keyword has its own section with all the output like startup configuration, running telemetry and variables.