VPP/Per-feature Notes

From fd.io
< VPP
Revision as of 20:13, 9 May 2018 by Jdl (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This page contains a collection of Notes loosely organized by VPP feature.

VLAN

VLAN And Tags

If the interface is in L3 routing mode, packets with VLAN tags, either one or two, will need to match a sub-interface or will be dropped. It is not just QinQ packets that are dropped.

Once the packet is received, however, the VLAN tags are not removed in the L3 input processing path. So it is possible to perform classification on the VLAN tags to check cos fields, etc. When the packet is routed out an output interface, however, the L2 header will be replaced with whatever is the correct header according to the adjacency info of the output interface. I am not sure why you claim VLAN tag is removed unless you are on the output path when L2 header rewrite happens.

If an interface is put into an L2 mode and associated with a bridge domain, however, exact VLAN match will not be necessary. Sub-interface can still be setup and ethernet-input node will perform best-match sub-interface lookup. For example, if you have a sub-interface with VLAN 10 and another sub-interface with QinQ of outer VLAN 10 and inner VLAN 20 on this interface, then:

  • Any packet with QinQ matching 10/20 will be matched to that QinQ sub-interface.
  • Any packet with outer VLAN 10, with or without inner VLAN tags will be matched to the VLAN 10 sub-interface.
  • Any packet with no VLAN tag or outer VLAN tag other than 10 will be matched to the main interface.

When a (sub-)interface is in L2 mode, you can also add a tag-rewrite operation to push/pop/replace VLAN tags as needed to make packet forwarding in the bridge domain (BD) work properly when the BD have (sub-)interfaces that receive packets with different VLAN tags.


VRF, Table Ids and Routing

(From email by Neale Ranns, 27-Aug-2017.)

A VRF is virtualization of a router’s *IP* routing and forwarding. VRFs are typically identified by a name (and again typically named to refer to the VPN customer they represent). IP packets in VRF RED must be separate from IP packets in VRF BLUE. By ‘IP’ in this context we mean IPv4 and IPv6 and unicast and multicast (known as sub-address families or SAFIs). To provide this separation we therefore need 4 ‘tables’ per-VRF, one for each SAFI. A ‘table’ in this context is the well known longest-prefix matching DB. Tables are known by a unique per-AFI ID (note per-AFI not per-SAFI, so IPv4 unicast and multicast share the same table-id). It is the client’s responsibility to associate unique table IDs to tables within all of its VRFs. The client is free to choose the table-ID from the full u32 range. So, bottom line, in the context of IP forwarding a table (and its associated ID) refer to an instance of a LPM DB.

Despite code comments and variable naming, VPP does not maintain the concept of a VRF, i.e. it does not maintain a grouping of ‘tables’. At the client interface VPP deals only with table IDs – i.e. an identifier that the client provided for a given LPM DB. All APIs that claim to accept a VRF index should be renamed to accept an IP table ID. As with all things VPP the allocation of the data-structure that represents the LPM-DB comes from a memory pool. This data-structure thus has an associated pool index – this is the FIB index. So, there is a one to one mapping between the externally visible and client assigned ‘table ID’ and the internal use only ‘FIB index’. Both are a u32, neither are strictly typed…

With regards to the creation of tables, I’m currently working on the API you discovered – ip_table_add_del. With this API the client instructs VPP to add/delete a ‘table ID’ (as discussed above). The VPP FIB has the concept of ownership or ‘sourcing’ of its resources. Sources can be external (i.e. the CLI or the API) or internal (e.g. LISP and DHCP). FIB resources are only completely free’d was there are no more sources that are referencing it. My intention with the table add/delete API is that the client can add the table then insert routes and bind interfaces. If the client then deletes the table its routes will be purged. The table will then be deleted iff it held the last reference. With the introduction of this API VPP will insist that it has been called to create the table before any routes or interfaces refer to it.

The current behavior is that tables can be created either by setting an interface into that table, or by setting the ‘create_vrf_if_needed’ flag in a route add. There is no means to delete it, hence my new API work.

IP Addressing For Tunnels

In the world or tunnels we typically talk about the underlay and the overlay. The underlay will contain the addresses that form the tunnel’s source and destination address (the addresses one sees in the outer IP header) – i.e. the address in ‘create gtpu tunnel …’. The overlay contains the address configured on the tunnel interface, that is used for routing via the tunnel – i.e. the address in ‘set int ip addr gtpu_tunnel0 …’.

The tunnel’s source address and interface address should not be the same, if they were then if the tunnel were to go down (say a keep-alive mechanism failed) then the tunnel’s interface address is removed from the FIB and hence the tunnel’s source address is no longer reachable and hence it can never receive more packets and consequently never come back up.

Instead one chooses the tunnel’s source address to be an address configured on another interface in the underlay. This could be a physical interface, usually the interface over which the tunnel destination is reachable, or a loopback. The downside of using a physical interface is if that physical goes down, then again the tunnel is unreachable, despite perhaps there being an alternate from the peer to VPP. The benefit of using a loopback is that these never go down. So, to configure the underlay do;

 loop create
 set int state loop0 up
 set int ip addr loop0 1.1.1.1/32

Note the use of a /32 as the loopback’s interface address. This is possible since one cannot connect peers to a loopback, hence the network comprises only one device.

Next create some tunnels using the loopback’s interface address as the tunnel source;

 create gtpu tunnel src 1.1.1.1 dst 10.6.6.6 teid 1111 decap-next ip4
 create gtpu tunnel src 1.1.1.1 dst 10.6.6.6 teid 1112 decap-next ip4
 create gtpu tunnel src 1.1.1.1 dst 10.6.6.6 teid 1113 decap-next ip4

Now for the overlay addressing. Here we have choices. Firstly, we can assign each of the tunnel’s their own overlay address:

 set int ip addr gptu_tunnel0 1.1.1.2/31
 set int ip addr gptu_tunnel1 1.1.1.4/31
 set int ip addr gptu_tunnel2 1.1.1.6/31

Note the use of a /31. GTPU tunnels are point-to-point, so we only need 2 address, one for us, one for the peer.

Or secondly, we can use the same address for each of the tunnels, if we make them unnumbered.

 loop create
 set int state loop1 up
 set int ip addr loop1 1.1.1.2/31
 set int unnumbered gtpu_tunnel0 use loop1
 set int unnumbered gtpu_tunnel1 use loop1
 set int unnumbered gtpu_tunnel2 use loop1


SPAN

SPAN without L2 is putting the packet mirror on an interface (only main interface) input and/or output irrespectively of it in L2 or L3 mode.

SPAN with L2 means to perform packet mirror on L2 input and/or output on an interface or sub-interface which is in a bridge domain or cross connect. One thing to note with L2 SPAN is that the output/destination interface used for mirrored packet must also be in L2 mode. Using L2 interface for output allows user to configure VLAN rewrite operation on that interface on mirrored packets.

If SPAN is configure on an interface with both L2 and without L2, you will get packet mirrored twice if the interface is set to L2 bridging or xconnect. Packet is replicated once on input at the interface and replicated again on L2-input to a BD/xconnect. If interface is in L3 mode, L2 SPAN will not have any effect.