VPP Security Groups
Features are tracked as they are developed in the following VPP-427.
- Support classifiers/filters on any interface type (bridged / routed)
- Filter on IP-addresses with address mask or prefix length (IPv4 and IPv6)
- Filter on source and destination TCP/UDP port ranges
- Filter on source and destination L2 MAC addresses
- Support IPv6 with extension headers present
- Support fragmented packets and unknown transport layer headers
- Combinations of the above filters (e.g. MAC + IP)
- Filters on ingress and egress interfaces
- Stateful firewall. No application layer filtering.
Work list
Task | Owner | Priority | Status | Description |
API definition | Ole | 0 | WIP | |
Ingress/Egress support for classifier | 0 | |||
Support for L2/L3 interfaces | 0 | |||
"Established" behaviour | 1 | |||
Stateful firewall | 1 | |||
Port ip_tables_firewall.py from Neutron as unit test | 1 |
/* * Access List Rule entry * * Future considerations: * u32 proto_flags; * u8 traffic_class; * u32 flow_label; * u32 extension_header_present; * u8 port_range_operator */ typeonly define acl_rule { u8 is_permit; u8 is_ipv6; u8 src_ip_addr[16]; u8 src_ip_prefix_len; u8 dst_ip_addr[16]; u8 dst_ip_prefix_len; u8 proto; u16 src_port; u16 dst_port; };
define acl_add { u32 client_index; u32 context; u32 count; vl_api_acl_rule r[count]; };
define acl_add_reply { u32 context; u32 acl_index i32 retval; };
define acl_del { u32 client_index; u32 context; u32 acl_index };
define acl_del_reply { u32 context; i32 retval; };
define acl_interface_add_del { u32 client_index; u32 context; u8 is_add; u8 is_input; u32 sw_if_index; u32 acl_index; }
define acl_interface_add_del_reply { u32 context; i32 retval; };
define acl_dump { u32 client_context; u32 context; u32 sw_if_index; /* ~0 for all tunnels */ }
define acl_details { u32 context; u32 sw_if_index; u32 acl_index; u32 count; vl_api_acl_rule_t r[count]; }
/** Add or delete MAC / IP ingress filter. These rules restrict the MAC addresses that can send the traffic. If the ip_address is all-zero, any IP address is allowed and only the MAC address is used for the ingress filtering. There can be many MAC addresses on a given interface, a given MAC address may have multiple addresses associated with it (by means of separate ingress rules), and different MAC addresses can also have the same addresses. */ define ip_apr_macip_add_del_ingress { u32 client_index; u32 context; u32 sw_if_index; u8 is_add; u8 is_ipv6; u8 mac_address[6]; }; /** @param context - sender context, to match reply w/response @param retval - return code for the request */ define ip_apr_macip_add_del_ingress_reply { u32 context; i32 retval; };
Design and prototyping
The stateful design is being prototyped in https://github.com/vpp-dev/vpp-lua-plugin/blob/master/samples/polua.lua
The goal for this prototype was to minimize the amount of changes to the main forwarding path and explore for the later possible optimizations.
Also one of the primary design criteria is to avoid creating a separate forwarding path as much as possible.
The main idea with the stateful design is to use the L2 classifier for storing the sessions.
For this, we create two chained tables per interface per direction: TCP/UDP then ICMP, and hook them into the processing path of the packet.
If the session is not in the table, it means we need to do the policy check - thus the miss_next index of the ICMP table is set to one of the nodes taking care of the policy checks: there are four of them because of {ingress/egress, ip4/ip6}.
Each of the nodes is very simple: it checks the policy and if the policy permits the packet, then it adds the session and recirculates the packet back into the lookup - then that packet will hit the session and be processed by "fast path".
For the purposes of this document we refer to the policy check path as "slow path" and the path using the established state as "fast path" even as you see if it is a bit of a misnomer - the "slow path" does not really pass the packets through the box, it merely sets up the fast path and recirculates the packets to hit it.
Besides adding the forward flow, if there is a policy in the reverse direction, then the slow path also sets up the mirror flow in the tables of the opposite direction - so as to avoid having to do the policy check for the return packets of the flow. The only type of ICMP that are considered to have the "return" packets are echo/echo-reply.
When the ingress packet processing is done, the forwarding is done as usual by VPP, and then the similar check against the flow table is done on egress in the l2-output-lookup - if there is a policy applied. Again, the missing session results to a redirect to a "slow path" node, which inserts a session and a return session, and recirculates the packet.
This highlights a particularity - if there is a policy in one direction that is other than "permit everything" and has some deny rules, then for the proper functioning, there needs to be a "permit everything" policy applied in the opposite direction on the same interface - so that the return packets did not hit the policy lookup. However, this can be easily hidden from the user by implementation, so is probably not a big problem.
However, some more distinct shortcomings:
1) not very frugal about the memory. With policies applied, each connection consumes 4 session slots. How bad is it ?
2) no TCP state tracking nor UDP timeout tracking.
3) No any cleanup at all for the classifier tables. Only additions are performed. this MUST be taken care of and is TBD. Note that it is intentionally separate from (2), because it covers the scenarios like just simple high resource utilization as well.
4) No support for IPv6 EH or IPv4 fragments. This is a general issue with using the "simple" bitmask/match type of classifier, and so far the solution is is TBD.
set interface input acl intfc <int> [ip4-table <index>] [ip6-table <index>] [l2-table <index>] [del] show inacl type [ip4|ip6|l2]
classify table [miss-next|l2-miss_next|acl-miss-next <next_index>] mask <mask-value> buckets <nn> [skip <n>] [match <n>] [del] show classify tables [index <nn>] classify session [hit-next|l2-hit-next|acl-hit-next <next_index>|policer-hit-next <policer_name>] table-index <nn> match [hex] [l2] [l3 ip4] [opaque-index <index>]
test classify [src <ip>] [sessions <nn>] [buckets <nn>] [table <nn>] [del]
set ip classify intfc <int> table-index <index>
set interface ip6 table <intfc> <table-id>
set interface l2 input classify intfc <interface-name> [ip4-table <n>] [ip6-table <n>] [other-table <n>]
set interface l2 output classify intfc <<interface-name>> [ip4-table <n>] [ip6-table <n>] [other-table <n>]
set ip source-and-port-range-check
show ip source-and-port-range-check vrf <nn> <ip-addr> <port>
YANG model
Open Issues
- Security Group use case specific API. Done in VPP or control plane plugin?
Existing functionality
The existing functionality has a classifier (https://wiki.fd.io/view/VPP/Introduction_To_N-tuple_Classifiers) matching.
As the above document explains, the classifier is a series of chained tables, with each table having a specific mask, but this mask is the same for all entries.
This has been tested to happen in the L2 bridged case (test case: http://stdio.be/vpp/t/aytest-bridge-tap-py.txt).
Therefore, if we have an example policy:
nova secgroup-create test-secgroup test nova secgroup-add-rule test-secgroup icmp -1 -1 nova secgroup-add-rule test-secgroup tcp 22 22
So, assuming we match with offset 0 (from the beginning of the packet) the mask will look like this for the first line:
000000000000 000000000000 0000 00 00 0000 0000 0000 00 FF 0000 00000000 00000000 00 00 0000 0000 eth dst eth src et ihl t len id fo ttl pr cs ip4src ip4dst t c cs id +-------- L2 ---------------+----------- L3 IPv4 ------------------------------+--------L4 ICMP -----+
For the TCP matching on port 22 it will look as follows:
000000000000 000000000000 0000 00 00 0000 0000 0000 00 FF 0000 00000000 00000000 0000 FFFF 00000000 00000000 0000 0000 0000 0000 eth dst eth src et ihl t len id fo ttl pr cs ip4src ip4dst sp dp seq ack fl win cs urg +-------- L2 ---------------+----------- L3 IPv4 ------------------------------+--------L4 TCP ---------------------------------+
(One would need to round up the number of bytes to the nearest 16-byte boundary that makes sense)
For IPv6 assuming no extension headers, it will look similar, with the L3 header being the IPv6 one:
000000000000 000000000000 0000 0 00 00000 0000 FF 00 00000000000000000000000000000000 00000000000000000000000000000000 00 00 0000 0000 eth dst eth src et v TC fll len nh hl ipv6 src ipv dst t c cs id +-------- L2 ---------------+----------- L3 IPv6 --------------------------------------------------------------------+--------L4 ICMP -----+
For the TCP matching on port 22 it will look as follows:
000000000000 000000000000 0000 0 00 00000 0000 FF 00 00000000000000000000000000000000 00000000000000000000000000000000 0000 FFFF 00000000 00000000 0000 0000 0000 0000 eth dst eth src et v TC fll len nh hl ipv6 src ipv dst sp dp seq ack fl win cs urg +-------- L2 ---------------+----------- L3 IPv6 --------------------------------------------------------------------+--------L4 TCP ---------------------------------
Then using these masks one would create 4 tables, by using the API call:
classify_add_del_table(is_add=1, skip_n_vectors=0, mask=<MMMM>, match_n_vectors=<NNNN>,nbuckets=32,memory_size=20000, next_table_index=-1, miss_next_index=-1)
Let's call these tables "IPv4PROTO", "IPv4PROTO_TCPDPORT", "IPv6PROTO", "IPv6PROTO_TCPDPORT".
One would mention "IPv4PROTO" table as "next_table_index" table for "IPv4PROTO_TCPDPORT", and "IPv6PROTO" as "next_table_index" table for IPv6PROTO_TCPDPORT table.
Then one needs to populate the tables with the correct matches for "ICMP" and "tcp dst port 22". That can be done using API call:
classify_add_del_session(is_add=1, table_index=<XXXX>, match=<bytes-to-match>, hit-next-index -1)
The bytes "XXXX" above would be the match of one or several vectors, corresponding to the packet contents with the desired value.
WARNING: if the "skip" is nonzero in the table configuration, the match is still the entire bitstring, without skipping any leading bytes !!!
Then one would apply the IPv4PROTO_TCPDPORT and IPv6PROTO_TCPDPORT as l2 input classify tables.
The CLI for that is set interface l2 output classify intfc <name> ip[46]-table <tableid>.
The API for this is
classify_set_interface_l2_tables(sw_if_index=<INTFC>, ip4_table_index=<IPv4PROTO_TCPDPORT>, ip6_table_index=<IPv6PROTO_TCPDPORT>, other_table_index=-1, is_input=0)
This would allow to create a unidirectional policy, assuming the other policy is "permit all" it would be fine. If not -
then a mirror table entries will need to be created using the same logic.
The full script showing this process in detail using the python API is at http://stdio.be/vpp/t/classifier_script_simple_policy.txt
The Java API is located in $ROOT/vpp-api/java..