Skip to content

An excursion in VPP

Last blog post, I wrote about Snabb, a blazingly fast userspace networking framework. This blog post does stand alone, however I see networking from a Snabb mindset and there will be some comparisons between that and VPP, another userspace networking framework and the subject of this blog. I recommend folks read that first.

The What and Why of VPP

VPP is another framework that helps you write userspace networking applications using a kernel bypass, similar to Snabb. It came out of Cisco a few years after being open-sourced. Since then it’s made a name for itself, becoming quite popular for writing fast networking applications. As part of my work at Igalia I spent a few months with my colleague Asumu investigating and learning how to write plugins for it and seeing how it compares to Snabb, so that we can learn from it in the Snabb world as well as develop with it in addition.

The outlook of VPP is quite different from Snabb. The first thing is that it’s all in C, quite a difference when you’ve spent the last two years writing Lua. C is a pretty standard language for these sorts of things, however it’s a lot less flexible than more modern languages (such as Lua). I sometimes question whether its current popularity is still deserved. When you start VPP for the first time it configures itself as a router. All the additional functionality you want to provide is done via plugins you write and compile out of tree, and then load into VPP. The plugins then hook themselves into a graph of nodes, usually somewhere in the IP stack. Another difference and in my opinion one of the most compelling arguments for VPP is that you can use DPDK, a layer written by Intel (one of the larger network card makers) with a lot of the drivers. DPDK adds a node at the start called dpdk-input which feeds the rest of the graph with packets. As you might imagine with a full router it also has its own routing table populated by performing ARP or NDP requests (to get the address of hops ahead). It also provides ICMP facilities to, for example, respond to ping requests.

First thing to do is install it from the packages they provide. Once you’ve done that you can start it with your init system or directly from the vppcommand. You then access it via the command-line VPP provides. This is already somewhat of a divergence from Snabb where you compile the tree with the programs and apps you want to use present, and then run the main Snabb binary from the command line directly. The VPP shell is actually rather neat, it lets you query and configure most aspects of it. To give you a taste of it, this is configuring the network card and enabling the IPFIX plugin which comes with VPP:

vpp# set interface ip address TenGigabitEthernet2/0/0
vpp# set interface state TenGigabitEthernet2/0/0 up
vpp# flowprobe params record l3 active 120 passive 300
vpp# flowprobe feature add-del TenGigabitEthernet2/0/0 ip4
vpp# show int
Name Idx State Counter Count
TenGigabitEthernet2/0/0 1 up
local0 0 down drops 1

Let’s get cracking

I’m not intending this to be a tutorial, more for it to give you a taste of what it’s like working with VPP. Despite this, I hope you do find it useful if you’re hoping to hack on VPP yourself. Apologies if the blog seems a bit code heavy.

Whilst above I told you how the outlook was different, I want to reiterate. Snabb, for me, is like Lego. It has a bunch of building blocks (apps: ARP, NDP, firewalls, ICMP responders, etc.) that you put together in order to make exactly want you want, out of the bits you want. You only have to use the blocks you decide you want. To use some code from my previous blog post:

module(..., package.seeall)

local pcap = require("apps.pcap.pcap")
local Intel82599 = require("apps.intel_mp.intel_mp").Intel82599

function run(parameters)
   -- Lua arrays are indexed from 1 not 0.
   local pci_address = parameters[1]
   local output_pcap = parameters[2]

   -- Configure the "apps" we're going to use in the graph.
   -- The "config" object is injected in when we're loaded by snabb.
   local c =, "nic", Intel82599, {pciaddr = pci_address}), "pcap", pcap.PcapWriter, output_pcap)

   -- Link up the apps into a graph., "nic.output -> pcap.input")
   -- Tell the snabb engine our configuration we've just made.
   -- Lets start the apps for 10 seconds!
   engine.main({duration=10, report = {showlinks=true}})

This doesn’t have an IP stack, it grabs a bunch of packets from the Intel network card and dumps them into this Pcap (Packet capture) file. If you want more functionality like ARP or ICMP responding, you add those apps. I think both are valid approaches, though personally I have a strong preference for the way Snabb works. If you do need a fully setup router then of course, it’s much easier with VPP, you just start it and you’re ready to go. But a lot of the time, I don’t.

Getting to know VPP is, I think, one of its biggest drawbacks. There are the parts I’ve mentioned above about it being a complex router which can complicate things, and it being in C, which feels terse to someone who’s not worked much in it. There are also those parts where I feel that information needed to get off the ground is a bit hard to come by, something many projects struggle with, VPP being no exception. There is some information between the wiki, documentation, youtube channel and dev mailing list. Those are useful to look at and go through.

To get started writing your plugin there are a few parts you’ll need:

  • A graph node which takes in packets, does something and passes them on to the next node.
  • An interface to your plugin on the VPP command line so you can enable and disable it
  • Functionality to trace packets which come through (more on this later)

We started by using the sample plugin as a base. I think it works well, it has all the above and it’s pretty basic so it wasn’t too difficult to get up to speed with. Let’s look at some of the landmarks of the VPP plugin developer landscape:

The node

VLIB_REGISTER_NODE (sample_node) = {
  .function = sample_node_fn,
  .name = "sample",
  .vector_size = sizeof (u32),
  .format_trace = format_sample_trace,
  .n_errors = ARRAY_LEN(sample_error_strings),
  .error_strings = sample_error_strings,

  .n_next_nodes = SAMPLE_N_NEXT,

  /* edit / add dispositions here */
  .next_nodes = {
        [SAMPLE_NEXT_INTERFACE_OUTPUT] = "interface-output",

This macro registers your plugin and provides some meta-data for it such as the function name, the trace function (again, more on this later), the next node and some error stuff. The other thing with the node is obviously the node function itself. Here is a shortened version of it with lots of comments to explain what’s going on:

always_inline uword
sample_node_fn (vlib_main_t * vm,
                vlib_node_runtime_t * node,
                vlib_frame_t * frame,
                u8 is_ipv6)
  u32 n_left_from, * from, * to_next;
  sample_next_t next_index;
  /* Grab a pointer to the first vector in the frame */
  from = vlib_frame_vector_args (frame);
  n_left_from = frame->n_vectors;
  next_index = node->cached_next_index;
  /* Iterate over the vectors */
  while (n_left_from > 0)
      u32 n_left_to_next;
      vlib_get_next_frame (vm, node, next_index,
                           to_next, n_left_to_next);
      /* There are usually two loops which look almost identical one takes
       * two packets at a time (for additional speed) and the other loop does
       * the *exact* same thing just for a single packet. There are also
       * apparently some plugins which define some for 4 packets at once too.
       * The advice given is to write the single packet loop (or read) and
       * then write the worry about the multiple packet loops later. I've
       * removed the body of the 2 packet loop to shorten the code, it just
       * does what the single one does, you're not missing much.
      while (n_left_from >= 4 && n_left_to_next >= 2)
      while (n_left_from > 0 && n_left_to_next > 0)
          u32 bi0;
          vlib_buffer_t * b0;
          u32 next0 = SAMPLE_NEXT_INTERFACE_OUTPUT;
          u32 sw_if_index0;
          u8 tmp0[6];
          ethernet_header_t *en0;

          /* speculatively enqueue b0 to the current next frame */
          bi0 = from[0];
          to_next[0] = bi0;
          from += 1;
          to_next += 1;
          n_left_from -= 1;
          n_left_to_next -= 1;
          /* Get the reference to the buffer */
          b0 = vlib_get_buffer (vm, bi0);
          en0 = vlib_buffer_get_current (b0); /* This is the ethernet header */
          /* This is where you do whatever you'd like to with your packet */
          /* ... */
          /* Get the software index for the hardware */
          sw_if_index0 = vnet_buffer(b0)->sw_if_index[VLIB_RX];
          /* Send pkt back out the RX interface */
          vnet_buffer(b0)->sw_if_index[VLIB_TX] = sw_if_index0;
          /* Do we want to trace (used for debugging) */
          if (PREDICT_FALSE((node->flags & VLIB_NODE_FLAG_TRACE) 
                         && (b0->flags & VLIB_BUFFER_IS_TRACED))) {
              sample_trace_t *t = 
              vlib_add_trace (vm, node, b0, sizeof (*t));
      /* verify speculative enqueue, maybe switch current next frame */
      vlib_validate_buffer_enqueue_x1 (vm, node, next_index,
                                       to_next, n_left_to_next,
                                       bi0, next0);
      vlib_put_next_frame (vm, node, next_index, n_left_to_next);
  return frame->n_vectors;

I’ve tried to pare down the code to the important things as much as I can. I personally am not a huge fan of the code duplication which occurs due to the multiple loops for different amounts of packets in the vector, I think it makes the code a bit messy. It definitely goes against DRY (Don’t Repeat Yourself). I’ve not seen any statistics on the improvements nor had time to look into it myself yet, but I’ll take it as true that it’s worth it and go with it :-). The code definitely has more boilerplate than Lua. I think that’s the nature of C unfortunately.

Finally you need to hook the node into the graph, this is done with yet another macro, you choose the node and arc in the graph and it’ll put it into the graph for you:

VNET_FEATURE_INIT (sample_node, static) =
  .arc_name = "ip4-unicast",
  .node_name = "sample",
  .runs_before = VNET_FEATURES ("ip4-lookup"),

In this case we’re having it run before¬†ip4-lookup.


Tracing is really useful when debugging both as an operator and a developer. You can enable a trace on a specific node for a certain amount of packets. The trace shows you which nodes they go through and usually the node provides extra information in the trace. Here’s an example of such a trace:

Packet 1

00:01:23:899371: dpdk-input
  TenGigabitEthernet2/0/0 rx queue 0
  buffer 0x1d3be2: current data 14, length 46, free-list 0, clone-count 0, totlen-nifb 0, trace 0x0
                   l4-cksum-computed l4-cksum-correct l2-hdr-offset 0 l3-hdr-offset 14 
  PKT MBUF: port 0, nb_segs 1, pkt_len 60
    buf_len 2176, data_len 60, ol_flags 0x88, data_off 128, phys_addr 0x5b8ef900
    packet_type 0x211 l2_len 0 l3_len 0 outer_l2_len 0 outer_l3_len 0
    Packet Offload Flags
      PKT_RX_L4_CKSUM_BAD (0x0008) L4 cksum of RX pkt. is not OK
      PKT_RX_IP_CKSUM_GOOD (0x0080) IP cksum of RX pkt. is valid
    Packet Types
      RTE_PTYPE_L2_ETHER (0x0001) Ethernet packet
      RTE_PTYPE_L3_IPV4 (0x0010) IPv4 packet without extension headers
      RTE_PTYPE_L4_UDP (0x0200) UDP packet
  IP4: 02:02:02:02:02:02 -> 90:e2:ba:a9:84:1c
  UDP: ->
    tos 0x00, ttl 15, length 46, checksum 0xd8fa
    fragment id 0x0000
  UDP: 12345 -> 6144
    length 26, checksum 0x0000
00:01:23:899404: ip4-input-no-checksum
  UDP: ->
    tos 0x00, ttl 15, length 46, checksum 0xd8fa
    fragment id 0x0000
  UDP: 12345 -> 6144
    length 26, checksum 0x0000
00:01:23:899430: ip4-lookup
  fib 0 dpo-idx 13 flow hash: 0x00000000
  UDP: ->
    tos 0x00, ttl 15, length 46, checksum 0xd8fa
    fragment id 0x0000
  UDP: 12345 -> 6144
    length 26, checksum 0x0000
00:01:23:899434: ip4-load-balance
  fib 0 dpo-idx 13 flow hash: 0x00000000
  UDP: ->
    tos 0x00, ttl 15, length 46, checksum 0xd8fa
    fragment id 0x0000
  UDP: 12345 -> 6144
    length 26, checksum 0x0000
00:01:23:899437: ip4-arp
    UDP: ->
      tos 0x00, ttl 15, length 46, checksum 0xd8fa
      fragment id 0x0000
    UDP: 12345 -> 6144
      length 26, checksum 0x0000
00:01:23:899441: error-drop
  ip4-arp: address overflow drops

The format is a timestamp with the node name, with some extra information the plugin chooses to display. It lets me easily see this came in through dpdk-input on went through a few ip4 nodes until ip4-arp (which presumably sent out an ARP packet) and then it gets black-holed (because it doesn’t know where to send it). This is invaluable information when you want to see what’s going on, I can only imagine it’s great for operators too to debug their own setup / config.

VPP has a useful function called format and unformat. They work a bit like printf and scanf, however they can also take any struct or datatype with a format function and display (or parse) them. This means to display for example an ip4 address it’s just a matter of calling out to the function with the provided format_ip4_address. There are a whole slew of them which come with VPP and it’s trivial to write your own for your own data structures. The other thing to note when writing these tracing functions is that you need to remember to provide the data in your node to the trace function. It’s parsed after the fact as not to hinder performance. The struct we’re going to give to the trace looks like this:

typedef struct {
  u32 next_index;
  u32 sw_if_index;
  u8 new_src_mac[6];
  u8 new_dst_mac[6];
} sample_trace_t;

The call to actually trace is:

sample_trace_t *t = vlib_add_trace (vm, node, b0, sizeof (*t));
/* These variables are defined in the node block posted above */
t->sw_if_index = sw_if_index0;
t->next_index = next0;
clib_memcpy (t->new_src_mac, en0->src_address,
             sizeof (t->new_src_mac));
clib_memcpy (t->new_dst_mac, en0->dst_address,
             sizeof (t->new_dst_mac));

Finally the thing we’ve been waiting for, the trace itself:

/* VPP actually comes with a format_mac_address, this is here to show you
 * what a format functions look like
static u8 *
format_mac_address (u8 * s, va_list * args)
  u8 *a = va_arg (*args, u8 *);
  return format (s, "%02x:%02x:%02x:%02x:%02x:%02x",
         a[0], a[1], a[2], a[3], a[4], a[5]);

static u8 * format_sample_trace (u8 * s, va_list * args)
  CLIB_UNUSED (vlib_main_t * vm) = va_arg (*args, vlib_main_t *);
  CLIB_UNUSED (vlib_node_t * node) = va_arg (*args, vlib_node_t *);
  sample_trace_t * t = va_arg (*args, sample_trace_t *);
  s = format (s, "SAMPLE: sw_if_index %d, next index %d\n",
              t->sw_if_index, t->next_index);
  s = format (s, "  new src %U -> new dst %U",
              format_mac_address, t->new_src_mac, 
              format_mac_address, t->new_dst_mac);

  return s;

Not bad, I don’t think. Whilst Snabb does make it very easy and has a lot of the same offerings to format and parse things, having for example ipv4:ntop, I appreciate how easy VPP makes doing things like this in C. Format and unformat are quite nice to work with. I love the tracing! Snabb doesn’t have a comparable tracing mechanism, it display a link report but nothing like the trace.

CLI Interface

We have our node, we can trace packets going through it and display some useful info specific to the plugin. We now want our plugin to work. To do that we need to tell VPP that we’re open for business and for it to send packets our way. Snabb doesn’t have a CLI similar to VPPs, partially because it works differently and there is less need for it. However I think the CLI comes into its own when you want to display lots of information and easily configure things on the fly. It also has a command from the shell you can use to execute commands, vppctl,¬†allowing for scripting.

In VPP you use a macro to define a command, similar to the node definition above. This example provides the command name “sample macswap” in our case, a bit of help and the function itself. Here’s what it could look like:

static clib_error_t *
macswap_enable_disable_command_fn (vlib_main_t * vm,
                                   unformat_input_t * input,
                                   vlib_cli_command_t * cmd)
  sample_main_t * sm = &sample_main;
  u32 sw_if_index = ~0;
  int enable_disable = 1;

  /* Parse the command */
  while (unformat_check_input (input) != UNFORMAT_END_OF_INPUT) {
    if (unformat (input, "disable"))
      enable_disable = 0;
    else if (unformat (input, "%U", unformat_vnet_sw_interface,
                       sm->vnet_main, &sw_if_index));

  /* Display an error if we weren't provided with the interface name */
  if (sw_if_index == ~0)
    return clib_error_return (0, "Please specify an interface...");
  /* This is what you call out to, in other to enable or disable the plugin */
  vnet_feature_enable_disable ("device-input", "sample",
                               sw_if_index, enable_disable, 0, 0);
  return 0;

VLIB_CLI_COMMAND (sr_content_command, static) = {
    .path = "sample macswap",
    .short_help = 
    "sample macswap <interface-name> [disable]",
    .function = macswap_enable_disable_command_fn,

It’s quite simple. You see if the user is enabling or disabling the plugin (by checking for the presence of “disable” in their command), grab the interface name and then tell VPP to enable or disable your plugin on said interface.


VPP is powerful and fast, it shares a lot of similarities with Snabb, however they take very different approaches. I feel like the big differences are largely personal preferences: do you prefer C or Lua? Do you need an IP router? Are your network cards supported by Snabb or do you need to built them or use DPDK?

I think overall, during my excursion in VPP. I’ve enjoyed seeing the other side as it were. I do however, think I prefer working with Snabb. I find it faster to get up to speed with because things are simpler for simple projects. Lua also lets me have the power of C though it’s FFI, without subjecting me to the shortcomings C. I look forward however to hacking on both and maybe even seeing some of the ideas from the VPP world come into the Snabb world and the other-way round too!

Published inIgaliaNetworking