PoC: compiling to eBPF from Rust

I have been playing with eBPF (extended Berkeley Packet Filters), a neat feature present in recent Linux versions (it evolved from the much older BPF filters). It is a virtual machine running in the kernel, to which you can send code from userland, and that code can be used to filter packets or trace parts of the kernel code.

What makes eBPF really nice is how the kernel handles it. You send a program in bytecode format to the kernel, it then checks it, verifying, for example, that there are no loops, thus guaranteeing that the program will terminate, and it will then apply JIT compilation, making the resulting code quite fast. Even better, that code can be loaded and unloaded at any time through a syscall, and you can set up shared data structures between the eBPF program and your own, to efficiently gather data.

As an example, you can use eBPF (and the XDP - eXpress Data Path - feature) to write very efficient firewalls, or employ BCC (BPF Compiler Collection) to trace a process’s IO events.

I’m looking at how we could use that to trace applications on our infrastructure at Clever Cloud. There are a few things we should know about the tooling first.

At the beginning, people wrote their program using the bytecode directly:

/* Compare IPv4 with one word instruction (32bit)*/
struct bpf_insn insn[] = {
  /* If skb->protocol != ETH_P_IP, skip this whole block. The offset will be set later. */
  BPF_JMP_IMM(BPF_JNE, BPF_REG_7, htobe16(protocol), 0),
  /*
   * Call into BPF_FUNC_skb_load_bytes to load the dst/src IP address
   *
   * R1: Pointer to the skb
   * R2: Data offset
   * R3: Destination buffer on the stack (r10 - 4)
   * R4: Number of bytes to read (4)
   */
  BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
  BPF_MOV32_IMM(BPF_REG_2, addr_offset),
  BPF_MOV64_REG(BPF_REG_3, BPF_REG_10),
  BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, -addr_size),
  BPF_MOV32_IMM(BPF_REG_4, addr_size),
  BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_skb_load_bytes),
  /*
   * Call into BPF_FUNC_map_lookup_elem to see if the address matches any entry in the
   * LPM trie map. For this to work, the prefixlen field of 'struct bpf_lpm_trie_key'
   * has to be set to the maximum possible value.<
   *
   * On success, the looked up value is stored in R0. For this application, the actual
   * value doesn't matter, however; we just set the bit in @verdict in R8 if we found any
   * matching value.
   */
  BPF_LD_MAP_FD(BPF_REG_1, map_fd),
  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -addr_size - sizeof(uint32_t)),
  BPF_ST_MEM(BPF_W, BPF_REG_2, 0, addr_size * 8),
  BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
  BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1),
  BPF_ALU32_IMM(BPF_OR, BPF_REG_8, verdict),
};

This is a bit raw, and somewhat complex to write, so people worked on C to eBPF compilers, and the feature landed in LLVM: we can use clang to write eBPF programs! It will generate the bytecode, that can then be loaded through the bpf() syscall.

This is still a bit complex, since the eBPF program might need access to some internal data structures of the kernel, and those change depending on kernel versions and configuration options. And we still need to set up the shared data structures with the userland program that will gather data.

That’s why the BCC project provides an easy to use interface to compile and load eBPF programs. They made it so simple that you can write a python script to compile, load and interact with your program:

from bcc import BPF
BPF(text='int kprobe__sys_clone(void *ctx) { bpf_trace_printk("Hello, World!&#092;n"); return 0; }').trace_print()

They provide a lot of useful examples and a nice tutorial to get started writing eBPF tracers.

Unfortunately, those tools make a tradeoff that’s slightly annoying for me: they require installing BCC, which requires Python, LLVM and the complete Linux sources, on the target machines. It might be possible to precompile the programs though, but it does not look like it’s a common use case with BCC.

So, maybe there’s a nice way to precompile those programs, store them as bytecode, then load them with a small agent that does not need LLVM and the kernel sources to work? It turns out it is possible, thanks to the gobpf project, who split their ELF loading code from the BCC part a year ago.

And, now, you’ll see where I am going with this. Being one of those annoying Rust developers who want to rewrite everything in their favorite language, I thought “hey, maybe I can Rust that thing too!”

Since it is possible to compile to eBPF bytecode from C, it is possible to compile LLVM IR (the kind of bytecode LLVM generates from the code before compiling it to the target CPU’s assembly) to eBPF. Look for “LLVM IR debugging” in this link for an example. And I know I can compile Rust to that LLVM IR, and everything should work out, as long as Rust’s LLVM version is the same as the system’s version.

So I created a small Rust project,