Advancing Toward the Mirage

I left off last time telling you about getting Mirage to not work. I’m still working hard to get this blog – yes, this one you’re reading now – up and running as a unikernel on EC2.

It became clear to me last week that I needed to fork my own instance of the mirage-tcpip repository and compile my kernels with it, if I were to make any progress in debugging the DHCP problems I was having. A few naive attempts to monkey with version of mirage-tcpip downloaded by opam weren’t successful, so I set about to figure out how actual OCaml developers develop in OCaml with opam.

First stop: the opam documentation on doing tricky things. This is a little short of a step-by-step “do this, dorp” guide, unfortunately; here’s what I end up doing, and it sorta seems to work.

git clone https://github.com/mirage/mirage-tcpip #get code
opam switch install system-mirage-tcpip --alias-of system
opam install mirage
opam remove tcpip
git clone git://github.com/ocaml/opam-repository.git #probably not necessary?
opam remote add local `pwd`/opam-repository #probably also unneeded?
opam pin tcpip ~/mirage-tcpip

The end result of all of this looks about right:

$ opam info tcpip
             package: tcpip
             version: 1.1.1
              pinned: true
        upstream-url: /home/random_dorp/mirage-tcpip
       upstream-kind: local
             depends: ocamlfind & cstruct >= 1.0.1 & mirage-types >= 1.1.1 & mirage-unix >= 1.1.0 & mirage-console-unix & mirage-clock-unix >= 1.0.0 & mirage-net-unix >= 1.1.0 & ipaddr >= 2.2.0
   installed-version: tcpip.1.1.1 [system]
  available-versions: 1.1.0, pinned
         description: Userlevel TCP/IP stack

Unfortunately, I can’t actually make install in /home/random_dorp/mirage-tcpip. I make clean and make with no problems, but still can’t make install:

$ make install
ocaml setup.ml -install 
ocamlfind: Package tcpip is already installed
 - (file /home/random_dorp/.opam/system/lib/tcpip/META already exists)

Why’s it looking in /home/random_dorp/.opam/system/? It should be looking in system-mirage-tcpip. I double-check that my environment looks right with env:

$ env|grep system
CAML_LD_LIBRARY_PATH=/home/random_dorp/.opam/system-mirage-tcpip/lib/stublibs:/usr/lib/ocaml/stublibs
MANPATH=:/home/random_dorp/.opam/system-mirage-tcpip/man
PERL5LIB=/home/random_dorp/.opam/system-mirage-tcpip/lib/perl5:
OCAML_TOPLEVEL_PATH=/home/random_dorp/.opam/system-mirage-tcpip/lib/toplevel
PATH=/home/random_dorp/.opam/system-mirage-tcpip/bin:/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/games:/home/random_dorp/.rvm/bin:/home/random_dorp/.cabal/bin

I don’t have any references to /home/random_dorp/.opam/system/. But looking around in the mirage-tcpip directory reveals something helpful:

$ grep system *
grep: channel: Is a directory
grep: dhcp: Is a directory
grep: lib: Is a directory
grep: packages: Is a directory
README.md:system that supports TCP/IPv4, ARPv4, DHCPv4 and UDPv4.
setup.data:ocamlfind="/home/random_dorp/.opam/system/bin/ocamlfind"
setup.data:system="linux"
setup.data:pkg_io_page="/home/random_dorp/.opam/system/lib/io-page"
setup.data:pkg_mirage_types="/home/random_dorp/.opam/system/lib/mirage-types"
setup.data:pkg_ipaddr="/home/random_dorp/.opam/system/lib/ipaddr"
setup.data:pkg_cstruct="/home/random_dorp/.opam/system/lib/cstruct"
setup.data:pkg_cstruct_syntax="/home/random_dorp/.opam/system/lib/cstruct"
setup.data:pkg_lwt_syntax="/home/random_dorp/.opam/system/lib/lwt"
setup.data:pkg_lwt="/home/random_dorp/.opam/system/lib/lwt"
setup.data:pkg_mirage_net_unix="/home/random_dorp/.opam/system/lib/mirage-net-unix"
setup.data:pkg_lwt_unix="/home/random_dorp/.opam/system/lib/lwt"
setup.data:pkg_ipaddr_unix="/home/random_dorp/.opam/system/lib/ipaddr"
setup.data:pkg_cstruct_lwt="/home/random_dorp/.opam/system/lib/cstruct"
setup.data:pkg_io_page_unix="/home/random_dorp/.opam/system/lib/io-page"
setup.data:pkg_mirage_unix="/home/random_dorp/.opam/system/lib/mirage-unix"
setup.data:pkg_mirage_clock_unix="/home/random_dorp/.opam/system/lib/mirage-clock-unix"
setup.data:pkg_mirage_console_unix="/home/random_dorp/.opam/system/lib/mirage-console-unix"
setup.data:pkg_mirage_types_lwt="/home/random_dorp/.opam/system/lib/mirage-types"
setup.ml:  let system         = c "system"

This setup.data file looks both programmatically generated and outdated. I nuke it from orbit, just to be sure, and re-make clean and make. Sure enough, I get a bunch of reconfiguring. And better yet, make install actually installs! Now to see whether we can build a unikernel. (Remember unikernels?)

$ rake mirage
## Beginning unikernel build ...
MIRAGE      Using the scanned config file: config.ml
MIRAGE      Compiling and dynlinking /home/random_dorp/octopress/_mirage/config.ml
MIRAGE      + Executing: rm -rf /home/random_dorp/octopress/_mirage/_build/config.*
MIRAGE      + Executing: cd /home/random_dorp/octopress/_mirage && ocamlbuild -use-ocamlfind -tags annot,bin_annot -pkg mirage config.cmxs
www         CONFIGURE: /home/random_dorp/octopress/_mirage/config.ml
www         1 job [Dispatch.Main]
www         Installing OPAM packages.
www         + Executing: opam install --yes crunch cstruct lwt mirage-clock-xen mirage-console-xen mirage-http mirage-net-xen mirage-types mirage-xen tcpip
www          The following actions will be performed:
www           - install tcpip.pinned [required by mirage-http]
www           - install mirage-http.1.1.0
www          2 to install | 0 to reinstall | 0 to upgrade | 0 to downgrade | 0 to remove
www          
www          =-=-= Installing tcpip.pinned =-=-=
www          tcpip.pinned Synchronizing with /home/random_dorp/mirage-tcpip
www          Building tcpip.pinned:
www            make
www            make install
www          Removing tcpip.pinned.
www            ocamlfind remove tcpip
www          
www          1.0.0).
www          [NOTE] Package lwt is already installed (current version is 2.4.4).
www          [NOTE] Package cstruct is already installed (current version is 1.1.0).
www          [NOTE] Package crunch is already installed (current version is 1.3.0).
www          [ERROR] The compilation of tcpip.pinned failed.
www          [ERROR] Due to some errors while processing tcpip.pinned, the following actions will NOT proceed:
www           - install mirage-http.1.1.0

Nope. OK, let’s see if we can’t just fix whatever problem is causing our install to be so confused about whether tcpip.pinned is installed or not installed.

$ opam install mirage-http
The following actions will be performed:
 - install tcpip.pinned [required by mirage-http]
 - install mirage-http.1.1.0
2 to install | 0 to reinstall | 0 to upgrade | 0 to downgrade | 0 to remove
Do you want to continue ? [Y/n] y

=-=-= Installing tcpip.pinned =-=-=
tcpip.pinned Synchronizing with /home/random_dorp/mirage-tcpip
Building tcpip.pinned:
  make
  make install
Installing tcpip.pinned.

=-=-= Installing mirage-http.1.1.0 =-=-=
Building mirage-http.1.1.0:
  make
  make install
Installing mirage-http.1.1.0.

…o…kay… trying to build the unikernel no longer appears to try to install tcpip.pinned over itself, so… yay? And it looks like the kernel builds, so… double-yay! Now, to start hacking.

I noticed that the console output in the DHCP discovery process seemed to be dumping some hex directly to the console - namely, the bytes that are supposed to represent the client’s MAC address in the DHCP request. I added a little code to attempt to parse this MAC out and represent it as a string, and try to run it on the EC2 instance.

  let chaddr = match (Macaddr.of_bytes (copy_dhcp_chaddr buf)) with (* try parsing from the wire *)
    | Some mac -> (Macaddr.to_string mac)  (* if it was successful, render as a string *)
    | None -> "00:00:00:00:00:00" (* otherwise, use this placeholder *)

I make a couple of changes to mirage-tcpip, rebuild and reinstall with opam reinstall tcpip, rebuild my unikernel with rake mirage, boot up the kernel with rake unikernel, and discover that yes, my changes are integrated! Hooray!

Sending DHCP broadcast len 552
DHCP: input ciaddr 0.0.0.0 yiaddr 172.31.38.10 siaddr 0.0.0.0 giaddr 0.0.0.0 chaddr 00:00:00:00:00:00 sname  file 

DHCP: offer received: 172.31.38.10

Sending DHCP broadcast len 552
 sg:true gso_tcpv4:true rx_copy:true rx_flip:false smart_poll:false
RX exn Invalid_argument("String.sub")

…but the represented MAC address is 00:00:00:00:00:00, the value I asked it to display if the parse failed. Hm. The structure representing the entire received DHCP packet is a Cstruct, which I discover provides a hexdump function that I can use to dig into what the received packet looks like, so I give that a shot next. The output shows something I could’ve realized from the definition of the DHCP cstruct - the chaddr field is 16 bytes wide, whereas MAC addresses are only 8, which means that what I’m passing to Macaddr.of_bytes is twice as wide as it needs to be. I also have a look at the DHCP RFC and sure enough, what’s in there isn’t in any way guaranteed to be a MAC address - it’s only required to be unique within the subnet. In this case, a more proper thing to do with this input is render it with a generic hex-to-string function.

There’s already some code for this in the MAC address module of ocaml-ipaddr, which mirage-tcpip imports.

let chri x i = Char.code x.[i]

let to_string ?(sep=':') x =
  Printf.sprintf "%02x%c%02x%c%02x%c%02x%c%02x%c%02x"
    (chri x 0) sep
    (chri x 1) sep
    (chri x 2) sep
    (chri x 3) sep
    (chri x 4) sep
    (chri x 5)

This code just pulls each byte out of the string, converts it with Char.code, and prints it in hex into a string. We need a version of this that works with a 16-byte, not just a 6-byte, string. I could just copy out (chri x 6) sep, (chri x 7) sep, etc., but shit, I’m at Hacker School! I should figure out a non-hack solution.

Let’s try what I’d do in Haskell:

  let of_byte x = 
    Printf.sprintf "%02x" (Char.code x) in
  let chaddr_to_string x = 
    (* make a list of bytestrings, then concatenate with : to get printable version *)
    String.concat ":" (List.map of_byte x)

This fails because strings aren’t lists of characters in Ocaml. What’s more, this pattern is actively discouraged. I could use the Jane Street version of String, which provides a to_list function, but that module isn’t already opened in mirage-tcpip. Adding it starts to feel like I’m going overboard for a small function, so instead I look for different ways to accomplish this.

There does exist String.map, but it expects a function that maps characters to other characters (the type signature is (char -> char) -> string -> string). A function that’s attempting to render raw hex bytes into the standard human-readable representation of those bytes can’t do this, because each character will be represented by a two-character pair. That’s why we wanted List.map above, which is more general (('a -> 'b) -> 'a list -> 'b list is the type signature), and will allow us to map with any function that takes one argument and returns one value.

The efficiency argument against this approach isn’t incorrect. Creating a list out of the string is a full traversal through the string - each element must be examined and operated on. Mapping each element of the created list to a human-readable string is another traversal through each item in the list. Concatenating all of the strings has a time cost that might vary depending on how strings are implemented, but it’s not going to be less than the number of strings which need to be concatenated. Moreover, we have to allocate memory for the temporary single-byte strings, which we’re just going to throw away after our concat operation.

What operations do we get for free on the members of a Cstruct ? I know we can copy_, get_, and set_ any member of the thing, and we can hexdump_ the whole Cstruct to standard out. Is there a way to hexdump_ the thing to a string, rather than standard out, so we can send it to Console.log_s ? There exists hexdump_{structname}_to_buffer, but this is defined only for an entire Cstruct, not for an individual element; it looks like this is a bust, too.

All this laziness is starting to look too much like work! Let’s just do it ourselves:

  let of_byte x = 
    Printf.sprintf "%02x" (Char.code x) in
  let chaddr_to_string x = 
    let dst_buffer = (String.make 32 '\000') (*start blank*) in
    for i = 0 to 15 do (* a very bad idea if we didn't know x was always 16 bytes *)
      let thischar = of_byte x.[i] in
        String.set dst_buffer (i*2) (String.get thischar 0); 
        String.set dst_buffer ((i*2)+1) (String.get thischar 1)
    done;
    dst_buffer
  in  
  let chaddr = chaddr_to_string (get_dhcp_chaddr buf) in 

(I changed the argument to chaddr_to_string from copy_dhcp_chaddr buf to get_dhcp_chaddr buf because we’re chaddr_to_string creates a new string and returns it; there’s no need to operate on a copy of the data, rather than reading the data directly.)

Let’s see it in action:

DHCP: input ciaddr 0.0.0.0 yiaddr 192.168.2.50 siaddr 192.168.2.1 giaddr 0.0.0.0 chaddr 00163e6f43f400000000000000000000 sname  file 

Ah yes, much better.

…but I still don’t have a solution to my actual problem (remember that I had an actual problem?), which is not unique to me. Stay tuned for our next episode, where I use every tool at my disposal and learn a couple of new ones to discover the thrilling solution that will get this unikernel up and running on an EC2 t1.micro instance.

(Spoiler: it’s one line of code.)