Advancing Toward the Mirage
I left off last time telling you about getting Mirage to not work. I’m still working hard to get this blog – yes, this one you’re reading now – up and running as a unikernel on EC2.
It became clear to me last week that I needed to fork my own instance of the mirage-tcpip
repository and compile my kernels with it, if I were to make any progress in debugging the DHCP problems I was having. A few naive attempts to monkey with version of mirage-tcpip
downloaded by opam
weren’t successful, so I set about to figure out how actual OCaml developers develop in OCaml with opam
.
First stop: the opam documentation on doing tricky things. This is a little short of a step-by-step “do this, dorp” guide, unfortunately; here’s what I end up doing, and it sorta seems to work.
git clone https://github.com/mirage/mirage-tcpip #get code
opam switch install system-mirage-tcpip --alias-of system
opam install mirage
opam remove tcpip
git clone git://github.com/ocaml/opam-repository.git #probably not necessary?
opam remote add local `pwd`/opam-repository #probably also unneeded?
opam pin tcpip ~/mirage-tcpip
The end result of all of this looks about right:
$ opam info tcpip
package: tcpip
version: 1.1.1
pinned: true
upstream-url: /home/random_dorp/mirage-tcpip
upstream-kind: local
depends: ocamlfind & cstruct >= 1.0.1 & mirage-types >= 1.1.1 & mirage-unix >= 1.1.0 & mirage-console-unix & mirage-clock-unix >= 1.0.0 & mirage-net-unix >= 1.1.0 & ipaddr >= 2.2.0
installed-version: tcpip.1.1.1 [system]
available-versions: 1.1.0, pinned
description: Userlevel TCP/IP stack
Unfortunately, I can’t actually make install
in /home/random_dorp/mirage-tcpip
. I make clean
and make
with no problems, but still can’t make install
:
$ make install
ocaml setup.ml -install
ocamlfind: Package tcpip is already installed
- (file /home/random_dorp/.opam/system/lib/tcpip/META already exists)
Why’s it looking in /home/random_dorp/.opam/system/
? It should be looking in system-mirage-tcpip
. I double-check that my environment looks right with env
:
$ env|grep system
CAML_LD_LIBRARY_PATH=/home/random_dorp/.opam/system-mirage-tcpip/lib/stublibs:/usr/lib/ocaml/stublibs
MANPATH=:/home/random_dorp/.opam/system-mirage-tcpip/man
PERL5LIB=/home/random_dorp/.opam/system-mirage-tcpip/lib/perl5:
OCAML_TOPLEVEL_PATH=/home/random_dorp/.opam/system-mirage-tcpip/lib/toplevel
PATH=/home/random_dorp/.opam/system-mirage-tcpip/bin:/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/games:/home/random_dorp/.rvm/bin:/home/random_dorp/.cabal/bin
I don’t have any references to /home/random_dorp/.opam/system/
. But looking around in the mirage-tcpip
directory reveals something helpful:
$ grep system *
grep: channel: Is a directory
grep: dhcp: Is a directory
grep: lib: Is a directory
grep: packages: Is a directory
README.md:system that supports TCP/IPv4, ARPv4, DHCPv4 and UDPv4.
setup.data:ocamlfind="/home/random_dorp/.opam/system/bin/ocamlfind"
setup.data:system="linux"
setup.data:pkg_io_page="/home/random_dorp/.opam/system/lib/io-page"
setup.data:pkg_mirage_types="/home/random_dorp/.opam/system/lib/mirage-types"
setup.data:pkg_ipaddr="/home/random_dorp/.opam/system/lib/ipaddr"
setup.data:pkg_cstruct="/home/random_dorp/.opam/system/lib/cstruct"
setup.data:pkg_cstruct_syntax="/home/random_dorp/.opam/system/lib/cstruct"
setup.data:pkg_lwt_syntax="/home/random_dorp/.opam/system/lib/lwt"
setup.data:pkg_lwt="/home/random_dorp/.opam/system/lib/lwt"
setup.data:pkg_mirage_net_unix="/home/random_dorp/.opam/system/lib/mirage-net-unix"
setup.data:pkg_lwt_unix="/home/random_dorp/.opam/system/lib/lwt"
setup.data:pkg_ipaddr_unix="/home/random_dorp/.opam/system/lib/ipaddr"
setup.data:pkg_cstruct_lwt="/home/random_dorp/.opam/system/lib/cstruct"
setup.data:pkg_io_page_unix="/home/random_dorp/.opam/system/lib/io-page"
setup.data:pkg_mirage_unix="/home/random_dorp/.opam/system/lib/mirage-unix"
setup.data:pkg_mirage_clock_unix="/home/random_dorp/.opam/system/lib/mirage-clock-unix"
setup.data:pkg_mirage_console_unix="/home/random_dorp/.opam/system/lib/mirage-console-unix"
setup.data:pkg_mirage_types_lwt="/home/random_dorp/.opam/system/lib/mirage-types"
setup.ml: let system = c "system"
This setup.data
file looks both programmatically generated and outdated. I nuke it from orbit, just to be sure, and re-make clean
and make
. Sure enough, I get a bunch of reconfiguring. And better yet, make install
actually installs! Now to see whether we can build a unikernel. (Remember unikernels?)
$ rake mirage
## Beginning unikernel build ...
MIRAGE Using the scanned config file: config.ml
MIRAGE Compiling and dynlinking /home/random_dorp/octopress/_mirage/config.ml
MIRAGE + Executing: rm -rf /home/random_dorp/octopress/_mirage/_build/config.*
MIRAGE + Executing: cd /home/random_dorp/octopress/_mirage && ocamlbuild -use-ocamlfind -tags annot,bin_annot -pkg mirage config.cmxs
www CONFIGURE: /home/random_dorp/octopress/_mirage/config.ml
www 1 job [Dispatch.Main]
www Installing OPAM packages.
www + Executing: opam install --yes crunch cstruct lwt mirage-clock-xen mirage-console-xen mirage-http mirage-net-xen mirage-types mirage-xen tcpip
www The following actions will be performed:
www - install tcpip.pinned [required by mirage-http]
www - install mirage-http.1.1.0
www 2 to install | 0 to reinstall | 0 to upgrade | 0 to downgrade | 0 to remove
www
www =-=-= Installing tcpip.pinned =-=-=
www tcpip.pinned Synchronizing with /home/random_dorp/mirage-tcpip
www Building tcpip.pinned:
www make
www make install
www Removing tcpip.pinned.
www ocamlfind remove tcpip
www
www 1.0.0).
www [NOTE] Package lwt is already installed (current version is 2.4.4).
www [NOTE] Package cstruct is already installed (current version is 1.1.0).
www [NOTE] Package crunch is already installed (current version is 1.3.0).
www [ERROR] The compilation of tcpip.pinned failed.
www [ERROR] Due to some errors while processing tcpip.pinned, the following actions will NOT proceed:
www - install mirage-http.1.1.0
Nope. OK, let’s see if we can’t just fix whatever problem is causing our install to be so confused about whether tcpip.pinned
is installed or not installed.
$ opam install mirage-http
The following actions will be performed:
- install tcpip.pinned [required by mirage-http]
- install mirage-http.1.1.0
2 to install | 0 to reinstall | 0 to upgrade | 0 to downgrade | 0 to remove
Do you want to continue ? [Y/n] y
=-=-= Installing tcpip.pinned =-=-=
tcpip.pinned Synchronizing with /home/random_dorp/mirage-tcpip
Building tcpip.pinned:
make
make install
Installing tcpip.pinned.
=-=-= Installing mirage-http.1.1.0 =-=-=
Building mirage-http.1.1.0:
make
make install
Installing mirage-http.1.1.0.
…o…kay… trying to build the unikernel no longer appears to try to install tcpip.pinned
over itself, so… yay? And it looks like the kernel builds, so… double-yay! Now, to start hacking.
I noticed that the console output in the DHCP discovery process seemed to be dumping some hex directly to the console - namely, the bytes that are supposed to represent the client’s MAC address in the DHCP request. I added a little code to attempt to parse this MAC out and represent it as a string, and try to run it on the EC2 instance.
let chaddr = match (Macaddr.of_bytes (copy_dhcp_chaddr buf)) with (* try parsing from the wire *)
| Some mac -> (Macaddr.to_string mac) (* if it was successful, render as a string *)
| None -> "00:00:00:00:00:00" (* otherwise, use this placeholder *)
I make a couple of changes to mirage-tcpip
, rebuild and reinstall with opam reinstall tcpip
, rebuild my unikernel with rake mirage
, boot up the kernel with rake unikernel
, and discover that yes, my changes are integrated! Hooray!
Sending DHCP broadcast len 552
DHCP: input ciaddr 0.0.0.0 yiaddr 172.31.38.10 siaddr 0.0.0.0 giaddr 0.0.0.0 chaddr 00:00:00:00:00:00 sname file
DHCP: offer received: 172.31.38.10
Sending DHCP broadcast len 552
sg:true gso_tcpv4:true rx_copy:true rx_flip:false smart_poll:false
RX exn Invalid_argument("String.sub")
…but the represented MAC address is 00:00:00:00:00:00, the value I asked it to display if the parse failed. Hm. The structure representing the entire received DHCP packet is a Cstruct, which I discover provides a hexdump
function that I can use to dig into what the received packet looks like, so I give that a shot next. The output shows something I could’ve realized from the definition of the DHCP cstruct - the chaddr
field is 16 bytes wide, whereas MAC addresses are only 8, which means that what I’m passing to Macaddr.of_bytes
is twice as wide as it needs to be. I also have a look at the DHCP RFC and sure enough, what’s in there isn’t in any way guaranteed to be a MAC address - it’s only required to be unique within the subnet. In this case, a more proper thing to do with this input is render it with a generic hex-to-string function.
There’s already some code for this in the MAC address module of ocaml-ipaddr
, which mirage-tcpip
imports.
let chri x i = Char.code x.[i]
let to_string ?(sep=':') x =
Printf.sprintf "%02x%c%02x%c%02x%c%02x%c%02x%c%02x"
(chri x 0) sep
(chri x 1) sep
(chri x 2) sep
(chri x 3) sep
(chri x 4) sep
(chri x 5)
This code just pulls each byte out of the string, converts it with Char.code
, and prints it in hex into a string. We need a version of this that works with a 16-byte, not just a 6-byte, string. I could just copy out (chri x 6) sep, (chri x 7) sep
, etc., but shit, I’m at Hacker School! I should figure out a non-hack solution.
Let’s try what I’d do in Haskell:
let of_byte x =
Printf.sprintf "%02x" (Char.code x) in
let chaddr_to_string x =
(* make a list of bytestrings, then concatenate with : to get printable version *)
String.concat ":" (List.map of_byte x)
This fails because strings aren’t lists of characters in Ocaml. What’s more, this pattern is actively discouraged. I could use the Jane Street version of String, which provides a to_list
function, but that module isn’t already opened in mirage-tcpip
. Adding it starts to feel like I’m going overboard for a small function, so instead I look for different ways to accomplish this.
There does exist String.map
, but it expects a function that maps characters to other characters (the type signature is (char -> char) -> string -> string
). A function that’s attempting to render raw hex bytes into the standard human-readable representation of those bytes can’t do this, because each character will be represented by a two-character pair. That’s why we wanted List.map
above, which is more general (('a -> 'b) -> 'a list -> 'b list
is the type signature), and will allow us to map with any function that takes one argument and returns one value.
The efficiency argument against this approach isn’t incorrect. Creating a list out of the string is a full traversal through the string - each element must be examined and operated on. Mapping each element of the created list to a human-readable string is another traversal through each item in the list. Concatenating all of the strings has a time cost that might vary depending on how strings are implemented, but it’s not going to be less than the number of strings which need to be concatenated. Moreover, we have to allocate memory for the temporary single-byte strings, which we’re just going to throw away after our concat
operation.
What operations do we get for free on the members of a Cstruct
? I know we can copy_
, get_
, and set_
any member of the thing, and we can hexdump_
the whole Cstruct
to standard out. Is there a way to hexdump_
the thing to a string, rather than standard out, so we can send it to Console.log_s
? There exists hexdump_{structname}_to_buffer
, but this is defined only for an entire Cstruct
, not for an individual element; it looks like this is a bust, too.
All this laziness is starting to look too much like work! Let’s just do it ourselves:
let of_byte x =
Printf.sprintf "%02x" (Char.code x) in
let chaddr_to_string x =
let dst_buffer = (String.make 32 '\000') (*start blank*) in
for i = 0 to 15 do (* a very bad idea if we didn't know x was always 16 bytes *)
let thischar = of_byte x.[i] in
String.set dst_buffer (i*2) (String.get thischar 0);
String.set dst_buffer ((i*2)+1) (String.get thischar 1)
done;
dst_buffer
in
let chaddr = chaddr_to_string (get_dhcp_chaddr buf) in
(I changed the argument to chaddr_to_string
from copy_dhcp_chaddr buf
to get_dhcp_chaddr buf
because we’re chaddr_to_string
creates a new string and returns it; there’s no need to operate on a copy of the data, rather than reading the data directly.)
Let’s see it in action:
DHCP: input ciaddr 0.0.0.0 yiaddr 192.168.2.50 siaddr 192.168.2.1 giaddr 0.0.0.0 chaddr 00163e6f43f400000000000000000000 sname file
Ah yes, much better.
…but I still don’t have a solution to my actual problem (remember that I had an actual problem?), which is not unique to me. Stay tuned for our next episode, where I use every tool at my disposal and learn a couple of new ones to discover the thrilling solution that will get this unikernel up and running on an EC2 t1.micro instance.
(Spoiler: it’s one line of code.)