Mirage

Parsers Optional

Friends, I have spoken to you of TCP and of fuzzing. Next I will speak to you of both, but today, I will speak to you of TCP options. If you’re here for the pwnage, sit tight; it’s coming.

What Even Is TCP Anyway

Here’s the lazy way of explaining it: TCP is the abstraction layer that allows you to pretend that network communication works in a logical, orderly, reliable fashion when you’re writing an application. Reading data and having it always be in the order it was sent? TCP. Being able to know whether a connection is open or closed? TCP. Knowing the difference between data coming from two separate processes on the same remote host? TCP. (There are other ways to get these guarantees, but the vast majority of Internet traffic that needs them gets them via TCP.)

On a less abstract level, TCP is a header (one of several!) that your operating system slaps on your network traffic before shipping it over the wire, on the way to its final destination. For damn near all the information on TCP you can shake a stick at, you can consult RFC 793 directly. The header summary, most relevant for our exploration, is reproduced below:

0                   1                   2                   3   
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Source Port          |       Destination Port        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                        Sequence Number                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Acknowledgment Number                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Data |           |U|A|P|R|S|F|                               |
| Offset| Reserved  |R|C|S|S|Y|I|            Window             |
|       |           |G|K|H|T|N|N|                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           Checksum            |         Urgent Pointer        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Options                    |    Padding    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                             data                              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Everything here is a fixed-length field except for Options, Padding, and data, all of which are optional. Data is up to the application, when it’s present (and is also frequently referred to as payload). When you loaded this web page, TCP packets were sent from my server at somerandomidiot.com to your computer, and the contents of the data field were these very words that you’re reading right now. TCP is data-agnostic; it only cares that your payload arrives intact, not what’s in it.

Options, on the other hand, are very much TCP’s concern.

The Minnesota Goodbye

Looking into some of the results from last week’s fuzzing session, I noticed something interesting:

$ tcpdump -r experimenting_with_pathoc.pcap 'src host 192.168.2.24 and tcp[13] & 1 != 0'
reading from file experimenting_with_pathoc.pcap, link-type EN10MB (Ethernet)
$

Let’s translate that into human.

  • tcpdump -r experimenting_with_pathoc.pcap: use tcpdump to read an existing packet trace named experimenting_with_pathoc.pcap.
  • src host 192.168.2.24: show me only packets that were sent by 192.168.2.24, which is the IP address of a running unikernel that’s serving web pages on port 80.
  • and tcp[13] & 1 != 0: of the packets sent by 192.168.2.24, show me only those where the least significant bit of the 13th byte of the TCP header was not zero. The 13th byte of the TCP header is designated for flags relevant to how the packet should be processed by the TCP state machine, and the least significant bit corresponds to the FIN flag, used to initiate graceful connection closures.

All together, “show me all the packets sent by 192.168.2.24 which initiated a graceful connection closure.” tcpdump helpfully shows us… all zero such packets in the trace.

This isn’t necessarily wrong for a webserver implementing HTTP/1.1, which defaults to persistent connections:

8.1.2 Overall Operation

A significant difference between HTTP/1.1 and earlier versions of HTTP is that persistent connections are the default behavior of any HTTP connection. That is, unless otherwise indicated, the client SHOULD assume that the server will maintain a persistent connection, even after error responses from the server. – RFC 2616

So let’s make something that will try to initiate a connection closure.

Throwing Some Fuzzy Dice

I mentioned a while ago that the Mirage project agreed to have me on board, through the OPW internship project, for the summer. We started on Monday, and I’ve already had a lot of fun!

Officially, my job for the summer is to help shore up the network stack in Mirage, in part by running the current code through its paces, and in part through implementing some new functionality. This first week, I continued some work I started at the end of Hacker School - figuring out how to fuzz some strange (and not-so-strange) corners, and how to wrangle the data I got out of doing so.

Fuzz What Now?

Let’s step back. Way, way, way back.

If you’re a computer program, and you have some data that you care about, your data is likely in some kind of structure reflecting an underlying order to that data. Objects are a common way to organize this stuff; dictionaries, hashmaps, lists, arrays, trees, the list goes on. That’s all well and good when your program is running, keeping all this stuff in memory. But it happens depressingly often that you need to dump this stuff to permanent storage, or express it in some way to some other program or another computer, or represent it on the screen because something awful has happened, and you can’t just say “memory address 0x52413abd, memory address 0x52413cda, memory address 0x52413ea2” - these things are meaningless outside the context of the current run of that program.

So we have serialization, the high-level concept for the jillion different ways to take that data and put it in a string, or a binary data format, so something else can read that string and reassemble the structure of the data. That’s deserialization, which implies parsing; parsing is a pretty big deal.

When the data you’re attempting to assemble into a structure is as you expect it and everything is correct, parsing’s no problem. But it frequently happens that everything is not as you expect it, for any number of reasons - the programmer who made the program that made the message made a mistake; the programmer who made the program that reads the message made a mistake; the programs reading and writing the message are using different versions of the specification in the first place; the specification wasn’t specific about whether the third byte’s range from 0 to 5 was inclusive or exclusive and each programmer made a different decision; both programs agree, but the message was corrupted in transit; the message was corrupted in transit, and one program has implemented a different corruption recovery algorithm than the other… I’ll stop now, but I could keep going for a long time.

There are a lot of bad messages out there. It’s hard to make your parser do the right thing when it receives an arbitrary bad message. It can be hard to even know that your parser does the wrong thing when it receives an arbitrary bad message - if you thought of a certain kind of bad message to use in testing, of course you fixed your code to deal with it; you thought of it! But there are almost certainly loads more bad messages out there than the ones you thought of - both by chance, and by design.

If humans can’t make enough bad messages, maybe computers can. Randomly generating a whole mess of bad messages, sending them to your program, and seeing what happens is called fuzz testing, and it’s awesome.

Verb Your Own Noun

This blog has been running on a Mirage OS unikernel hosted on Amazon EC2 since April 3rd:

$ ec2-get-console-output --region the-best-region i-0123abcd
2014-04-03T16:42:58+0000
Xen Minimal OS!

In that time, I’ve done some stuff:

I figured it was time to tell you about some of it, but first I did some other stuff:

  • upgraded some packages on my build machine
  • broke the build on my blog
  • learned about how Mirage makefiles are generated by trying to get mine working again

You’d rather hear about all of that, right?

Arriving At the Mirage

When last we left our hero, I was strugging valiantly to get a Mirage unikernel version of this blog running on Amazon EC2. All unikernels built and shipped off to EC2 would begin booting, but never become pingable or reachable on TCP port 80. ec2-get-console-output on any instance running a Mirage unikernel would show the beginning stages of a DHCP transaction, then the disappointing RX exn Invalid_argument("String.sub"), then… silence.

When all you had for many years was a hammer, stuff is still going to look an awful lot like nails to you, even if it’s pretty distinctly screw-shaped. I wanted to take a packet trace of this transaction pretty badly. I could do three things that were almost like this:

  • get a packet trace of another machine getting a DHCP lease on EC2
  • get a packet trace of a unikernel getting a DHCP lease on my local Xen server
  • print out an awful lot of diagnostic data from the EC2 unikernel and read it from the console

Trying to draw some conclusions from the first option above led me down the wrong path for about a day or so. I did manage to cause the DHCP client to fail on my local Xen server by sending a DHCP reply packet with no server-identifier set, using scapy and some hackery to cause the xid to always match:

Advancing Toward the Mirage

I left off last time telling you about getting Mirage to not work. I’m still working hard to get this blog – yes, this one you’re reading now – up and running as a unikernel on EC2.

It became clear to me last week that I needed to fork my own instance of the mirage-tcpip repository and compile my kernels with it, if I were to make any progress in debugging the DHCP problems I was having. A few naive attempts to monkey with version of mirage-tcpip downloaded by opam weren’t successful, so I set about to figure out how actual OCaml developers develop in OCaml with opam.

First stop: the opam documentation on doing tricky things. This is a little short of a step-by-step “do this, dorp” guide, unfortunately; here’s what I end up doing, and it sorta seems to work.

It's a mirage! (Or, how to shave a yak.)

A week or so ago, I heard about the Mirage project, a library OS project that makes tiny virtual machines running on top of Xen to run a given application, and do nothing else. I was intrigued, and started working through the excellent intro documentation, and got to the point where I wanted to replace my ho-hum statically-compiled blog hosted from Ubuntu LTS with a unikernel that would serve my static site and do nothing else.

There are excellent instructions on doing this with a Jekyll site on Amir Chaudhry’s blog. Octopress, which I use to generate this site, is built on top of Jekyll, and I only had a few extra goodies to throw in before I was able to make a unikernel that would run my blog with a few rake invocations. After getting the first unikernel up and running via Xen on my laptop, I entertained myself by throwing a few nmap commands at it; I was particularly curious to see whether my unikernel knew what to do with UDP packets:

sudo nmap -sO 192.168.2.13

Starting Nmap 6.40 ( http://nmap.org ) at 2014-03-14 23:26 EDT
Nmap scan report for 192.168.2.13
Host is up (0.00037s latency).
Not shown: 254 open|filtered protocols
PROTOCOL STATE SERVICE
1        open  icmp
6        open  tcp
MAC Address: 00:16:3E:53:E0:1B (Xensource)

Nmap done: 1 IP address (1 host up) scanned in 17.72 seconds

Hee hee hee.