How to Set the Evil Bit

Our mission: fuzzing TCP options from scapy.

Our target: the echo service from mirage-tcpip/examples/services.ml.

Outcome: a revision on a widely-used OCaml dependency, gleeful murder and resurrection of several EC2 instances, something to brag to my mom about, a look at a case worse than failure, and great justice.

Okay, Let’s Get To It

scapy has facilities for programmatically generating TCP options from an option-kind number and some data, but there isn’t a nicely provided function for doing so randomly, so I had to write my own.

Let’s have a look at how our automated fuzzer will generate a random TCP option to send to the hapless unikernel.

optionNumber = RandNum(0, 255)
optionLength = RandNum(0, 60 - 20 - 2)
if(optionLength == 0):
	optionValue = ''
else:
	# make an optionLength-long string of random bytes
	optionValue = RandBin(size=optionLength)._fix()
return(optionNumber, optionValue)

The results of this function - a tuple consisting of a random option kind and some random data - are used as an argument in the connection-initiating packet construction. scapy calculates the option-length automatically when building the packet to send to the destination host.

I wrote some surrounding code to attempt a full TCP conversation with the echo server with random options attached to TCP messages. I left it to run repeatedly against the echo server, running as a host instance in Xen, for a few hours while I worked on a Secret Project. When I came back, I was surprised to see that it had uncovered a most exciting result indeed! Here is what became of my test echo instance:

mm.c: Cannot handle page request order 8!
Fatal error: out of memory.
exit: 2
kernel.c: Do_exit called!
$ sudo xm list # show running virtual machines
Name                       ID   Mem VCPUs      State   Time(s)
Domain-0                    0 15837     8     r-----  44361.6
$ 

In other words, I found some unintended behavior.

A locomotive with its nose off the edge of a cliff.

Unintended behavior

No echo instance running at all! It had been completely crashed by something the fuzzer ran against it. I used some rough analysis tools to figure out which conversations resulted in unfinished conversations, and was soon able to narrow it down to a single packet with an options field that looked like this one:

+-----+-----+-----+-----+
|     |     |     |     |
| 001 | 002 | 000 | 000 |
|     |     |     |     |
+-----+-----+-----+-----+

Get Your Fins, We’re Goin’ Divin’

Armed with the magic four-byte sequence that crashed my unikernel, I was ready to go sourcediving in the Mirage TCP/IP stack.

I happen to know (read: I used find mirage-tcpip -name *.ml|xargs grep options on my local mirage-tcpip repository to discover) that the options-parsing code for TCP sits in mirage-tcpip/tcp/options.ml, in a function called unmarshal.

When looking at unmarshal, I was in the unenviable position of knowing what it must be trying to do, knowing that it must mostly succeed, knowing that it was failing under some circumstance for some reason, and knowing that I was missing some important information I needed in order to understand it. Here are some useful things to know when trying to understand unmarshal, followed by how I went about understanding it:

  • In this context, you can imagine a Cstruct as a byte array with placeholder and a known overall length - in other words, a bunch of data in an ordered list, plus a bookmark pointer. (In other contexts, Cstructs can give you very convenient autogenerated code, but variable-length specifications like this aren’t so lucky.)
  • iter and fold are Cstruct.iter and Cstruct.fold, not the functions included in Pervasives, and they have differing type signatures.
let unmarshal buf = 
  let open Cstruct in
  let i = iter 
    (fun buf ->
      match get_uint8 buf 0 with
      |0 -> None   (* EOF *)
      |1 -> Some 1 (* NOP *)
      |n -> Some (get_uint8 buf 1)
    )
    (fun buf ->
      match get_uint8 buf 0 with
      |0 -> assert false
      |1 -> Noop
      |2 -> MSS (BE.get_uint16 buf 2)
      |3 -> Window_size_shift (get_uint8 buf 2)
      |4 -> SACK_ok
      (* I've elided processing for an option not relevant to the post here *)
      |8 -> Timestamp ((BE.get_uint32 buf 2), (BE.get_uint32 buf 6))
      |n -> Unknown (n, (copy buf 2 (len buf - 2)))
    ) buf in
  fold (fun a b -> b :: a) i []

Let’s have a look at the Cstruct.iter and Cstruct.fold documentation.

type 'a iter = unit -> 'a option
(** Type of an iterator. *)

val iter: (t -> int option) -> (t -> 'a) -> t -> 'a iter                        
(** [iter lenf of_cstr cstr] is an iterator over [cstr] that returns            
    elements of size [lenf cstr] and type [of_cstr cstr]. *)                    
                                                                                
val fold: ('b -> 'a -> 'b) -> 'a iter -> 'b -> 'b                               
(** [fold f iter acc] is [(f iterN accN ... (f iter acc)...)]. *)  

The type 'a iter is something that takes no input (because it advances an internal placeholder) and produces a 'a option.

So iter, in our unmarshal function, takes a Cstruct.t -> int option, a Cstruct.t -> Option, and a Cstruct, and returns an Option iter - in other words, a function that takes no output and produces an Option option.

fold, in turn, takes that function and applies it to (fun a b -> b :: a), a basic list-appending function that also, in this case, is reversing the order of elements in the list. (Because this is combined with the stacklike effects of fold, this has the end result of making the resulting list of options mirror the input in its order.)

After some brow-furrowing, I came to understand the end result of this code to be “use the first function to figure out how many bytes to consider in the second function, then run the second function on that chunk of Cstruct and stick the results on the end of the list.”

Why the Fireworks?

Consider our problem input to this function, a 4-byte buffer of 001, 002, 000, 000. In order to make this explanation clearer, I’m going to name the two functions used in constructing our Option iter above, and reproduce them here:

let option_length_of_cstr buf = 
      match get_uint8 buf 0 with
      |0 -> None   (* EOF *)
      |1 -> Some 1 (* NOP *)
      |n -> Some (get_uint8 buf 1)
let option_of_cstr buf =
      match get_uint8 buf 0 with
      |0 -> assert false
      |1 -> Noop
      |2 -> MSS (BE.get_uint16 buf 2)
      |3 -> Window_size_shift (get_uint8 buf 2)
      |4 -> SACK_ok
      (* I've elided processing for an option not relevant to the post here *)
      |8 -> Timestamp ((BE.get_uint32 buf 2), (BE.get_uint32 buf 6))
      |n -> Unknown (n, (copy buf 2 (len buf - 2)))

When we call option_length_of_cstr on 001, we get Some 1 from the pattern match. option_of_cstr on 001 is similarly unproblematic - it’s a simple match to Noop. The result from fold’s first call to iter will be Noop, so fold will put that in the accumulator (which begins as [], and after the first run, contains [Noop]).

When we call option_length_of_cstr on 002 000 000, we get a different result. 002 matches neither 0 nor 1, so the function reads the next byte to get the length, and the result is Some 0, because 0 is the first byte after 002. (That’s already a problem, because returning Some 0 from the length function will cause the iterator never to advance, and the program to be stuck in an infinite loop while trying to parse the buffer.)

Also suboptimally, when we try to call option_of_cstr on 002 000 000, we’ll try to construct a 16-bit number out of the two bytes starting with our second 000. But there’s not a second byte there that we should be able to see, and if we get a number out of this request, it means we’ve accessed memory that we shouldn’t have been able to. Cstruct.BE.get_uint16 doesn’t stop us from accessing this memory, and constructs an MSS object of some size between 0 and 255, depending on the contents of whatever happens to be in memory after our buffer.

Let’s suppose this byte contains the value 127, so the first invocation by fold will result in a list containing [Noop; MSS 127]. Each subsequent call to iter by fold will result in another MSS 127 in the list, until the list grows so large it exhausts the available memory for our virtual machine, and results in a hard crash of the host VM.

A train car is on its side, with a crane hoisting it back into place

:(

There are a few things we should do to fix this:

  • fix Cstruct to not allow get_uint16 buf (and friends) to access memory outside the bounds of buf
  • fix unmarshal to check for nonsensible option lengths like 0

Anil did the former, and I did the (much less critical, given the former) latter.

That’s Pretty Bad. How Could It Be Worse?

Imagine if the option we’d gotten looked like this, instead:

+-----+-----+-----+-----+
|     |     |     |     |
| 001 | 002 | 003 | 000 |
|     |     |     |     |
+-----+-----+-----+-----+

Instead of infinitely recursing in the iter function, this will nicely return an MSS consisting of 000 in the high byte and whatever memory is sitting past the end of this buffer in the low bit. The TCP stack will consider this to be the maximum segment size that the remote host on the other end will accept, and won’t send any packets larger than this. A crafty attacker can observe the maximum size of received segments to figure out what was in this byte of memory. This kind of bug is called a data leak, and it has potentially disastrous implications, especially for critical infrastructure like TLS servers.

Without soapboxing too much, I think it’s better for developers to catch this kind of problem early. Bugs like this one, where it’s obvious that someone is supposed to making sure that something bad doesn’t happen, are difficult to see by reading the source but are made more obvious by automatic testing. “Oh no, something went wrong!” is a pretty good indication that something could, in fact, go wrong.

If you’re saying, “Okay, but this particular leak gets you one byte of data at a time, and in a way that’s hard to recover programmatically; the actual severity of this doesn’t seem very high,” consider that Cstruct is widely used outside the context of the TCP stack, and that any code of this pattern:

let leak_data flow buf = 
  let reasonable_looking_number = value_past_end_of_buf in
  let offset = reasonable_looking_number in
  let data = Cstruct.BE.get_uint32 buf offset in
  S.TCPV4.write flow data

is a data leakage that’s exploitable by an attacker.

Luckily for us, this bug is fixable in Cstruct, and folks developing on top of this infrastructure need only update to a version with this bug patched, rather than rewriting code in many places (as they might have to do in another language.

…Wait, Why Did We Make That Packet In The First Place?

I expected my fuzzer to only send options malformed in these ways:

  • length syntactically correct, but violates the spec
  • length correct, but content is invalid

because scapy automatically handles setting the length field of TCP options. To recap, the exciting behavior above was triggered by sending a TCP option that looked like this:

+-----+-----+-----+-----+
|     |     |     |     |
| 001 | 002 | 000 | 000 |
|     |     |     |     |
+-----+-----+-----+-----+

which looks an awful lot like an option with an improperly calculated length to me. It took me some staring and frowning to figure out that this packet is produced when the randomly-generated option-kind is 001 and the randomly-generated amount of data to send is 000, meaning our scapy packet gets an options array consisting of [(1, '')].

Here’s the relevant code path for assembling TCP options in scapy (extracted from a larger function):

# onum is the option kind, oval is the option data
# they are passed to this function as (onum, oval), e.g. (02, '\xff\xab')
# opt is the byte string that will be included in the overall field of TCP options
# e.g. (\x02\x04\xff\xab).
for oname, oval in x:
   opt=""
   onum = oname
   if type(oval) is not str:
       warning("option [%i] is not string."%onum)
       continue
   opt += chr(onum)+chr(2+len(oval))+oval

return opt+"\x00"*(3-((len(opt)+3)%4))

In the assembly function, scapy doesn’t differentiate between option-kind 001 (or 000, for that matter) and other option-kinds, and gives the option a length and data field. The length is 002 and so the option needs padding to be properly aligned, so scapy appends two 000 bytes to the end. That’s how we get our Packet of Death.

What Next?

It’d be nice to be able to generate random packets that test all of the error conditions I laid out in my last post, for all option-kinds:

  • option length is greater than the data available
  • option length is less than the data available
  • length syntactically correct, but violates the spec
  • length correct, but content is invalid
  • options are valid, but not properly included in the overall options field
  • option list is valid, but not properly padded

Also, the IP header has an options field of its own, which could use some similar investigation.