Parsers Optional
Friends, I have spoken to you of TCP and of fuzzing. Next I will speak to you of both, but today, I will speak to you of TCP options. If you’re here for the pwnage, sit tight; it’s coming.
What Even Is TCP Anyway
Here’s the lazy way of explaining it: TCP is the abstraction layer that allows you to pretend that network communication works in a logical, orderly, reliable fashion when you’re writing an application. Reading data and having it always be in the order it was sent? TCP. Being able to know whether a connection is open or closed? TCP. Knowing the difference between data coming from two separate processes on the same remote host? TCP. (There are other ways to get these guarantees, but the vast majority of Internet traffic that needs them gets them via TCP.)
On a less abstract level, TCP is a header (one of several!) that your operating system slaps on your network traffic before shipping it over the wire, on the way to its final destination. For damn near all the information on TCP you can shake a stick at, you can consult RFC 793 directly. The header summary, most relevant for our exploration, is reproduced below:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data | |U|A|P|R|S|F| |
| Offset| Reserved |R|C|S|S|Y|I| Window |
| | |G|K|H|T|N|N| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum | Urgent Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Everything here is a fixed-length field except for Options
, Padding
, and data
, all of which are optional. Data
is up to the application, when it’s present (and is also frequently referred to as payload
). When you loaded this web page, TCP packets were sent from my server at somerandomidiot.com
to your computer, and the contents of the data
field were these very words that you’re reading right now. TCP is data
-agnostic; it only cares that your payload arrives intact, not what’s in it.
Options
, on the other hand, are very much TCP’s concern.
TCP : US Constitution :: Options : Amendments
Options serve a few different purposes, and some can serve more than one:
- Communicate a desire to change the behavior of some aspect of the TCP from the RFC 793 spec to some other spec
- Send, or request the receipt of, information that is not required for proper operation of the connection, but may be useful
- Send tuning parameters that allow both sides to agree on a more performant connection
Providing for options is likely the reason TCP, published in September 1981, is still widely used. A general specification for the format of options is provided in RFC 793, so any TCP implementation will at least be able to recognize and ignore options that it doesn’t know how to implement.
What Options Look Like
+---------------+---------------+----------------+-------------------...---------------+
| | | | | |
| option kind | option length | option byte 0 | option byte 1 ... | option byte n |
| | | | | |
+---------------+---------------+----------------+-------------------...---------------+
Each box represents one 8-bit byte.
“Option kind” is a unique code number assigned to each option. There’s a handy chart of them maintained by the IANA, but all that’s really important to know for our purposes is that option kinds 0 and 1 are laid out in RFC 793 and structurally different from all other option kinds.
“Option length” means the overall length of the option, not the length of the option data. An option with 6 bytes of data will have a length of 8 – 6 bytes of option data, plus one byte for the option kind and one byte for the option length. (All numbers are in decimal, both in this paragraph and in the rest of this document.)
A packet can have multiple TCP options specified; to send multiple options, simply order them one-after-another in the TCP header.
The chunk of the TCP packet that denotes the options also must have a total length (in bytes) that can be evenly multiplied by 4.
Seems simple enough, right? Let’s think of a bunch of ways that it might go wrong, because that’s what I call fun!
Option length is greater than the actual length of the option####
Imagine receiving a packet with a TCP option like this:
+-----+-----+-----+-----+-----+-----+
| | | | | | |
| 016 | 008 | 127 | 067 | 014 | 023 |
| | | | | | |
+-----+-----+-----+-----+-----+-----+
The length is said to be 8 bytes, but there’s only 6 bytes of information here. The parser will attempt to read the next two bytes in memory and take them to be the concluding two bytes of the option data.
Option length is less than the actual length of the option
+-----+-----+-----+-----+-----+-----+
| | | | | | |
| 016 | 002 | 127 | 067 | 014 | 023 |
| | | | | | |
+-----+-----+-----+-----+-----+-----+
Here, the actual length of the option is 6, but the claimed length is only 2. The parser will take the third and fourth bytes (127 and 067, respectively) and assume they’re the option kind and option length of a second option in the list of options. (Note that attempting to parse this spurious second option results in behavior like the previous example, where the claimed length was greater than the number of bytes supplied.)
Length is syntactically correct, but violates the spec
+-----+-----+-----+-----+-----+
| | | | | |
| 002 | 005 | 001 | 000 | 000 |
| | | | | |
+-----+-----+-----+-----+-----+
Option 2 is defined in the TCP specification with a fixed length of 4 bytes - 1 byte for option kind, 1 for option length (which is always 4), and 2 for the length of the maximum segment size (MSS for short), in bytes, that the host is capable of receiving. If option 2 were defined as variable-length, the packet above would be a correct way of specifying an MSS of 65,536 bytes, but in fact this packet violates the spec.
Length is correct, but content is invalid
+-----+-----+-----+-----+
| | | | |
| 002 | 004 | 000 | 000 |
| | | | |
+-----+-----+-----+-----+
Here’s our old friend, the MSS option, again. This time, we have a packet with the correct length (4 overall; 2 bytes of data). Our 2 bytes of data are 000 and 000, which means that the maximum segment size for the host sending the packet is 0. This is a completely nonsensical maximum segment size, and to take it seriously would mean forgoing any communication with that host (since it can’t actually process any segments of nontrivial size).
Options are themselves valid, but not properly included in the overall options list
+-----+-----+-----+-----+-----+-------...
| | | | | | TCP |
| 016 | 005 | 001 | 000 | 000 | payload|
| | | | | | data |
+-----+-----+-----+-----+-----+-------...
Above I’ve attempted to depict a TCP packet where there is only one TCP option. It’s a well-formed option of type 16, length 5, with 3 bytes of option data. However, it’s not followed by “end-of-options” terminator, a single byte set to “0”, which is required for any list of options that doesn’t have a total length (in bytes) divisible by 4, and so a naive parser may attempt to read the TCP payload data as more TCP options.
Options lists that are correct, but not properly included in the overall packet
+-----+-----+-----+-----+-----+-----+--------+--------+-------...
| | | | | | | TCP | TCP | TCP |
| 016 | 005 | 001 | 000 | 000 | 000 | payload| payload| payload|
| | | | | | | data 0 | data 1 | data 2 |
+-----+-----+-----+-----+-----+-----+--------+--------+-------...
^ ^
boundary 1 boundary 2
This packet has two TCP options: a 5-byte option 16 and a 1-byte option 0. The overall list of options has a length of 6 bytes and is immediately followed by TCP payload data. Unfortunately, 6 is not evenly divisible by 4, and so the beginning of the TCP payload data is not properly expressible as a multiple of 4 bytes; it can only fall at boundary 1 or boundary 2, marked above. Both of these options are incorrect and the payload will either have extra 0 bytes at the beginning, or be missing the first two bytes (TCP payload data 0 and data 1 above).
Cool Story, Bro
So, if we know it’s possible to handle all these things incorrectly, why not fire a bunch of options that are messed-up like this at a TCP stack and see whether it handles them incorrectly? The short answer is, “because there’s not a nice Scapy module for fuzzing that.” Luckily for us all, it’s possible (although less straightforward than anticipated) to write one, and moreover it’s almost immediately gratifying - stay tuned to watch an entire host VM crash to a grinding halt on the receipt of a single packet.