Crowbar is a tool that combines afl-persistent’s instrumentation with quickcheck-like property-based testing. afl-fuzz is a great tool for detecting crashes, but Crowbar helps us go a step farther and automatically discover inputs which cause our program to no longer have the properties we expect it to have.
For reasons that don’t need exploring at this juncture, I first thought to apply Crowbar to charrua-client, a library which implements the DHCP state machine from a client perspective. Code snippets below are taken from the dhcp directory in my somerandompacket repository on GitHub.
Where to begin?
charrua-client attempts to express a simple API:
1 2 3 4 5 6 7 8
which can then be used by a specific implementation that sends packets and manages timeouts. With Crowbar, we can randomize the parameters to these functions, and make sure that the properties we expect still hold.
Generators and How to Make Them
In order to do much with the code above, we’ll need to get a
Dhcp_client.t, and the only way to do that is via
Dhcp_client.create, which needs a
Macaddr.t (six bytes corresponding to a MAC address) and a
Dhcp_wire.option_code list (a list of DHCP options to request in the initial message). To start with, we can write a Crowbar generator for a
Macaddr.t, a generator for
Dhcp_wire.option_codes, and use those as arguments to
1 2 3 4 5 6 7 8 9 10
opt_code generators, we can then use
Dhcp_client.create to get a client and a suggested first message. We can check to see whether the message asks for the option codes as we expect:
1 2 3 4 5 6
The generators for
Dhcp_wire.pkt are fairly complicated because nearly all possible option types are handled — if you’re interested, you can see generators.ml in the source repository for the code discussed here.
Useful Properties to Test
To construct more tests for charrua-client in Crowbar, it’s necessary to think about the properties a client should have. Here are some simpler ones:
- clients generate DHCPDISCOVER packets that ask for the options specified in
create(we made a test fort this above)
- clients which have not received input don’t have a lease
- clients which have received only one input frame don’t have a lease
- clients only change their state in response to messages where the xid matches the one the client expects
A troublesome feature of code which includes a state machine is some dependency on state. Since the client’s reaction to various inputs is not purely expressed by the code and the input alone, but is also a function of the current state (and therefore the set of previously-sent input which was accepted), it’s helpful for us to include tests which get the client into various states. Then we can quickly discover new paths, rather than having to wait for the fuzzer to discover sets of input which result in them from the initial state. (When using
afl-fuzz, this is commonly done by including examples of input which advance the state machine in the
in/ directory, so that
afl-fuzz’s mutation of the input begins from an “interesting” point.)
t in our
Dhcp_client module is abstract, and no interface to the current state of the client is exposed. A user of the
Dhcp_client module can only find out whether a client has a lease or doesn’t, and what to do in response to input. It’s tempting to expose the state directly for easier testing (e.g., “after calling
input with a
DHCPOFFER packet, the state is
REQUESTING”), but that information isn’t necessary to expose in common use — we want our library to make the right decisions, not to punt them to the user. Instead, we can use “generates a new response or notes a new valid lease” as a proxy for “changed state”.
We also want to make an assertion about the client: that given a server that responds “reasonably” to client requests, the client’s state also advances as we’d expect it to. In more concrete terms, we want to assert that usually, the following conversation happens:
1 2 3 4
and that when the client hears
DHCPACK, it notes that it now has a valid lease and exposes this information to any caller.
In other words, the test we want to write is:
- the client and a server negotiate a lease in four messages
In order to write such a test, we need to have some “reasonable” server. Luckily, the
Dhcp_server module in
charrua-core.server exposes the configuration record, so we can automatically generate one in Crowbar:
1 2 3 4 5 6 7 8 9 10 11
As a precondition for the test we’d actually like to write, it’s sensible to write a more basic one first:
- the client and server generate messages which are mutually intelligible
which looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13
and once this passes, we can do something more complicated:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Bonus Testing: charrua-core
charrua-client is, at its heart, a fairly simple wrapper around
charrua-core. In particular,
charrua-client mostly exposes interfaces which deal with input and output in terms of
buffer (which is a
charrua-client itself immediately calls
Dhcp_wire.pkt_of_buf on the
Cstruct.t supplied to
input, and similarly calls
Dhcp_wire.pkt_to_buf on a constructed
pkt just before returning a
Response. Several of the tests above hinted at inconsistent behavior in
pkt_to_buf, so perhaps some direct testing is warranted.
Dhcp_wire.pkt is exposed in
Dhcp_wire’s public interface, so we can write a generator for
Dhcp_wire.pkts. (The generator for
pkts is quite a bit more involved than any of the generators above, as it also includes reference to a generator for
Dhcp_wire.dhcp_options, a type with over 100 variants.)
A common test for sets of functions like
pkt_of_buf is to see whether the
pkt is the same after being serialized and then deserialized. Hopefully, a round-trip through
pkt_of_buf won’t gain or lose any information.
1 2 3 4 5 6 7 8 9 10 11
The full test suite includes a few more tests which are less visually interesting (or too long to reproduce here!).
charrua-client’s API wasn’t sufficient for the user to make the correct decision about what to do given arbitrary inputs (diff)
charrua-coreparsed some options which contained lists incorrectly: (#49, #54, #55)
charrua-corelooked for the
flagsfield of the DHCP header at the wrong offset: (#51)
charrua-coredidn’t have parity between the DHCP options it was possible to emit and those it understood (#50, #52)
charrua-corehad a few typos in its option code processing, meaning some options couldn’t be expressed and some would be expressed counter to intentions (#53, #48)
charrua-corevaliantly attempted to serialize empty lists in some cases where the spec disallows it (#47, #48)
After my experiences with charrua-core’s predecessor, I was very impressed by how hard I had to try to find bugs in
charrua-core. I was additionally impressed by how easy it was to fix said bugs once I had discovered them — generally it took only minutes or tens of minutes to discover and isolate the problem, and the fix was fairly obvious. I would caution against concluding that
charrua-core is a low-quality library based on these findings; it is very hard to write software that doesn’t have any bugs. If you think you’ve done so, I recommend seeing whether Crowbar agrees!
Thanks to Stephen Dolan for Crowbar, Christiano Haesbaert and Gina Maini for charrua-core, and to Thomas Leonard and Hannes Mehnert for comments on this post.