As a result of great encouragement from colleagues and friends, I gave a few talks in September. Persistent Networking with Irmin and MirageOS, which I gave at the OCaml Workshop, is a talk on sticking a persistent database into various levels of the network stack. (It includes demonstrations from What a Distributed, Version-Controlled ARP Cache Gets You, as well as an Irmin-ified NAT device that I haven’t yet written up here.
Most instructions on how to get started with OCaml packages now advise the user to get started with opam, which is excellent advice. Getting up and running with opam is pretty easy, but I wasn’t sure where to go from there when I wanted to modify other people’s packages and use the modifications in my environment. I wish I’d realized that the documentation for making packages has a lot of applicable advice for that use case, as well as the apparent target (making your own packges from scratch).
In 2014, I spent 12 weeks at the Recurse Center, formerly (and at the time) known as Hacker School. After finishing up my time there in May of that year, a lot of people asked me reasonable questions like:
- How was the Recurse Center?
- Was attending RC worth your time?
- What did you learn at the Recurse Center?
My response to these questions was “I don’t know yet! It’s too early to say.” Now that more than a year has passed, I think I might have some idea of where to start.
git (and its distributed version control system friends
darcs) have some great properties. Not only do you get a full history of changes on objects stored in them, you can get comments on changes, as well as branching and merging, which lets you do intermediate changes without messing up state for other entities which want to work with the repository.
That’s all pretty cool. I actually want that for some of my data structures, come to think of it. Say, for example, a boring ol’ key-value store which can be updated from a few different threads – in this case, a cache that stores values it gets from the network and the querying/timeout code around it. It would be nice if each thread could make a new branch, make its changes, then merge them into the primary branch once it’s done.
It turns out you can totally do that with Irmin, “the database that never forgets”! I did (and am still doing) a bit of work on sticking a modified version of the MirageOS address resolution protocol code’s data structures into Irmin:
$ git log --all --decorate --oneline --graph * 68216f3 (HEAD, primary, expire_1429688434.645130) Arp.tick: updating to age out old entries * ec10c9a entry added: 192.168.3.1 -> 02:50:2a:16:6d:01 * 6446cef entry added: 10.20.254.2 -> 02:50:2a:16:6d:01 * 81cfa43 entry added: 10.50.20.22 -> 02:50:2a:16:6d:01 * 4e1e1c7 Arp.tick: merge expiry branch |\ | * cd787a0 (expire_1429688374.601896) Arp.tick: updating to age out old entries * | 8df2ef7 entry added: 10.23.10.1 -> 02:50:2a:16:6d:01 |/ * 8d11bba Arp.create: Initial empty cache
When last we spoke, I left you with a teaser about writing your own NAT implementation.
iptables (and friends
pf, to be a little less partisan and outdated) provide the interfaces to the kernel modules that implement NAT in many widely-used routers. If we wanted to implement our own in a traditional OS, we’d have to either take a big dive into kernel programming or find a way to manipulate packets at the Ethernet layer in userspace.
But if all we need to do is NAT traffic, why not just build something that only knows how to NAT traffic? I’ve looked at building networked applications on top of (and with) the full network stack provided by the MirageOS library OS a lot, but we can also build lower-level applications with fundamentally the same programming tactics and tools we use to write, for example, DNS resolvers.
Building A Typical Stack From Scratch
Let’s have a look at the
ethif-v4 example in the mirage-skeleton example repository. This example unikernel shows how to build a network stack “by hand” from a bunch of different functors, starting from a physical device (provided by
config.ml at build time, representing either a Xen backend if you configure with
mirage configure --xen or a Unix tuntap backend if you build with
mirage configure --unix). I’ve reproduced the network setup bits from the most recent version as of now and annotated them a bit: