Some Random Idiot

Let's Play Network Address Translation: The Home Game

When last we spoke, I left you with a teaser about writing your own NAT implementation. iptables (and friends nftables and pf, to be a little less partisan and outdated) provide the interfaces to the kernel modules that implement NAT in many widely-used routers. If we wanted to implement our own in a traditional OS, we’d have to either take a big dive into kernel programming or find a way to manipulate packets at the Ethernet layer in userspace.

But if all we need to do is NAT traffic, why not just build something that only knows how to NAT traffic? I’ve looked at building networked applications on top of (and with) the full network stack provided by the MirageOS library OS a lot, but we can also build lower-level applications with fundamentally the same programming tactics and tools we use to write, for example, DNS resolvers.

Building A Typical Stack From Scratch

Let’s have a look at the ethif-v4 example in the mirage-skeleton example repository. This example unikernel shows how to build a network stack “by hand” from a bunch of different functors, starting from a physical device (provided by config.ml at build time, representing either a Xen backend if you configure with mirage configure --xen or a Unix tuntap backend if you build with mirage configure --unix). I’ve reproduced the network setup bits from the most recent version as of now and annotated them a bit:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
module Main (C: CONSOLE) (N: NETWORK) (Clock : V1.CLOCK) = struct

  (* N, a module of type NETWORK (defined in module V1_LWT 
     from mirage-types), is the building point for the 
     rest of our stack.  Modules E, I, U, and T provide 
     functions like [write], which take a record of the type 
     matching the module (e.g., E.write needs an E.t argument)
     along with some information to write and generate a
     reasonable set of headers of the appropriate layer before 
     calling a lower-level [write] function. 
     *)

  module E = Ethif.Make(N)
  module I = Ipv4.Make(E)
  module U = Udp.Make(I)
  (* Ethernet, Ipv4, and UDP don't need outside timers or randomness, 
     just an underlying implementation to listen from and write to,
     but TCP does *)
  module T = Tcp.Flow.Make(I)(OS.Time)(Clock)(Random)
  (* DHCP also needs timers and randomness *)
  module D = Dhcp_clientv4.Make(C)(OS.Time)(Random)(U)

  let or_error c name fn t =
    fn t
    >>= function
    | `Error e -> fail (Failure ("Error starting " ^ name))
    | `Ok t -> return t

  let start c net _ = (* net is of type N.t *)
    or_error c "Ethif" E.connect net
    >>= fun e ->
    (* e is of type Ethif.t, on which we can call 
    ethernet-level listen and write *)

    or_error c "Ipv4" I.connect e
    >>= fun i ->
    (* we can manually set IP options here for interface i, 
       in addition to overwriting them (potentially) with 
       DHCP below *)
    I.set_ip i (Ipaddr.V4.of_string_exn "10.0.0.2")
    >>= fun () ->
    I.set_ip_netmask i (Ipaddr.V4.of_string_exn "255.255.255.0")
    >>= fun () ->
    I.set_ip_gateways i [Ipaddr.V4.of_string_exn "10.0.0.1"]
    >>= fun () ->
    or_error c "UDPv4" U.connect i
    >>= fun udp ->
    let dhcp, offers = D.create c (N.mac net) udp in
    or_error c "TCPv4" T.connect i
    >>= fun tcp ->
    (* main body of code continues... *)

The code doesn’t do much once it’s built the stack — just prints lines to the console when various types of traffic are received — so I’ve elided that portion from the reproduction here. If we wanted to work with an Ethif.t (a type representing the Ethernet layer communications on that interface), an I.t (the IP layer), or even the raw physical device passed to the start function with the name of net, we can do that just as we can work with tcp or udp.

Working with Multiple Network Interfaces

Working with two interfaces rather than one is fairly similar. A nice minimal example, working right down on the netif layer, is the netif-forward example unikernel, also in mirage-skeleton. The config.ml for this unikernel defines two interfaces, and unikernel.ml provides a module Main functorized over two modules of type NETWORK – there’s no expectation that these are necessarily the same type of physical interface, just that they both know how to satisfy the basic operations required of a network device.

Instead of building something on top of the provided netifs, netif-forward (as of the latest revision) works with them directly — it takes packets from the first interface (n1, of type N1.t), queues them, and then sends them out the second interface (n2, of type N2.t) as quickly as it can.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
module Main (C: CONSOLE)(N1: NETWORK)(N2: NETWORK) = struct

  let (in_queue, in_push) = Lwt_stream.create ()
  let (out_queue, out_push) = Lwt_stream.create ()

  let listen nf =
    let hw_addr =  Macaddr.to_string (N1.mac nf) in
    let _ = printf "listening on the interface with mac address '%s' \n%!" hw_addr in
    N1.listen nf (fun frame -> return (in_push (Some frame)))

  let update_packet_count () =
    let _ = packets_in := Int32.succ !packets_in in
    let _ = packets_waiting := Int32.succ !packets_waiting in
    if (Int32.logand !packets_in 0xfl) = 0l then
        let _ = printf "packets (in = %ld) (not forwarded = %ld)" !packets_in !packets_waiting in
        print_endline ""

  let start console n1 n2 =

    let forward_thread nf =
      while_lwt true do
        lwt _ = Lwt_stream.next in_queue >>= fun frame -> return (out_push (Some frame)) in
        return (update_packet_count ())
      done
      <?> (
      while_lwt true do
        lwt frame = Lwt_stream.next out_queue in
          let _ = packets_waiting := Int32.pred !packets_waiting in
          N2.write nf frame
      done
      )
  in
  (listen n1) <?> (forward_thread n2)
  >> return (print_endline "terminated.")

end

Building a NAT Library and Unikernel

For our NAT implementation, we need to be able to:

  • make reference to the publicly-routable IP address on the Internet-facing interface
  • generate new and unique port numbers to use to disambiguate traffic from different hosts on the private network side
  • keep a table mapping private-network connections to their public-network analogs
  • add new entries to the table based on new connection attempts
  • alter Ethernet, IP, TCP, and UDP headers of incoming and outgoing packets:
    • replace ip addresses and ports according to table entries
    • recalculate checksums on IP and transport layers after making other mutations

Since there’s nothing privileged about any of the data structures we’re using, or the memory we’re accessing, it’s relatively straightforward to pull the packet-transformation and inspection code out into a simple library that does the following:

  • decomposes incoming packets into either a tuple of the relevant layers or None
  • pulls relevant information for NAT decision-making (Ethernet layer ethertype, IP-layer source and destination address and protocol, transport-layer port numbers) out of packet layers
  • given an existing NAT table and an incoming packet, either rewrites the packet according to the rules in the table or returns None
  • given an existing NAT table and an incoming packet, along with an IP address and port number, creates a new NAT table rule for the packet using the IP address and port number provided

Along with a library that provides basic CRUD operations on the table itself, this is enough to get Internet browsing working through a NATting unikernel with not much code at all. If you’d like to try it out, here are some instructions on setting up a Xen machine to NAT via mirage-nat. The instructions given are for a CubieBoard2 or CubieTruck, but any machine running Xen with multiple network interfaces (or even virtual bridges, if you wish to NAT nonphysical devices) can run the NATting unikernel.

Some Comments on Limitations of the Implementation

This is not enough to have a stable or even reasonably secure Internet browsing through a NATting unikernel, largely because there’s no nice facility for table entries to be removed. This has two important consequences:

  • the NAT table will grow until it consumes all available memory and the NAT device crashes. This mimics the behavior of many commercial implementations (memory exhaustion due to NAT table size is a common reason you need to restart your home router), but in this case that isn’t a feature.
  • the NAT table will allow servers which previously replied to requests, to send new traffic to the host which made the original request. In other words, if a client made an unencrypted HTTP request to the-toast.net, downloaded a webpage, and then closed the connection three days ago, the NAT device has no way of knowing that the-toast.net shouldn’t be sending responses now. This is particularly bad in the case of UDP, which has fewer protocol-level safeguards against state-violating traffic.

There’s nothing about the MirageOS architecture that imposes these limitations — code which times out and maintains state is already implemented in MirageOS. Ideally, we’d want to make use of (the state machine logic for TCP connections) already included in the mirage-tcpip library, so we could continue to use the power of our library OS architecture to avoid duplicating this code. We’d be stuck writing our own UDP “connection” expiry logic no matter what, since UDP is a connectionless protocol, although we could provide those as a library as well — perhaps a firewalling unikernel might be able to use this code in the future?

Acknowledgements

Some of the research leading to these results has received funding from the European Union’s Seventh Framework Programme FP7/2007-2013 under the UCN project, grant agreement no 611001.