Thursday, 1 February 2018

DHCP, RPF verify, FHRP and ECMP - the final solution

A while back, I wrote about a problem with DHCP relaying, when using an FHRP (with multiple routers) and ECMP, whilst also using RPF source verification.

My solution at the time was to use an access control list to do the RPF source verification, with some exceptions to allow the DHCP packets to be relayed across the client subnet, from the "other" router.  The RPF source verification feature ("ip verify unicast source ..." command) caused problems on the Supervisor 720s we had at the time), when used with an exception.

Since then, we've upgraded our backbone network to Catalyst 6807-XLs, all with Supervisor 6Ts and IOS 15.4.  These updated routers resolve the issue with the "ip verify unicast source ..." command, when used with an "exception" ACL (which allows packets matching the ACL to violate the RPF source verification check) — it used to punt the majority of packets up to the CPU and cause high load through the IP Input process (only the "exception" packets were handled on the ASIC).  However, the new Supervisor 6Ts (and the Supervisor 2Ts, which we've never used) do all of this in the ASICs and avoid the CPU issue.

As part of the upgrade, we redid all of the router configurations from scratch, rather than just adapting the existing configuration files with a bit of editing plus search and replace.  This gave me an opportunity to revisit this.

Refresher — how RPF source verification works


Just as a reminder, putting RPF source verification is enabled by adding a command under the SVI definition, e.g.:

interface Vlan789
 description BOTOLPHS
 ip address 172.27.89.253 255.255.255.0
 standby version 2
 standby 81 ip 172.27.89.254
 ip helper-address 172.31.1.1
 ip verify unicast source reachable-via rx

This stops any packets being admitted via this interface, unless they would normally be sent out via that interface.  This will include directly-routed/connection routes and static routes via the interface.

However, this breaks DHCP, when relayed through the "other" router (e.g. one also serving VLAN 789, on 172.27.89.252 and running HSRP to protect address 172.27.89.254) because the packet will come from the DHCP server (172.31.1.1), in via the VLAN 789 SVI on the other router, to 172.27.89.253 (to address this router relayed the packet to the DHCP server from) and so will have a source address of 172.31.1.1, which isn't reached via the interface.

To work around problems like this, the "ip verify unicast source ..." command allows you to specify an access list to exempt packets from this check, so we can use to to allow the DHCP packets.  Unfortunately, it must be a NUMBERED access list, so you're limited in the number you can create (100-199 and 2000-2699 for extended access lists), plus it's difficult to pick sensible numbers for them, given the constraints on these in IOS.

The number scalability problem


That the ACL is limited to being a numbered one creates a bit of a problem: what we really wanted to do was create an ACL per interface, to permit the packets for DHCP on that interface, e.g. for the above:

ip access-list extended 2089
 permit udp host 172.31.1.1 eq bootps 172.27.89.0 0.0.0.255 eq bootps

(Note that DHCP packets going from the server to relay agents go from port 67 ["bootps" — BOOTP server] to the same port; packets going from relay agents or servers to clients go from 67 to 68 ["bootpc" — BOOTP client].)

The problem with this is that we'd need one access list per interface and we only have a relatively few numbers to play with.  (In practice, there are probably enough numbers per router, but managing the set of them would be difficult, as we need to know which numbers are free, since we can't just pick names/numbers based on the VLAN ID, due to the range limitations.)

The solution


The workaround to this problem we came up with was to combine the RPF source verification feature with an access list.  We start by creating numbered access lists per set of DHCP servers and don't limit the destination address to this VLAN but, instead, put "any":

ip access-list extended 2031
 permit udp host 172.31.1.1 eq bootps any eq bootps

This exemption would obviously leave the SVI open to a host spoofing replies from the DHCP server to other router interfaces (on this router, or elsewhere on the network).  To protect against this, we also use an inbound ACL:

ip access-list extended VL789-IN-ACL4
 permit udp host 172.31.1.1 eq bootps 172.27.89.0 0.0.0.255 eq bootps
 deny udp host 172.31.1.1 eq bootps any eq bootps
 ...

... the access list needs to:
  • permit the relayed packets for the primary IPv4 range on this VLAN (it doesn't need to cope with secondary ranges, since those will always be relayed via the primary address on the router), then
  • deny all other packets from the DHCP server's address
We can then apply these to the interface:

interface Vlan789
 ip address 172.27.89.253 255.255.255.0
 ...
 ip helper-address 172.31.1.1
 ...
 ip verify unicast source reachable-via rx 2031
 ip access-group VL789-IN-ACL4 in
 ...

Because the RPF access list (2031) doesn't specifically mention the addresses used on this VLAN, it can be reused across multiple VLANs/subnets.  As such, we only need one access list per set of DHCP servers, which is manageable.

[In fact, in our case, we actually limit the per-VLAN inbound access list to just accepting packets to this particular router's primary IPv4 address on the VLAN, since the configuration build mechanism we have knows what that is.  As such, the access list on each router is slightly different, for the same VLAN.  However, this is probably unnecessary — it's just something we can do easily.]

NX-OS strikes again


The Cisco Nexus equipment we have (which runs NX-OS 7.3) also have the RPF source verification command but DOESN'T have the exception access list feature.  This means we can't use it, where DHCP relaying is in use — instead, we need to used regular access lists, as we did on the Supervisor 720s.

[If vPC (Virtual Port-Channel — Cisco's version of MLAG) NX-OS is supposed to process traffic addressed for the addresses of the partner bridge, for VLANs which are members of vPC.  However, this didn't seem to work for DHCP relaying — either it only applies to traffic arriving via the vPC member port (which Cisco said it didn't, when dealing with them about something else), or DHCP is an exception.  I haven't had time to look at this, yet.]