OTV AEDs Are Like Highlanders
July 14, 2014 1 Comment
While prepping for CCIE Data Center and playing around with a lab environment, I ran into a problem I’d like to share.
I was setting up a basic OTV setup with three VDCs running OTV, connecting to a core VDC running the multicast core (which is a lot easier than it sounds). I’m running it in a lab environment we have at Firefly, but I’m not going by our normal lab guide, instead making it up as I go along in order to save some time, and make sure I can stand up OTV without a lab guide.
Each VDC will set up an adjacency with the other two, with the core VDC providing unicast and multicast connectivity. That part was pretty easy to setup (even the multicast part, which had previously freaked me the shit out). Each VDC would be its own site, so no redundant AEDs.
On each OTV VDC, I setup the following as per my pre-OTV checklist:
- Bi-directional IPv4 unicast connectivity to each join interface (I used a single OSPF area)
- MTU of 9216 end-to-end (easy since OTV requires M line cards, and it’s just an MTU command on the interface)
- An OTV site VLAN which requires:
- That the VLAN is configured on the VDC
- That the VLAN is active on a physical port that is up
- Multicast configuration
- IP pim sparse-mode configuration on every interface, end-to-end
- IP igmp version 3 on every interface end-to-end
- Rendezvous point (RP) configured on the loopback address of the core VDC (I used the bidir tag)
So I got all that configured and then configured the OTV setup. Very basic:
feature otv otv site-vlan 10 interface Overlay1 otv join-interface Ethernet1/2 otv control-group 239.1.1.1 otv data-group 232.1.1.0/28 otv extend-vlan 100 no shutdown otv site-identifier 0000.0000.0002 ip pim rp-address 10.11.200.1 group-list 224.0.0.0/4 ip pim ssm range 232.0.0.0/8
The only difference between the three OTV VDC configurations was the site-identifier and the join interface. Everything else was identical, pretty easy configuration. But… it didn’t work. Shit. Time for some show commands:
N7K-11-vdc-2# show otv adjacency
Overlay Adjacency database
Overlay-Interface Overlay1 : Hostname System-ID Dest Addr Up Time State VDC-3 18ef.63e9.5d43 10.11.3.2 01:36:52 UP vdc-4 18ef.63e9.5d44 10.11.101.2 01:41:57 UP vdc-2#
OK, so the adjacencies are built. I’ve at least got IP4 unicast and multicast going on. How about “show otv”?
N7K-11-vdc-2# show otv OTV Overlay Information Site Identifier 0000.0000.0002 Overlay interface Overlay1 VPN name : Overlay1 VPN state : UP Extended vlans : 100 (Total:1) Control group : 239.1.1.1 Data group range(s) : 232.1.1.0/28 Join interface(s) : Eth1/2 (10.11.2.2) Site vlan : 11 (up) AED-Capable : No (Site-ID mismatch) Capability : Multicast-Reachable N7K-11-vdc-2#
Site-ID mismatch? What the shit? They’re supposed to mismatch. I try another command:
N7K-11-vdc-2# show otv site Dual Adjacency State Description Full - Both site and overlay adjacency up Partial - Either site/overlay adjacency down Down - Both adjacencies are down (Neighbor is down/unreachable) (!) - Site-ID mismatch detected Local Edge Device Information: Hostname vdc-2 System-ID 18ef.63e9.5d42 Site-Identifier 0000.0000.0002 Site-VLAN 11 State is Up Site Information for Overlay1: Local device is not AED-Capable (Site-ID mismatch) Neighbor Edge Devices in Site: 1 Hostname System-ID Adjacency- Adjacency- AED- State Uptime Capable -------------------------------------------------------------------------------- VDC-3 18ef.63e9.5d43 Partial (!) 00:17:39 Yes
Now this show command confused me for a while. I was trying to figure out the Site-ID mismatch. I was also wondering why I could see VDC-3 but couldn’t see VDC-4. Then it dawned on me (after am embarrassing amount of time) I’m not supposed to. I’m not supposed to see VDC-3, either. The “show site” command is only looking at the local area. For my configuration, I shouldn’t see any other VDCs with “show otv site”.
This means that there’s some type of Layer 2 connectivity between the different sites. VDC-3 and VDC-4 both somehow see each other as Layer 2 adjacent. That shouldn’t happen if they’re supposedly on remote sites. This is a lab environment, so there’s some sort of Layer 2 connectivity for the Site-VLAN that I need to kill.
OTV edge devices are like highlanders, if there’s Layer 2 adjacency, they sense each other.
“I could sense you by your VLAN”
It probably happened on the interface that I assigned the site-VLAN to as an access port. A VLAN will not show “active” unless you have an active physical link (interface VLANs don’t count).
So I went through and re-configured the site VLAN. Instead of VLAN 10 (which was probably active on the other ends of those interfaces somehow) I created new VLANs, and used a unique VLAN for each VDC. The site-VLANs do not need to be identical between sites. I put the VLAN on a physical link that was up, and voila.
In the real world, you probably won’t run into this. However, it’s possible if there are other Layer 2 interconnects going on in your data center (perhaps dark fiber) or you’re transitioning from one DCI to another, you may hit this.