VxLAN was defined in 2014 by RFC 7348 and has been used as a component in several SDN (software defined network) solutions from various hardware and software manufacturers. However, most datacenter architects discounted using a native VxLAN design because of multiple challenges and limitations. But now might be time to reconsider.
Datacenter Network Design Standards Have Changed Significantly
Over the last eight years, there has been a significant change in datacenter network design standards. With a proliferation of virtualized and containerized applications, datacenter networks need to support more east-west traffic with minimal latency and insert policy into those east-west flows. We no longer design datacenter networks with access and aggregation tiers and rely on Spanning Tree protocol for loop prevention; instead, we leverage a CLOS architecture commonly known as leaf / spine topology.
Datacenter Topologies Are Transitioning
As datacenters transition from legacy multi-tier topologies to modern leaf / spine topologies, solutions like Cisco ACI and VMware NSX have used proprietary solutions to overcome the limitations associated with VxLAN. These solutions offer much more capabilities than just extending layer 2 VLANs over layer 3 networks, and many organizations deployed these feature-rich solutions even if they only needed a subset of those features and capabilities.
Improvements and Some New Challenges
By 2015, the development of a VxLAN-based EVPN standard – including the use of multi-protocol BGP (MP-BGP) as the control plane – made the solution scalable. The combination of these standards provided a functional but limited solution for a MP-BGP EVPN datacenter design. Many datacenter architects evaluated and dismissed this option at this stage of the solution evolution.
In 2017, Cisco introduced a solution for multi-site EVPN that included the concept of a Border Gateway that provides fault isolation and allows for flexible policies between datacenter fabrics. Unfortunately, the addition of these much-needed capabilities introduced a significant amount of complexity for the average Cisco command line jockey. The lines of code in a single switch configuration increased by 10x or more, and the skills needed to troubleshoot the overlay are new to most network engineers.
Enhancements Fill Gaps and Solve Complexity
Cisco has made several recent enhancements to this solution to fill the capabilities gap and solve the complexity challenges:
- Enhanced Policy Based Redirect (ePBR) supports service chaining between service nodes across multiple datacenter fabrics. Combined with Nexus Intelligent Traffic Director (ITD) for line-rate load balancing capabilities for a multiple levels of service redundancy.
- Improvements in the Datacenter Network Manager (DCNM) orchestration tool to build and manage MP-BGP/EVPN fabrics through a GUI interface reducing the complexity for implementing and supporting the solution. Alternatively, you can use other orchestration tools like Ansible, Terraform, etc.
- Integration with Nexus Dashboard and Nexus Dashboard Orchestrator (NDO). If you manage your datacenter fabric with DCNM you can integrate with other DCNM or ACI sites in NDO and monitor all the datacenters in a single Nexus Dashboard.
You Have Two Options for Building a Cisco Datacenter Fabric
If you are building a Cisco datacenter fabric on Nexus switching today, then there are two options available to you: Cisco ACI is the flagship product with widespread adoption, and a VxLAN MP-BGP EVPN design that is less widely adopted.
The primary differentiator for ACI is the integrated orchestration platform known as the APIC (Application Policy Infrastructure Controller) and the policy model that it uses to build the datacenter fabric. The policy model was strongly influenced by the Cisco UCS Manager model – which effectively created the first stateless blade server chassis configuration by using service profiles, templates, and resource pools. The ACI model uses a similar structure, but many of the policy objects you create become one-off objects and you find yourself creating many re-usable objects that only get used once. As a result, the value of the policy model is debatable.
This brings me back to the MP-BGP EVPN option – which is not tied to a policy model and not inextricably linked to the Cisco orchestration and monitoring platform. If you prefer to bring your own orchestration and policy tools, then maybe MP-BGP EVPN is the better choice.
Datacenter Success Hinges on a Complete Vision
The key to a modern datacenter is a holistic and complete vision. Therefore, it’s vital to partner with a consultancy like Core BTS that can see all aspects of what you want to accomplish and can deliver the cloud and security expertise you need to succeed. Contact us to learn more.