A Full Stack automation at Internet Exchanges
【IIJ 2020 TECHアドベントカレンダー 12/21（月）の記事です】
Even if network operators master complexity to keep all of their services alive, the cost and delay increase with complexity. Developing upper management layers with automation has been recognized as the most suitable solution to simplify operational processes without mitigating scalability and performance.
Today, we are introducing the Holistic Internet Exchange(HIX), a full-stack SDN solution that allows Internet Exchange Points (IXP) to plan, design, test and configure their networks. By full-stack, we mean a simple-to-use human interface down to a redesigned layer-2 protocol supported by many network vendors.
Internet Exchange Points context
In order to fully understand the HIX solution, we must provide the appropriate context. First, what are Internet Exchange Points? IXPs are interconnection points where Internet operators come together to exchange traffic amongst themselves for mutual benefit. IXPs are critical infrastructures of today’s internet as they are core facilitators for Internet Service Providers and Content Delivery Networks to interconnect.
Nowadays, they are around 800 IXPs worldwide. A vast number of IXPs are small and run less than three switches, which are operated by voluntary engineers. Internet traffic is growing exponentially, pushing IXPs to increase their switching capacity with higher reliability. The larger IXPs are big enough to be handled by paid and trained engineers to configure their equipment manually, or with scripts developed within their IXP. Only the biggest IXPs, (less than 1%), have internally developed a full automated back-office.
IXP Manager is a widely used management platform for IXPs, including administration portals, end-to-end provisioning and implementing all the state-of-the-art best practices. Over 145 IXPs are currently using the platform, up from around 80 at the beginning of 2020. IXP Manager includes an administration portal that operators can use to provision their infrastructure, including switches, routers, IP allocation, patch panel member registrations, and more.
IXP Manager enforces many good practices. IXPs operators maintain their operational infrastructure information database while being compliant with common IXP best practices. As information about the IXP infrastructure is already available within IXP Manager, we make use of the available API endpoints. This allows us to use it as our database which keeps all our configuration states up to date.
However, IXP Manager does not include any automation tools to configure network equipment even if INEX, the main contributor, uses one itself. We fixed this breach with HIX.
Figure 1: HIX Full Stack Overview
Presenting the Holistic Internet Exchange
HIX is a full-stack solution, which aims to reduce complexity and cost within IXPs by introducing a platform for managing, planning, and testing an Umbrella switching fabric.
The name HIX comes from the holism theory, which suggests that various systems are all connected and everything within the internet exchange should be viewed as a whole. We decided to keep with the holistic theme and chose names from Douglas Adams’ book, Dirk Gently’s Holistic Detective Agency. Particularly, we selected names from an illustration featured in the 2016 television series based on it. The illustration symbolically depicts a number of individuals with a connected phenomenon within “Project Blackwing,” and gives them odd and sometimes obscure codenames. We have named the components we designed Miru (as it means to look/asses in Japanese) and Athos due to the symbol being depicted being similar to Miru’s.
In figure 1, we can see that HIX consists of four main sections. We will quickly cover each of these with a focus on Miru and Athos later on as they are the new components we have designed for this. First, we will look at the management section. HIX extends IXP Manager and Miru allows operators to plan their infrastructure through simple drag and drop functionality, which it then uses to generate a configuration file for our control plane.
For the control plane, we use Faucet as our SDN controller with the configuration generated by Miru. Athos will then use this configuration to emulate a network for reachability verification between all members and test the redundancy mechanism between switches. After successful verification, the same configuration can then be deployed in production with no risk. This allows operators to assess and better understand their network.
We make use of Grafana and Gauge that comes with Faucet to monitor the traffic within the network, and then display this information back into IXP Manager. Grafana is a web dashboard solution which shows the statistics collected by Gauge which is a controller that reads only the OpenFlow rules counters.
Lastly, the Umbrella switching fabric runs on layer2 and is made possible through the flexibility of SDN. It introduces two main concepts: first, it removes all broadcast traffic and enforces label switching at the MAC level using the MAC destination address. For further details, please read the published paper.
Figure 2: Demo with new network topology diagrammed and connected in Miru. Switches are dragged onto the diagram and linked together. After all the links are connected, the configuration is generated and then evaluated within Athos.
The first part of HIX we designed is the planning and diagramming aspect (also known as Miru). It uses the data available within the IXP Manager and allows operators to plan out and diagram their network.
Traditionally a diagram of the network is left until the end of a deployment process and typically neglected over time. Miru starts with planning and diagramming the network. An easy “drag and drop” interface built on top of the open-source backend of the popular diagramming software diagrams.net (previously known as draw.io).
In Figure 2, we can see an overview of how IXP Manager integrates Miru. It populates the side menu with the switch blocks containing all its information, including all the connected members’ details. Operators can then drag and drop the switches onto the canvas and draw links between the switches. Like IXP Manager, Miru follows the same best practice philosophy; for example, it will find available ports on each switch that have been designated as core ports, and uses that to create a link between them.
The diagram can now act as a trusted source for all the infrastructure. Miru will use this information to generate a configuration file for our faucet controller, which can be sent for testing and verification to our second component Athos.
The final part of HIX is the push-on-green module, which builds and automates network testing and verification. Push-On-Green is a method for automatically updating production systems in a secure and controlled manner. Push-on-green processes are intended to maintain software systems in production with minimum effort and downtime.
Athos will configure, based on the Miru’s file, a fully emulated network, which includes the control and data planes. It will then proceed to go through each of the members and ensure their connectivity over IPv4/6, with or without VLAN tagging, works. Then the failure detection, the redundancy mechanism will be tested by cutting off links and turning off switches.
In figure 2, we could see that Athos can be called from within Miru. The goal was to create one place for the operator to plan, test, and evaluate their network. Figure 3 shows the architecture diagram for HIX, where Miru and Athos are designed as separate modules and communicate with each other via IXP Manager. As part of the HIX design, Athos uses the OpenFlow Faucet controller and supports OpenFlow switches as well as Umbrella core switches programmed in p4_lang.
Figure 3: HIX Architecture Diagram
Summary and future work
This blog post is just a brief overview of the HIX project before Miru and Athos’ codebase will be beta released in the upcoming months. By enhancing IXP Manager with the missing bits through our advanced automation solution, we aim to have HIX help IX operation of all sizes in the new future.
Finally, this work’s primary focus is dedicated to IXPs, but not limited to just that. We are looking forward to adapting and implementing our solution to other networks, including data center.