The promise of multilayer restoration as a way to save roughly half the cost of an IP core network is well documented: for example, in a highly visible paper authored by Cisco, Telefonica, and Deutsche Telecom that was published in IEEE Communications magazine and an ALU (now Nokia) white paper. The purpose of this post it to provide a simple explanation of how such savings can be achieved.
What Is It and What Value It Provides
Most IP/MPLS networks today rely on the IP/MPLS layer to protect services against failures using methods such as IGP convergence and MPLS Fast Reroute. This makes sense for router related failures. However, for optical failures, such as fiber cuts, this approach is very inefficient. It requires vast overprovisioning of IP resources to accommodate the significant amount of traffic carried over different IP links that are often impacted by an optical failure and ensuring the routers can find sufficient capacity in IP links on the backup routes circumventing the failure. A much more efficient approach is to use optical restoration to deal with such failures.
To understand multilayer restoration and the savings it promises, consider the network in Figure 1, which shows routers and the IP links between them, as well as the underlying Optical layer.
To withstand failures using IP restoration mechanisms, extra IP links must be added to the IP/MPLS layer as shown in Figure 2. The number of added IP links is roughly equal to the number of links needed to carry the traffic if no failures are assumed. Note that I’m oversimplifying here. First, I’m just focusing on added capacity to deal with one failure, while in reality all failures must be taken into account. Second, in reality, the number of added IP links can be lower or higher than the links required just for working traffic, depending on how much spare capacity exists in the original links and depending on network topology.
A more efficient approach is to rely on optical restoration to deal with the failure as shown in Figure 3. This approach no longer requires the red links from Figure 2 since traffic no longer has to go around the IP/MPLS layer; the same links that were supporting the traffic originally and have failed are optically restored (note the pale connections in the Optical layer in the figure). So one can go back to provisioning the network just for the working traffic, avoid about half of the required interfaces, and enable substantial savings on CapEx and OpEx.
The Need for Careful Multilayer Coordination
While being extremely cost-effective, photonic restoration must be carefully managed while keeping in mind the behavior of the IP/MPLS layer and its changing needs. This is due to several reasons.
- Photonic switching is slow and can take up to several minutes in large networks. This means that for a few minutes the network might run congested if the failure coincides with peak traffic. Most operators seem OK with such a transient phenomenon as long as this only impacts best-effort traffic. But the network must be monitored to ensure that business traffic is not impacted.
- The slow reaction might cause some other transient effects that impact high-priority traffic even if none of it was impacted by the failure itself. Such effects can be mitigated if restoration is properly orchestrated.
- Finally, running the network lean also means that some rare failures, such as multiple simultaneous failures, will be harder to cope with in a fully distributed manner and will require careful orchestration.
If done right, the above concerns can be addressed and one can ensure that optical restoration restores traffic as quickly as possible over restoration paths that are still usable by the IP/MPLS layer (for example, without excessive latency). Managing restoration with both IP/MPLS and optical networks in mind allows a well-designed multilayer solution to efficiently deal with failures that involve the IP/MPLS layer and cannot be dealt with using optical restoration alone.
A Closer Look at the Savings
Considering how multilayer restoration might be deployed in an existing network, we realize that the savings are substantially larger than when focusing on a static design that considers the network at a single point in time. Refer to Figure 4, which shows the number of 100-Gbps ports needed in a tier-1 network over a period of 5 years assuming traffic growth of 35% per year. The purple bars represent the required interfaces needed when restoration occurs in the IP/MPLS layer, while the light blue bars represent the interfaces needed with multilayer restoration.
When multilayer restoration is first deployed in year 1, many of the existing ports that were deployed to support IP/MPLS layer restoration become superfluous. Relying on multilayer restoration, the network could have been run without them while keeping the same SLAs. These “extra” ports serve as a buffer to accommodate future traffic growth. As traffic grows, extra ports are gradually needed and used under multilayer restoration control as shown in Figure 5.
This allows the network operator to avoid installing any new equipment for the first 3 years from deployment of multilayer restoration as expressed by the red line in Figure 4, which represents the level of overprovisioning existing in year 1. Figure 6 focuses only on the number of ports added to answer traffic growth for the same network, with and without multilayer restoration. As the number of ports added is correlated with annual CapEx spent, it can be seen that the savings over a period of 5 years amount to up to 50% of the CapEx spent.