We’ve all been the victim of a dropped mobile phone call and know how frustrating it can be. However, virtualized networks provide network operators with powerful tools to detect and recover from network disruptions, or “faults,” that can drop calls for thousands of subscribers simultaneously. The Open Platform for Network Functions Virtualization (OPNFV) project together with OpenStack have developed features in software that add resiliency to mobile networks and enable them to recover from network and other outages.
At the recent OpenStack Summit in Barcelona, both groups demonstrated how new technologies in NFV can help minimize network disruptions. During the keynotes, technical leads from the OPNFV Doctor Project and OpenStack Vitrage project conducted a phone call using a 4G mobile system running on top of OpenStack. The mobile call continued without disruption even after a dramatic cutting of network cables. (You can watch the short demo in its entirety below.)
To get the skinny on how the technology works and what it took to pull off such a compelling demo, we sat down with folks involved with OPNFV, OpenStack and the Doctor project, including Ifat Afek (System Architect at Nokia Cloudband), Carlos Goncalves (Software Specialist at NEC), Ryota Mibu (Assistant Manager at NEC), and Ildiko Vancsa (Ecosystem Technical Lead at OpenStack Foundation).
OPNFV: Can you give an overview of the demo you did at OpenStack Summit?
OPNFV/OpenStack demo team: We performed two live mobile calls from stage and both were interrupted. The first call dropped when Mark Collier (COO at OpenStack Foundation) removed two cables from the servers powering the mobile system for the calls. After this failed call, Ryota Mibu enabled the OPNFV Doctor features and the teams made another call. During the second call, Mark cut the network cables with giant scissors, but this time the call continued without disruption.
The demo leverages OpenStack as the base for a 4G mobile system equipped with the functionality to perform a smooth failover in case of faults in the system (in a process called “Fault Management”). OpenStack laid the foundation for the cloud-based mobile platform and OPNFV—via the Doctor Fault Management project—filled the existing feature gaps and provided system integration. While we successfully showed how OpenStack operates in an NFV/Telecom environment, the demo was also an example of the fruitful collaboration between the OpenStack and OPNFV communities as development of the new features and additions were driven through Doctor “upstream” into OpenStack.
OPNFV: Can you talk a little more about fault management and why it’s important?
Demo team: There is no system without faults, errors, and failures, even in the cloud. Fault management is a component that allows operations teams to monitor, detect, isolate and automate the recovery of faults. With an efficient fault management system, countermeasures can negate the effects of any deployment faults, avoiding bad user experiences or violation of service-level agreements (SLAs).
To put this in perspective, think about the impact to network services during natural disasters or other emergencies. According to a report by NTT DOCOMO, the largest mobile phone operator in Japan, thousands of antennas and other infrastructure equipment went out of service as a result of the magnitude 9.0 earthquake and tsunami in March of 2011. The consequences, as we all know, were devastating. Millions of mobile subscribers were disconnected from the cellular network, unable to make emergency calls or check in with loved ones.
Service continuity of virtualized platforms has to be equally addressed. The features enabled by OPNFV and OpenStack add value toward helping operators quickly recover from small to large-scale faults, ultimately keeping our societies connected in times of need.
OPNFV: How can organizations implement Doctor’s Fault Management solution in their networks?
Demo team: While not standalone software that can be downloaded and installed directly, the core Doctor framework relies on OpenStack components. Any organization deploying recent versions of OpenStack (from Liberty onward) will have Doctor-prescribed enhancements already available out-of-the-box with little to no configuration. In other words, Doctor is now a part of OpenStack.
Extensive documentation covering requirements, use cases, gap analysis, architecture, design decisions, configuration and user guides are available. Head to OPNFV.org to the OPNFV Colorado 2.0 Doctor documentation page for details.
OPNFV: Are there other use cases for Doctor that go beyond telecom? Will it work with other types of networks?
Demo team: Yes, definitely! There are a number of interesting cloud and enterprise applications that can use the framework; for example, those with time constraints, e.g. in the area of multimedia and real-time applications (for faster replacement of a video cache associated with peak user times). The OpenStack-powered fault management framework will be useful for anyone operating within contracted SLAs.
Individually developed features can also be used beyond fault management scenarios. For example, event alarms can be leveraged for quicker triggering of administrative actions. Without this feature, events (or “faults”) can only be retrieved by periodically polling data from a database. In fact before Doctor, the time required to detect and recover from a fault was a few minutes. With Doctor, the time to recovery is less than one second!
OPNFV: What’s next for the Doctor project? Are there other cool implementations we can expect to see in 2017?
Demo team: We certainly hope so, but it will be hard to top our Barcelona demo! As a project and a part of a larger community, maintenance and continuous improvements to the functionality of fault monitoring, notification and handling are needed and planned for in OpenStack. And as integrators, the community needs rich monitoring functions that can be supported by the broader OpenStack/OPNFV ecosystem.
Recently, new open source communities have surfaced that aim to develop higher-layer network function management and orchestration systems. OPNFV has been supportive of these activities, and a plan to integrate them in the platform is on the horizon. That said, we may see Doctor joining additional collaborative efforts at some point.
OPNFV: Most importantly: How did Mark get those giant scissors through airport security?
Demo team: Mark made all of us sign a nondisclosure agreement that prevents us from sharing any details! (It was either that or he would sabotage the demo…)