Tencent: Transforming Networks with SDN


“SDN can really transform the way we do networks,” said Tom Bie, VP of Technology & Operation of Data Center, Networking and Server, Tencent, during his Wednesday keynote address at the Open Daylight Summit. The China telecom giant should know about the issues of massive scale networks: they have more than 200 million users for QQ instant messaging, 300 million users of their payment service, and more than 800 million users of their VChat service.  Bie noted that Tencent also operates one of the largest gaming networks in the world, along with video services, audio services, online literature services, news portals, and a range other digital content services.

Tencent has a three-pronged core communication strategy based on “connecting everything.” They focus on people to people, people to services, and people to devices (IoT). The foundation is an open platform for partners to connect to public clouds. Here, third parties can run their applications on top of the infrastructure designed for the massive scale that Tencent deals with every day.  Today, millions of applications are running along the “beachhead” applications of Tencent. To ensure they have a steady flow of new and interesting services, they’ve created an innovation space for startup companies to develop and commercialize new services.  Bie noted that there are currently 4 million startups involved with the innovation space.

Working at such massive scale has forced Tencent to look for new solutions and innovations in networking technology to overcome their challenges. These challenges, Bie noted, include Agility and Scalability, End-to-End Quality of Service (QoS), Global View, Deep Insights, Automation, and Intelligence. The first two are driven from the business perspective. Services must always be available and of sufficient quality — and Tencent must be able to scale fast. The next two are from an operational perspective.  A key concern here is the need to quickly find a problem anywhere in the network to minimize the impact on services and on their business. Having a global view of the entire network with real-time deep insights enables a rapid response to network anomalies and failures. Today, the information provided to the controller or management plane is not fast enough or good enough to enable a rapid response.

This massive scale requires automation, said Bie. People, he noted, are too slow and too error prone. Automation must apply throughout the life cycle of the service and include provisioning, operations, and finally decommissioning. Bringing intelligence to the network is key.  With programmable networks, massive amounts of data can be generated and acted upon by analytics and even machine learning to drive actionable intelligence.

The first SDN use case Bie discussed was that of the Data Center Interconnect Backbone. Tencent has major datacenters in China and across Asia as well as on other continents.  Their backbone must support all of their applications so users can have quality services no matter where they are. This backbone is based on MPLS, MPLS-TE (Traffic Engineering), and MPLS VPNs. Currently, it is challenging to manage and to operate.  By adding ODL-based controllers, Tencent realizes global path optimization, fast convergence around failures or congestions, and end-to-end quality of service.

The second use case Bie discussed was managing the network within a datacenter. They use VxLANs over the fabric controller to control both the overlay networks and underlay networks. Bie noted the capability required to scale out firewalls. Here, Tencent uses flow-based load balancing, real-time monitoring, and automatic traffic schedule to scale out to up to 24 firewall pairs. The final use case involved their Internet-facing networks. A key feature Bie noted was the ability of the ODL controller to collect routes from BGP routers, determine the optimal path, and then overwrite the BGP routing tables.

Bie concluded by noting that the Internet has always been empowered by what he called an open spirit. He called out the increasing scope and range of open source initiatives around the globe.  Lastly, he highlighted ODL for adding value to cluster performance and scale, southbound interfaces for load balancing, software maintenance including the mandatory ISSU (In Service Software Upgrades, aka Hitless upgrades), and northbound interfaces standardized on Yang Modeling.