June 16, 2016

How Verizon Labs Built a 600 Node Bare Metal Mesos Cluster in Two Weeks

neth_k.jpg

Craig Neth
Craig Neth tells how he and his team set up their own Apache Mesos test cluster.

Verizon Labs is building some impressive projects around Apache Mesos and relies on a lot of open source software for functionality: operating systems, networking, provisioning, monitoring, and administration. Open source software is popular at Verizon Labs because it gives them the flexibility and the functionality to do what they want to do, without fighting vendor restrictions.

Apache enterprise software plays a key role, including Mesos, Kafka, Spark, and the Apache HTTP server. And a host of other OSS software, including Docker, Ansible, CoreOS, DHCPD, Ubuntu Linux, and Fleet.

In his talk at MesosCon North America earlier this month, Larry Rau, Director of Architecture and Infrastructure at Verizon Labs gives a live demonstration of a large-scale messaging simulation across multiple datacenters, including a failure and automatic failover. You can see it all happening in real time during his keynote.

In the second talk, Craig Neth, Distinguished Member of the Technical Staff at Verizon Labs, describes building a 600-node Mesos cluster from bare metal in two weeks. His team didn't really get it all done in two weeks, but it's a fascinating peek at some ingenious methods for accelerating the installation and provisioning of the bare hardware, and some advanced ideas on hardware and rack architectures.

Keynote: Verizon Calls Mesos

Larry Rau, Director of Architecture and Infrastructure, Verizon Labs

Larry Rau, Director of Architecture and Infrastructure at Verizon Labs, gave a live demonstration of a high-volume messaging system built on Mesos. The demo simulated 110 million devices generating over 400,000 messages per second over Verizon's wireless network, managed by multiple data centers. The demo included the failure of one data center, and seamless failover to other data centers.

Verizon's software stack is stuffed with open source software, including CoreOS Linux, the Mesosphere data center operating system, Apache Kafka, which is a high-throughput distributed messaging system, and Apache Spark, for fast big data processing.

Rau explained their decision to go with Mesos was to to increase efficiency and flexibility: "We chose Mesos as a platform because we wanted to basically do this. We wanted to run lots of containers. We realised this, we really buy into the, "We don't need a virtual machine layer, we want to containerize, run microservices and we've got to run lots of these different microservices within our cluster.

"This is another key point: We didn't want any more silos,” he said. “If I looked across how we built applications and deployed them in the past, they were all silos of machines and applications and put into these data centers. Every time you wanted to bring up a new application, you had to go source hardware, deploy hardware, deploy applications, set up new teams and monitor it. Really we didn't want to do that anymore. We really wanted to go cluster computing, so we have lots of very similar, same types of computers running in a cluster, we run our applications across all these."

How to Stand Up a 600 Node Bare Metal Mesos Cluster in Two Weeks

Craig Neth, Distinguished Member of Technical Staff, Verizon Labs

In this video, Craig Neth tells how he and his team attended MesosCon in Seattle in August 2015 and were excited and inspired to set up their own test cluster. He asked his boss for a couple of racks, and instead was given the go-ahead for a 20-rack test lab. This may sound like being showered with riches, but it also meant being showered with headaches, because part of the deal was using experimental hardware and rack designs, and having it all done by Christmas.

His team had to find a location for their new cluster lab and then had to figure out power and cooling. The compute sleds included "a standard off-the-shelf Intel Taylor Pass motherboard. It's got two CPU sockets...Each one of them has a plug-in 10 gig PCI nic card. We use that for our data plane stuff. We use a couple of the one-gig nics on there, one for management and one for the IPMI network. That's how you get the four servers per 2U." The sleds do not have power supplies, but rather draw DC power from a common bus bar across the backs of the racks. All the interconnects are on the back as well.

The storage sleds are configured differently from the compute sleds. "It's a two-layer system. The top layer has 16 six-terabyte drives, spinning drives. The bottom layer has got another one of those Taylor Pass motherboards and a couple of SSDs down there. They're the exact same motherboards that we run in the compute sleds. The only difference here is on this particular cluster we only have one socket populated."

Provisioning all these machines was considerably accelerated by having the vendor do the preliminary work, and Neth is proud that they only had to connect a single serial cable to configure the first node, and then the rest was done automatically.

Nodes are cattle. They're not pets.

Maintenance is pull-and-replace, and uses the same auto-provisioning as the initial installation. "Our maintenance model is we don't replace components in any of these things,” Neth said. “We replace sleds. If we lose a disc, if we lose some memory, if we lose fans, whatever it is, we call up the vendor, and they overnight us a new sled, and we just pull out the old sled. We get the new sled. We get metadata for the sled so we can provision it and bring it right back up again. Nodes are cattle. They're not pets."

Getting Creative With Mesos

You might also enjoy 4 Unique Ways Uber, Twitter, PayPal, and Hubspot Use Apache Mesos. And, come back for more blogs on ingenious and creative ways to hack Mesos for large-scale tasks.

MesosCon Europe 2016 offers you the chance to learn from and collaborate with the leaders, developers and users of Apache Mesos. Don’t miss your chance to attend! Register by July 15, 2016 to save $100.

mesoscon-video-cta-2016.jpg?itok=PVP-FqWv

Apache, Apache Mesos, and Mesos are either registered trademarks or trademarks of the Apache Software Foundation (ASF) in the United States and/or other countries. MesosCon is run in partnership with the ASF.

Click Here!