March 6, 2017

What If Mesos Metrics Collection Was a Snap?

roger-ignazio-mesoscon.png

Roger Ignazio
Roger Ignazio, tech lead at Mesosphere, introduces the Snap plugin for Apache Mesos at MesosCon Asia 2016.

Roger Ignazio, tech lead at Mesosphere, introduces the Snap plugin for Apache Mesos at MesosCon Asia 2016. Snap is an open telemetry framework that simplifies the collection, processing, and publishing of system data through a single API. It collects hundreds of metrics from Mesos masters and agents and helps you can make sense of this mass of data so that you can monitor your cluster operations.

Ignazio presents Snap in the context of day two operations. "Day two operations is everything that comes after day one," says Ignazio. "So, what does that mean? Everything that happens after you provision a Mesos or a DC/OS cluster falls into day two operations. That's logging, that's debugging, that's metrics collection. Really, anything that you need to do to operate and ensure the health of a cluster."

There are many Mesos metrics APIs that you can potentially use, and Ignazio describes some fo them. "The first one is the redirect endpoint. In a Mesos cluster you commonly have three or five or seven masters, for a production highly available deployment. The redirect endpoint returns a HTTP 307 redirect to the leading master, and this is important because you never want to query the non-leading master for its metrics."

"The next is this metrics snapshot endpoint, and that's a summary of the masters metrics. Kind of a high level and operations view. It's things like how long it's taking to query the Mesos internal registry, and how many messages are being sent back and forth between the frameworks." Other metrics APIs include state and state summary endpoints, which provide either a high level or a detailed view of cluster states, and metrics about running containers, container IDs, CPU, memory, and disk usage.

Snap separates the collection of all of this data from publishing it. "You can filter, you can add context, you can type metrics, you can aggregate them, and then ultimately publish them onto your message queue or to a time-series database, and you can visualize them with your tools of choice." The Grafana dashboard is a popular choice for creating a visual representation of your Snap data.

Watch the full presentation (below) to learn more about Snap's architecture and to see some examples of how to use it.

Interested in speaking at MesosCon Asia on June 21 - 22? Submit your proposal by March 25, 2017. Submit now>>

Not interested in speaking but want to attend? Linux.com readers can register now with the discount code, LINUXRD5, for 5% off the attendee registration price. Register now to save over $125!

Click Here!