7 Reliability Questions Engineering Managers Need to Ask Their Teams

January 28, 2019

460

Modern software teams face no shortage of edge cases and variations across the service categories and tiers of their ever-evolving architectures. In the midst of leading a team through the day-to-day firefighting, it can be difficult to see the forest for the trees. But as managers, we know our teams face similar trials: defects and regressions, capacity problems, operational debt and dangerous workloads affect all of us.

And then there is the complexity of scale, something we know about first hand. The New Relic platform includes more than 300 unique services and petabytes of SSD storage that handle at least 40 million HTTP requests, write 1.5 billion new data points, and process trillions of events … every minute. The platform is maintained by more than 50 agile teams performing multiple production releases a week. To cope with serious scale like this, engineering teams must be nimble and fast moving. Their managers must also ensure that their teams adhere to reliability processes that support this kind of complexity and scale.

RELATED ARTICLESMORE FROM AUTHOR

Building Autonomous ML Experimentation with Tangle and Tangent

Score Big on Your Tech Career

Celebrating the Second Year of Linux Man-Pages Maintenance Sponsorship

How to Deploy Lightweight Language Models on Embedded Linux with LiteLLM

Automating Compliance Management with UTMStack’s Open Source SIEM & XDR

RELATED ARTICLES MORE FROM AUTHOR