Back to Basics: What Sysadmins Must Know about Logging and Monitoring

324

With the rise of containers and virtual machines, some system administrators have been neglecting their system logs. That’s a mistake.

Even if your containerized applications spin up and down several times an hour, you still need to keep and analyze logs. To find the root cause of a failure or to track down a system attack, you must be able to review what happened, when it happened, and what components of your software and hardware stack were affected. Otherwise, you’ll waste time looking for problems in the wrong place — time that you don’t have to spare in an emergency. Or, worse still, you may miss hidden issues such as performance problems, security violations, or costly use of system resources.

Without system logs, you’re not administering a system; you’re running a black box and hoping for the best. That’s no way to run servers, whether they are physical, virtual, or containerized.

So, here are some of the basics to keep in mind as you approach server logging in the 21st century. These are all practices that I either use myself or picked up from other sysadmins, including many from the invaluable Reddit/sysadmin group.

Read more at HPE