Automating Processes with Chef


chef-screen1Recently, Chef (along with Puppet and other tools) has been getting plenty of coverage in the areas of DevOps and continuous delivery. Big companies, including HP, have embraced Chef as an important tool in automation. This automation stretches through the entire hardware and software lifecycle, and Chef has become an integral part of it.

Clearly, knowing Chef is vital these days, whether you’re a software developer, an IT administrator, a database administrator, or any mix thereof. But how do you learn Chef? The key to learning Chef is to first understand what it is. The next key is to learn a bit of Ruby. You don’t need to be a Ruby master, though.

What Is Chef?

So what exactly is Chef? Chef is a tool for which you write scripts that are used to automate processes. What processes? Pretty much anything IT-related. Think about some of the tasks you might do repeatedly. Here’s a random list:

  • Install an operating system on a new computer

  • Upgrade the operating system on a new computer

  • Install software libraries

  • Install Apache

  • Modify an existing Apache configuration

  • Upgrade MySQL

  • Add a user to MySQL

  • Generate SSH keys on a server

  • Upgrade MySQL on all development servers

  • Pull files down on a server from source control

 See where this list is going? In a million different directions. No matter what your specialty, there’s a good chance you’ll be able to benefit from automation. Consider my own line of work: I work on a team of software developers, creating a web-based application in node.js. The application runs on replicated servers. As demand grows, new servers need to be allocated. You can see the repetitive process there. That’s for the live servers; however, the development servers similarly have repetition.

Over and over, we find ourselves needing to provision a clean development server to do unit testing on. Whereas, in the old days, we software developers would ask the IT team to help us find an available computer to start clean with, now we simply allocate a virtual server on a cloud platform. Once we have a clean server with a fresh operating system installed, we need all our tools. We’re using node.js, which needs all the runtime tools for node, along with a few other things we need. Who wants to install this over and over?

Yes, we could just start with an image. But, the reality is that as our software evolves, at least once a week we would need to modify the image. That doesn’t buy us much. Plus, each virtual server has a unique IP address and name, and additional unique parameters such as SSH keys, and so on.

So, we use an automation tool instead. The tool will allocate the virtual server on the cloud platform, which includes installing the operating system; install the node.js tools and other programs we need (e.g., both MySQL and MongoDB); configure all the software; configure some individual files; and so on. Because it’s a dev server, we need to have Git installed, and it needs some SSH keys so it can do a pull, and the list goes on and on.

Chef and other automation tools take care of this. Using Chef, you can write simple scripts (called recipes in Chef parlance), and then Chef will handle the automation for you. You can use Chef for small, simple tasks or for huge tasks such as automating massive datacenters. To get started, it’s best to learn some small tasks so you can get a handle on the basics. The good people at Chef have put together a pretty decent website to help you learn Chef, and I recommend you start with the tutorial. I worked through the Ubuntu version (there are also Red Hat Enterprise Linux and Windows versions), but I want to provide some additional notes here beyond what the website provides.

One thing to keep in mind is that although you’re using a virtual machine to work through the examples, when doing your own exploring separately, you’ll want to create two virtual machines: One is the machine you’re doing your Chef development on; the other is the machine that you’re going to configure and monitor with Chef. The second machine might be the machine you need all your main software installed on — either your programming tools for your test machine, or Apache for a live website, for example. The first machine does the monitoring and configuring; the second machine might be one of many servers you’re managing. In the case of this tutorial, you’re actually doing both on a single virtual machine, which is why it’s important to remember the big picture and how things would be arranged in a larger environment.

Next, with Chef you can manage resources. The resource might be the second machine, but it really can be any number of things, including something as simple as a file, which is exactly what you manage in the first exercise. Resources can represent anything you might need to manage, whether that is software (e.g., applications including Apache and MySQL), hardware (e.g., networking devices), or virtual hardware.

Why Automate Simple Tasks?

As you proceed through the tutorial, bear in mind that what you’re doing is automating processes. At one point, you install the Apache web server. While it’s true that you can easily install Apache through a simple shell-based call to a package utility (e.g., apt-get), installing Apache is likely to be a small part of a larger set of configuration. So, Chef includes the ability to install Apache for you.

Also, consider the approach you’re taking here. Remember that although Ruby is a complete, general-purpose programming language, to use Chef you don’t need to be a Ruby master. The idea with Chef is that you declare the state you want your resources in. In the first exercise, for example, you declare what the state of the file should be (it should have a single line of text, “Hello World”). With a traditional programming approach, you might write an app that opens the file, checks if the contents are not what they should be, and then, if the contents are wrong, replace them with the string Hello World, write the file out, and finally close the file.

That’s a procedural approach where you define the procedure that takes place. With Chef, however, you take a declarative approach, where you declare what the contents should be and let Chef handle the details of how it to do it. Or, in the case with Apache, you’re not so much as saying “Please install Apache,” as you’re saying, “Make sure Apache is installed.” Even more succinctly, you can specify the state of your server that you desire: “Apache is installed with this setup configuration.”

In other words, you’re not saying to do something and how to do it; you’re simply stating what your system must look like, what state it must be in. And, if the system is not in that state, Chef will make it so.

Figure 1 (above) is a screenshot of my terminal window as I was working through the tutorial. I have the vim editor open, and I typed in the recipe. But, this recipe doesn’t describe what to do; it describes the state the server that I’m managing needs to be in. Then, when I opened up my browser and connected to the running server, I saw the result (Figure 2), just as the tutorial says I should.

chef-screen2This demonstrates that the server is in the state that I asked it to be in: Apache is running, and the index file contains HTML to display the words Welcome to Remember that as you work: Your computer is in a state; and if it’s not in the state you want it, you need a tool to get it back to the correct state. That’s where Chef comes in.

But Wait, There’s More

Here, I’ve looked at how to automate some processes with Chef, but there’s much more to what Chef can do. One aspect is managing existing systems. It can manage both Linux and Windows computers (which Chef calls nodes). It can manage applications as well as servers. And, as I mentioned earlier, it’s powerful enough to manage an entire datacenter. Learning Chef starts with learning the basics of what it is, and how to write some recipes. Want more? Take a look at the Learn Chef website and soon you’ll have this tool mastered.


So, who all should learn configuration management tools such as Chef? These days, pretty much anyone who works in a computer field. Certainly, IT people who manage networks and datacenters. But software developers, too. Today, software needs to be very aware of its own infrastructure, which means software developers need to write code that’s infrastructure-aware. Chef can help.