January 5, 2004

CLI magic: everything's a file

Author: Joe Barr

This is the third and final file-related article in the Linux CLI for noobies series. But before we remove our GUI water-wings and dive into the CLI end of the pool to solve the final file mystery, we'll do a quick recap of the series to-date to make sure everyone is up-to-snuff. I also want to take this opportunity to give a shout-out to our friends at Microsoft who are underlining the importance of getting out of the GUI box occasionally and feeling the power of CLI by deciding to "invent" a CLI for Leghorn. Or is that Foghorn? Whatever, it is the name Microsoft is using for their new vaporware OS. It's scheduled to launch in three years, so if the history of MS operating system launches are any guide, it should be here by 2009. And what a compliment that MS calls it "monad." Redmond touting the benefits of a CLI? Amazing. Who knows, maybe their lead designers have been among the noobies learning from this series all along. Corrected


The first step in the CLI for noobies series was alias cat and pipe meet grep." In it, we took a look at a few simple commands we could use to make handy CLI tools. We also learned how we could rename those tools with the alias command, and where to keep them.

The topic for week two was file this. It started the current mini-series on files. Episode one concentrated on commands and tools useful in dealing with files: finding them, renaming them, and so on.

Last week we continued our file exploration. In man for hier, we covered the basics of the Linux file system hierarchy and learned a little bit about what sorts of files live where, as well as why that matters to us.

Forward through the fog

Looking at the world from the eyes of the Linux kernel, everything is a file. Devices like modems, monitors, CD-Roms, hard drives, even keyboards and printers are files. You (what you type on the keyboard, at least) are a file, the data that appears on your screen in response to what you've typed is a file, even any error messages that might result from what you've typed is a file. In fact, those last three files form something of a trinity that deserves special attention: standard input, standard output, and standard error.

It's like this, C

Linux is written (mostly) in the C language, just like Unix. In C, most programs are able to read from the keyboard and write to the console monitor thanks to a set of standard I/O definitions for three standard streams. Streams, naturally, are a kind of file. Console programs written in C use those three files (stdin, stdout, and stderr) without a second thought. Unless you specify them differently, stdin is your keyboard and stdout is your monitor. The stderr stream might also be your monitor. That's why, grasshoppa, when you sit at a Linux console for very long, you start to feel as if you are one with the force.

Do you remember structured analysis and design? Probably not, but let's take a look at a typical dataflow diagram from the stoneage of IT anyway, just for the sake of illustration. The circle in the chart below represents a process: a program running on Linux. The curved lines outside the circle represent data in motion. Another way of saying streams. Or Files.

The curve with the arrow pointing into the circle is stdin, the standard input stream. The curved line with the arrowhead pointing away from the circle is stdout. The stdout stream contains the output from the program. The stderr stream (not shown) contains any errors that might have been generated.

Now let's do away with abstractions like dataflow diagrams and look at something concrete: one of the first commands we learned. Like this one:

cat phones.txt | grep -i steve

That command combines two processes, not just one. That gives us one more thing to talk about. In your mind's eye, label the circle in the chart above as the program cat. As a command line program, cat expects that its input (that would be stdin, of course) to be passed to it on the command line. That input would contain important things for cat to know. Like the name of the file(s) it is supposed to work with, for example. That would be the "phones.txt" in our example.

If that were the end of the command, cat would assume that the console is the stdout and print the contents of phones.txt there. But that's too easy. In this example, we have linked the stdout from cat to become the stdin of grep. That's what the pipe operator does. It pipes output from one process to another process as its input.

The "| grep -i steve" portion of our example shows that part of grep's input comes from the pipe and part of it from the command line. The grep command normally gets the names of the file(s) to be searched from the command line, like this:

grep -i steve phones.txt

In our case, the pipe operator provides grep with the data to be searched. But the other arguments still need to be provided on the command line itself, like the "-i" option and the search term ("steve"). Corrected: The original version incorrectly stated "So in this case, stdin is split between two sources: the pipe and command line data." In fact, command line arguments are completely distinct from stdin.

Let me redirect your attention

There are other operators besides the pipe which can be used with files in cool ways. Redirection is one example. Let's slightly change the example above to show how this works.

cat phones.txt | grep -i steve > steve.txt

All we've done is to add "> steve.txt" to the end. That redirects the output from grep that would normally go to stdout (your monitor) to a file named "steve.txt." Of course there are other operators as well. The "stdin so that the program gets the data from a file instead of the command line. Using ">>" instead of just ">" allows you to add the output of a program to a file instead of replacing it, if it already exists. As usual, there is plenty more where that came from. We only introduce you.

What's next?

Don't worry, we're just getting started exploring this newly discovered environment. In the future we'll learn more commands, more handy tips, and unlock the secrets of the Linux gurus. We're talking serious geek here, mister. Secrets like compiling your own programs with the magic mantra of "./configure, make, and make install." We'll also look at compiling the kernel and writing your own shell scripts. You can't be stopped now that you've learned all about files.

Click Here!