In large heterogeneous Unix/Linux environments with several hundred servers, keeping up to date with security patches, which are the number one requirement for strong security, is next to impossible. Pushing out patches across hundreds of servers, with a mix of different operating systems, kernels, and versions, is complex, time-consuming, risky, and very expensive. For most large companies a six-month refresh of security patches is the best you can hope for, and even that is optimistic. This means that your servers will have exploitable security bugs most of the time. The challenge is not to eliminate those bugs but to mitigate the risk they pose.
The vast majority of exploitable bugs require local access for an attacker to be able to use the exploit; he has to be logged onto the server. We tend to think that the main aim of Unix security is to guard root, but for most servers in large companies this doesn't work. A competent attacker with any local login can probably find an exploit to get root; so the real challenge is to prevent attackers from getting a local command line login.
This is a big problem because all of your administrators (and maybe users too) have an account. Any one of them can allow an attacker in. No one can remember 200 passwords, so if an attacker gets a login to one box, he's probably got a login to tens or hundreds.
The first way to mitigate this risk is to compartmentalise the users and administrators. You may be able to prevent some people having command line access. Maybe they can be locked into a menu or launched into a single command (e.g. to change their password). In some cases, a chrooted environment may be suitable for users.
One method is to have each of your administrators responsible for certain groups of servers. If Bob only has access to 30 servers and his password is compromised, at least you've limited the damage that can be done. Unfortunately, this works less well in practise, because it clashes with a key aim of most IT departments to save money by consolidating support personnel. If you have your administrators compartmentalised into five teams, that's five different on-call rotas. Worse that this, you risk splitting up the Unix expertise and limiting communication between teams, which reduces everyone's effectiveness. This is a problem we'll see again: the contention between security and efficiency.
There are ways that compartmentalising can have a chance of working, but they are not very satisfactory. For example, separate teams could perform day-to-day administration on groups of systems, but with only one on-call rota shared between the teams. You set up a special on-call user account which is maintained across the servers, store the passwords securely, and allow the on-call engineer to access them as needed. If a password is accessed, change it the next day. In this way, any administrator can access any server.
Notice that identification and authentication are separated from authorisation in this model. Identification and authentication occur not on the server but when the password is accessed, which means that there needs to be a reasonably strong authorisation method for password access. When people access the server with this protocol, they are merely confirming that they are someone who has gained access to the password, which in itself is not too impressive. Whatever method of holding and accessing passwords is used, it is important that only approved people can access the on-call password and that the identity of the person who does so is correctly recorded.
With this route there's still a basic problem on the efficiency side. The people on call are required to fix problems, at all hours of the night, on systems that they do not normally log onto and so have no familiarity with. In the unlikely event that the servers are either very well documented or have very standarised builds, this might work. More likely, every server has its quirks and having a problem worked on by someone unfamiliar with those could add substantially to the time it takes to fix problems and get the production system working again.
There are other methods of compartmentalising, but these mostly restrict who has access to root commands (e.g. with the wheel group, user roles, or sudo). Because of all those security holes, this is rather like locking the stable door after the horse has bolted. To be fair, it's not pointless - not every attacker will have root exploits up his sleeve - but it's certainly not sufficient.
One other compartmentalisation method which is worth implementing is to ensure that users do not have write access to others' home directories. By default they should not, but it's common for someone to allow world write access for convenience. This allows attackers who break into one account to take control of others too, even before they have root access.
Whether or not you can compartmentalise your administrators and users, there are other techniques to make it more difficult for attackers to gain command-line user access.
One option is to enforce access by secure shell with public/private key pairs and pass-phrases. This puts a significant barrier in the way of an attacker. Instead of just getting a password, he now has to get a private key and pass-phrase: something I have and something I know. Of course for this to work, password access must be blocked.
In large environments, public/private key access are tough to manage. First there's the problem of key distribution. If there are 20 administrators and 200 servers, that's 4,000 individual keys to push around onto servers. If someone changes his key, the new key has to be pushed out to all the servers again. On the client side, each administrator's private key must be on each client machine the administrator might use to access servers (or on a network directory mounted onto every client machine). This may include home PCs and laptops that are easily stealable.
There are other problems with using public key access. Enforcing good pass-phrases and key aging is difficult. Revoking keys is also difficult. If an administrator leaves, his account should be locked or deleted. But because he had root access, he could have hidden his public key in someone else's account to give himself a back door onto the system, or created a fake user account. To some extent this can be checked for. For example, scripts can flag up accounts with more than one key, but this isn't easy, and is certainly hard work.
This is one situation where a public key infrastructure (PKI) is effective. Having a PKI allows keys to be revoked and can also overcome the key distribution problem. It doesn't make people choose strong pass-phrases though; and it is certainly non-trivial to implement.
One last thing to consider with the public key solution is the efficiency angle. A critical requirement for any solution is that administrators and users can get onto any server when they need to. As protocols get more complex, there are more things that can go wrong and prevent access, and a greater chance that the people using a product won't understand how it works and will break it, or compromise security by using it wrongly. For example, there are various file permissions which prevent secure shell working but do not cause telnet a problem. How confident are you that your solution will always allow the right people on while keeping the wrong people off? If you are not too confident, you may need an alternative method to access the servers.
Two possible server access methods that can be used in emergencies are TCP-wrappered telnet (so people can telnet to a server, but only from a specific other server within the same location, with the plain text password not leaving the computer room) and remote console access, again within a secure computer room.
All of these techniques are helpful in maintaining security. Tomorrow, we'll turn our attention to another weak link -- the people who have command-line access to the systems.
Iain Roberts is a freelance IT consultant with 10 years' experience working in large Unix environments.
Note: Comments are owned by the poster. We are not responsible for their content.
I just noticed that Cryptonomicon.Net is running a critique of sorts of this article at
<A HREF="http://www.cryptonomicon.net/modules.php?name=News&file=article&sid=720" TITLE="cryptonomicon.net">http://www.cryptonomicon.net/modules.php?name=New<nobr>s<wbr></nobr> &file=article&sid=720</a cryptonomicon.net>
Thanks to everyone who commented on my article. Here are my responses.
"Sorry why are they mixed".
They are mixed because in large IT environments, that's the situation you get. Systems bought over many years, with different OSs, different versions, different apps. This is the real world of computing with large companies. Nothing wrong with your suggestions for a greenfield site, but that's not what the article is about.
"Why run a telnetd?"
I agree with your conclusions that serial connections are better; but may not always be possible (e.g. you need to lay down lots of new cables and there may be distance problems). As for saying that anyone who runs telnetd is mad : well, maybe, but again that ignores the reality of working in big companies where telnet has been around for years and change is difficult to push through. Phasing out telnet is not easy - and I speak as someone who has done it and knows just what the problems are. Life isn't as simple as you think sometimes!
"Might have missed the point here..."
Yes you have I'm afraid and so did the cryptonomicon article. The issue is not the technical solution : that's trivial. The issue is that no large organisation will allow you to push out changes in that way - it's seen as too risky and uncontrolled.
Typically, even if you have the infrastructure to push out patches on mass, you still have to do it in a controlled way with lots of different changes and it still takes a long time to achieve. That's the reality in all but the most homogenous of large scale environments. The technology is almost irrelevant. The business requirements, their perception of risk, the bureaucracy and internal politics is what it all comes down to. I'm sure anyone with experience of working in IT in large companies will confirm this.
Liked your other comments about PKI, SSH and telnet.
Cryptonomicon posting
As I've said, I feel that the central criticism of my article, on patching, and the overall negative opinion of my article is both wrong and wholly unjustified. However, the piece does have some good ideas which I did not include in my Newsforge piece so take a look for those.
Sorry why are they mixed
Posted by: Anonymous Coward on March 18, 2004 10:04 AMPick a key distro/distros of linux install it across the servers(a max of 2 different base types anymore to hard and one is taken out if you are running windows) Ie Debian based distos can be moved to a common core not a problem. In some cases even Redhat and Mandrake can be common cored to Debian(note it is harder than picking a Debian based one). Now having a common core setup mean one lot of downloads spread accross all servers so it can be proxyed reducing internet useage.
Now root access can be made verry hard by installing lids. Note standard Linux security is not at max only after few add on it reachs it max.
Basicly getting to root is pointless without lids approval because you cannot do anything basicly two layers have to be defeated.
#