In large heterogeneous Unix/Linux environments with several hundred servers,
keeping up to date with security patches, which are the number one requirement
for strong security, is next to impossible. Pushing out
patches across hundreds of servers, with a mix of different operating
systems, kernels, and versions, is complex, time-consuming, risky, and
very expensive. For most large companies a six-month refresh of
security patches is the best you can hope for, and even that is
optimistic. This means that your servers will have
exploitable security bugs most of the time. The challenge is not
to eliminate those bugs but to mitigate the risk they pose.
The vast majority of exploitable bugs require local access for an attacker to be able to use the exploit; he has to be logged onto the server. We tend to think that the main aim of Unix security is to guard root, but for most servers in large companies this doesn't work. A competent attacker with any local login can probably find an exploit to get root; so the real challenge is to prevent attackers from getting a local
command line login.
This is a big problem because all of your administrators (and maybe users too) have an account. Any one of them can allow an attacker in. No one can remember 200 passwords,
so if an attacker gets a login to one box, he's probably got a login
to tens or hundreds.
The first way to mitigate this risk is to compartmentalise the users and
administrators. You may be able to prevent some people having
command line access. Maybe they can be locked into a menu or
launched into a single command (e.g. to change their password). In
some cases, a chrooted environment may be suitable for users.
One method is to have each of your administrators responsible
for certain groups of servers. If Bob only has access to 30 servers and his password is compromised, at least you've limited the damage that can be done. Unfortunately, this works less well in practise, because it clashes with a key aim of most IT departments to save money by consolidating support personnel. If you have your administrators compartmentalised into five teams, that's five different on-call rotas.
Worse that this, you risk splitting up the Unix expertise and limiting communication between teams, which reduces everyone's effectiveness. This is a problem we'll see again: the contention between security and efficiency.
There are ways that compartmentalising can have a chance of working, but they are not very satisfactory. For example, separate teams could perform day-to-day administration on groups of systems, but with only one on-call rota shared between the teams. You set up a special on-call user account which is maintained across the servers, store the passwords securely, and allow the on-call engineer to access them as needed. If a password is accessed, change it the next day. In this way, any administrator can access any server.
Notice that identification and authentication are separated from authorisation in this model. Identification and authentication occur not on the server but when the password is accessed, which means that there needs to be a reasonably strong authorisation method for password access. When people access the server with this protocol, they are merely confirming that they are someone who has gained access to the password, which in itself is not too impressive. Whatever method of holding and accessing passwords is used, it is
important that only approved people can access the on-call password and that the identity of the person who does so is correctly recorded.
With this route there's still a basic problem on the efficiency side. The people on call are required to fix problems, at all hours of the night, on systems that they
do not normally log onto and so have no familiarity with. In the unlikely event that the servers are either very well documented or have very standarised builds, this might work. More likely, every server has its quirks and having a problem worked on by someone unfamiliar with those could add substantially to the time it takes to fix problems and get the production system working again.
There are other methods of compartmentalising, but these mostly restrict who has access to root commands (e.g. with the wheel group, user roles, or sudo). Because of all those security holes, this is rather like locking the stable door after the horse has bolted. To be fair, it's not pointless - not every attacker will have root exploits up his sleeve - but it's certainly not sufficient.
One other compartmentalisation method which is worth implementing is to ensure that users do not have write access to
others' home directories. By default they should not, but it's common for someone to allow world write access for convenience. This allows attackers who break into one account to take control of others too, even before they have root access.
Whether or not you can compartmentalise your administrators and users, there are other techniques to make it more difficult for attackers to gain command-line user access.
One option is to enforce access by secure shell with public/private key pairs and pass-phrases. This puts a significant barrier in the way of an attacker. Instead of just getting a password, he now has to get a private key and pass-phrase: something I have and something I know. Of course for this to work, password access must be blocked.
In large environments, public/private key access are tough to manage. First there's the problem of key distribution. If there are 20 administrators and 200 servers, that's 4,000 individual keys to push around onto servers. If someone changes his key, the new key has to be pushed out to all the servers again. On the client side, each administrator's private key must be on each client machine the administrator might use to access servers (or on a network directory mounted onto every client machine). This may include home PCs and laptops that are easily stealable.
There are other problems with using public key access. Enforcing good pass-phrases and
key aging is difficult. Revoking keys is also difficult. If an administrator leaves, his account should be locked or deleted. But because he had root access, he could have hidden his public key in someone else's account to give himself a back door onto the system, or created a fake user account. To some extent this can be checked for. For example,
scripts can flag up accounts with more than one key, but this isn't easy, and is certainly hard work.
This is one situation where a public key infrastructure (PKI) is effective. Having a PKI allows keys to be revoked and can also overcome the key distribution problem. It doesn't make people choose strong pass-phrases though; and it is certainly non-trivial to implement.
One last thing to consider with the public key solution is the efficiency angle. A critical requirement
for any solution is that administrators and users can get onto any server when they need to. As protocols get
more complex, there are more things that can go wrong and prevent access, and a greater chance that the people using a product won't understand how it works and will break it, or compromise security by using it wrongly. For example, there are various file permissions which prevent secure shell working but do not cause telnet a problem. How confident are you that your solution will always allow the right people on while keeping the wrong people off? If you are not too confident, you may need an alternative method to access the servers.
Two possible server access methods that can be used in emergencies are TCP-wrappered telnet (so people can telnet to a server, but only from a specific other server within the same location, with the plain text password not leaving the computer room) and remote console access, again within a secure computer room.
All of these techniques are helpful in maintaining security. Tomorrow, we'll turn our attention to another weak link -- the people who have command-line access to the systems.
Iain Roberts is a freelance IT consultant with 10 years' experience working in large Unix environments.