'Know Your Enemy': Everything you need to know about honeypots

Honeypots are a relatively new and highly dynamic technology. Because they are so dynamic, it is difficult to define just what they are. Honeypots are unique in that they are not a solution in and of themselves; they do not solve a specific security problem. Instead, they are highly flexible tools with many different information security applications.


“Know Your Enemy (2nd Ed.)” Addison-Wesley, $49.99

This article is excerpted from the recently published book “Know Your Enemy: Learning About Security Threats“.

This contrasts with such technologies as firewalls and intrusion detection systems (IDSs), which are easier to define and understand as they solve specific problems. Firewalls are a prevention technology; they are network or host solutions that keep attackers out. IDSs are a detection technology; their purpose is to detect and alert security professionals about unauthorized or malicious activity. Honeypots are tougher to define because they can be involved in aspects of prevention, detection, information gathering, and much more. For the purpose of this book, we will define a honeypot as follows:

A honeypot is an information system resource whose value lies in unauthorized or illicit use of that resource.

This definition was developed by members of the Honeypot mail list, a public forum made up of over 5,000 security professionals. The definition was difficult to develop, as honeypots can come in so many different shapes and sizes. As a result, this definition is very broad in scope, as it has to cover many different applications of honeypots. The definition of a honeypot does not indicate how a honeypot works or what its purpose is. Instead, its definition refers to how a honeypot generates its value. Simply put, honeypots are a technology whose value depends on the bad guys interacting with it. All honeypots work on the same concept: Nobody should be using or interacting with them-any transactions or interactions with a honeypot are by definition unauthorized.

A honeypot contains no value as a production-oriented component of an information infrastructure-it does no real productive service. Any transactions processed, any logins attempted, or any data files accessed on a honeypot are most likely malicious or unauthorized activities. For example, a honeypot system can be deployed on an internal network. This honeypot would have no production value and no one in the organization should be using it. It could appear to be a file server, a web server, or even an employee’s workstation. If someone interacts with that system, they are most likely committing some unauthorized or malicious activity.

In fact, a honeypot does not even have to be a computer. It can be any type of digital entity (often called a honeytoken) that has no production value. For example, a hospital could create a false set of electronic patient records labeled George W. Bush. Because these records are honeypots, nobody should be accessing or interacting with them. These records could then be implanted into a hospital’s patient database as a honeypot component. If any employee or attacker attempted to access these records, this would indicate unauthorized activity because no one should be using these records. If anyone or anything accesses the records, they could also generate an alert. It is the very simplicity of this concept that gives honeypots their tremendous advantages (and disadvantages).

Advantage and disadvantages

Honeypots collect only small data sets. Honeypots only collect data when someone or something is interacting with them. As a result, honeypots collect very small sets of data, although it is extremely valuable data. Organizations that log thousands of alerts a day may log only a hundred alerts with honeypots. This makes the data honeypots collect much easier to manage and analyze.
Honeypots reduce false positives. One of the greatest challenges of most detection technologies is that they generate false positives or false alerts. It’s similar to the problem of car alarms. To stop cars from being stolen, owners install alarms in them to trigger whenever someone attempts to break-in or steal the vehicle. The problem is, these alarms are falsely triggered (a false positive) so often that people simply ignore them. Think about it, what do you do when you are walking in the parking lot and you hear a car alarm? Most likely, nothing. Many detection technologies today face the same problem. The larger the probability that a security technology produces a false positive, the less likely the technology will be useful. Honeypots dramatically reduce false positives simply because almost any activity with honeypots is by definition unauthorized, making honeypots extremely efficient at detecting attacks.
Honeypots can catch false negatives. Another challenge inherent in traditional detection technologies is that they often fail to detect unknown attacks. This is a critical difference between honeypots and traditional computer security technologies that rely upon known signatures or statistical detection. Signature-based security technologies by definition imply that “someone is going to get hurt” before the new attack is discovered and a signature is distributed. Statistical detection also suffers from probabilistic failures-there is some non-zero probability that a new kind of attack is going to go undetected. Honeypots, on the other hand, are designed to identify and capture new attacks against them. Any activity with the honeypot is an anomaly, making new or unseen attacks stand out.
Honeypots capture encrypted activity. Even if an attack is encrypted, honeypots can capture the activity. As more organizations adopt encryption (such as secure shell [SSH], IP Security Protocol [IPsec], and Secure Sockets Layer [SSL]) within their environments, this becomes a major issue. Honeypots can do this because the encrypted probes and attacks interact with the honeypot as an end point, where the activity is decrypted by the honeypot.
Honeypots work with IPv6. Most honeypots work in any IP environment, regardless of the IP protocol, including IPv6. IPv6 is the new Internet Protocol (IP) standard that many organizations, such as the Department of Defense, and many countries, such as Japan, are actively adopting. Many current technologies, such as firewalls and intrusion detection system sensors, are not adapted well for IPv6.
Honeypots are highly flexible. Honeypots are extremely adaptable and can be used in a variety of environments, everything from a social security number embedded into a database to an entire network of computers designed to be broken into. It is the ability to customize honeypots that allows them to do what few other technologies can: gather extensive information, especially against insider threats.
Honeypots require minimal resources. Even on the largest of networks, honeypots require minimal resources. A simple, aging Pentium computer can monitor literally millions of IP addresses, or an OC-12 network.

Like any other technology, honeypots also have disadvantages. They are not designed to replace any technologies. Instead, they add value by working with existing technologies. As a honeynet is nothing more than one type of honeypot, honeynets also share the following disadvantages:

Honeypots have a limited field of view. Honeypots see only what interacts with them. They do not see attacks against or interactions with other systems. While this can be an advantage, it can also be a disadvantage. A honeypot will not tell you that another system has been compromised, unless that compromised system interacts with the honeypot. To address this disadvantage, there are a variety of measures you can take to direct attackers’ activities to honeypots, such as the use of honeytokens, redirection, and so on.
Risk. Any time you deploy a new technology, that technology introduces risk-specifically, the risk of an attacker taking over that system and using it as a launching pad for other attacks against internal or external targets. Even IDS solutions that have no IP stack assigned to them can be at risk (sniffers such as Snort and Snoop have been vulnerable to remote attacks). Honeypots are no exception. Different honeypots have different levels of risk, with various ways to mitigate that risk. Of all the different types of honeypots, Honeynets have the greatest level of risk.

More on Page 2: Types of honeypots

Types of honeypots

High-interaction honeypots

To better understand honeypots, we can divide them into two general categories: low interaction and high interaction. Interaction is the amount of activity a honeypot allows an attacker to have with that honeypot. The more interaction a honeypot allows, the more an attacker can do with the honeypot and the more you can learn. However, the more the attacker can do, the greater the risk. Low-interaction honeypots allow for a limited amount of interaction, whereas high-interaction honeypots allow for an extensive amount of interaction. While these categories are general in nature, they help us better understand the capabilities and limitations of the honeypots we are dealing with.

Low-interaction honeypots

Low-interaction honeypots work primarily by emulating systems and services. Attackers’ activities are contained to what the emulated services allow. For example, the BackOfficer Friendly honeypot shown in Figure 2-1 is an extremely simple honeypot that emulates seven different services. Attackers are very limited to what they can do with the honeypot based on the emulated services. At the most, attackers can connect to the honeypot and issue a few basic commands.

Low-interaction honeypots tend to be easier to deploy as they usually come preconfigured with a variety of options for the administrator. You merely have to point and click, and you instantly have a honeypot with the operating system, services, and behavior you want, as we see in the interface for Specter, shown in Figure 2-2. Specter is a commercial honeypot designed to run on Windows. It can emulate up to 13 different operating systems and monitors 14 different services. User interfaces make deploying honeypots very simple, as you merely have to click on the services you want monitored and how you want the honeypot to behave.

Low-interaction honeypots also have minimal risk, as the emulated services contain the hacker, limiting what they can and cannot do. There is no real operating system for the attacker to upload toolkits to, nor are there any services they should be able to actually break into.

However, emulated services are also limited to the amount of information they can capture, as attackers have limits as to what they can do. Also, emulated services primarily work best with known behavior or expected attacks. When attackers do something unknown or unexpected, low-interaction honeypots have difficulty understanding the attacker’s actions, responding properly, or capturing the activity. Some examples of low-interaction honeypots include Honeyd, Specter, and KFSensor. To better understand how a low-interaction honeypot works, let’s take a quick look at the Honeyd honeypot.

Low-interaction honeypot example: Honeyd

Honeyd is an open source honeypot that was developed by Niels Provos and was first released in April 2002. As an open source solution, Honeyd is free to use and provides users with full access to its source code. Developed and designed for UNIX, Honeyd has also been ported to Windows. However, the Windows port lacks many of the features the UNIX version has. Honeyd is a low-interaction honeypot in that you install the software on a computer. This software then emulates hundreds of different operating systems and services, as typical of most low-interaction solutions. By editing the configuration file, you determine which IP addresses Honeyd will monitor, the types of operating systems it will emulate, and the services it will emulate.

For example, you can tell Honeyd to emulate a Linux 2.4.14 kernel system with an emulated File Transfer Protocol (FTP) server listening on port 21. If attackers probe the honeypot, they will believe they are interacting with a Linux system. If attackers connect to the FTP service, they will be deceived into thinking they are interacting with the wu-ftpd service. The emulated script behaves in many of the same ways a real wu-ftpd service would behave, logging all of the attacker’s activities. However, the script is nothing more than a program that expects specific input from the attacker and then returns a predetermined output. If the attacker does something, the emulated script is not programmed to react to, the script merely returns an error message.

The following is some of the source code of the emulated wu-ftpd service script that comes with Honeyd.

QUIT* ) echo -e "221 Goodbye.r" exit 0;; SYST* ) echo -e "215 UNIX Type: L8r" ;; HELP* ) echo -e "214-The following commands are recognized.r" echo -e echo -e echo -e echo -e echo -e echo -e echo -e "214 Direct comments to ftp@$domain.r" ;;


chartcode





"USER



PORT



STOR



MSAM*



RNTO



NLST



MKDr"





"PASS



PASV



APPE



MRSQ*



ABOR



SITE



XMKDr"





"ACCT*



TYPE



MLFL*



MRCP*



DELE



SYST



RMDr"





"SMNT*



STRU



MAIL*



ALLO



CWD



STAT



XRMDr"





"REIN*



MODE



MSND*



REST



XCWD



HELP



PWDr"





"QUIT



RETR



MSOM*



RNFR



LIST



NOOP



XPWDr"







USER* )

parm1_nocase='echo $parm1 | gawk '{print toupper($0);}'' if [ "$parm1_nocase" == "ANONYMOUS" ]

then

echo -e "331 Guest login ok, send e-mail as password.r" AUTH="ANONYMOUS"

else

echo -e "331 Password required for $parm1r" AUTH=$parm1

fi

;;  


Notice how in the script, Honeyd expects specific input and then has predetermined responses to that input. If the emulated FTP service gets input it does not expect, it returns an error message. Honeyd includes several features not common to many low-interaction honeypots. First, not only does it emulate operating systems by modifying the behavior of emulated services, it also emulates operating systems at the IP stack level. If an attacker uses active fingerprinting methods (such as security scanning tools Nmap or Xprobe), Honeyd responds at the IP stack level as whatever operating system you want. In addition, unlike most low-interaction honeypots, Honeyd can monitor literally millions of IP addresses. Honeyd does this not by monitoring the IP address of the computer it's installed on; instead, it monitors all of the unused IP addresses in your network. When Honeyd identifies a connection attempt to an unused IP, it intercepts that attempt, dynamically assumes the identity of the victim, and then interacts with the attacker. This capability dramatically increases Honeyd's chances of interacting with an attacker.


High-interaction honeypots are very different from low-interaction honeypots as they provide entire operating systems and applications for attackers to interact with. High-interaction honeypots do not emulate; instead, they are real computers with real applications to be broken into. The advantages provided by high interaction honeypots are tremendous. For one, they are designed to capture extensive amounts of information. Not only can they detect attackers probing a system, they also allow attackers to break into the service and gain access to the operating system. You can then capture the attackers' rootkits as they upload them onto the systems, analyze their keystrokes as they interact with the computer, and monitor their communications as they talk with other attackers. As a result, you can learn attackers' motives, skill levels, organization, and other critical information.


Also, since high-interaction honeypots do not emulate, they are designed to capture new, unknown, or unexpected behavior. Time and time again, high-interaction honeypots have demonstrated the capability to capture new activity, everything from nonstandard IP protocols used for covert command channels, to tunneling IPv6 in IPv4 environment to hide communications. However, these tremendous capabilities come at a price. First, high-interaction honeypots pose a high level of risk. Since attackers are provided real operating systems to interact with, these same honeypots can be used to attack or harm other non-honeypot systems. Second, high-interaction honeypots are complex. You don't simply install software and instantly have a honeypot. Instead, you need to build and configure real systems for the attackers to interact with. Also, a great deal of complexity is added as you attempt to minimize the risk of attackers using your honeypots to harm or attack other people.


Two examples of high-interaction honeypots are Symantec's Decoy Server and honeynets. As this entire book is dedicated to honeynets, we will not discuss them in this chapter. However, to give you a better idea of high-interaction honeypots, we will spend a moment discussing Decoy Server.

High-interaction honeypot example: Symantec Decoy Server

Decoy Server is a commercial honeypot sold by Symantec. As a high-interaction honeypot, Decoy Server does not emulate operating systems or services. Instead, it creates real systems and real applications for attackers to interact with. Currently, Decoy Server works only on the Solaris operating system, both SPARC and Intel platforms. Decoy Server is a software program that is installed on an existing Solaris computer. The software then takes the existing host system and creates up to four identical "cages," each cage being a honeypot. Each cage has a separate operating system with its own file system. Attackers interact with the cages just as they would with real operating systems. What attackers don't realize is that their every action and keystroke is being logged and recorded by the honeypot. Figure 2-3 shows a logical diagram of how this technology works.

Low-interaction vs. high-interaction honeypots

Keep in mind when choosing low-interaction or high-interaction honeypots that no one type of honeypot is better than the other. Each type of honeypot has:


Cage 1
Cage 2
Cage 3
Cage 4
Host Operating Sytem









Advantages and disadvantages of low-interaction and high-interaction honeypots
Low-interaction honeypots (Emulate operating systems and services)

Easy to install and deploy; usually requires simply installing and configuring software on a computer

Minimal risk as the emulated services control what attackers can and cannot do

Captures limited amount of information, mainly transactional data and some limited interaction

Can capture far more information than can low-interaction honeypots, including new tools, communications, or attacker keystrokes

High-interaction honeypots (No emulation; provide real operating systems and services)

Can be complex to install or deploy (commercial versions tend to be much simpler)
Increased risk as attackers are provided real operating systems to interact with its own unique advantages and disadvantages. Different organizations have different goals and therefore use different honeypots. One common trend is that, in general, commercial organizations (such as banks, manufacturing, or retail stores) prefer low-interaction honeypots as they are low risk, easy to deploy, and simple to maintain. High-interaction honeypots are more common among organizations that need the unique capabilities of high-interaction solutions and manage the risk, such as military, government, and educational organizations. Table 2-1 compares the advantages and disadvantages of low- and high-interaction honeypots.


More on Page 3: Uses of honeypots 

Uses of honeypots
High-interaction honeypots

You now know that honeypots are extremely flexible tools that can be used for a variety of purposes. Think of them as tools in your security arsenal; you can use them however they best fit your needs. In general, we can break down a honeypot's value into two broad categories: production and research. In general, low-interaction honeypots are used for production purposes, whereas high-interaction honeypots are used for research purposes. However, either type of honeypot can be used for either purpose. Once again, neither purpose is better than the other. These categories simply help you identify what you are attempting to achieve with your honeypot. When used for production purposes, honeypots can protect organizations in one of three ways: by preventing attacks, detecting attacks, and responding to attacks. When used for research purposes, honeypots collect information. This information provides different value to different organizations. Some organizations may want to study trends in attacker activity, whereas others may be interested in early warning and prediction or law enforcement. Let's take a more in-depth look at how a honeypot can work for you.

Preventing attacks

Honeypots can help prevent attacks in several ways. For one, honeypots can prevent automated attacks, such as those launched by worms or auto-rooters. These attacks are based on tools that randomly scan entire networks looking for vulnerable systems. If vulnerable systems are found, these automated tools then attack and take over the system (with worms self-replicating, or copying themselves, to the victim). One way that honeypots can help defend against such attacks is by slowing the scanning process, potentially even stopping it. Called "sticky honeypots," these solutions monitor unused IP space. When probed by such scanning activity, the honeypots interact with and slow the attacker. They do this using a variety of Transmission Control Protocol (TCP) tricks, such as using a Windows size of zero, which puts the attacker into a holding pattern. This is excellent for slowing down or preventing the spread of a worm that has penetrated your internal organization. One such example of a sticky honeypot is La Brea Tar pit. Sticky honeypots are most often low-interaction solutions (you can almost call them "no-interaction solutions," as they slow the attacker down to a crawl).


You can also use honeypots to protect your organization from human (that is, non-automated) attacks. The concept is based on deception or deterrence. The idea is to confuse attackers, making them waste their time and resources interacting with honeypots. Meanwhile, your organization is able to detect the attacker's activity and has the time to respond and stop it. This can be taken one step farther. If attackers know your organization is using honeypots but they do not know which systems are honeypots and which systems are legitimate computers, they may be so concerned about being caught by honeypots that they decide not to attack your organization. Thus, the honeypot deters attackers. An example of a honeypot designed to do this is Deception Toolkit, a low-interaction honeypot.

Detecting attacks

Another way in which honeypots can protect an organization is through detection. Detection is critical as it identifies a failure or breakdown in prevention. Regardless of how secure an organization is, there will always be failures, if for no other reason than humans are involved in the security process. By detecting attacks, you can quickly react to them, stopping or mitigating the damage they do.


Detection has traditionally proven to be an extremely difficult activity. Technologies such as intrusion detection system sensors and systems logs have proven ineffective for several reasons: They generate far too much data and a large percentage of false positives, they are unable to detect new attacks, and they are unable to work in encrypted or IPv6 environments. Honeypots address many of these traditional detection problems, reducing false positives by capturing small data sets of high value, capturing unknown attacks such as new exploits or polymorphic shellcode, and working in encrypted and IPv6 environments. You can learn more about this in the paper "Honeypots: Simple, Cost Effective Detection (Spitzner 2003)." In general, low-interaction honeypots make the best solutions for detection. They are easier to deploy and maintain than high-interaction honeypots and have less risk.

Responding to attacks



Honeypots can also help protect organizations by responding to attacks. Once an organization has detected a failure, how should it respond? This can often be one of the greatest challenges organizations face. There is often little information on who the attackers are, how they got in, or how much damage they have done. In these situations, detailed information on the attacker's activities is critical. There are two problems compounding incidence response. First, the very systems compromised often cannot be taken offline to be analyzed. Production systems, such as an organization's mail server, are so critical that even though the system has been hacked, security professionals may not be able to take the system down and do a proper forensic analysis on it. Instead, they are limited to analyzing the live system while still providing production services. This makes it difficult to analyze what happened, how much damage the attacker has done, and to determine whether the attacker has broken into other systems.


Another problem is that even if the system is taken offline, there is often so much data pollution that it can be very difficult to determine what the attackers did. By data pollution, I mean that there has been so much activity (users logging in, mail accounts read, files written to databases, and so on) that it can be difficult to determine what is normal day-to-day activity and what are the attacker's actions.


Honeypots can help address both problems as they can quickly and easily be taken offline for a full forensic analysis without impacting day-to-day business operations. Also, because the only activity a honeypot captures is unauthorized or malicious activity, this makes hacked honeypots much easier to analyze than hacked production systems, as any data you retrieve from a honeypot is most likely related to the attacker. The value honeypots provide is thus that they are able to quickly give organizations the in-depth information they need to rapidly and effectively respond to an incident. In general, high-interaction honeypots make the best solution for response purposes. To respond to intruders, you need in-depth knowledge on what they did, how they broke in, and what tools they used. For that type of data you most likely need the capabilities of a high-interaction honeypot.

Using honeypots for research purposes

As noted earlier, honeypots can also be used for research purposes, to gain extensive information on threats, information few other technologies are capable of gathering. One of the greatest problems security professionals face is a lack of information or intelligence on cyber threats. How can your organization defend itself against an enemy when you don't even know who that enemy is? Research honeypots address this problem by collecting information on threats. Organizations can then use this information for a variety of purposes, including analyzing trends, identifying new tools or methods, identifying attackers and their communities, ensuring early warning and prediction, or understanding attackers' motivations.


By now, you should have a better understanding of what honeypots are, how they can be used, how powerful they can be, and what advantages and disadvantages are inherent in their use. From this point on, we will focus only on honeynets, which are nothing more than one type of honeypot. If you want to learn more about other honeypots, consider the book "Honeypots: Tracking Hackers" (Spitzner 2003). This is the first and only book dedicated entirely to honeypot technologies.

Lance Spitzner has had a longtime interest in enterprise security and is particularly passionate about researching honeypot technologies. He is also the author of "Honeypots: Tracking Hackers" (Addison-Wesley). This excerpt is taken from "Know Your Enemy: Learning About Security Threats" (Second edition, Addison-Wesley).

RELATED ARTICLESMORE FROM AUTHOR

Celebrating the Second Year of Linux Man-Pages Maintenance Sponsorship

How to Deploy Lightweight Language Models on Embedded Linux with LiteLLM

Automating Compliance Management with UTMStack’s Open Source SIEM & XDR

Using OpenTelemetry and the OTel Collector for Logs, Metrics, and Traces

Xen 4.19 is released

RELATED ARTICLES MORE FROM AUTHOR