July 13, 2007

Where are your site visitors? GeoIP knows

Author: Murthy Raju

If you maintain a portal, ecommerce site, or heavily trafficked Web site, you might appreciate the ability to identify the geographical location of your site visitors. Geolocation information can help you localize content, serve relevant local advertisements, offer a download mirror close to visitors, and detect online fraud. Techniques like whois lookup of IP addresses are of some help, but they don't always find accurate locations. A better approach is a database that maps each IP address to a location -- such as MaxMind's GeoIP.

GeoIP is a set of databases that map IP addresses to country, city, and Internet service provider (ISP). Bundled with the data are LGPLed APIs for accessing those databases using C, PHP, Java, and a few other languages. MaxMind has released lite versions of the databases under an Open Data License, and offers databases enhanced with information from additional sources under a commercial licensing model.

GeoIP's two databases are called GeoIP Country and GeoIP City. You can download their free lite versions from MaxMind as either CSV or binary files. CSV files are handy if you have complex requirements and need custom routines to access the data; if not, you're better off with the ready-to-use binary database files for use with MaxMind's APIs. MaxMind also offers a dynamic shared module for Apache, mod_geoip, which enables Apache to query the location databases to identify the location of visitors.

You can install mod_geoip from an RPM file or compile it from source. You need the GeoIP C API installed to compile mod_geoip. The C API has the country database file in binary form bundled with it. To compile the mod_geoip source archive and install the shared module, use the apxs command:

apxs -i -a -L/usr/local/lib -I/usr/local/include -lGeoIP -c mod_geoip.c

After installing the module, you need to add the following lines to Apache configuration file /etc/httpd/conf/httpd.conf, or to the module-specific configuration file /etc/httpd/conf.d/mod_geoip.conf, to load the module into Apache and enable it:

LoadModule geoip_module modules/mod_geoip.so
<IfModule mod_geoip.c>
  GeoIPEnable On
  GeoIPDBFile /usr/local/share/GeoIP/GeoIPCountry.dat

Reload the Apache configuration using the command /etc/rc.d/init.d/httpd reload after making the changes.

By default, mod_geoip-enabled Apache sets up two environment variables and Apache notes table entries, GEOIP_COUNTRY_CODE and GEOIP_COUNTRY_NAME. You can use these variables or notes table entries in the configuration directives for other modules, such as mod_rewrite. That lets you do things like redirect visitors from Russia to a corresponding page on a Russia-specific site, by adding this code to /etc/httpd/conf/httpd.conf or to /etc/httpd/conf.d/mod_geoip.conf:

RewriteEngine on
RewriteRule ^(.*)$ http://ru.example.com$1 [R,L]

While you can perform simple tasks, such as redirecting, using Apache configuration, you're better off doing more complex handling based on GeoIP inside your Web application code written in PHP or in another language. For instance, you can include the country code of a visitor in the URL of an ad-serving script to serve a country-specific advertisement in your PHP code:

$visitor_country_code = $_SERVER["GEOIP_COUNTRY_CODE"];
$advertisement_url = "/path/to/advertisement_script.php?targetCountryCode=$visitor_country_code";

PHP libraries

While the above approach of using Apache for location lookups gives good performance, you may find a PHP-only approach much easier to implement if you are on a shared hosting setup. A PHP library for GeoIP, called geoip.inc, makes it easy to access GeoIP databases from within PHP without relying on Apache.

You can install the library simply by placing the file geoip.inc somewhere in the include_path of PHP. The library provides a class called GeoIP and a set of functions for using the GeoIP objects. For instance, you can place the following code in your PHP script to display the vistor's country name:

$gi = geoip_open("/usr/share/GeoIP/GeoIP.dat",GEOIP_STANDARD); 
$visitor_ip_address = $_SERVER["REMOTE_ADDR"];
$visitor_country_name = geoip_country_name_by_addr($gi, $visitor_ip_address);
echo $visitor_country_name;

Another option is to use the PHP extension GeoIP PECL module to add location lookup capability into PHP.

Outthinking ISPs and proxies

ISPs such as AOL that have operations across multiple countries pose difficulties for location software because of the way they route their traffic. Commercial versions of GeoIP databases promise better identification of geographical location, especially with such ISPs. In its commercial version, GeoIP City attempts a more precise mapping by including data from additional sources in the database. Both lite and commerical versions of the database provide location-related information such as city, longitude, latitude, region, and postal code.

Sometimes, users may route their HTTP requests through an open anonymous proxy server to mask their actual location or to circumvent local access restrictions. That can cause incorrect identification of users' locations. GeoIP proxy detection addresses this issue by identifying that the connection has come from a known public anonymous proxy server. Other related tools from MaxMind are GeoIP Organization, to find the the organization that uses a given IP address; GeoIP ISP, to identify which ISP the visitor is coming from; and GeoIP Netspeed, to get an idea about the speed of the visitor's Internet connection.

Location information about your site visitors can be a useful aid in customizing your Web content and refining your traffic analysis. The availability of GeoIP databases and APIs under open source license increases the reach of geolocation technology for the Web.


  • Internet & WWW
  • Apache & Web Servers