Understanding LDAP Data
LDAP is a protocol for interacting with data. The way the LDAP views data is as a hierarchical collection of objects, each of which has one or more attributes. This view of the data works well in many environments, because most environments store facts (or attributes) about people, places, or things (objects). Furthermore, these objects generally fall under some sort of top-level managerial or human resources domain, like "accounting," "legal," or "inventory." But how do we map this theoretical babble to something that can actually be understood by LDAP? How do we tell LDAP that "Tom" is a person, and how do we tell our systems how to grab Tom's password attribute?
The answer lies in schemas. We touched on schema files in our
last article during the configuration of our OpenLDAP server. Schema files are
imported by the slapd process at startup, and they define which objects are
supported by the LDAP directory. The OpenLDAP tarball you unpacked in our last discussion includes many of the most popular and useful schemas, so you won't have to grep the entire Internet to find what you're looking for. Just cd /etc/openldap/schema and you can peruse some of the schemas available. No matter what problem you're trying to get LDAP to solve, there is probably already a schema
available tailored to the task.
To get a taste of how schemas work, let's have a look at just one object definition. This one from the standard nis.schema file, which comes with OpenLDAP:
objectclass ( 1.3.6.1.1.1.2.0 NAME 'posixAccount' SUP top AUXILIARY
DESC 'Abstraction of an account with POSIX attributes'
MUST ( cn $ uid $ uidNumber $ gidNumber $ homeDirectory )
MAY ( userPassword $ loginShell $ gecos $ description )
)
This is the object definition of the posixAccount object. If you include
nis.schema in your slapd.conf file, then you can define objects of this type
to store in your directory, which we'll do in a minute. First let's understand
what this object definition is telling us.
The DESC line is self explanatory, and sometimes isn't as helpful as you
might like. The MUST line
consists of a list (separated by dollar signs) of required attributes that every
posixAccount object must have associated with it. The MAY line is a similar list, but these attributes are all optional, or allowed.
Do not discount the part of this block that says SUP top AUXILIARY. This
is actually a crucial part of the object's definition, which we'll come back to
when we have a better context to put that information in. For now, we know that
if we use nis.schema and define a posixAccount object, that object must
have, for example, a homeDirectory. But how do we know what a homeDirectory
is supposed to look like? Well, we can look at the homeDirectory attribute
definition (also from the nis.schema file), which will give us a clue:
attributetype ( 1.3.6.1.1.1.1.3 NAME 'homeDirectory'
DESC 'The absolute path to the home directory'
EQUALITY caseExactIA5Match
SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 SINGLE-VALUE )
The DESC line tells most *nix administrators all they need to know. A
homeDirectory attribute is in the form of an absolute path to the home
directory of that particular posixAccount.
Blowing the cover
The simple truth, in practice, is that the purpose of the posixAccount object
type is to store information about accounts that is typically found in an
/etc/passwd file, or a NIS passwd map. The two are very similar. If you
work with either of these account storage mechanisms, then most of these
objects and attributes mean exactly what you think they should mean.
For now, you should understand that each entry in an LDAP directory is considered an object. Each object has one or more attributes. The objects and attributes that will be understood by your directory are defined in schema files, which are simple text files created to allow admins like us can get real work done with a minimum of hassle.
LDAP data migration: Laying the (hierarchical) foundation
Importing data from files or NIS to LDAP requires that you extract the data and transform it into a format called LDAP Data Interchange Format that can be readily understood by your LDAP directory. LDIF is easy to understand and work with, and there are tools available to automate the transformation. In addition, it's easy enough to use that I generally script my own transformation routines, and I'm not really known for my coding abilities.
The first bit of LDIF we need to write and import into our directory server should define some hierarchy for the rest of our objects to sit under. There is more than one way to structure this, but the most popular method nowadays (at least for new deployments) is the domainComponent model. In most cases, this model maps the parts of your DNS domain (e.g. linuxlaboratory.org) to separate domain components (e.g. dc=linuxlaboratory,dc=org). This new object becomes the top-level of your directory server.
Here's the LDIF for my test directory's top-level object:
dn: dc=linuxlaboratory,dc=org
objectClass: top
objectClass: dcObject
objectClass: organization
o: LinuxLaboratory
dc: linuxlaboratory.org
description: Your Source for (more) Advanced Linux Knowledge
In the first line of this entry, "dn" stands for Distinguished Name. Every object in your directory, no matter what type of object it is, is uniquely identified by the value of this attribute. In fact, your LDAP directory will throw an error if you try to import two objects that have the same value for dn.
Notice that this object has three objectClass lines. That's because I want to take advantage of an attribute I'm allowed to use with the "organization" object that I'm not allowed to use with the "dcObject" object: namely, the "description" attribute.
It's okay to combine object types to take advantage of different attributes allowed by each one, provided you follow the rules. Those rules harken back to the first line of the objectClass definitions we looked at earlier. Remember when I said not to discount the part that said SUP top AUXILIARY? Here's where that can make or break your directory design.
Aside from "AUXILIARY," an object can also be described as being "STRUCTURAL." There are other types as well, but these are the two most prevalent. In addition, each object definition lists its superior, as noted by the "SUP top" in the earlier posixAccount definition. "top" is the highest-level object, but objects can have other objects as their superiors. For every entry, there can be one and only one "SUP top STRUCTURAL" objectClass used to define it. The rest must be AUXILIARY, or STRUCTURAL objects with a different superior object. In this example, "organization" is the only STRUCTURAL objectClass, and dcObject is AUXILIARY. But take a look at this string of objectClasses, taken from an account entry we'll se a bit later:
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
objectClass: posixAccount
objectClass: inetLocalMailRecipient
objectClass: shadowAccount
The objects organizationalPerson, person, and inetOrgPerson are all STRUCTURAL. The reason this works is that, in short, this string forms a proper "chain of superiors," as I like to call it. The person object's superior is the top-level "top," and it is STRUCTURAL. organizationalPerson is STRUCTURAL, but its SUP is person. inetOrgPerson is also STRUCTURAL, but its SUP is organizationalPerson. The rest of the objectClasses listed are AUXILIARY (which I think of as supplementary). Since I do not have any two STRUCTURAL objectClasses listed with the same SUP object, the chain is never broken.
To be clear, if I went back and added another objectClass to this list which was A) STRUCTURAL, and B) had the same SUP as another already-listed STRUCTURAL object, I would break my design. This constraint was not strictly enforced in earlier versions of OpenLDAP, but later versions, as they strive to conform to the LDAPv3 spec, have begun throwing errors for bad design. In the long run, good design saves more headaches than conforming to good design causes.
More on page 2...
There are no comments attached to this item.