Linux DevCenter    
 Published on Linux DevCenter (http://www.linuxdevcenter.com/)
 See this if you're having trouble printing code examples


Getting Started with LDAP

by Luke A. Kanies
11/08/2001

This article was much more difficult than I expected. I initially began with an in-depth explanation of LDAP as a protocol, but realized that the real goal here is to be able to work with LDAP right now, not after reading 50 pages of abstract explanations.

So with that goal in mind, we're going to start working with LDAP in a semi-real work environment. Specifically, we're going to set up a basic LDAP directory to store Unix user accounts, along with a script to pull those accounts to a Unix system -- that is one of the things for which you can and should use LDAP. This will also be useful to demonstrate that even if your version of Unix can't authenticate directly off LDAP, you can still store your users in LDAP and get all the benefits that come with that.

The goal

As mentioned in my previous article, LDAP was developed as a method of consolidating access, authentication, and authorization (AAA, or Triple-A) information. By itself, this is useful, because you are maintaining all of the information in one place rather than many. However, you could have accomplished the same thing using any old database. What makes LDAP especially suited to store your AAA information is that all LDAP operations take place within the context of the AAA information, rather than forcing the application to supply or interpret the context. Operations fail or succeed with no need for the application to understand the rules involved.

If you attempted to put all of the same AAA information into a database (which would be somewhat difficult because you would have to define all the standards for storage of the information, which LDAP has already done for you), then every one of your applications would have to parse that information and take it into account for each AAA operation. If you use LDAP, however, there are already methods for storage of the AAA information, although they are not yet RFC-defined, and the LDAP server rather than the application applies all of the AAA rules. This not only makes the lives of the application developers easier, it also eliminates the chance of rogue applications or users bypassing the AAA rules to directly access and modify the directory contents, except of course through traditional security compromises.

So, our goal here is to build such an AAA infrastructure, with LDAP at its core. The majority of our authentication information already exists, in the form of user accounts, and for the purposes of this article, we will assume that those user accounts are on Unix machines. Our goal, then, is to put that user account information into LDAP, manage it entirely from LDAP, and then use it as the root of AAA operations in other applications. Once we have the authentication information in place, we must then add the access and authorization information.

This article will deal with the first task, replacing the standard methods of maintaining Unix accounts with methods using LDAP. Later, we will hopefully provide examples of web-based maintenance of our LDAP data and some applications which might take further advantage of this newly centralized data.

The tools

As mentioned in the previous article, nearly every modern language has an LDAP API; as such, there is a near infinite availability of tools. Fortunately, because of the simplicity of the protocol, most of the APIs work quite similarly. To get started immediately, we're going to start with the command-line tools (of which there are multiple versions), because they're straightforward and ubiquitous.

There are only three basic types of LDAP operations, and each basic type has a few subtypes: interrogation (search, compare), updating (add, modify, rename, delete), and binding/control (bind, unbind, abandon). Notice there is no "read" operation; if you want to read an entry, you use a search operation to retrieve it.

To move our Unix accounts to LDAP, we must convert them into a form the LDAP server can understand. Because of the simplicity of conversion from passwd file to LDIF (see below), we will leave the conversion as an exercise for the reader. Instead, we will begin with an already converted user account and add it to LDAP:

$ ldapadd -D "cn=Directory Manager" -h 
server
password: ********
dn: uid=luke,ou=People,dc=domain,dc=com
objectclass: top
objectclass: posixAccount
uid: luke
cn: Luke A. Kanies
cn: Luke Kanies
cn: Kanies, Luke
uidNumber: 100
gidNumber: 14
homeDirectory: /home/luke
userPassword: {crypt}8SYYCOBH.aIII
gecos: Luke A. Kanies
^D
adding new entry uid=luke,ou=People,dc=domain,dc=com
$

This operation uses two of the three basic LDAP operations, an update operation and a binding operation, along with an implicit unbind. We must bind as a privileged user in order to modify the directory (this example uses a special user which all iPlanet Directory Servers have).

We then provide the data for the entry we want to add, which is in LDAP Date Interchange Format (LDIF), the standard text format for LDAP data. Notice that each line has an attribute name, then a colon and a space, and the value of the attribute. It certainly makes sense that LDAP search operations would return this format, so that it could be immediately fed back into an LDAP server, but some tools -- including those shipped with Solaris -- do not print standard LDIF, stupidly requiring that we convert.

Components of an LDAP entry

Now that we see what an LDAP entry looks like, let's go through the various pieces to understand it.

Distinguished Names

The first line is what is called the "Distinguished Name", or dn. Because this is how we tell the server what object we are working with, the dn line must always be specified first.

The dn is how an entry is uniquely referred to within an LDAP server, similar to an absolute path name or a fully qualified domain name. Notice that the dn is represented similarly to DNS names, with the most specific information first, as opposed to path names, which have the least specific information first. Contrary to how it looks, the root of this LDAP tree, also called the "naming context," is dc=domain,dc=com, not dc=com.

There are no requirements about what you name the root of your LDAP tree, but there are two standards: either the standard I've followed here, which breaks a domain into its various domain components, or one where an organization is referred to at the top level (for example, o=domain.com). Which one you should follow will be answered differently by everyone you ask, and why you should follow it will also be answered differently. I have chosen to follow the domain component standard because it seems to be more popular these days -- Sun and Microsoft have both begun recommending/requiring it. Pick the one that seems easiest and fits best with how you plan on using the data.

The rest of the dn consists of a branch in the tree, ou=People, and an attribute-value pair uniquely identifying this entry, uid=luke. This attribute-value pair, when separate from the dn, is called the "Relative Distinguished Name," or rdn, and it uniquely identifies this object at this level of the tree.

Because path names and DNS names always refer to the name of the object in question, all they need to specify is that value, but LDAP can use any attribute to create an rdn, and as a result both the attribute and its value must be specified. The dn for this entry could also be cn=Luke A. Kanies,ou=People,dc=domain,dc=com, as long as there is no one else in ou=People with "Luke A. Kanies" as a value of cn.

We choose uid here, though, because we will always guarantee that it is unique for each entry -- anything else would not function correctly on our Unix systems, and conveniently iPlanet's Directory Server can be set up to disallow duplicate uid values (or any other attribute). Apparently Microsoft's Active Directory is requiring that cn be used to create the rdn, which is annoying because the value of cn is almost guaranteed not to be unique, as it is the user's full name. Thanks!

Object classes

Next in our entry are two objectclass attributes. These attributes define what type of object the entry is. However, the concept of an object in LDAP is extremely simple: It merely defines what attributes an entry must have and what attributes an entry is allowed to have. All object classes inherit requirements from their parent object classes and add their own. The objectclass attribute isn't a special attribute, though -- in all LDAP operations it is treated exactly like other LDAP attributes, but modifying object classes does determine whether the object will be acceptable to the server after the operation.

The above definition of an LDAP object is important, because it is difficult to convince yourself how simple this definition really is. Again, an object merely defines what attributes must or can be stored with an entry. I can create an object which has both the posixAccount object class and a printer object class; this combination may seem quite contradictory to you and me, but to LDAP, the data is just data, and has no meaning on its own. One of the things that makes LDAP great is that the attributes mean something to humans, but it is up to the humans working with the data to retain that meaning by naming object classes and attributes intelligently and then creating objects that actually make sense. This is more difficult than it sounds, and deciding how all of this will be done is the first step to using LDAP. Fortunately, most of the object classes you will need are part of the LDAP specification (although posixAccount is part of a later RFC), so at least initially you won't have to worry too much about that.

Note that you can't just willy-nilly make up object classes and add them to an object; the server must have each object class defined for it, in what is called its schema. If you try to add an entry with an undefined object class, you will get a schema violation and the operation will fail. How you define an object class for the server varies from server to server, but most of them document it quite well.

In this case, we have declared that our entry will be of types top and posixAccount. The top object class merely requires that the objectclass attribute be present, and by definition it is the parent object class for all other LDAP object classes. Given this fact, and the fact that objects always list the object classes, they are an instance of along with all of their parent object classes, every object in an LDAP database will list top as an object class. The posixAccount object class is defined in an RFC by Luke Howard, who has done a significant amount of the work involved in allowing LDAP to store Unix accounts, and provides a means to store all of the information from a passwd file in LDAP.

Attributes

As you may have already guessed, attributes can have multiple values. Whether an attribute supports multiple values is stated in the definition of the attribute, but most do.

As we mentioned earlier, the top object class requires that the objectclass be present, and it is. Thus, the presence of top has no other effect on this entry. The posixAccount object class requires cn, uid, uidNumber, gidNumber, and homeDirectory, and allows userPassword, loginShell, gecos, and description. Because a valid login account should have all of this information except the description, we have included all of it in our entry.

All of our attribute values are self-explanatory, except for the value of the userPassword. Because most (all?) Unix machines store passwords in crypt format, and because we are going to dump the LDAP entry into the passwd file, we must also store the LDAP password in crypt format. Many LDAP servers support multiple password formats, and we have included an example of how to specify the format. Servers that support multiple formats should allow specification of the default format, and it is recommended that you set the default format to crypt if you plan on using your LDAP server for storing Unix accounts. The only exception to this is if all of your Unix machines can authenticate directly off of the LDAP server and can understand formats other than crypt.

Using the data

Okay, we have the entry in the LDAP server, and we have a basic understanding of the entry itself. Now it's time to start using the data. As mentioned earlier, if you want to read an entry, you have to search for it. So, to verify that our entry is present, we will do so.

There are a total of eight (yes, eight) different options for every LDAP search, but most of them have reasonable defaults and they mostly make sense. We won't be using very many of these options to start, to keep it simple.

Here's an example search:

$ ldapsearch -h server -b "dc=domain,dc=com" "(uid=luke)"
uid=luke,ou=People,dc=madstop,dc=com
objectclass=top
objectclass=posixAccount
uid=luke
cn=Luke A. Kanies
uidnumber=100
gidnumber=14
homedirectory=/home/luke
gecos=Luke A. Kanies
$

This is on a Solaris box; horror of horrors -- notice that the output is not in LDIF format (equal signs are used instead of colons and spaces). Amazing but true. Notice also that the password is not printed; this is for security reasons, in the same way that /etc/shadow is only readable by privileged users.

In our search, we included two flags and an argument. The first flag is obvious: We specified the server. The second flag specified the base for the search, similar to how one specifies the start point of a search using the find command; we could have specified any branch including the object itself. The argument to the search is called the filter, and it's how we specify the specific entry or entries we want. All entries matching the filter will be returned. Fortunately the filter format is somewhat sophisticated, so you shouldn't have problems performing complex searches to find exactly what you want, but I'm only going to use basic filters to get the job done. For more information, consult a reference of some kind.

Because we know the dn of the entry, we could have also performed our search thus:

$ ldapsearch -s base -h server -b "uid=luke,ou=People,dc=domain,dc=com" "(objectclass=*)"

Because we know the dn, we can set that dn as the base of our search. If we do that, however, we can change the scope of the search. By default, an LDAP query searches for objects anywhere in the heirarchy beginning at the base of the search. You can also specify a scope of "one", which only searches the next lower level of the hierarchy, and base, which searches only the base itself. I have actually noticed some discrepencies here when using iPlanet Directory Server 4.x; usually the base of the search is returned if it matches the filter, but I have seen instances where that is not the case. The above method is the preferred method of retrieval if you know the dn of the entry, because it is faster and requires less work from the server.

Notice also that our filter has changed. We previously searched for a specific value of uid, but because we are using the entry's dn as the base of our search in order to retrieve it directly we use what is called an existence filter. When an attribute is searched for with a value of *, the LDAP server returns every entry which has any value for that attribute. Because all entries have values for objectclass, using a filter of objectclass=* is the standard method for returning all entries matching a given base and scope. The reason for preferring an existence filter over a more specific filter is that a specific filter causes the LDAP server to work harder, and using an existence filter here provides the same result with less work for the server.

Creating the Unix account

This is all fine and good, we've got the data in the LDAP server, and we can look at it, but how are we going to turn it into a Unix account?

Well, assuming you have added this entry and all other desired entries to your LDAP server, all it takes is a simple shell script to convert this data to the passwd and shadow files:

#!/usr/bin/bash
#
# shell script to convert from LDAP to passwd/shadow files
IFS='
' # set the internal file separator to a carriage return, in case
  # attributes have spaces or tabs in them

# look for all entries which have the posixAccount objectclass
# we have to authenticate as a privileged user in order to read the
# password
for entry in $(ldapsearch -h server -D \
"uid=sysadmin,ou=People,dc=domain,dc=com" -w password \
-b dc=domain,dc=com "(objectclass=posixAccount)"); do

  # this is the logic that prints out the passwd entry for the
  # previous entry every time we hit a DN line
  echo $line | grep "dc=madstop,dc=com" > /dev/null
  if [ $? == 0 -a ! -z "$uid" ]; then
    echo $uid:+:$uidnumber:$gidnumber:$gecos:$homedirectory:$usershell \
    >> newpasswd
    echo $userpassword >> newshadow
  fi

  # this uses some trickery to cause the shell to interpret
  # ldapsearch output as variable assignments; this may need
  # to be modified depending on the output of the ldapsearch command,
  # but this works with the ldapsearch supplied on Solaris 8
  newline=$(echo $line | sed "s/'/'\''/g
    s/=/='/
    s/$/'/")
  eval $newline
done
# this echo line puts our last entry into the files, because our
# method of dumping the entries is triggered by the next entry,
# and there is no next entry for the last entry
echo $uid:+:$uidnumber:$gidnumber:$gecos:$homedirectory:$usershell \
>> newpasswd
echo $userpassword >> newshadow

# backup the passwd and shadow files, and put the new ones in place
for file in passwd shadow; do
  cp $file $file.bak
  cp new$file $file
done

Is this pretty? Not really. Would I have rather done it in Perl? Yeah, but I don't want to get into the Perl API(s) just yet. Regardless, this script will take your LDAP entries and create valid passwd and shadow files for Solaris. They may require modification for other Unixes. This isn't terribly interesting or useful if you only have a couple of Unix machines, but if you have 200 machines all running different versions of Unix, you can easily make a version of this script for each version of Unix (if necessary), and thus consolidate all of your account information into one data store.

One thing we've ignored here is password management information. All information related to password aging and management is stored in the shadowAccount object class. It is straightforward to use, and would have only complicated this article, so its use is left to the reader. Another thing we've ignored is security; that is, we are doing a clear-text authentication to the LDAP server and all of the communication is in clear-text. A later article will begin to explain how to incorporate encryption into your directory service. SSL authentication isn't much more difficult than simple authentication.

Conclusion

We've now hopefully added our Unix account information to LDAP, and we've provided a way to pull the information back out and repopulate the passwd and shadow files. Yes, we've ignored the group files, which are also important, and we've assumed that every account is going to be present on every machine. Fortunately, it isn't much of a leap to put the group information into LDAP, and it is relatively easy to only pull certain classes of users to various classes of machines, for example, only adding developer and system administrator accounts to the web servers, and only adding system admininistrator and DBA accounts to the database servers. In fact, as you begin to use LDAP more, you'll find that like so many other automation/centralization processes, the hardest part is deciding how to make the above classifications. Most companies rely on the system administrator admin to decide if a user gets an account somewhere, or how often passwords get changed, and even if there are clear rules, those rules are usually obeyed only by system administrators, not the tools they use.

Related Reading

Running LinuxRunning Linux
By Matt Welsh, Matthias Kalle Dalheimer & Lar Kaufman
Table of Contents
Index
Sample Chapters
Full Description
Read Online -- Safari

To fully take advantage of LDAP, all of those rules must be pushed into LDAP, such that all of these decisions can be made by the automated tools instead of by people. It's a one-time investment toward a future of of less work. Fortunately, development and codification of these rules is very good for your business even if you don't plan to use LDAP, and you will most likely find that doing so is a beneficial process in and of itself.

As we progress in this series of articles, we will develop an LDAP architecture for a simple example organization. One of the aspects of that process will be to create these LDAP rules which determine who has access to what services and information, including who has accounts on which machines.

Next time

We've introduced to some aspects of how authentication is handled within LDAP and how you can migrate your existing authentication data into LDAP. In the next article in this series, we will explore management of this data through a web interface, along with basic AAA control, which is necessary for acceptable management. If I can get it together in time, we might just be able to incorporate SSL into the mix, too.

Luke A. Kanies is an independent consultant and researcher specializing in Unix automation and configuration management.


Return to the Linux DevCenter.

Copyright © 2009 O'Reilly Media, Inc.