Teresa Lau

Subscribe to Teresa Lau: eMailAlertsEmail Alerts
Get Teresa Lau: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: Apache Web Server Journal, XML Magazine

Apache Web Server: Article

A Three-Way Query

A Three-Way Query

Lightweight Directory Access Protocol (LDAP) is fast becoming a de facto access method for common directory information. XML is now the standard for data exchange on the Web. The relational database, a technology for modeling data in table form, has always been the most popular way to store and retrieve application data.

How do these three technologies come together, other than the fact that they're all buzzwords in the technology world? They're all a way to represent data, which you can easily access using the powerful set of query capabilities they each provide. LDAP data can be queried using LDAP directory search, XML data can be queried using XPath, and relational database data can be queried using SQL.

In this article I describe an Employee Information Application that can use all three methods to store and query data. You'll learn how to construct each kind of query and write Java code to retrieve data, respectively. It's of great value to you as a developer because you're likely to come across systems implemented using one of these technologies and be asked to develop applications on top of it.

The Employee Information Application
Figure 1 shows the user interface of a simple Employee Information Application, organizing employees by last name, first name, phone, and e-mail address.

Each employee can be represented as a Java object (seen below). In the next few sections, I discuss how this data can be represented and queried in various ways.

class Employee
{
private String id;
private String lastName;
private String firstName;
private String phone;
private String e-mail;
}

Relational Database and SQL
The relational database is a common way to model application data. You can model all the employee information described above into one table called Employee. The id will be a unique key used to identify an employee.

create table Employee
{
  id varchar(64)                        not null;
  lastName varchar(64)            null;
  firstName varchar(64)            null;
  phoneNumber varchar(12)     null;
  e-mail varchar(64)                  null
}
create unique index IndxEmp on Employee (id)

To retrieve data from relational database, use SQL. Most developers are familiar with SQL, so I won't explain it here. A standard SQL looks like this:

Select <field1>, <field2>
From <table>
Where <field1> = <value1>
And <field2> = <value2>
And ...

Querying a Relational Database in Java Code
To access the database using Java, you can use JDBC, the standard API for executing SQL statements in any relational database. With JDBC, different drivers can be installed dynamically to access different databases. As a result, your code remains the same irrespective of the kind of database you're using. The only difference is you need to include that particular database driver class in your classpath when you run the program; and inside the code you need to load your specific database driver class at the beginning.

In my implementation I use Sybase for the database and Sybase jConnect as the JDBC driver. To get the code to run I put jConnect.jar in my classpath when I run my program. To log in to the database, I specify the login properties below, which I'll put into a properties file, "db.properties."

user=username
password=password
server=servername
sqlinitstring=use databasename

Now I'll initialize by creating a connection to the database using the code below:

Class.forName("com.sybase.jdbc.SybDriver");
Properties prop = Util.loadProperties ("db.properties");
String URL="jdbc:sybase:Tds:whitehorn:6600";
Connection connection = DriverManager.getConnection(URL, prop);

This code does the following:

  1. Calls Class.forName() to load the database driver (in this case, the Sybase driver)
  2. Calls loadProperties (), a utility function I wrote (see Listing 1) to load the file db.properties to a Properties object
  3. Sets up the URL to connect to the database (the documentation of your JDBC driver should tell you what to put in your URL; it's of the form jdbc: protocol: data source information.)
  4. Passes URL and connection properties to getConnection() to create a database connection
Use this connection to run SQL statements to get all the employees from the table using the code below:

String sql ="select id, lastName, firstName, " +
                  "phone, e-mail from Employee" ;
List employeeList = new ArrayList();
Statement stmt = connection.createStatement();
ResultSet s = stmt.executeQuery(sql);
while (s.next())
{
        String id = s.getString ("id");
        // ..Get other fields e.g. lastName similarly
        Employee employee = new Employee(id);
        // ... Set other fields into employee object similarly
        employeeList.add(employee);
}

This code does the following:

  1. Creates a statement, then executes the SQL to return a ResultSet
  2. Iterates through the ResultSet, and for each row extracts the column value out and sets the appropriate field in the Employee object
  3. Adds the Employee object to the EmployeeList
The full source of the database query code can be found in the DBImpl class in Listing 2.

LDAP and LDAP Directory Search
LDAP is a hierarchical way of storing data. It's commonly used by system administrators to store system and user information. LDAP stores data in a directory structure similar to the Unix file system. The basic unit of information in a directory is an entry. An entry has attributes, each of which has a type and one or more values. Figure 2 shows how employee information may be represented in LDAP.

In Figure 2, the top entry is an organization that has attri- butes dc=whitehorn, dc=com. Note that for the same attribute type dc, you can have more than one value. Under the organization entry you can have children, which are also entries. For example, location newyork and location toronto are both child entries of organization. To identify an entry uniquely, a DN (distinguished name) is used. DN is similar to the concept of a primary key in a database. In this example, the DN of John Doe is represented as:

Uid=100012, ou=Employee,
dc=newyork,dc=whitehorn,dc=com

Inside the John Doe entry you'll find other attributes such as lastName, firstName, phone, and e-mail.

An LDAP search operation is used to search a directory and retrieve individual directory entries. The LDAP search operation has eight parameters. I'll describe only three of them here, which should be sufficient for a simple query. (Refer to the links in the Resources section to learn more about LDAP searches.)

  1. Search Base: This indicates the top of the tree where you want to start the search from. It's expressed in the form of a DN.
  2. Attributes: Similar to the Select clause in the SQL, this is a list of attributes to be returned for each matching entry.
  3. Search Filter: An expression that describes the types of entries to be returned. It's similar to the Where clause in SQL. A filter can be in the following form:
Attr=value                   equal
Attr~=value                 approximate
Attr=*value*               use * to indicate wild card any position
Attr>=value                 greater or equal
Attr<=value                 less or equal
Attr=*                         presence
(&(filter1)(filter2))       and
( | (filter1)(filter2))        or
( ! (filter))                    negation

Querying LDAP In Java Code
In Java you can use Java Naming and Directory Interface s(JNDI) to do a directory search on the LDAP server. JNDI is an API that provides naming and directory functionality to applications written using Java. JNDI is included in the Java 2 SDK, v1.3, and later releases. If you're using earlier Java versions, you'll need to include the JNDI extension package in your classpath.

JNDI is defined to be independent of any specific directory service implementation. To use LDAP as the directory ser-vice implementation, add ldapbp.jar, ldap.jar, and providerutil.jar to your classpath to access the LDAP service provider classes. You can download the LDAP service provider release from http://java.sun.com/products/jndi/ #download.

In my example, the LDAP server I used is Netscape Directory Server, but it really doesn't matter what LDAP server you're using: the code should still be the same.

To access the LDAP server I need to specify information on how to get to it. I put this information in the properties file "ldap.properties" as shown below. This information includes the name of the service class provider (in this case, the Sun LDAP service provider class), the URL to the LDAP server, and the user login and password to the LDAP server.

java.naming.factory.initial=com.sun.jndi.ldap.LdapCtx Factory
java.naming.provider.url=ldap:/localhost:1100/dc=ny,dc =whs,dc=com
java.naming.security.principal=user
java.naming.security.credentials=password

In my initialization code below, I load the properties file into a Properties object and then create an InitialDirConext:

Properties p= Util.getProperties("ldap.properties"):
DirContext ctx = new InitialDirContext(p);

Now that I have an InitialDirContext, I can use it to look up all the employee entries using the code below:

String searchRoot = "ou=Employee" ;
String filter="(uid=*) ";
String[] attrIDs ={"uid", "givenname", "sn", "tele-
                            phonenumber", "mail"};
SearchControls ctls = new SearchControls();
ctls.setReturningAttributes(attrIDs);
List employeeList = new ArrayList ();
NamingEnumeration list =
                ctx.search(searchRoot, filter, new SearchControls());
while (list.hasMore())
{
               SearchResult nc = (SearchResult)list.next();
               Attributes attrs = nc.getAttributes();
               Attribute attrId = (Attribute) attrs.get("uid");
               String id = (String) attrId.get();
               Employee e = new Employee(id);
               // ...similarly get other fields to fill out
               emplyeeList.add(employee);
}

This code does the following:

  1. Specifies the three parameters for directory search: searchRoot, filter, and attributes
  2. Using the initial context, does a search base on these conditions and returns a List of SearchResult
  3. Iterates through the SearchResult, and for each Entry return retrieves its attributes and assembles the Employee object
  4. Adds the Employee object to the employeeList
The complete LDAP search code can be found in LDAPImpl class in Listing 3.

XML and XPath
XML may not be the way to persist data, but with the B2B nature of applications, it's likely that we don't get our information directly from its persistence storage. For example, we could be getting the employee information from another Web service that will send us the information as an XML file, as shown in employees.xml in Listing 4.

XPath is a language used to query XML, commonly used inside an XSLT stylesheet for XML transformation. It describes how to locate specific elements (and attributes, processing instructions, etc.) in a document. Similar to the LDAP search, which has a search base, XPath has the concept of the "context node", which is where you start from. Everything you search is relative to the starting location.

I'll now show you some examples of an XPath query using the XML in Listing 4. The basic syntax of XPath is similar to file system addressing. You can use an absolute path by starting with/or a relative path. For example, /Employees/ Employee/phone returns all the phone elements under Employees/Employee. Specifying * will get you all the elements located in the preceding path, so //Employee /* will return all elements (e.g., lastName, phone) under any Employee element in the document.

Expression in a square bracket can further specify an element. This is similar to a Where clause in a SQL. For example, Employee[starts-with(./lastName/text(),'D')] returns all employees with a lastName beginning in D. You can also specify the position of an element as in //Emp- loyees/Employee[2], which returns the second Employee child of Employees.

Attributes are specified by an @prefix. For example, // @city returns all the city attributes in the document. Attributes can also be used inside the square bracket to specify an element. An example is Addr[@city='newyork'], which returns all the Address elements that have an attribute city=newyork.

You can use a Boolean operation inside the square bracket. For example:

1. Employee[boolean(./lastName) or boolean (./e-mail)]
2. Employee[boolean(./lastName) and boolean (./e- mail)]

The first query represents an employee that has lastName or e-mail attributes, while the second query represents an employee that has both lastName and e-mail attributes. There are lists of functions for Node-Set, Boolean, Number, and String that you can use within an expression (see the Resources section).

You can apply more conditions in sequence by entering square brackets one after another. This is similar to a subquery in SQL. In addition, if the query is sequence-dependent, the result will depend on which condition you specify first. This can be seen in the example below:

1. //Employees/Employee[1][starts-
with(./lastName/text(),'D') ]
2. //Employees/Employee[starts-
with(./lastName/text(),'D') ][1]

Query 1 first selects all the first Employee elements under any Employees element, and then from that list, selects the ones that have a lastName beginning with "D." Query 2 first gets all the Employee elements under employees that have lastName D, and then from that list, selects the first one. The result and number of elements returned from these two queries may be different in this case.

There are many more powerful features of XPath that I haven't described here. Refer to the Resources section to learn more.

Querying XML in Java Code
To query XML using XPath, we can use the JAXP ( Java API for XML Parsing) API. JAXP provides a common interface for creating and using the standard SAX, DOM, and XSLT APIs in Java, regardless of which vendor's implementation is actually being used. JAXP API 1.1. is included in the J2SE 1.4, or you can get it from http://java.sun.com/xml/download.html to be used as an optional package for JDK 1.1.8 and above.

In my example, I use the Xerces XML parser and Xalan XSLT processor, both from Apache (www.apache.org), but you can use any other parsers or XSLT processors that implement the JAXP API. In my classpath I need to include xerces.jar and xalan.jar so that the Apache classes can be found. The following code prepares for an XPath query by creating a DOM parser first. It uses the Xerces implementation of the DocumentBuilderFactory to create a Xerces DOM parser.

DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
domParser = factory.newDocumentBuilder();

Now that I have a DOM parser, I can use the following code to do an XPath query to get all the Employees elements.

URL cfgURL = Util.createURL("employees.xml");
Document d = domParser.parse(cfgURL.toString());

String XPath = "//Employee
List employeeList = new ArrayList();
NodeList nl = XPathAPI.selectNodeList(d, XPath );
for (int i = 0; i< nl.getLength(); i++)
{
          Node n = nl.item(i);
          // id is an attribute
          Node nm = n.getAttributes().getNamedItem("id");
          Employee emp = new Employee (nm.getNodeValue());

          // Other fields are elements
          NodeList childNodes = n.getChildNodes();
          Node lastname = lookup (childNodes, "lastName");
          // ...do similarly for other fields

emp.setLastName(lastname.getFirstChild().getNodeValue());
         employeeList.add (emp);
}

This code does the following:

  1. Creates a DOM document from the XML file employees.xml (see Listing 4) by parsing the XML file
  2. Calls org.apache.xpath.XPathAPI.selectNodeList (from Xalan API) to run a query on the document using the XPath query statement
  3. Iterates the result (NodeList) to assemble the Employee object; (note that in this XML, id is an attribute of Employee and it's retrieved in a different way than the other fields, e.g., lastName or firstName, which are elements of Employee)
  4. Adds the Employee object to the EmployeeList
The full listing of the XPath search code can be found in the XMLImpl class in Listing 5.

A Mode Complicated Query
To further illustrate the various ways to query the data, I'll now explain how to assemble the respective queries based on the same search condition as shown below:

  1. (Lastname begins with 'D%' OR firstname begins with 'M%') AND
  2. Does not have an e-mail attribute

The SQL query for this search is:

select id, lastName, firstName, phone, e-mail from Employee
where (lastName like 'D%' or firstName like 'M%')
and e-mail = null

Figure 3 shows the result of running this query in a SQL tool.
The LDAP query for this search is:

Search Base:          ou=Employee,dc=newyork, dc=white
                              horn,dc=com
Search Filter:          (& (|(givenname=M*)(sn=D*))
                              (!(mail=*)))
Attributes:              uid, sn, givenName, telephoneNumber, mail

The result of running this query in the query tool of an LDAP browser is shown in Figure 4.
The XPath query for this search is:

//Employee[starts-with(./lastName/text(),'D') or
starts-with(./firstName/text(),'M')]
     [not(boolean(./e-mail/text()))]

Figure 5 shows the result of running this in XPath using a tool called Cooktop XML editor (a free XML editor that can be downloaded at www.xmleverywhere.com/cooktop).

You can easily put the queries we constructed here back into the code we wrote in the last section, and have the results returned programatically in Java.

Common Interface
Depending on where the data is stored or comes from, the implementation for the Employee Search Application will be different. We could apply the Bridge pattern from the book, Design Patterns, to write the client for the Employee Search Application. The Bridge pattern separates a class interface from its implementation, so you can vary or replace the implementation without changing the client code. To do so I define the following interface:

Interface EmployeeQuery
{
      // Does initialization e.g. connect to server
      init();

      // Generate query based on Condition and return
         List of Employee
      List search (Condition c);
}

The client can then call methods in this interface without knowing what the underlying implementation is like. In my test client program, I create a method call runTest that calls init() and then does a search with empty conditions. This returns a list of all employees, printing their first and last names.

public static void runTest(EmployeeQuery eq)
 {
            System.out.println ("Running class " + eq.getClass().toString());
            eq.init();
            List l = eq.search (new Conditions());
            Iterator i = l.iterator();
            while (i.hasNext())
            {
                    Employee e = (Employee) i.next();
                    System.out.println (e.getId() + ":" +
                    e.getFirstName() + " " + e.getLastName() );
             }
}

Then I create the database, XML and LDAP implementations of EmployeeQuery, and call runTest on them, respectively, to test out the implementation.

public static void main(String[] args)
{
        runTest(new DBImpl());
        runTest(new XMLImpl());
        runTest(new LDAPImpl());
}

You can find the complete source code of DBImpl, LDAPImpl, and XMLImpl shown, respectively, in Listing 2, Listing 3, and Listing 5. Because this is only an example, currently all they do is return all the employees in their system. It doesn't use the condition parameter passed in to generate the query. You can do more work to specify conditions from a GUI, which you turn into the respective query to run and return the matching items.

Conclusion
I've shown you three ways of implementing the same client based on how the data is stored. It's unlikely that you'll have to implement an application in all three different ways. The purpose of this article is to show you the different syntaxes and code APIs you can use, so you can easily tackle each kind of query with minimal effort when you need to. I've only touched briefly on SQL, LDAP, and XPath. Each of them is in itself a big subject that you can study in detail when you have time.

A final point to bear in mind when developing your client: it's possible that the underlying implementation may change or that your client may be used with more than one implementation. In that case, your system will benefit if you use an interface when designing your code.

Resources

    Relational databases and SQL:
  1. JDBC API Doc: http://java.sun.com/j2se/1.3/docs/api /java/sql/package-summary.html
    LDAP:
  2. JNDI API Doc: http://java.sun.com/products/jndi/javadoc/index.html
  3. Excerpts from an LDAP book: http://developer.netscape.com/docs/books/macmillan/ldap/ldapbk.html
  4. Sun JNDI tutorial: http://java.sun.com/products/ jndi/tutorial/TOC.html
    XML and XPath:
  5. JAXP API Doc: http://java.sun.com/xml/jaxp-1.1/docs/ api/
  6. Official W3C site: www.w3.org/TR/XPath
  7. Quick Reference Card: www.mulberrytech.com/quickref/index.html
  8. XPath example: www.zvon.org/xxl/XPathTutorial/General/ examples.html

More Stories By Teresa Lau

Teresa Lau has been an independent Java consultant for over four years, with
an emphasis on financial applications. She received her MS in computer
science from the University of Waterloo, and her BS in engineering from the
University of California, Berkeley.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.