ZendCon!

I didn't want to make mention of it on this blog, because I didn't want to jinx myself after having had several instances where I thought I wasn't going to be able to go, then I was, then I wasn't again, etc. But now... I'm at ZendCon in San Francisco! Thanks to some sheer dumb luck and the generosity of several dear friends, I've been able to travel here to attend the conference. I didn't realize until after leaving that I'd forgotten a few things, most notably my camera, but I think things are going to be memorable enough that I won't need photos (though they would have been nice, but I'm sure other people will take plenty). Anyway, off to bed with me... I've got a day of tutorials on writing extensions tomorrow as well as the ZCE exam. Good night!

Supporting Hierarchical Data Sets

If you deal with relational database systems on any semi-regular basis, you've probably had to support hierarchical data sets before. In genealogical terms, each record may have a logical "parent" record and zero or more "child" records. The approach that you've likely taken to support these data sets has been to include a foreign key column in the table in question that references the primary key column in the same table. This approach is called the adjacency list model, where the name is taken from the same concept in graph theory. This model is simple and intuitive, but it has a drawback: to obtain any significant portion of the hierarchy stored in a table can take up to as many queries as there are levels of depth in the hierarchy.

There is an alternative approach, called the nested set model, which finds its basis in set theory. Its development is credited to a man by the name of Joe Celko, who's written a number of books related to SQL and database design. The basis of the approach is this: each record has numerical left and right bounds that represent the boundaries of a set, where everything within that set is a descendant of that record. By employing some simple joins, most of the subsets of data commonly desired in a hierarchical data set can be retrieved with a single query, which is much more efficient in terms of the number of queries required to fulfill an individual request.

If MySQL is your database server of choice, its main web site has an excellent article that outlines what the nested set model is and how to implement it within MySQL. This article on Sitepoint also explains it and its examples use PHP in conjunction with MySQL. There's also another great article on developer.com that gives a visual walk through of the nested set model and its implementation in Oracle for use in generating breadcrumb trail navigation on a web site. Danne Lundqvist also has a blog entry describing his experiences in using PHP and JavaScript (specifically Mootools) to transfer data back and forth between the nested set model and an ordered depth model. These articles explain the concepts well enough that I doubt I'd do much better in trying to reiterate them here.

I found out about the nested set model through Elizabeth Smith, one of my fellow developers on the Forkr project. One of my tasks while working on the project was to adapt information from various resources, including code she'd written to implement a combined adjacency list/nested set model approach on a PostgreSQL database, to a component for Forkr to support hierarchical data sets. Why a combined approach? There are several reasons why that is actually better than each individual approach by itself.

  1. Some of its common queries, mostly those involving only two levels worth of data (i.e. a parent and its children), are more simple and efficient in the adjacency list model than in the nested set model.
  2. Most sites that support hierarchical data sets already use the adjacency list model, so roughly half the work of implementing a combined approach is already done.
  3. The adjacency list model is fairly easy to understand and, when effectively laid side-by-side with the nested set model, makes it somewhat easier to follow when looking at data that uses a combined approach.
  4. When developing an application that uses the nested set model, the logic which controls the bound values for each record can be error-prone early in development. By also maintaining a foreign key column that points back to the parent, the bound values can easily be regenerated at any time.

My work in Forkr is still very much in an alpha stage, but once it's completed I'll likely post another blog entry on this topic that outlines how to implement the nested set model with specific SQL examples. So, keep your eyes peeled for it.

Update: Wow... apparently I unknowingly struck a nerve. :P The earlier comments are correct in that manipulating nested set bounds does indeed require updates on many records of a table. The extra speed in retrieving the hierarchy has to come from somewhere, right? The nested set model is admittedly better for instances where the tree isn't likely to change often or have multiple potential points of modification that can run concurrently. My apologies for not making that clear earlier on.

Why I Still Hate System Administration

So, Chris Cornutt's comment on my last entry regarding LD_LIBRARY_PATH did turn out to be my issue. Once it was properly set, libphp5.so was properly able to see needed libraries and Apache was able to boot successfully.

Some time later, I was able to get the PDO and PDO_INFORMIX extensions from PECL installed. I was able to install the Informix Client SDK, despite post-installation configuration on the client system being a bit tedious to wade through.

As of the end of the day Wednesday, though, I still couldn't successfully get PHP and Informix speaking on the same wavelength. I have a feeling, though, that this is more than likely due to an issue with Informix than with PHP.

I've tried logging in as both root and as the user who owns all of the tables in the database I'm trying to access. However, neither prevents this error from appearing when I attempt to connect to the server from PHP.

Fatal error: Uncaught exception 'PDOException' with message 'SQLSTATE=HY000, SQLDriverConnect: -329 [Informix][Informix ODBC Driver][Informix]Database not found or no system permission.'

I've used the dbaccess utility that comes with Informix on the database server box to confirm that both logins are valid and can see the database I'm trying to connect to. I've already checked from both the server and client boxes to confirm that the hostname being used maps to the correct IP address on the LAN, so the issue is not a host identity crisis.

It seems odd that I would be able to access the database with no issues on the database server, but that may just be a lack of understanding of Informix user permissions on my part. Oh well... I'll be hitting it again tomorrow, so wish me luck and throw in pointers if you're familiar with Informix.

Update: A coworker of mine did eventually find the problem, which had to do with weirdness in the Informix database server configuration. If you didn't know, Informix stores its data in a fashion similar to that of MySQL: each database has its own directory and each table one or more files within that directory. Apparently, because of the server configuration, when specifying a database, the entire path to the database directory had to be provided instead of just the database name, hence the "Database not found" error. Now I get to engage in the exciting task of porting a large C-based CGI application over to PHP. Joy...

Why I Hate System Administration

I got assigned to a new project at work recently. At first, I was delighted, because I knew that the project requirements would allow me to use PHP. My job generally involves Oracle Forms and Reports or PL/SQL data cartridges, so going back to my language of choice seemed like it would be a refreshing change. The client, however, has rather interesting requirements regarding the server environment.

First, they're running on Sun hardware and as such require that they maintain a Sun operating system, namely Solaris 10, only that hardware. Second, their applications are currently running off of an Informix 7 database server.

There is currently only one PHP extension for interfacing with Informix that is actively maintained: PDO_INFORMIX, which is in PECL and is maintained by IBM (who bought out Informix shortly after the client got this database server). The latest release of this extension was made in February 2007 and it relies on PDO 1.0, which is about 3 versions back from the current version of PDO. That's not to mention that you need the Informix Client SDK installed on the system to install PDO_INFORMIX.

The only Solaris builds of PHP are on the Freeware for Solaris web site and the only build that will run on Apache 2.0.59 (what's currently running on the client's production server) is PHP 5.2.1, which has the latest version of PDO built in by default. There's no way to override this with an installation of the separate PECL extension, at least that I know of. The only way to get around it is to recompile PHP from source and disable the default inclusion of PDO.

I spent today trying to figure out how to get PHP to compile from the source tarball offered on that web site so that I could disclude PDO from being built into the interpreter and then install it as a separate PECL extension to allow me to install PDO 1.0 to allow PDO_INFORMIX to be installed and work correctly. No matter what I tried, I got this error message:

ld.so.1: httpd: fatal: relocation error: file /usr/apache/libexec/libphp5.so: symbol sapi_module: referenced symbol not found

I've been told (and I believe it) that libphp5.so is expecting there to be some Apache files somewhere and obviously just isn't finding them, even though I've used the --with-apache2 and --with-apxs2 options when configuring PHP for installation from source and tried installing Apache from source as well.

The last thing I did before leaving was start a fresh installation of Solaris 10 to a VMware disk, which I intend to make a snapshot and backup copy of before I try again. I'm hoping that my initial attempts to install PHP and Apache from their Solaris packages is causing some sort of conflict, though I have to say I have my doubts. I'll continue searching for answers tomorrow. In the meantime, any sagely advice in the comments would be welcome.

PHP Articles and News

Etienne Kneuss has an excellent article explaining late static bindings, a prospective feature for PHP 6 (and possibly 5). If you have any interest in scope or context as it relates to inheritance, I highly recommend you check it out.

Also, while this one is a few months old, I still think it's link-worthy: Ligaya Turmelle wrote an article illustrating the various types of SQL JOIN clauses in a very simple, easy-to-understand manner.

Elizabeth Marie Smith has been working on a new project recently to create a PHP 5-geared PECL extension to wrap the Win32 API as a nicer alternative to Winbinder.

Update: While Stijn Leenknegt isn't the first person to suggest the "function-return-array" idea, I think it's certainly been a popular suggestion and may even make it into PHP 6. It's definitely a useful feature in JavaScript.

Page:  1 2 … 7