Another Year, Another Summary

In seeing several people doing the year's end blog post thing, I decided to take a look at my post from last year and follow suit with one for this year.

First, I thought I'd look at the list of goals I set for this year and see how I did.

  • Fail: I ended up leaving my job in Baton Rouge and also withdrawing from BROUG due to other commitments, so I wasn't able to get very far into my term as their VP.
  • Draw: I did read and write a review for one book, though it wasn't in my existing backlog. I managed to read or at least skim through most of The Pragmatic Programmer and blog about it several times. No other reading to speak of, though.
  • Win: I submitted one paper for OSCON that was rejected, then submitted four for ZendCon and got one accepted. I gave one of the others as an uncon session.
  • Win: The aforementioned book review was also released as a PHP Abstract podcast episode. While it hasn't been published yet, I do have another magazine article coming out next month that I wrote this year.
  • Win: Part of the reason I haven't done more in the way of podcasts and articles is that I did get a publishing deal and find myself a technical editor. I'm in the ever-so-slow process of writing the book now. I'm hoping it will be out in the first half of 2009 and will post updates here as they arise.
  • Fail: While I did get initial code and unit tests developed for Zend_Service_RememberTheMilk, the unit tests I did ended up needing to be refactored to use local static files representing expected web service responses. In short, my interest and energy levels were depleted before that portion of the project was completed.
  • Win: I did migrate my blog to Habari. I can't say it's been entirely stable (though it's admittedly still pre-1.0) and without issues, but the ride has been interesting nonetheless. I may look migrating to something else later depending on how Habari does. WordPress seems to be improving, so maybe I jumped that ship too soon.
  • Fail: No developments on a content management project yet, just ideas floating around. I may get around to it eventually, but for the moment I've got enough other projects keeping me busy.
  • Fail: The local music scene web site project never got rebooted either, again due to lack of time and interest.

So 4 wins, 4 fails, and 1 draw. At least I broke even. I also did accomplish a few things that weren't originally on my list.

  • I had a large hand in rebooting the Acadiana Open Source Group, a local agnostic open source software user group. It's had monthly meetings almost every month since April and I consider it a fair success.
  • I was present for the first public offering of and among the first to take the new Zend Framework certification exam. Though it was rather last minute that I was able to get a slot, happily I was able to pass.
  • I switched jobs twice. The first change was to surgiSYS, LLC. I enjoyed my work there and later made the difficult decision to leave for other opportunities at Blue Parabola, LLC, where I'm currently working and having a blast.
  • I attended php|works and PHP Appalachia both for the first time. Both were immensely fun and informative events and I hope to continue my presence at both next year.
  • I'm now a technical editor for php|architect magazine. So, if you submit an article for the magazine, chances are you might be working with me to polish it up before it goes to print.

And I've got an updated goal list for this year, of course.

  • Finally finish my book and hold a published copy in my hands! (And hopefully get requests to sign a few. Hey, a guy can hope.)
  • Continue this year's success of the Acadiana Open Source Group and become involved in other technology and social media related events in the area, such as the TechSouth conference.
  • Publish another magazine article. PHP podcasts, it seems, are going the way of the dinosaur. The two most well-known, PHP Abstract and P3, haven't published episodes in a few months.
  • Get accepted to speak at another conference. It's looking like php|tek is going to be a bust for me this year. I submitted three papers - web scraping, IRC bots, and a repeat of my Zend Framework web services talk. The conference lineup does look really good, though, so I do hope I get to attend. If I do, I'll probably do a BYOL Phergie hacking session for the uncon. (And hopefully Phergie will be to 2.0 stable by that point.)
  • Get a new non-PHP certification, likely MySQL CMDEV.
  • Finally get around to reading Sara Golemon's Extending and Embedding PHP and a Christmas present from this year The C Programming Language from K&R so I can contribute to PHP in some way, be it internals or PECL.

So, I hope your New Year was as productive or more so than mine, and wish you all the best in 2009. Happy New Year!

Seven Things - Tagged by Keith Casey

My Blue Parabola colleague Keith Casey decided to pull me into one of those viral tagging games. Since my blog has been building up a little dust lately between my Blue Parabola blog posts and my book, I decided to kill two birds with one stone by obliging him and freshening up my landing page a bit.

So, onto the seven things you may not know about me bit.

  • I used to play the flute. I did it all through middle school and then my freshman year of high school. When I was about to enter middle school, I was brought along with a group of new students to a table to sample the various instruments that were available to me and selected the flute because I liked the sound. The time commitment and demanding activities involved in the high school marching band lead me to abandon it. I continued to play here and there for church and the like, but haven't picked it up in years. Currently, I can strum a few guitar chords and aspire to one day pick up drums and piano.
  • Web development was my bread and butter through most of college. In 2001, I landed my first job with a small web development group within Gannett tasked with creating a custom CMS for news publications within the state. After being there just under a year, I moved on to work for a web development company for three and a half years and finished off my degree working for an education-focused non-profit for a year before finally graduating at the end of 2006.
  • My Bachelors Degree was in computer science. OK, maybe most of you already knew that. My concentration was video game design and development. There, is that better? I started out in information technology, moved onto cognitive science, and then finally finished out in video game development only a few semesters after the concentration first became available. My interests have been more prevalent toward web development and I haven't really touched game development since I left school.
  • I have a black belt in judo. Between ages 12 and 18, I was actively involved in a local club and took my shodan test shortly after graduating from high school (the minimum age to test was 18). At one point, I held certifications to referee at local tournaments and teach several kata forms. I acted as a coach in my club during my last few years of high school. My mother and brother were also in the club with me and some of the fondest memories I have from my teen years are of times we shared there.
  • While in judo, one of my better known talents was jumping over people. One of the skills practiced in judo is break falls, which are generally used to take falls without injury while practicing throwing techniques. One of its more "flashy" applications for the purposes of demonstration is leaping over a group of people arranged in a line and falling safely on the other side. My records were three standing men of roughly equal height and ten preteen kids sitting cross-legged. I was often referred to as being "half gazelle."
  • I originally met my wife when I was 12. Want to guess how we met? The judo club I just mentioned. At the time that I started, she was 16 and an orange belt. She was in a car accident shortly thereafter and was out for six years. When she finally came back, I was 18 and a black belt. The first words I ever said to her (after listening to her babble for three weeks straight in determination to befriend me) were "Do you ever shut up?" (No, I'm not kidding.)
  • During college, I grew my hair out to shoulder length. My parents always had me cut my hair before it got to be very long. While I was in high school, it was also against school policy for boys to wear their hair longer than a certain length. Once I was in college, I decided to try it just to have the experience. I cut it short again shortly after Hurricane Lili left us without working air conditioning for a week. While I'm glad to have done it, I don't expect to repeat the experience in the future. Related photos are probably buried somewhere and I'm too lazy to go digging for them.

And now onto the fun part where I get to tag seven other people into doing this.

Benchmarking PHP HTTP Clients

If you read my blog semi-regularly, you might remember when I mentioned that my book would be released later on this year. Unfortunately, that project had to be put on hold in favor of a few other projects. Now that those are winding down, however, I'm able to return to working on the book. I'm hoping the manuscript will be completed by the end of March 2009.

One of the interesting bits of research that I've done is benchmarking various mainstream PHP HTTP clients. Of course, we all know that there are lies, damned lies, statistics, and benchmarks, so take these with a grain of salt. They were run on my Sony Vaio, which is an Intel C2D T5550 @ 1.83GHz with 2 GB of RAM running Ubuntu Ibex and its standard php5 package. According to Speedtest.net, my Cox Cable connection has a 12,375 kb/s download rate and a 5,998 kb/s upload rate.


// pecl_http (1.6.1)
$response = http_get(
'http://paste2.org/new-paste',
array('connecttimeout' => 15)
);
echo 'http ', strlen($response), PHP_EOL;

// streams http wrapper
$response = file_get_contents('http://paste2.org/new-paste');
echo 'streams ', strlen($response), PHP_EOL;

// curl (php5-curl Ubuntu package:
// libcurl/7.18.2 OpenSSL/0.9.8g zlib/1.2.3.3 libidn/1.8)
$ch = curl_init('http://paste2.org/new-paste');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
curl_close($ch);
echo 'curl ', strlen($response), PHP_EOL;

// PEAR::HTTP_Client (PEAR 1.7.2, HTTP_Client 1.2.1)
$error = error_reporting(E_ALL);
require_once 'HTTP/Client.php';
$client = new HTTP_Client();
$client->get('http://paste2.org/new-paste');
$response = $client->currentResponse();
$response = $response['body'];
echo 'pear ', strlen($response), PHP_EOL;
error_reporting($error);

// Zend_Http_Client (SVN r12780)
require_once 'Zend/Http/Client.php';
$client = new Zend_Http_Client('http://paste2.org/new-paste');
$response = $client->request()->getBody();
echo 'zend ', strlen($response), PHP_EOL;

The Ubuntu packages for Xdebug (php5-xdebug) and KCachegrind produced the following results for this script.

pecl_http 20.08%
streams 19.81%
curl 19.83%
pear 19.73%
zend 19.88%

So the performance of these components is roughly equivalent. One thing that's interesting is that the call tree for PEAR is actually the longest (four calls underneath the one shown in the source here) and at the bottom is a call to gethostbyname, which takes 18.97% of the script's runtime, putting the amount used by the calls above it at 0.76%. This suggests that the majority of the time taken by the other components is likely due to the same reason.

Let's try a slightly more complex request.


$post = array(
'lang' => 'php',
'description' => '',
'code' => 'test',
'parent' => '0'
);

// pecl_http
$response = http_post_fields(
'http://paste2.org/new-paste',
$post,
null,
array('connecttimeout' => 15)
);
echo 'http ', strlen($response), PHP_EOL;

// streams http wrapper
$context = stream_context_create(array(
'http' => array(
'method' => 'POST',
'header' => 'Content-Type: application/x-www-form-urlencoded',
'content' => http_build_query($post)
)
));
$response = file_get_contents('http://paste2.org/new-paste', false, $context);
echo 'streams ', strlen($response), PHP_EOL;

// curl
$params = array(
CURLOPT_URL => 'http://www.paste2.org/new-paste',
CURLOPT_POST => true,
CURLOPT_HEADER => true,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_POSTFIELDS => $post
);
$ch = curl_init();
foreach ($params as $key => $value) {
curl_setopt($ch, $key, $value);
}
$response = curl_exec($ch);
curl_close($ch);
echo 'curl ', strlen($response), PHP_EOL;

// PEAR::HTTP_Client
$error = error_reporting(E_ALL);
require_once 'HTTP/Client.php';
$client = new HTTP_Client();
$client->post('http://paste2.org/new-paste', $post);
$response = $client->currentResponse();
$response = $response['body'];
echo 'pear ', strlen($response), PHP_EOL;
error_reporting($error);

// Zend_Http_Client
require_once 'Zend/Http/Client.php';
$client = new Zend_Http_Client('http://paste2.org/new-paste');
$client->setParameterPost($post);
$response = $client->request('POST')->getBody();
echo 'zend ', strlen($response), PHP_EOL;

And here are the Xdebug + KCachegrind results for the execution of this script.

pecl_http 12.56%
streams 25.02%
curl 12.69%
pear 24.81%
zend 24.81%

The gethostbyname call in the PEAR call stack again takes up the majority of its runtime, 21.05% in this case. That puts the remainder of the time for PEAR at 3.76%. pecl_http and curl are roughly equivalent in performance to each other and twice that of the others. Oddly, streams (a C extension like pecl_http and curl) suffers a performance difference similar to the libraries written in PHP.

I have a semi-educated guess as to why this is. PEAR makes two gethostbyname calls to process the request, presumably one for the initial POST and one for a GET that follows because the POST response includes a Location header. Zend appears to make two stream_socket_client calls for the same reason. Streams do not appear to implicitly cache DNS lookups, so the HTTP streams wrapper is most likely in the same situation.

The existence of the CURLOPT_DNS_USE_GLOBAL_CACHE option and the http.request.datashare.dns configuration setting and the fact that both are enabled by default lead me to believe that the curl and pecl_http extensions do cache DNS lookups and thus don't suffer the performance hit of repeating them. Can anyone confirm or deny this?

AWDG November 2008 Meetup Slides are up

While I was in Atlanta this past week for php|works / PyWorks Conference, I volunteered to speak at the November meetup for the Atlanta Web Designers Group. Slides and demo code from my presentation can now be found in the Publications area of this web site. Thanks to the group for their invitation and hospitality and to Ben Ramsey for introducing us.

Natural Ordering in MySQL

I ran into an instance recently where I wanted to implement natural sorting of a result set in MySQL. When you're dealing with numerical strings or strings with a common non-numeric prefix, the common solution of casting the order column to an integer by adding zero to it works fine. However, if neither of the aforementioned conditions is the case, it takes a little more work.

What actually happens when you add zero to a non-numeric column depends on the characters at the beginning of the column value. If the column does not begin with a sequence of one or more numeric characters, then adding zero to that column produces zero. (Ex: "dog" + 0 = 0) If the column does begin with numeric characters, then adding zero to it produces the sequence of numeric characters up to the first non-numeric character in the original value or the end of the value, whichever comes first. (Ex: "12 dogs" + 0 = 12) An example might be the easiest way to illustrate this.

mysql> SELECT name+0<>0, name+0, name 
-> FROM `recommendation`
-> ORDER BY name+0<>0 DESC, name+0, name;
+-----------+--------+------------------------+
| name+0<>0 | name+0 | name |
+-----------+--------+------------------------+
| 1 | 3 | 3 month follow-up |
| 1 | 6 | 6 month follow-up |
| 1 | 12 | 12 month follow-up |
| 0 | 0 | Intervention |
| 0 | 0 | Observation |
| 0 | 0 | Specialty Consultation |
+-----------+--------+------------------------+
6 rows in set (0.00 sec)

The first ORDER BY clause checks the string to see if it begins with numeric characters, then places results for those that do first. If you prefer that numeric results appear after non-numeric results, then you can exclude this clause.

The second ORDER BY clause orders the numeric results by casting them to integers and ordering by those integers.

The third clause orders the non-numeric results by the original column value.

And that's all there is to it. Hope this proves helpful to someone.

Page:  1 2 … 14