According to this post at Jungle Disk, my SSH connections to some servers are stalling (for nearly a minute) because of changes in the DNS resolution in OS X Leopard. The root cause it probably in my router, which I need to dig into. In the short-term, I added OpenDNS to my list of DNS servers and that seems to speed things up considerably.
Archive for March, 2008
At Heavy, we have been using a database ORM called Propel for some time now. I’m not a huge fan, to be honest, but it is what it is. One issue that we ran into is that Propel generates about five class files for every table that you model in PHP. With a few hundred tables, there are quite a few class files in our database code. We use Propel in an application layer behind our front-end web servers. Given that some of the controller scripts for the application layer might handle a variety of requests and deal with a large number of tables, you might imagine that a single script could end up requiring 50-200 class files at runtime.
Yuck.
Last fall, we took a close look at how to split up this load and only compile the truly necessary PHP class files when needed. We setup an autoload routine and saw dramatic results. We went from a typical 175% CPU usage on each 4 processor server down to around 50%. Much of the CPU seems to have been eaten up by constant recompiling of this huge number of class files. (Note: we also looked at using APC in the past to cache compiled PHP code, but ran into regular segfault issues on Apache when running it, which I believe are a fault of our own legacy configurations.)
Our autoloader looks something like this:
function __autoload ($class_name)
{
if (!class_exists($class_name, false)) {
$class_file_path = str_replace('_', '/', $class_name) . '.php';
require($class_file_path);
}
}
There is an assumption, here, that you’ve got your PHP classes in order. One class per file, and using a PEAR-ish naming convention, where the class Input::Validator is stored in as Input/Validator.php in your include path.
If your class files are always located in one library directory, you can get some extra points by using an absolute path in your autoloader so that PHP doesn’t have to search your configured include path. For us, we’ve modeled our include path so that we can use custom classes, Zend Framework classes, and PEAR classes together. This ensures that our custom libraries override calls for similarly named classes in Zend or PEAR (which can be bad if you’re not careful!).
- Current working directory (.)
- Custom library path
- Zend Framework path
- PEAR path
- PHP standard path
Since we use this across our code base, we included the autoload function in a script that gets automatically loaded for every PHP script using auto_prepend_file. I can hear the groans already. The truth is that this works very well for our codebase. Your mileage may vary.
What does this gain us?
First of all, we improved the overall performance of our application, and I think pretty dramatically. Figures 1 and 2 (below) show Munin reports of our CPU usage on front-end servers and application servers. The drops in October and November coincide with our code promotions that included class autoloading.
Simplified development is another benefit. We never have to think about whether we’ve already included a class file in a script we’re writing, or whether it was conditionally included somewhere other than the top of the script file. If the class hasn’t been defined, the autoloader takes care of loading the file.
Finally, classes that fall out of use in our codebase automatically fall out of the runtime code. Unless we use a class, it’s class file is never included in the running script. When managing a lot of legacy code, possibly written by people who no longer work with you, this works great for culling out the old classes.

Figure 1. CPU usage for front-end web server

Figure 2. CPU usage for back-end application server
I’ve read both of these before, but always love returning to them:
For three years, we’ve had a “home phone” number in our apartment. Originally, I needed it because I was working from home for a few months and my old cell phone didn’t even get decent reception up here (30 floors up). Yesterday, I finally cut the cord.
Now we have to figure out what to do with that extra $20 per month. We’ve been thinking hard about subscribing to the Japanese TV channel on TimeWarner. I’m looking forward to watching some of the great soccer games coming out of Japan.
I’ve been looking at PHP sessions again for a new project. At Heavy, we’re very conservative about using sessions at all on the site. So I have been thinking about the performance impact of mindlessly turning on session.auto_start.
Let’s start with the assumption that on many web sites, a small percentage of web traffic is actually from visitors who are logged in. In other words, many visitors arrive at your site, look at a page or two, maybe search for products or content, and then leave. Why bother setting a session for all of these page views if your not storing anything in it?
What happens when a PHP session starts up? A cookie is set for the visitor’s browser with the session identifier, “PHPSESSID”, by default. The session data, if available, is loaded from the session store, which is in a file — unless you’ve moved it to something faster. The data that is loaded is then unserialized into parameters in the $_SESSION global.
Additionally, at startup the PHP session handler rolls the dice to see if it’s time to do garbage collection on expired sessions. Unless you’ve changed the default settings for PHP, then there’s a 1% chance that PHP will have to sort through all of your available session files to find out if any are ripe for dismissal.
This is just a summary. I’ll readily admit I don’t know all of the internals of session management in PHP. I also can’t speak to whether PHP re-saves your visitor’s session on every page view, whether or not the data has been changed. If anyone can answer that, I’d love to know.
Finally, one other thing that I’ve noticed is that once you have a PHP session running, some additional cache-busting HTTP headers seem to be added to the server response for a page view:
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
These headers make it impossible for your browser to cache a page, even if it’s a PHP script that returns virtually static content.
So tonight I took an extra step after setting up a login class that makes use of sessions for storing a visitors username and password credentials. The code conditionally checks to see whether or not a session handler needs to be fired up. Session_start() is only called if the HTTP request includes a session cookie.
if (isset($_COOKIE['PHPSESSID'])) {
session_start();
}
This code could go in one of two places, either in the constructor for a login class, or if you potentially need session data in more places in your code, maybe in a file that gets auto-loaded on every request.
Once you get to a sign-in page, the login class would be responsible for firing up the session if it does not yet exist. For me, this is in a method called “authenticate()”:
public function authenticate ()
{
if (!isset($_SESSION)) {
session_start();
}
// Do rest of user validation...
}
Note that we can use isset() to see if $_SESSION exists, which prevents E_NOTICE messages from being fired by session_start() if there is already a session in progress.
With these small changes, I can surf all over my site without having a session started up. The session is only initialized when I log into my site’s account. Furthermore, you could add the additional behavior of explicitly deleting the session cookie for the visitor once they have logged-out of your site. While session_destroy() will delete data within the session file, it doesn’t delete the cookie from your visitor’s browser.
I’ve been waiting for this news for a while, and friends of mine have, too. Google has opened a Contacts API to allow developers to manage or sync contacts with your Google account. Maybe I can finally ditch Plaxo, which just seems a little weird for me, now that they are trying to extend into the social network space with their “Pulse” product. All I want is an address book, and if I add a new one on my phone, I want to see it at home and at work. And if I delete one at work, I want it deleted on my phone and at home. And if my wife has access and changes one of my contacts, I want that change to show up for me, too. That seems like it’d be nice.
Additionally, there’s a first step toward syncing your Google Calendar with Microsoft Outlook. Looks like it only supported on Windows, so far. What I’m still waiting for is good syncing between Google Calendar and iCal, or even better, wireless syncing from the iPhone Calendar app. If Google were to open a sync API for Calendar, I’m sure plenty of application support wouldn’t be far around the corner.
Found this news on Matt Cutts’ blog.