PHP

Two Munin Graphs for Monitoring Our Code

Posted by Mike Brittain on December 17, 2009
PHP, WWW / Comments Off

We’re using a few custom Munin graphs at CafeMom to monitor our application code running on production servers.  I posted two samples of these to the WebOps Visualization group at Flickr.  The first graph measures the “uptime” of our code, a measure of how many minutes it’s been since our last deployment to prod (with a max of 1 day).  The second graph provides a picture of what’s going on in our PHP error logs, highlighting spikes in notices, warnings, and fatals, as well as DB-related errors that we throw of our own.

When used together, these graphs give us quick feedback on what sort of errors are occurring on our site and whether they are likely to be related to a recent code promotion, or are the effect of some other condition (bad hardware, third-party APIs failing, SEM gone awry, etc.).

I figured someone might find these useful, so I’m posting the code for both Munin plugins.

Code Uptime

When we deploy code to our servers, we add a timestamp file that is used to manage versioning of deployments to make things like rolling back super easy.  It’s also handy for measuring how long it’s been since the last deployment.  All this plugin does is reads how long ago that file was modified.

We run multiple applications on the same clusters of servers. I wrote our code deployment process in a manner that allows for independent deployment of each application. For example, one team working on our Facebook apps can safely deploy code without interfering with the deployment schedule another team is using for new features that will be released on the CafeMom web site.

Each of these applications is deployed to a separate directory under a root, let’s say “/var/www.”  This explains why the plugin is reading a list of files (directories) under APPS_ROOT.  Each app has it’s own reported uptime on the Munin graph.

#!/bin/sh
#
# Munin plugin to monitor the relative uptime of each app
# running on the server.
#

APPS_ROOT="/var/www/"

# Cap the uptime at one day so as to maintain readable scale
MAX_MIN=1440

# Configure list of apps
if [ "$1" = "config" ]; then
 echo 'graph_title Application uptimes'
 echo "graph_args --base 1000 --lower-limit 0 --upper-limit $MAX_MIN"
 echo 'graph_scale no'
 echo 'graph_category Applications'
 echo 'graph_info Monitors when each app was last deployed to the server.'
 echo 'graph_vlabel Minutes since last code push (max 1 day)'

 for file in `ls $APPS_ROOT`; do
 echo "$file.label $file"
 done
 exit 0
fi

# Fetch release times
now_sec=`date +%s`

for file in `ls $APPS_ROOT`; do
 release_sec=`date +%s  -r $APPS_ROOT/$file/prod/release_timestamp`
 diff_sec=$(( $now_sec - $release_sec ))
 diff_min=$(( $diff_sec/60 ))
 ((diff_min>MAX_MIN?diff_min=MAX_MIN:diff_min))
 echo "$file.value $diff_min"
done

Error Logs

The second plugin uses grep to search for occurrences of specific error-related strings in our log files. In our case, the graph period was set to “minute” because that gives the best scale for us (thankfully, it’s not in errors per second!).

If you’re wondering about using grep five times against large error files every time Munin runs (every five minutes), I want to point out that we rotate our logs frequently which ensures that these calls are manageable. If you run this against very large error logs you may find gaps in your Munin graphs if the poller times out waiting for the plugin to return data points.

Even if you don’t care about PHP logs, you may find this to be a simple example of using Munin to examine any sort of log files that your application is creating.

#!/bin/bash

#
# Collect stats for the contents of PHP error logs. Measures notice,
# warning, fatal, and parse level errors, as well as custom errors
# thrown from our DB connection class.
#

logs="/var/log/httpd/*.error.log"

# CONFIG

if [ "$#" -eq "1" ] && [ "$1" = "config" ]; then
	echo "graph_title Error Logs"
	echo "graph_category applications"
	echo "graph_info Data is pooled from all PHP error logs."
	echo "graph_vlabel entries per minute"
	echo "graph_period minute"
	echo "notice.label PHP Notice"
	echo "notice.type DERIVE"
	echo "notice.min 0"
	echo "notice.draw AREA"
	echo "warning.label PHP Warning"
	echo "warning.type DERIVE"
	echo "warning.min 0"
	echo "warning.draw STACK"
	echo "fatal.label PHP Fatal"
	echo "fatal.type DERIVE"
	echo "fatal.min 0"
	echo "fatal.draw STACK"
	echo "parse.label PHP Parse"
	echo "parse.type DERIVE"
	echo "parse.min 0"
	echo "parse.draw STACK"
	echo "db.label DB Error"
	echo "db.type DERIVE"
	echo "db.min 0"
	echo "db.draw STACK"
	exit
fi

# DATA

# The perl code at the end of each line takes a list of integers (counts) from grep (one per line)
# and outputs the sum.

echo "notice.value `grep --count \"PHP Notice\" $logs | cut -d':' -f2 | perl -lne ' $x += $_; END { print $x; } ' `"
echo "warning.value `grep --count \"PHP Warning\" $logs | cut -d':' -f2 | perl -lne ' $x += $_; END { print $x; } ' `"
echo "fatal.value `grep --count \"PHP Fatal error\" $logs | cut -d':' -f2 | perl -lne ' $x += $_; END { print $x; } ' `"
echo "parse.value `grep --count \"PHP Parse error\" $logs | cut -d':' -f2 | perl -lne ' $x += $_; END { print $x; } ' `"
echo "db.value `grep --count \"DB Error\" $logs | cut -d':' -f2 | perl -lne ' $x += $_; END { print $x; } ' `"

I’m open to comments and suggestions on how we use these, or how they were written.  Spout off below.

Tags: , , , , , , ,

Suggestion of the Day: ack

Posted by Mike Brittain on January 21, 2009
PHP / 2 Comments

If you use grep regularly while you’re programming, try ack.  It’s easy to get setup in your shell account, and takes a short time to get familiar with.  Definitely worth the small amount of trouble.

Tags: ,

NBC Mobile Runs Zend Framework

Posted by Mike Brittain on May 09, 2008
PHP / Comments Off

And on top of that, it looks like they forgot to turn off display_errors on their production boxes. See photo courtesy of Mr. Smart.

PHP Tag Tokenizer

Posted by Mike Brittain on April 06, 2008
PHP / Comments Off

I found a great tag tokenizer written in PHP.  I ran a few quick tests to find out that it does exactly what I was looking for, a space-separated tag parser that recognizes multi-word tags when wrapped in quotes (Flickr-style).  Very nice.

Worth sharing: PHP Space-Separated Tag Parser.

Tags: ,

Autoloading PHP Classes to Reduce CPU Usage

Posted by Mike Brittain on March 27, 2008
PHP / 3 Comments

At Heavy, we have been using a database ORM called Propel for some time now. I’m not a huge fan, to be honest, but it is what it is. One issue that we ran into is that Propel generates about five class files for every table that you model in PHP. With a few hundred tables, there are quite a few class files in our database code. We use Propel in an application layer behind our front-end web servers. Given that some of the controller scripts for the application layer might handle a variety of requests and deal with a large number of tables, you might imagine that a single script could end up requiring 50-200 class files at runtime.

Yuck.

Last fall, we took a close look at how to split up this load and only compile the truly necessary PHP class files when needed. We setup an autoload routine and saw dramatic results. We went from a typical 175% CPU usage on each 4 processor server down to around 50%. Much of the CPU seems to have been eaten up by constant recompiling of this huge number of class files. (Note: we also looked at using APC in the past to cache compiled PHP code, but ran into regular segfault issues on Apache when running it, which I believe are a fault of our own legacy configurations.)

Our autoloader looks something like this:

function __autoload ($class_name)
{
  if (!class_exists($class_name, false)) {
    $class_file_path = str_replace('_', '/', $class_name) . '.php';
    require($class_file_path);
  }
}

There is an assumption, here, that you’ve got your PHP classes in order. One class per file, and using a PEAR-ish naming convention, where the class Input::Validator is stored in as Input/Validator.php in your include path.

If your class files are always located in one library directory, you can get some extra points by using an absolute path in your autoloader so that PHP doesn’t have to search your configured include path. For us, we’ve modeled our include path so that we can use custom classes, Zend Framework classes, and PEAR classes together. This ensures that our custom libraries override calls for similarly named classes in Zend or PEAR (which can be bad if you’re not careful!).

  1. Current working directory (.)
  2. Custom library path
  3. Zend Framework path
  4. PEAR path
  5. PHP standard path

Since we use this across our code base, we included the autoload function in a script that gets automatically loaded for every PHP script using auto_prepend_file. I can hear the groans already. The truth is that this works very well for our codebase. Your mileage may vary.

What does this gain us?

First of all, we improved the overall performance of our application, and I think pretty dramatically. Figures 1 and 2 (below) show Munin reports of our CPU usage on front-end servers and application servers. The drops in October and November coincide with our code promotions that included class autoloading.

Simplified development is another benefit. We never have to think about whether we’ve already included a class file in a script we’re writing, or whether it was conditionally included somewhere other than the top of the script file. If the class hasn’t been defined, the autoloader takes care of loading the file.

Finally, classes that fall out of use in our codebase automatically fall out of the runtime code. Unless we use a class, it’s class file is never included in the running script. When managing a lot of legacy code, possibly written by people who no longer work with you, this works great for culling out the old classes.

CPU Usage for Web Server
Figure 1. CPU usage for front-end web server

CPU Usage for Application Server
Figure 2. CPU usage for back-end application server

Tags: , , , , ,

Improve PHP Performance by Limiting Session Cookies

Posted by Mike Brittain on March 10, 2008
PHP / 2 Comments

I’ve been looking at PHP sessions again for a new project. At Heavy, we’re very conservative about using sessions at all on the site. So I have been thinking about the performance impact of mindlessly turning on session.auto_start.

Let’s start with the assumption that on many web sites, a small percentage of web traffic is actually from visitors who are logged in. In other words, many visitors arrive at your site, look at a page or two, maybe search for products or content, and then leave. Why bother setting a session for all of these page views if your not storing anything in it?

What happens when a PHP session starts up? A cookie is set for the visitor’s browser with the session identifier, “PHPSESSID”, by default. The session data, if available, is loaded from the session store, which is in a file — unless you’ve moved it to something faster. The data that is loaded is then unserialized into parameters in the $_SESSION global.

Additionally, at startup the PHP session handler rolls the dice to see if it’s time to do garbage collection on expired sessions. Unless you’ve changed the default settings for PHP, then there’s a 1% chance that PHP will have to sort through all of your available session files to find out if any are ripe for dismissal.

This is just a summary. I’ll readily admit I don’t know all of the internals of session management in PHP. I also can’t speak to whether PHP re-saves your visitor’s session on every page view, whether or not the data has been changed. If anyone can answer that, I’d love to know.

Finally, one other thing that I’ve noticed is that once you have a PHP session running, some additional cache-busting HTTP headers seem to be added to the server response for a page view:

Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache

These headers make it impossible for your browser to cache a page, even if it’s a PHP script that returns virtually static content.

So tonight I took an extra step after setting up a login class that makes use of sessions for storing a visitors username and password credentials. The code conditionally checks to see whether or not a session handler needs to be fired up. Session_start() is only called if the HTTP request includes a session cookie.

if (isset($_COOKIE['PHPSESSID'])) {
session_start();
}

This code could go in one of two places, either in the constructor for a login class, or if you potentially need session data in more places in your code, maybe in a file that gets auto-loaded on every request.

Once you get to a sign-in page, the login class would be responsible for firing up the session if it does not yet exist. For me, this is in a method called “authenticate()”:

public function authenticate ()
{
if (!isset($_SESSION)) {
session_start();
}
// Do rest of user validation...
}

Note that we can use isset() to see if $_SESSION exists, which prevents E_NOTICE messages from being fired by session_start() if there is already a session in progress.

With these small changes, I can surf all over my site without having a session started up. The session is only initialized when I log into my site’s account. Furthermore, you could add the additional behavior of explicitly deleting the session cookie for the visitor once they have logged-out of your site. While session_destroy() will delete data within the session file, it doesn’t delete the cookie from your visitor’s browser.

Tags: , ,

AJAX Photo Gallery using the SAJAX toolkit

Posted by Mike Brittain on May 30, 2006
Friends, Gadgets, PHP / 2 Comments

Sean’s recent article about using SAJAX was published at IBM today. He constructs a simple photo viewer application using an XML file for meta data about each photo. PHP and the SAJAX (Simple AJAX) toolkit are used to present the images.

In my follow up article, part 2 in the series, I describe how you can use client-side JavaScript to implement a history cache for the application that mimics the history utility in web browser software. Many AJAX applications suffer from the lack of “undo” functionality, or history navigation, that normal web browsing employs. That’s what I will be addressing. However, the solution will not involve hijacking the back button as other developers have demonstrated. Instead, it shows how to track a history of events using JavaScript within a single browsing session.

PEAR Image_3D Tutorial

Posted by Mike Brittain on April 04, 2006
PHP / Comments Off

Today my article on using PEAR’s Image_3D package was published to IBM’s developerWorks web site. The article explores setting up the package, getting oriented to the 3-D coordinate system, and rendering various objects.

This may, honestly, be the first time I’ve used Calculus since college.

A PHP Reading List, from IBM

Posted by Mike Brittain on March 17, 2006
PHP / Comments Off

IBM’s developerWorks web site has published a reading list for PHP which spans from overview and getting started topics to more advanced issues like security and integration with third-party resources. Much of the list is self-serving — a great number of resources are articles published on developerWorks, and are specific to Cloudscape, Derby, and Apache (IBM products, or IBM-supported products). These are great articles, though it should be recognized that they tend to lean developers in the direction of products backed by IBM.

The rest of the list, however, covers some real gems from other sites. I will give great credit, here, as there are quite a lot of sites with PHP content, through not many that have as well-edited material as developerWorks. This list filters out a lot of the cruft. Unfortunately, only one article was cited from onLamp’s PHP resources. Along with developerWorks, onLamp is another of my favorite sites to read for development-related material.

Finally, a wrap up includes a list of blogs, presentations, and authors who are central to the development and evangelism of the PHP language. It’s nice to see many of these names represented.