Cloud Computing

EC2 Hosting Architecture, Two Years Later

Posted by Mike Brittain on July 12, 2010
Cloud Computing / Comments Off on EC2 Hosting Architecture, Two Years Later

It’s been nearly two years to the day since I wrote my post about the hosting platform I setup on EC2.  The post still gets plenty of traffic and this week I was asked if it was still valid info.  I think there are some better ways of accomplishing what we set out to do back then, and here is a summary.

1. Instead of round-robin DNS for load balancing, you can now use Amazon’s Elastic Load Balancing (ELB) service.  The service allows for HTTP or TCP load balancing options.  I found that the HTTP 1.1 support in ELB is somewhat incomplete and “100 Continue” responses were not handled properly for large image uploads (a specific case I was using).

2. I chose Puppet two years ago for configuration management.  Since that time, OpsCode has released Chef, which is a friendlier way to manage your systems (in my opinion) that we also happen to use at Etsy.

3. Our database layer was built on four instances for MySQL, in a fairly paranoid configuration.  We had strong concerns about instances failing and losing data.  There are a couple of new tools available to help with running MySQL on EC2.  You can use the Elastic Block Store (EBS) for more resilient disk storage, or choose the Relational Database Service (RDS) which is MySQL implemented as a native service in AWS.  Disclaimer: I haven’t deployed production databases using either of these tools.  These are only suggestions for possibilities that look better/easier than the setup we used.

4. When it comes to monitoring tools, Ganglia is terrific.  What I like about Munin is the easy of writing plug-ins and the layout of similar services on a single page for quick comparisons between machines.  Ganglia’s plugins are also dead-simple to write.  In the five months I’ve been at Etsy, I’ve written at least 15.  In the three years I was using Munin, I probably wrote a total of six plug-ins.

Additionally, Ganglia has some sweet aggregated graphs for like machines.  This graph looks like a couple hundred web servers (as stacked lines).

All of the points listed in the “successes” section of that original article should still be considered valid and are worth reading again.  But I’ll highlight the last two, specifically:

Fault tolerance: Over the last two years when I worked at CafeMom we ran a number of full-time services on EC2 (fewer than 10 instances).  While a handful had been running for years at the time I left (had been started before I arrived, actually), other instances failed much sooner.  I can’t stress the importance of automated configurations for EC2 instances, given that these things have a tendency to fail when you’re busy work on another deadline.  I believe that in technical circles they refer to this as Murphy’s Law.

Portable hosting: I’m a big believer in commodity services.  The more generic your vendor services, the easier it is to switch them out when your blowout preventer fails.  I’ve mentioned a few services in this article that are specific to Amazon Web Services (ELB, RDS, and EBS).  If you go the route of Elastic Load Balancer or Relational Database Service, you should strongly consider what services you would use if you had to move to another cloud vendor.

Tags: , , , , , ,

Twitter’s Photo Storage (from the outside looking in)

Posted by Mike Brittain on May 24, 2010
Cloud Computing, WWW / 1 Comment

I’ve been working on some photo storage and serving problems at Etsy, which is exciting work given the number of photos we store for the items being sold on the site.  This sort of project makes you wonder how other sites are handling their photo storage and serving architectures.

Today I spent a few minutes looking at avatar photos from Twitter from the outside.  This is all from inspection of URLs and HTTP headers, and completely unofficial and unvalidated assumptions.

Two things I found interesting today were (1) the rate of new avatar photos being added to Twitter, and (2) the architecture for storing and serving images.


Avatar photos at Twitter have URLs that look something like the following:

I’m assuming the numeric ID increments linearly with each photo that is uploaded… two images uploaded a few minutes apart showed a relatively small increase between these IDs.  I compared one of these IDs with the ID of an older avatar, along with the “Last-Modified” header that was included with its HTTP response headers:

Last-Modified Tue, 26 Feb 2008 03:15:46 GMT

Comparing these numbers shows that Twitter is currently ingesting somewhere over two million avatars per day.

Stock, or library, avatars have different URLs, meaning they are not served or stored the same way as custom avatars.  This is good because you get the caching benefits of reusing the same avatar URL for multiple users.

Storage and Hosting

Running a “host” look up on the hostname of an avatar URL shows a CNAME to Akamai’s cache network:

$ host is an alias for is an alias for has address has address

If you’re familiar with Akamai’s network, you can dig into response headers that come from their cache servers.  I did a little of that, but the thing I found most interesting is that Akamai plucks avatar images from Amazon’s CloudFront service.

x-amz-id-2: NVloBPkil5u…
x-amz-request-id: 1EAA3DE5516E…
Server: AmazonS3
X-Amz-Cf-Id: 43e9fa481c3dcd79…

It’s not news that Twitter uses S3 for storing their images, but I hadn’t thought about using CloudFront (which is effectively a CDN) as an origin to another CDN.  The benefit here, aside from not pounding the crap out of S3, is that Akamai’s regional cache servers can pull avatars from CloudFront POPs that are relatively close, as opposed to reaching all the way back to a single S3 origin (such as the “US Standard Region”, which I believe has two locations in the US).  CloudFront doesn’t have nearly as many global POPs as Akamai. But using it does speed up image delivery by ensuring that Akamai’s cache servers in Asia are grabbing files from a CloudFront POP in Hong Kong or Singapore, rather than jumping across the Pacific to North America.

I suspect that Twitter racks up a reasonably large bill with Amazon by storing and serving so many files from S3 and CloudFront.  However, it takes away the burden of owning all of the hardware, bandwidth, and man power required to serve millions upon millions of images… especially when that is not a core feature of their site.

Tags: , , , , , ,

Web-based Email: One Example of Cloud Apps Replacing Desktop Apps

Posted by Mike Brittain on August 10, 2009
Cloud Computing / Comments Off on Web-based Email: One Example of Cloud Apps Replacing Desktop Apps

I read this poll on LifeHacker the other day about web-based email vs. desktop email apps.  It reinforced what I believe is the current momentum in web applications these days — that over time, people are going to get more and more comfortable using web apps the way they used to use desktop apps.  It’s been five years since the release of Gmail, which I view as a forerunner in this area, so clearly this isn’t going to be a quick change.

Browser innovations will help users with the perception that web applications are interchangeable with desktop software.  Google Chrome is already working in this direction by reducing the amount of browser UI and allowing the user to focus on sites and apps they are using.  I’m won’t argue that Chrome will be a major browser; it may never be.  I do believe that Google’s intention is to continue swaying the way we look at the Internet. As other browsers follow suit (in some cases), Google’s web applications and sites will all benefit.

As we move forward, many computers, especially in public spaces like libraries and educational computer labs, will use fewer licensed software suites and more subscriptions to web-based applications.  As with Gmail, it doesn’t matter whether I check email on my computer, your computer, a work computer, or a mobile phone — I still have access to the application because modern web browsers provide the baseline of support for these apps.  With stabilization and implementation of features in HTML 5, additional web-based apps will be built and they will continue to look and act like our familiar desktop apps.  Over time, institutions will replace their local file servers, email servers, and parts of their IT staff with outsourced apps that are purchased by subscription and delivered in a browser.

Fragility of the Cloud

Posted by Mike Brittain on June 11, 2009
Cloud Computing / Comments Off on Fragility of the Cloud

A lightning strike causes EC2 outages and Om Malik blames the “fragility of the cloud,” rather than recognizing that all tech suffers failures.  I’ll say it again, this could have happened to my own servers, or my own data center, and I would have been much further up the creek than if Amazon team was taking care of it.  Besides, one of the most important lessons I have learned from working with AWS is that servers/services should fail, and fail gracefully.  It shouldn’t matter whether that service is “in the cloud” or in your data center.

Tags: , ,

High Traffic Sites on EC2

Posted by Mike Brittain on April 08, 2009
Cloud Computing / Comments Off on High Traffic Sites on EC2

Grig Gheorghiu wrote up a nice article on handling high traffic sites on EC2.  It’s definitely worth a read for some high-level concepts about multi-tier architectures.  It doesn’t talk deeply on details of EC2 (would have liked to see something mentioned about availability zones for MySQL and load balancers).  One thing I really liked was the concept of using multiple load balancers with round-robin DNS pointing at them.  I’ve been considering this as an option and have played around with HAProxy already.  It’s likely a future step for our new image service at CafeMom.

Tags: , , , ,

Cyberduck Support for Cloud Files and Amazon S3

Posted by Mike Brittain on January 21, 2009
Cloud Computing / 1 Comment

Cyberduck is a nice Mac FTP/SFTP GUI client that I’ve used in the past for moving files around between my desktop and some web servers.  Turns out they’ve added support for moving your files directly to Amazon S3 and Mosso (RackSpace) Cloud Files.  This means that you can use the same tool that you may previously have used for publishing content to your own web server to instead publish content directly to a self-service CDN.  Amazon uses it’s Cloud Front service to distribute files, and Mosso is supposed to be integrated with LimeLight networks for distributing content from the Cloud Files system.

Just wish I had these services available to me three years ago.  They would have saved me some serious cash on bandwidth commits for CDNs for those silly little projects I was working on.

Tags: , , , , , , ,

Good Observation on Cloud Architecture with EC2

Posted by Mike Brittain on December 29, 2008
Cloud Computing / Comments Off on Good Observation on Cloud Architecture with EC2

I just read this short article about Soocial’s hosting architecture which runs on Amazon Web Services.  There was one particular line that echoes what I’ve been saying for a while and I think it is worth repeating:

One of the most interesting things is how the architecture isn’t dramatically different than it would be if you were to build an on-premise version.

In my own experience with hosting on EC2, we built our application on a physical dev server that we already had in place and was running Linux.  It was easy for us (with just a little forethought) to deploy the application on EC2 and S3, and the developers working on the application really needed to know very little about the workings of EC2.

Tags: ,

Manage Amazon Web Services on Your iPhone

Posted by Mike Brittain on October 23, 2008
Cloud Computing / Comments Off on Manage Amazon Web Services on Your iPhone

Ylastic is putting a management interface for AWS on the iPhone.  Looks pretty cool.

I am familiar with their name, but don’t have any experience with their product.  I sort of wish these sorts of tools could be open sourced (and there are some) so that I could run the management service on my own servers and not hand over my AWS keys.  Like I said, I don’t have experience with their product, so maybe I’m making an assumption there.

As I’ve said earlier about AWS, it’s an amazing service, but is very much like a raw material.  It’s like having someone hand you the keys to a datacenter, and you don’t even know how to turn on the lights.  Ylastic fits into the category of management vendors for AWS, and I think that Amazon’s ultimate success will depend on management vendors who extend the web services to the layperson.

Tags: ,

Munin Plugin for Testing S3 Speed

Posted by Mike Brittain on September 26, 2008
Cloud Computing / Comments Off on Munin Plugin for Testing S3 Speed

Matt Spinks put together a Munin plugin for monitoring S3 download speeds, which is now available at Google Code.  I mentioned this in a recent post and wanted to provide an update the the plugin is now published.

Tags: , , ,

Download Speeds from Amazon S3

Posted by Mike Brittain on September 25, 2008
Cloud Computing / Comments Off on Download Speeds from Amazon S3

I’ve been planning to post some details about download speeds that I’ve seen from S3, and why you shouldn’t necessarily use S3 as a CDN, yet.  Granted, Amazon recently announced that they will be providing a content delivery service in front of S3.  This post has nothing to do with the CDN (CDS).

Scott posted the presentation he gave at the AWS Start-Up Tour.  It’s worth a read, and is a good summary of our business case for building the EC2/S3 hosting platform that I led when I worked at Heavy.

His post includes this graphic of how we measured S3 delivery speeds throughout the day.  Matt Spinks wrote the Munin plugin that generated this graph, and he tells me he’s planning to make that available for others to use.  When it is, I’ll add the link here. It’s now available at Google Code.

As we measured it, S3 is fairly variable in their delivery speeds.  Unfortunately, we didn’t measure latency for initial bits, which would be good to know as well.

My own impression is that it is not a good idea to host video directly from S3 if you run a medium to large web site.  The forthcoming CDN service will probably help with this.  If you’re a small to medium site, you might be happy with hosting video on S3.  Hosting images and other static content (say, CSS and JS files) might also be a good idea if you don’t have a lot of your own server capacity.

For my own use, I’m planning on using S3 to host images and static content for some other sites I run, which use a shared hosting provider for serving PHP.  On a low traffic site, I’d be happy to offload images to S3.  And when the CDN service becomes available, the user experience should be even snappier.

One thing to note if you plan to use S3 for image hosting, look into providing the correct cache-control headers on your objects in S3.  You need to do this when putting content onto S3.  You can’t modify headers on existing content.  More on this in a future post.

Tags: , , , ,