March 11, 2011

Why I like Amazon AWS (Theoretically)

When I say I like Amazon AWS (the whole Amazon cloud services) "theoretically," I mean that I have not actually used it (yet). Instead, I like the design and the possibilities the service creates.

First, some background. I have had been using a virtual computing service for many years. I had a virtual machine with Linode before they were using Xen (it was Usermode Linux back then, which was the cheap man's virtualization by running a kernel in usermode). That is probably 7 years or more. I have always been happy with their service and uptime. I virtualized my personal mail server, and freed up space in my living room the first chance I got. Having a virtual server is great. because I love concentrating on the OS and application while letting someone else worry about the hardware.

I have seen a number of performance tests comparing AWS, Linode and the Rackspace virtual machines. Linode often comes out on top, and Amazon often near the bottom. I never understood why anyone would choose the Amazon service over Linode to host a virtual machine. Plus the virtual machine service EC2 did not save the server's state when it crashes or after a shutdown (though that is different now). That made no sense to me. But Amazon's service is really something more than virtual servers. AWS is not a virtual machine service, it is a virtual infrastructure service that contains machines.

It it like this: You can write a program to respond to HTTP and serve up web pages. Or, you can use a prewritten program (such as Apache) that is dedicated to responding to HTTP and give that program directions (HTML pages) for service up a website. If the website is complicated and dynamic, you could create an application that talks to the web server, and can dynamically create the HTML pages, or you could use a pre-written program (the PHP interpreter) and just give it directions for dynamically creating the website (a bunch of php files). The php files can be plopped on any Apache instance and run (you might need to configure Apache, but that is a separate job from the web site creation).

In Java, you create a bunch of classes that you can bundle together as a JAR or WAR file and then hand that bundle to a Java virtual machine or give it to a J2EE container to run. The JVM or JVM + J2EE container takes care of instantiating the classes and running the program. If you need more processing power, you can hand that same JAR or WAR file to many Java virtual machines or J2EE containers and have the program run many times without changing it. Complicated arrangements or clusters of Java programs can be created that comunicate to each other to balance load. These programs do not hold data, they loose their state when they shutdown or crash. The state of the program is stored externally to the Java program in a database or on a file system.

The Amazon EC2 is like the Java virtual machine. You can create an image of a system made up of a set of applications (shell scripts, Java programs, .Net files, Apache, Tomcat) plus a configurations of the OS (I like Linux). And then hand that image to Amazon. The important data is stored in another system (an outside database, some NoSQL thing such as SampleDB, or Amazon's persistant storage EBS), but the rest of the state of the system is not really important. You don't patch an EC2 instance like I have patched my Linode boxes or my real live servers. You don't patch a running J2EE server with new libraries. You update the images (be they Jar files, War files, or AMI Amazon Images) and roll it out.

When I realized that, then AWS made sense and seemed great. I would imaging building a system of interconnected services that run on different EC2 instances. I could configure them as needed and roll them out to be instantiated as needed.

No comments: