Eugene's Blog

I can't believe it's blog!

Google App Engine: the first look

Yesterday Google announced its new offering: Google App Engine. These are my random notes I did yesterday when I studied the new service.

Google didn’t go the same way as Amazon with its AWS. The former offers a form of shared hosting (think “distributed WebFaction”), while the latter offers a virtualized environment (think “distributed SliceHost”). So basically we are talking about more high-level approach to web applications, which is easy even for novices. On the other hand AWS is more flexible and more enterprise-y.

In order to scale, you have to virtualize. In order to virtualize, some actions should be restricted. Google App Engine provides a sandbox on the language level. Only one language is supported at the moment: Python. And guess what? Django is everywhere! For virtualization reasons only the pure Python is supported and a subset of the Python standard library. You can bundle any pure Python modules with your application. Three third-party libraries are available out-of-the-box: Django 0.96.1, WebOb 0.9, and PyYAML 3.05. But fret not, you can bundle a framework of your choice with your application and run it too, even a custom version of Django — basically anything that supports WSGI. I suspect that other mature Python frameworks will be supported out-of-the-box eventually. Pylons and TG2 come immediately to mind.

Google provides its own simplified web framework: the webapp. Apparently it uses Django templates (see this FAQ answer).

Google uses Bigtable as the database backend. Bigtable is not a relational database, so it is not possible to use it with the Django ORM directly. At least not yet. So they provide their own ORM. It looks like … Django ORM with models and whatnot. It is possible to move the existing application to the new database with relatively minor edits provided that you didn’t do any esoteric implementation-dependent hacking in your application. Obviously with Django ORM gone, Django Admin is gone too as all model-based applications in django.contrib. Google App Engine provides its own application for administration, but I didn’t try it yet, so I cannot comment on its quality.

Datastore API supports Django-like models and queries, and Google Query Language (GQL), which looks like a sane subset of SQL’s SELECT statement adapted for Bigtable.

Everything looks simple, standard protocols are used if possible. For example, for distributed applications you have to support some kind of a message delivery system. Amazon does it with Amazon Simple Queue Service. Google uses email messages. Simple yet effective. If you want to connect directly and fetch a file or something, there is URL Fetch API — only http and https ports are supported. Again, simple yet suitable for many things — this is exactly how the internet is glued together.

Another thing you cannot use is Django authentication application. But Google provides a replacement for it, which implements a single sign-on using Google Accounts.

Obviously it is not enough to write an application to host it properly. You should know where you can read local files, and how to serve static files. This stuff is usually done with web server-dependent configs. There is such config in Google App Engine done with YAML. Just like in Django URL paths are implemented with regular expressions. You can designate URLs for dynamic processing. You can define, which directories host static files, you can do it on per file basis too, and you specify how to treat uploaded files. It is possible to specify an expiration time, and set MIME types. Currently there is no way to control compression for static files. All in all you can do most of tweaking described in my article on performance tuning.

I am surprised that they didn’t put on their CDN popular JavaScript/CSS libraries for use in hosted applications. Of course you can always bundle your library of choice with your application, but it doesn’t make much sense for Google to duplicate them. I hope they will host Dojo, jQuery, Prototype/Scriptaculous eventually. Yes, Dojo is already being hosted by AOL, but the Google’s backup wouldn’t hurt either.

Update 5/27/2008: Apparently Google listens: AJAX libraries API (via Ajaxian). Yay!

In real life when you use a database it is not enough to define your models. Most probably you need to set up indexes to speed up frequent queries. Again there is a YAML-based config for that. It can be generated automatically while you debug your application.

Speaking of debugging. The SDK contains the Dev Web Server very similar to what we use to debug Django locally. Additionally it can be used to upload your finished application to Google. BTW, there is no waiting list to download it — you can get it and try it now.

Presently this service is free, but there are some limitations: disk storage is up to 500M, both incoming and outgoing traffic is restricted to 10G per day each, up to 2000 emails per day, and up to 200 million megacycles of CPU per day. Up to 3 applications can be registered per account. All indications are that if your application is below these limits it’ll be free after the trial. The interesting part is yet unannounced payment structure for more resource-hungry applications. Until then it is impossible to do a business comparison between Amazon’s and Google’s offers — they are very different technically.

One of the biggest drawbacks of Amazon AWS is the uptime. Amazon promises 99% uptime, which is not enough. Just imagine that your application can be randomly unavailable 15 minutes a day or almost 2 hours a week potentially at traffic peaks, when you want to serve users the most. Would Google App Engine be more reliable? Only time will tell.