Design of a modern web application
In order to improve the overall performance of a web application we have to reduce a load on our servers. It can be done in many ways depending on your business requirements. I will concentrate on two of them:
- Reducing CPU and I/O load is the key to improve the overall performance. Often overlooked technique: don’t repeat yourself (DRY), and cache anything you can on clients and on servers.
- Distribute computations. Remember that computing power of all client computers much bigger than computer power of your server.
Optimization on server side
Roy Fielding introduced in his doctoral dissertation the concept of REST, and how to use it for performance optimizations. Bill Higgins in his article Ajax and REST explains in details a relationship between REST and Ajax. I don’t want to repeat arguments of Roy and Bill, so I will sum them up: REST is good because it provides a good natural interface to server-side resources, Ajax is good because it allows splitting static and dynamic content cleanly.
- Majority of data on a web page is truly dynamic and has to be generated on a server.
- Support for restricted clients, e.g., legacy, or some mobile/embedded platforms.
It is important to recognize such cases, and to know who your clients are, and what they use to access your web site. I assume that you already did your homework on that.
Typically we should move all things that can be done efficiently on a client to a client. The example above demonstrated how to do that. In general almost all HTML rendering can be done on a client. Server is still responsible for database access and security (like a user authentication). Probably the best we can do is to design our server-side as a true REST server with an Ajax client on top. The server serves mostly static files (HTML, CSS, JS, images), which constitute a static framework of our web application. The client accesses the server to get static files, and to get dynamic content using REST conventions, and renders proper HTML.
Obviously we have to take into account who our clients are. If this application should be used by mobile users with restricted browsers, we have to provide a special UI for them as well. But if majority of our users come with modern browsers (e.g., Firefox 1.5, Firefox 2, Internet Explorer 6, Internet Explorer 7, Opera 9, Safari, and so on), the mentioned design is good for us. The idea is to generate the smallest possible amount of dynamic data, moving everything we can to static files, which can be served really fast. If you think about it, this design has two contradictory requirements:
- In order to serve static content:
- We try to cache most popular files in memory — we don’t need it for anything else.
- We tend to keep an HTTP connection alive so we can reuse it for other files.
- Our primary concern is how to shove bytes from disks to the Internet pipe, and do it fast.
- Usually we don’t use sessions for static files.
- Our server processes/threads are small, we can have a lot of them on the same web server.
- In order to serve dynamic content:
- We try to keep frequently used database tables, and our processing code in memory.
- We tend to close an HTTP connection as soon as we done with generating a response, and start serving other clients.
- Our primary concern is algorithms: how to get relevant data from our databases fast, how to transform data to what clients need.
- Usually we use sessions to identify users and enforce security restrictions (e.g., one user can not request private data of another user).
- Our server processes/threads tend to be relatively large, and we may have to deal with a distributed server farm.
These considerations are the reason for popular two-tier architecture, when thin servers serve a static content, and special application servers are responsible for dynamic responses. Typically a static content is served by specially configured web servers like Apache, or by more light-weight servers like lighttpd or nginx. You can even explore an option of serving some of your static files from a different domain to go around restriction on number of connections per domain imposed by browsers.
We minimized our dynamic data generation. Now we have to examine it, if it is truly dynamic. Real-world examples:
- User data is different for different users, but it is the same for the same user.
- News stories can be updated once in 15 minutes. During these 15 minutes, they are static.
If this is the case for our data, and it takes some noticeable I/O and CPU time to generate them, then the right way to go is to implement a server-side cache. The idea is simple: we store a copy of our generated data in a server-side cache, and send clients a copy, if we have one, and it is still valid. Basically we are moving our data from category “dynamically generated content” to category “static files”.
Different web frameworks have different ways to do it. For example, Django has a special infrastructure to do it, which is exhaustively documented. Even if you use a different web framework, read about Django’s cache framework (from the Django’s documentation) — it explains the problem and discusses ways to solve it. You may find something similar in your server-side web framework of choice. Additionally it discusses upstream caches, which is always an important topic especially for corporate networks, and CDNs.
We carefully designed our application to split its responses into two categories: static files (or quasy-static files, e.g, which should not be regenerated on every access), and dynamic content. Now we have to let client browsers know that they can reuse static data.
In real-life static data is not always static — artwork can be changed, bugs in CSS can be fixed, some legal wording can be modified in an HTML document, and so on. A good compromise is to set an expiration time for our content. Obviously, for truly dynamic data, we have to notify a browser that they should not be cached. There are several HTTP headers we should return from our web server, if we want to support an expiration feature: “Expires”, “Last-Modified”, and “Cache-Control”.
OK, we cached the content and it has expired. Now what? Should we retrieve it again? There is a better way to do it, and it is called a “conditional get”. In a nutshell it allows a browser to request data conditionally, if it was changed since “Last-Modified” date, or if it was changed at all. The latter is achieved by a web server sending a special “ETag” header, which serves as a unique content identifier (e.g., a hash value of the content). In both cases when browser detects that a content has expired, it will ask for it by automatically adding “If-Modified-Since”, and/or “If-None-Match” headers. If the web server determines that the content has not changed, it will respond with HTTP 304 (not modified) instead of resending potentially large data. Again you can find a good explanation of it in the Django’s documentation: CacheMiddleware, and ConditionalGetMiddleware. Most probably your web server has something similar, or you can generate headers manually.
Typically there is a special provision for static files. For example, Apache has a special module for it called mod_expires. Below is a snippet of Apache 2 configuration from my personal site, which is used to serve a static content:
1 2 3 4 5 6 7 8
Upstream caches and CDNs
In the real world we have to deal (and exploit to our advantage) upstream caches and Content-Delivery Networks. The former are frequently used by corporate users, the latter provide an easy way to distribute load geographically, and to outsource serving static content. Basically everything we did in the previous section (Client-side cache) will work well with intermediate caches. There is one more consideration especially important for caching a dynamic content: privacy. For example we need to indicate that content is different for different users, or prohibit intermediate caching at all. HTTP defines special mechanisms for that: “Vary” and “Cache-Control” headers. You can find a good discussion of it in the Django’s documentation: Upstream caches, Using Vary headers, and Controlling cache. Most modern web application servers implement similar functionality — peruse a documentation of your favorite web server.
Data compression is the lowest hanging fruit of a web application optimization. It may dramatically reduce a bandwidth required to access your web site. The trade-off is obvious: it requires some server CPU time to compress data. But it can be amortized by caching techniques discussed above, and by pre-compression of static data, which is supported by many web servers.
In order to support all clients we have to be able to serve the same data in two formats: compressed, and uncompressed. Practically all modern browsers support data compression, and they are powerful enough to decompress it instantly. Out of all modern browsers the Internet Explorer 6 stands out in shame: it doesn’t support it properly in several cases due to numerous bugs in the implementation. Sigh. So we have to be careful about compression. For example, Apache 2 has a special module for it: mod_deflate. Below is a snippet of Apache 2 configuration from my personal site, which is used to serve a static content:
1 2 3 4 5 6 7 8
It allows to gzip only HTML files for old Netscape 4 browsers, no gzip at all for Netscape 4.06-4.08, and (shame!) IE6. IE7 is OK in this respect. All images are exempt from compression. Our response varies by “User-Agent” header, so we can compress conditionally for different browsers. The latter means that all upstream caches will keep different versions of our files per “User-Agent”. In ideal world, we want them to keep just two different versions of our content, but there is no simple way to express it.
Does it make sense to compress dynamic responses? In some cases it does. Check the documentation of you web application server, if it supports it directly. For example, Django uses a special middleware for that: GZipMiddleware.
Dynamic data format
It is worth mentioning the format we use to send our dynamic data to a client. In most cases we have three options:
- HTML snippet
HTML snippets are good in some simple cases. Typically we use them when we need to retrieve data dynamically, and insert it directly on a web page without further processing.
XML comes to play when we need to process data on a client, keep an object around to modify it later, possibly sending it back to a server.
- You have legacy infrastructure, which already serves XML, e.g., SOAP.
- You want to use the power of XSLT to process a server’s response on a client.
1 2 3 4 5
Do not overlook all conventional ways to optimize a design and an implementation of your web applications. Jacob in Django performance tips gives a nice overview of Django-specific, and general techniques of server-side optimization, his views on database servers, and hardware, as well as a list of useful references to other web server optimization documents. The Apache Performance Notes document gives a lot of useful hints for Apache too.
Optimization on client side
Now when our work on a server-side is done, we are ready to tackle a client. Surprisingly we don’t have a lot to do — Dojo takes care about a lot of stuff with its sophisticated build system and modular structure.
Update (8/24/2008): all links in this section are updated to reflect the latest Dojo version: 1.1.1.
Using proper Dojo build
The first impression of many novice Dojo users is “Dojo is huge!”. For example, so called dojo-0.4.1rc3-minimal.zip is 4.4M. Full checkout of Dojo SVN repository takes 74M on my disk (with all hidden SVN files included). Actually majority of files in the SVN checkout are tools, which are not served to end users. Even majority of files in the minimal build are redundant and they will never be served either. All Dojo builds come with a full copy of all Dojo files, and a specially customized dojo.js file, which combines some of frequently used files. See Quick Installation for details.
The model used by Dojo is smart and simple:
- Browser loads dojo.js file, which contains:
- Dojo bootstrap code
- Dojo loader
- (optionally) frequently used modules
- dojo.js activates Dojo, and loads the rest of modules dynamically, unless they were already loaded by an optional part of dojo.js (see above).
- Dojo provides a number of different builds for typical scenarios.
- Dojo builds minimize a number of downloaded files.
The first order of business is to define what subset of Dojo we actually use. One way to do it is to study web server logs or use tools like Firebug to see what files were requested. Dojo covers a lot of aspects of client-side web development, and typically only a small subset of modules are used by any given web application, and the use may be varied from page to page. If we are lucky enough, we can come up with a single set of Dojo modules, which are used by practically all pages. In complex cases (e.g, for huge web sites) we may have several different subsets.
If your configuration has a good match with one of the standard builds, your job is done. Otherwise you have to do a custom Dojo build as explained in The Package System and Custom Builds.
One exciting option to explore is a cross-domain Dojo build, which opens a possibility of cross-application caching for Dojo builds. This is the way to distribute Dojo files (and your custom Dojo modules) with CDNs, which can significantly reduce a start-up time of your web application. This feature is still under heavy development. Expect more details in the future.
Optimizing widget initialization
As a convenience feature during initialization Dojo parses all HTML in a web page automatically instantiating all encountered Dojo widgets. Even if you don’t have any. While it is very convenient for small applications, it can increase significantly a start-up time for web applications with a lot of HTML in them. If the start-up time is high, and you feel that something is not right, use a profiler (like Firebug) for that. If you see that a lot of time is spent in places like dj_load_init(), or modulesLoaded(), or anything else resembling initial loading, it is a sure sign that you may have to look at the widget initialization. In my experience this is a major drain on a start-up time.
Update (8/24/2008): “searchIds” (discussed below) was removed from 1.1.1 due to improvements to Dojo parser. If you feel you still need this functionality, you can implement it yourself (courtesy of Karl Tiedt):
1 2 3 4 5
Don’t forget to require “dojo.parser” module.
Dojo provides several tools to optimize it. The most important one is the ability to turn the parser off. Just add a following line before including a script pointing to dojo.js:
It tells Dojo to skip parsing widgets completely. What to do if you do have widgets in the body of your HTML document? In this case the simplest way is to assign “id” to each of them and to list them in “searchIds” attribute of “djConfig”. Example:
1 2 3 4 5 6 7 8 9 10 11 12 13
This example uses three Dojo widgets. Two of them were listed in djConfig.searchIds. The third one was added to the list dynamically in place of its definition. Why? It was done to show two possible techniques:
- List all widgets in the “searchIds” list. This technique has one important benefit: it gives you an immediate overview of what widgets you have in this file so you can judge how “heavy” this page is. The drawback is obvious too: every time you add/remove widgets, you have to update the list, which is easy to forget.
- Add widgets dynamically to the list in the place of their declarations. It is a little bit more verbose, but more foolproof.
- You have to specify id instead of widgetId (which can be assigned too, if you wish).
- Some container widgets (like a dialog widget) delay parsing of their bodies until it is shown. If you defined a widget inside a dialog widget, it will not be available until the dialog widget is made visible.
- Some container widgets parse their body as soon as they are parsed themselves.
Update (8/24/2008): “parseWidgets” (discussed below) was removed from 1.1.1 due to improvements to Dojo parser.
In the latter case, or if you don’t want to use searchIds mechanism, you can hint your HTML with “parseWidgets=‘false’” attribute like so:
1 2 3 4
Of course you can always create a widget dynamically at will. Read all about it in Creating a Widget Programmatically.
User interface considerations
Now when your web application is properly split into static and dynamic parts, and you use a Dojo build, which is optimized for your application, bundled, and pre-compressed, and served with all provisions for caching, there are three problems, which should be addressed:
- An initial hit, when a user comes to your URL with an empty (“cold”) cache and has to download all appropriate files first. We already reduced this time by using an appropriate Dojo build, by compressing static text files, and by optimizing Dojo widget parsing. But it may not be enough.
- A normal start-up time, when a user came with pre-cached files (“hot” cache), and your application (a “web page”) has to ask for additional dynamic data before being ready.
- A possible period of unresponsiveness, when an application is retrieving data during a normal course of actions. Technically it is very similar to #2, but it is different psychologically because in #2 user “doesn’t know yet”, if application is working.
All of these issues should be addressed in the user interface. The general idea is to let users know what is going on. They should never wonder why an application is “not working”. Of course it is very hard to design a good user interface, and it is your responsibility, but I can share some ideas as starting points.
In all cases you should give some visual hints on what is going on at the moment. Otherwise people will feel that your application is “sluggish” or “unresponsive”. If the net is slow, they should “see” it instead of guessing that “it’s stuck somehow”. One good example of addressing the #3 problem is demonstrated by Google Mail and Google Docs & Spreadsheets. Every time it accesses its server you can clearly see an unobtrusive text on red background in the top-right corner of a page. The text explains what is going on at the moment, e.g., “Saving…”, “Loading…”, and so on. The same technique can be used for the #2 problem, but it should be completely separated from the “initial hit” problem.
One way to address the “initial hit” is to implement a variation of a splash screen:
- Make sure that an HTML reproduces a non-functional version of your user interface.
- Make sure that the first element of your HTML is an absolutely positioned paragraph, which is superimposed on the top of your normal page. Don’t use external CSS rules to do that, position it inline using “style” attribute like so: “style=‘position: absolute; top: 10px; left: 10px; widht: 100px; height: 100px; z-index: 10;” (of course you should use some sensible numbers). In general it is good to position it approximately in a middle of a page.
- As soon as our code is started we hide the “splash screen” programmatically. dojo.addOnLoad() is a good place to do it along with all required initialization of your web application.
- The “splash screen” can have a text, which explains what is going on. Or it can be a single animated GIF file, which shows a “loading…” animation.
The notable variation of this technique: use one HTML fragment for “loading…” phase, then hide it, and unhide a normal user interface.
Make sure that the “splash screen” doesn’t flicker too much for users with a “hot” cache.
IE6 has a really peculiar bug, when it downloads repeatedly the same image, if it is used in several places. Read more on it here: IE DHTML image caching bug: workaround.
I wish to thank:
- Carla Mott for reviewing this article,
- Django community for their tireless work on perfection of web application, and attention to details — let’s face it, the bulk of work on optimization should be done on a server,
- Dojo Toolkit community and Sun Microsystems for their support.
As always let me know about any errors and omissions.