Latest ESXX Release


Tuesday, August 5, 2008

Using Apache's mod_cache with CGI or FastCGI

I just though I would elaborate on the caching part in the last post, since I had a real hard time finding good information about this on the web.

Scenario: You have a web application that produces static or semi-static content, and you'd like increase the performance and/or reduce the computing load.

The first thing you need to do is to mark the produced content as cachable. This involves setting the "Last-Modified" and "Expires" HTTP response headers. You should also add a "Cache-Control" header with a "max-age" parameter, and optionally also the "public" flag.

In ESXX, here's how you could construct the headers object, which is used in the Response() constructor, assuming last_modified and expires are dates in milliseconds and max_age is in seconds:

    var headers = {
"Last-Modified": new Date(last_modified).toUTCString(),
"Cache-Control": "max-age=" + max_age + ", public",
"Expires": new Date(expires).toUTCString()
};


Second, it's a good thing to handle the "If-Modified-Since" request header and return HTTP status 304 (Not Modified) if the client's cached version of the content is still fresh.

Third, make sure the "Content-Length" response header is sent. This probably involves buffering the produced content on the server before it's sent to the client. In ESXX, you do this by setting the "buffered" property on the Response object to "true", like this:

var result = new Response(200, headers, body, content_type);
result.buffered = true;
return result;


The Cacheability Query is a great tool for checking these headers!

With the content now cachable (the browsers are already caching your pages locally), it's time to make sure Apache caches the content locally as well, and serves the cached pages directly from disk (or rather, the OS' in-RAM disk cache).

And that's when it becomes tricky. I tried on both CentOS 4 (Apache 2.0.52) and Fedora 9 (2.2.8), and neither would cache anything but static files! And trust me, I tried hard.

The solution I finally settled for was to hide the "real" site behind a mod_proxy, and then adding the caching directives to the proxy server. In my case, I added (yes, those are dashes, not dots!)

127.0.0.1    www-esxx-org


to /etc/hosts and then used an Apache (mod_cache and mod_disk_cache) configuration that looks something like this:

<Virtualhost *:80>
ServerName www.esxx.org

<Location>
ProxyPass http://www-esxx-org/
ProxyPassReverse http://www-esxx-org/
</Location>

CacheEnable disk /
CacheRoot "/var/cache/mod_proxy"
</Virtualhost>

<Virtualhost localhost:80>
ServerName www-esxx-org

# Real site configuration goes here

</Virtualhost>


As you can see, the "real" site is only accessible from localhost. One final note! Be aware of the fact that in the "real" site configuration, all accesses will appear to come from 127.0.0.1, so any "Allow from" directives will be completely useless.

No comments: