Latest ESXX Release


Thursday, November 19, 2009

90% of all web apps are broken

A couple of weeks ago, µ ran an article that stated that nice out of ten web applications are broken from a security standpoint. Half of the volunabilities were SQL injections and Cross-site scriping problems.

A few days later, hackers break in to a Brazilian power grid operator, using, you guessed it, an SQL injection attack.

Seriously, isn't is about time we stop accepting these kinds of failures? I think it is, and thats one of the reasons I wrote ESXX in the first place:

  • ESXX apps never build XML or HTML output as strings, they always operate on real XML nodes which ensures that all text nodes or attributes are properly quoted when output or transformed by the style-sheets. Meaning no cross-site scripting attacks, without effort!
  • In ESXX, it's easier to use prepared SQL statements than it is to build SQL queries by hand, which means that all your SQL parameters will always be properly encoded. And without encoding errors, there can be no SQL injection attacks.
For more information on how ESXX helps you write secure web applications, please read this wiki page.

Sunday, June 14, 2009

Using Apache's HttpClient on Google App Engine

If you, like me, have tried to use Google's URL Fetch Java API on the Google App Engine, you've probably been disappointed. Sure, it's a small, clean API, but it's totally feature-less. The most advanced thing it supports seems to be ... well, it can follow redirects automatically. Wow. Cookies? Authentication? Forget it.

In ESXX, I use Apache's HttpClient 4, and it works really well. Wouldn't it be nice if you could use HttpClient on the App Engine? Well, now you can. All it takes is a custom connection manager that converts the final requests and feed them into the URL Fetch service, and then feeds the responses back into HttpClient.

You can have a look at the implementation here. It's just two classes, one ClientConnectionManager and one ManagedClientConnection class.

PS. ESXX now runs really well on GAE. Timers, the http, https, mailto, jdbc (in-memory H2) and data URI protocols and HTML parsing, yep, works! Only the dns and ldap are non-functional (they will probably never work). Check out trunk from Subversion to try it yourself. Build using ant gae-war.

Updated 2009-12-11: The URIs were double-encoded and query parameters did not work at all. Thanks for pointing this out, Thibaut!

Updated 2010-08-08: The URIs lacked a colon before the optional port number. Thanks for pointing this out, Nello! Also, I changed the license for the two files to LGPLv3.

Saturday, June 6, 2009

ESXX on Google App Engine

I got most of ESXX running on Google App Engine today! How cool is that?

Most of the code is in the subversion repository already. I'll try to finish the port as soon as possible and also add seamless support for Google's HTTP client APIs. Once fully checked in, you too can deploy ESXX + your custom JavaScript apps on GAE.

In the meantime, have a look at http://esxx-demo.appspot.com/.

The friendliest server-side JavaScript, now also on Google's servers. Life's sweet.

ThreadPoolExecutor deadlocks

I'm currently trying to get ESXX running on Google's App Engine. One of the problems are that GAE won't let you create background threads or timers, something which most applications, including ESXX, often do.

My initial plan was to switch from a plain ThreadPoolExecutor and Timers to a ScheduledExecutorService. Once done, I would write a GAE-specific, single-threaded version of that class tries to mimic the intended behaviour as good as possible.

The problem is that the ScheduledExecutorService class is a fixed-size thread pool using an unbounded work queue, which just doesn't work in ESXX. Consider this simple example:

import java.util.concurrent.*;

class STPEDeadlock {
public static void main(String[] args) {
final ExecutorService exe =
new ScheduledThreadPoolExecutor(1, new ThreadPoolExecutor.CallerRunsPolicy());

System.out.println(Thread.currentThread() + ": Submitting job 1");
Future j1 = exe.submit(new Runnable() {
@Override public void run() {
System.out.println(Thread.currentThread() + ": Submitting job 2");
Future j2 = exe.submit(new Runnable() {
@Override public void run() {
System.out.println(Thread.currentThread() + ": Running job 2");
}
});

try {
System.out.println(Thread.currentThread() + ": Waiting for job 2");
j2.get();
}
catch (Throwable t) {
t.printStackTrace();
}
}
});

try {
System.out.println(Thread.currentThread() + ": Waiting for job 1");
j1.get();
}
catch (Throwable t) {
t.printStackTrace();
}

System.out.println(Thread.currentThread() + ": All done");
exe.shutdown();
}
}
It creates a fixed-size thread pool, and adds and waits for a job. The job, in turn, adds and waits for another job. The problem is that there are no worker threads left, so the second job is just queued and never executed, resulting in a deadlock.

If you replace the executor with
final ExecutorService exe = new ThreadPoolExecutor(1, 1,
0L, TimeUnit.MILLISECONDS,
new SynchronousQueue(),
new ThreadPoolExecutor.CallerRunsPolicy());
then all is fine again. This is the fixed-size executor configuration I use in ESXX. Unfortunately, there is no way to configure a ScheduledExecutorService to match this behavior.

What were the Sun developers thinking??

Sunday, May 24, 2009

Scaling up with GoGrid

My small GoGrid experiment the other day made me curios. Assume the blog I put online became really popular. How would my deployment cope, and how would I be able to increase capacity?

So I figured I'd script a small benchmark. Assume all visitors land on the Blog's front page, and that half of the visitors click on the first post to read it comments. Finally, assume a third of those add a comment of their own. Translated into a bash script that uses Apache ab, this scenario might look something like this:

#/bin/bash

AB="ab"
BASE=http://216.121.78.66/blog/index.esxx

function bench {
start=$(date +%s)

${AB} > /dev/null -n 600 -c 60 -e apa ${BASE} &
pid1=$!

${AB} > /dev/null -n 300 -c 30 -e apa ${BASE}/posts/1.html &
pid2=$!

${AB} > /dev/null -n 100 -c 10 -T application/x-www-form-urlencoded -p ab-blog.post ${BASE}/posts/1.html
pid3=$!

while [ -n "$(ps | grep -E "^ *(${pid1}|${pid2}|${pid3})")" ]; do
sleep 0.1
done

end=$(date +%s)

echo "Total time: $(expr ${end} - ${start})"
}

bench
bench
bench
bench
bench

As you can see, I assume 100 concurrent requests.

On my particular configuration (a 1 core, 0.5 GB RAM virtual GoGrid server in San Francisco and an OpenSolaris laptop in Sweden acting as the client), it turns out that the benchmark takes about 19 seconds per iteration, or 19 ms/request.

It's not too bad, but lets see if we can improve. First of all, we should switch from an embedded to a standalone SQL database. By watching 'top' on our single server, it appears that the ESXX Java process is using up all the CPU; however, since the DB is embedded, we don't know what is the limiting factor; it could be the XSLT transformation, the JavaScript code or the SQL queries.

Since the DB can't easily be scaled horizontally, we deploy a 3 core, 4 GB RAM virtual server and start H2 in server mode on that server, using the following command line:

java -cp h2/bin/h2-1.1.107.jar -Dh2.bindAddress=10.102.228.78 org.h2.tools.Server -tcp -tcpAllowOthers

We then modify the following line in /var/www/ajax-blog/blog.js, to make it connect to the external H2 server instead of using H2 in embedded mode:

var blog = new Blog("jdbc:h2:tcp://10.102.228.78/~/Blog;AUTO_RECONNECT=TRUE", "admin", "secret");

Re-running the benchmark reveals that the SQL database sits mostly idle. That's hardly surprising. After all, how much CPU can a few SELECTs from a 100-something row table use? Knowing this, I could have just started the H2 database as a process on the initial web node, but it's always easy to be wise after the event.

Anyway, now we know how to improve the performance. Let's deploy another web front-end, add a load balancer and rerun the benchmark. The result? Almost twice the performance, or 10 ms/request!


And all you need is a web browser, an SSH client and a credit card. The Internet is truly an amazing place.

PS. With four front-ends, we're no longer CPU limited and probably need to tweak the Apache configuration (it's forking like crazy) or ESXX's thread pool settings to obtain further performance improvements.

Friday, May 22, 2009

From localhost to live in 60 minutes using GoGrid

I made an experiment today. The question I wanted to answer was this:
Given a locally developed ESXX application, running on my laptop, how long would it take to go live, assuming you own no servers or Internet connection suitable for such deployment?

For this, I turned to my favourite grid/cloud service, GoGrid. GoGrid is pretty amazing. With just a few clicks in their admin UI, you can create all kinds of servers in their data center, and they will be available online within minutes. Pretty cool stuff, and a perfect match for ESXX, considering the "friendliness" factor!

The app I tested was the Ajax Blog tutorial, which is available as one of the examples in the ESXX distribution.

So I logged in to my GoGrid account and created a "Web/App Server" called "ajax-blog" using the 64-bit CentOS 5.1/Apache 2.2 image. About 15 minutes later, the server was online and I could log in as the root user. After a "yum update" and a reboot, the system was updated to CentOS 5.3 and ready to be configured according to my requirements.

All in all, setting up the "hardware" and the base OS took about two minutes of work and forty minutes of waiting.

First up: software installation. We need to add the ESXX RPM repository, install Java and (unfortunately) we also need to compile mod_fastcgi ourselves. No big deal.

[root@17914_1_20449_109821 ~]# cat > /etc/yum.repos.d/esxx.repo
[esxx]
name=ESXX
baseurl=http://esxx.org/repos/rpm/
enabled=1
gpgcheck=0

[root@17914_1_20449_109821 ~]# yum install esxx java-1.6.0-openjdk httpd-devel.x86_64
...
Complete!
[root@17914_1_20449_109821 ~]# service esxx start
Starting esxx: [ OK ]
[root@17914_1_20449_109821 ~]# wget http://www.fastcgi.com/dist/mod_fastcgi-2.4.6.tar.gz
...
[root@17914_1_20449_109821 ~]# tar xfz mod_fastcgi-2.4.6.tar.gz
[root@17914_1_20449_109821 ~]# cd mod_fastcgi-2.4.6
[root@17914_1_20449_109821 mod_fastcgi-2.4.6]# make -f Makefile.AP2 top_dir=/usr/lib64/httpd/ local-install
...
[root@17914_1_20449_109821 mod_fastcgi-2.4.6]#

Time to install our ESXX app. We'll create a copy of the Ajax Blog tutorial in /var/www/ajax-blog and tweak the settings a bit:

[root@17914_1_20449_109821 ~]# cp -r /usr/share/doc/esxx/examples/ajax-blog /var/www/
[root@17914_1_20449_109821 ~]# chgrp apache /var/www/ajax-blog/
[root@17914_1_20449_109821 ~]# chmod g+w /var/www/ajax-blog/

Edit /var/www/ajax-blog/blog.js and change the line that creates the 'blog' object to include the full path of the SQL database and a new admin password:
esxx.include("src/Blog.js");

XML.ignoreWhitespace = false;
XML.prettyPrinting = false;

var blog = new Blog("jdbc:h2:/var/www/ajax-blog/Blog", "admin", "secret");

Next, create the file /etc/httpd/conf.d/ajax-blog.conf as follows:
LoadModule fastcgi_module modules/mod_fastcgi.so

FastCGIExternalServer /usr/sbin/esxx -host localhost:7654 -pass-header Authorization

# Install handler for all '*.esxx' files
AddType text/x-esxx .esxx
Action text/x-esxx /cgi-bin-esxx
ScriptAlias /cgi-bin-esxx /usr/sbin/esxx

# The Ajax Blog
Alias /blog /var/www/ajax-blog/public
RedirectMatch ^/blog/?$ /blog/index.esxx

Restart httpd, and yeah, that's it. The app is now live and you should be able to visit it at http://YOUR-SERVER-IP/blog/. How's that for friendly deployment?

Thursday, May 21, 2009

ESXX advances into beta

Whoo, long time, no see ... It's been a while since the last blog post, but today, we celebrate the fact that ESXX is no longer considered alpha quality with an all-new look of esxx.org.

I've already blogged about some of the new features in this release, but I just want to mention that there are now two tutorials available in the Wiki. The first one is very basic, while the second one demonstrates how easy it is to create a fully functional, database-backed blog using ESXX, complete with an advanced AJAX administration UI.

It's definately worth a read!

Sunday, February 22, 2009

JavaScript web applications/browser services, the ESXX way

In ESXX, the difference between a web service (a program that produces XML or JSON intended for other programs) and a web application or browser service (a program that produces HTML documents intended to be viewed by a human using a web browser) is minimal. Basically, it's only the data format (XML/JSON vs HTML) that differs.

It's a good idea to remember this when you build a web application. The response from your request handlers should be raw XML objects that includes all information that is unique for this particular web page or form, but nothing more. Layout, common headers and footers, static sidebars … everything that is common to all pages should be added by the XSLT stylesheet and/or external CSS stylesheets.

(Similarly, if you use filters to plug in a JavaScript template engine instead of using ESXX' built-in XSLT engine, your handlers should return plain JavaScript objects, containing the information that is unique for the page in question.)

ESXX provides two tools that are useful for building web applications: stylesheet handlers and request filters. Both can be used when building web services as well. Especially filters are useful for both kind of services.

Stylesheet handlers

As mentioned before, stylesheet handlers can be triggered by both HTTP and SOAP handler responses. When such a handler returns XML, the registered stylesheet handlers are searched based on the response's content type and the part of the URI that follows the .esxx file.

The handler specifies an XSLT 2.0 stylesheet that will, just like the application itself, be compiled and cached in memory until it times out or the file is modified. The stylesheet will then be applied to the response on every request. The params property of the ESXX.Response object can be used to set stylesheet parameters (response.params.mode refers to the <xsl:param name="'mode'/"> XSLT parameter).

If you're used to thinking in Model/View/Controller terms, the matched stylesheet is a collection of views and the name of the root element in the data returned by the request handler (the controller) is the name of the view to apply.

XSLT 2.0 is much more advanced and useful than XSLT 1.0, which is currently implemented by the browsers, but should the need arise, it's possible to call any JavaScript function from an XPath context using javascript: URIs.

For instance, the following JavaScript function, defined by your web application
function MyClass() {}

function MyClass.protoype.getCurrentDate() {
let now = new Date();

return <currentDate>
<day>{now.getDate()}</day>
<month>{now.getMonth() + 1}</month>

<year>{now.getFullYear()}</year>
</currentDate>;
}

var myObject = new MyClass();
can be used by the XSLT stylesheet like this:
The current year is <xsl:value-of my="javascript:myObject" select="my:getCurrentDate()/year">.
Global functions can be called by leaving out the object from the URI:
Today is <xsl:value-of my="javascript:" select="my:Date()">.
It's also possible to call class or instance Java methods:
The current year is <xsl:value-of my="java:java.util.Date" select="my:getYear(my:new()) + 1900">.
Request filters

Request filters differ from request handlers in that more than one filter may be invoked for a single request. They are defined as follows:
<esxx xmlns="http://esxx.org/1.0/">
<handlers>

</handlers>

<filters>

<filter method={http-method}
uri={path-info}
handler={object-and-method} />
<filter … />

</filters>
</esxx>
For each request, a list of matching filters are built and the first filter in the list is invoked with two parameters: the request object and a function that calls the next filter in the list, or, if the current filter is the last, a function that calls the request handler.

Each filter is supposed invoke the next function and return its value. This way, the request handler will eventually be invoked and its response will propagate back to the client via the filter handlers. By simply not calling the next function, a filter can abort the request and return its own response.

(The next function optionally takes a single argument, which will be passed as the req parameter to the next filter or handler. If unspecified, the current request object is passed. This way, it's possible to completely replace the request object.)

Here are a few filter handler examples:
function noOpFilter(req, next) {
return next();
}

function forbiddenFilter(req, next) {
// Abort request by not calling next() if cookie not set
if (req.cookies.secret != "Open Sesame") {
return [ESXX.Response.FORBIDDEN, {}, "Access denied"];
}
else {
return next();
}
}

function postFilter(req, next) {
let res = next();

// Set XSLT params
res.params.mode="silver";

// Add HTTP response header
res.headers["Cache-Control"] = "max-age=1800, public";

return res;
}

Saturday, February 21, 2009

JavaScript request filters

It's never to late to change your mind, and I'm happy to let you know that I checked in support for request filters in the trunk yesterday.

Filters differ from handlers in that more than one filter may be executed for a given request. Before a request is serviced, a matching filter chain is built and all all matching filters are then executed in turn as part of the request handling.

Each filter may modify (or even replace) both the request and response object or simply abort the request by not calling the next filter in the filter chain.

In ESXX, a filter is called with two parameters: the request object and a function that calls the next filter in the chain. The filter is supposed to return a response that will be sent to the client (optionally passing through the XSLT engine). That means that a filter may:
  • Abort the request by not calling the next() function at all.
  • Tweak the request by modifying the request object before calling the next() function.
  • Tweak the response by modifying the result of the next() function.
Filters are great for password-protecting parts of a site, setting XSLT stylesheet parameters or modifying responses (like converting XML to JSON or processing views in JavaScript instead of using ESXX' built-in XSLT engine).

Do have a look at an example to see how ESXX filters work.

Sunday, February 15, 2009

JavaScript servlets

Today, I checked in support for running ESXX as a inside servlet a Java EE application server. Why, someone may ask, run an application server within another application server?

Well, for starters, it allows you to use ESXX with your existing infrastructure and tool set. If you're already using an app server such as Glassfish or Tomcat, you can add modules powered by ESXX by rebuilding the war archive. Simply add your JavaScript code files, change the http-root parameter in web.xml, rebuild and deploy the generated esxx.war archive on your app server.

More work is required, but if you're interested to try it, just check out trunk from the Subversion repository at http://svn.berlios.de/svnroot/repos/esxx/trunk/ and type ant jee-war to build your own esxx.war archive. By default, it serves the ESXX example applications.