Getting Cloudy: Clojure on Google App Engine

Some weeks ago I joined a handful of ThoughtWorkers invited to test the new Google AppEngine’s Java API. Unfortunately I had a project requiring a lot of attention during most of this period but once back on the beach I found some time to play around with it.

Cloudy Skies

Google AppEngine (GAE) is Google’s shot in this cloud-computing segment. Using the service you deploy your system in Google’s infrastructure and are allowed to use BigTable, MapReduce and other tools some tools that the Internet giant developed in the past years.

Google’s take on cloud-computing is to offer a full development platform and not only “somewhere to deploy to”. Using Amazon AWS, for example, you have access to virtual servers but Amazon has very little control over how you develop your application.

I like to think about Amazon and Google’s approaches as:

  • The Cloud as Deployment Strategy: That is Amazon. Your application doesn’t need anything to be cloud’ed, the cloud is just a deployment option.
  • The Cloud as the Development Platform: That is Google. The vendor doesn’t offer only a deployment solution but also impacts in how you develop your application, supplying you with tools and services but often limiting your options to whatever is officially supported.

Google’s model can be a bad thing when it gets on your way but in GAE’s case it is often just a matter of enforcing best security and scalability best practices. Some of them may be really odd for a simple application though.

This model has a big advantage: as Google has control over the environment it can offer some services and optimisations that deployment-only clouds can’t. Amazon is aware of that and just released a MapReduce service.

Cloudy JVM

A terrible limitation in the platform strategy followed by Google is that you can only use the tools they support. For some time Python was the only language supported and that may explain what not so many applications use GAE.

This changes dramatically with the new Java support. We all know Java is probably the most popular language in our industry but more than that: during the past years more and more languages were created or ported to the JVM. By supporting Java Google automatically adds support for Groovy, Ruby, Scala and many other languages –as long as they can work under the restrictions imposed by the platform.

My Experience so Far

As I said before I didn’t have a lot of time to play around GAE. In the last two days I wrote a simple Clojure application that talks to Twitter and perform some very basic data transformation. The code is available on Github, the app is just a toy and works on the local server but I’m still adding twitter authentication to make it work properly on GAE. Feedback welcome.

The first time I tried using Clojure on GAE, weeks back, I had so many Classloader issues that I thought about giving up. During the past weeks both GAE and Clojure were updated many times and it seems that the class loading issues were resolved –maybe some Clojure committer was part of the GAE alpha testing program too.

GAE works using a Java Servlets 2.5 environment. My goal was to not use the Java language at all -and neither use a web framework like Compojure or Ring- so the easiest way to get this running as to export a namespace as a Servlet, something quite common in the community:

(ns BestMateServlet
  (:gen-class
   :extends  javax.servlet.http.HttpServlet)
(:use mate))

(defn- write-to-resp [resp text]
  (. (. resp getWriter) println text))

(defn -doGet [_ req resp]
  (write-to-resp resp (process-request req)))

And then using this Servlet in the web.xml file:

<servlet>
 <servlet-name>mybestmate</servlet-name>
 <servlet-class>BestMateServlet</servlet-class>
<servlet>
<servlet-mapping>
 <servlet-name>mybestmate</servlet-name>
 <url-pattern>/user/*</url-pattern>
</servlet-mapping>

That works fine. The second step was to try to get rid of Eclipse. GAE has a fully functional Eclipse plugin but when I am writing anything but Java I use emacs. The GAE documentation points to some nice macros that allow you to start your local dev environment (using a modified Jetty), deploy to production or even download logs. Shouldn’t be that hard to convert some of those to Capistrano.

There are many limitations in the classes you can use. As an example I tried to get the Apache Commons HTTP client in my project but it uses raw sockets, something GAE forbids. My next option was to use Lazy XML but GAE’s SecurityManager will not allow you to use Clojure agents, therefore the call to parse-seq dies with a “java.security.AccessControlException: access denied (java.lang.RuntimePermission modifyThreadGroup)”.

Next try was using Duck Streams. It sort of worked but I had problems closing buffers –still not sure if it was my ignorance or something that doesn’t work right in GAE. So I got back to Google’s advice and used the URL class. It was actually pretty easy:

(defn- GET-body [uri]
  (with-open
      [reader (BufferedReader. (InputStreamReader. (. (URL. uri) openStream)))]
      (apply str (line-seq reader))))

(defn- twitter-page-for [username]
    (GET-body (twitter-page-url username)))

Besides those issues I did not have many problems to run Clojure in GAE. One may argue that it is a big issue that you can’t use agents inside the server and I have to agree with that. Google’s answer makes sense and if you check the documentation you will see that there are many ways to get high performance without using those.

The problem is that Clojure shines in single JVM concurrency. Lots of libraries rely on actors and the like being available and life is not that sweet when you can’t use those.

Development Feedback Cycle

As I said before you have a nice Eclipse plugin –nice enough for a beta- but you can use ant to do pretty much all you need during development. That helps a lot but the feedback cycle is still a bit too long.

The local environment tries to be very close to production but problems will always arise when you finally deploy. It is probably not a good idea to do this frequently during development but I think that in a real project at least this should be a step in the Continuous Integration build. Even during local development you have to constantly stop and start the local server.

Conclusions?

I think that the new platform is pretty promising. Google has some advantages over Amazon in the multiple services it makes available. At the same time Amazon is a really strong player.

As pricing strategies for both options are really similar I think that right now the choice is between services provided by Google or flexibility provided by AWS. Amazon is adding services to their platform and Google certainly will evolve their platform in the near future.

I use AWS for a while now and I am really satisfied in most aspects. Even services like S3 you have enough freedom to use in multiple ways. My experience with GAE was very positive though and I will definitely think about it in my next project.

Update: Coverage by fellow ThoughtWorkers:

John Hume
Paul Hammant
Ola Bini
Sriram Narayan

8 Responses to “Getting Cloudy: Clojure on Google App Engine”


  1. 1 jeo Apr 8th, 2009 at 9:09 pm

    Nice article. No MapReduce available yet though :(

  2. 2 skeptomai Apr 9th, 2009 at 1:22 am

    Nice article, but I believe you have mischaracterized both GAE and AWS. I cannot speak for the builders of GAE, but I was one of the builders of EC2. Here’s my opinion, not the opinion of Amazon, my former employer. EC2, like other Amazon Web Services, was designed as a composeable primitive. It is meant to enable much broader behaviour and to enable the construction of more specific verticals by either Amazon or its ecosystem of customers. Granted, its strength right now is as a deployment platform, but look at the integration already with S3 as blob storage, EBS as a “SAN-like” horizontal store and datasets published by others as snapshots and consumable by all. Now, there is the (vertical) map-reduce service. It was our belief that starting at the bottom and making nearly everything available would vastly increase the possibilities. Contrast that with GAE. First, it was only Python and only one style of application. You could not implement EC2, or other EC2-deployed apps on GAE, but the converse is true: you can implement GAE on EC2. The two platforms will grow toward a middle ground. EC2 will accrue specific capabilities and GAE will broaden support. But, it’s unfair to state that EC2 is meant only to be a deployment strategy. Let me ask this: you are already a user of AWS. Have you already been running Clojure there?

  3. 3 Alejandro "Badugi" Russo Jun 26th, 2009 at 3:27 am

    Cloud computing is only a matter of years away before it’ll be mainstream, google will be one of the leaders but they’ll never get out of beta & it’ll go offline twice a week like gmail :D

  1. 1 Juixe TechKnow » Google App Engine for Java Pingback on Apr 11th, 2009 at 6:29 am
  2. 2 Clojure в GAE/J « Reverse Engineer Pingback on Apr 14th, 2009 at 7:44 am
  3. 3 Google App Engine SDK for Java « c# to javascript, actionscript Pingback on Apr 14th, 2009 at 3:42 pm
  4. 4 Dev Blog AF83 » Blog Archive » Veille technologique : Langages, Google App Engine, Javascript, Ruby, Rails, Profiling, PDF, Infrastruture, Performances web Pingback on May 5th, 2009 at 8:23 am
  5. 5 Refletindo sobre Tendências « Fragmental Pingback on Jul 10th, 2009 at 12:43 am








Creative Commons License

This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.