Archive for the ‘comet’ Category

Screenpush: Publish Your Browser Window Over Comet

Posted on 2008-12-10 in comet, javascript

Back in December of 2007, Simon Willison came up with a very neat Comet demo written with just a few lines of Dojo code. It’s a presentation/slideshow app where a master client page, controlling the slideshow, pushes the URLs of slide images out to subscribed clients over a Comet channel. The idea is that subscribers can view a conference presentation on their laptops, instead of the big screen—or even from another room, or over the internet.

In Simon’s implementation, the slides are simply a series of JPEGs, hosted on the Comet server. I’ve been playing around with HTML Canvas recently, and began to wonder if I could use it for something more sophisticated: is it possible to push arbitrary web content between browsers as a dynamically generated image?

The answer is ‘Yes’. And also ‘No’.

The ‘Yes’ is thanks to two canvas methods that do nearly all of the necessary work. The first, 2dContext.drawWindow() renders an arbitrary part of the browser window into a canvas, which allows you to effectively screengrab the browser from JavaScript. Following that, canvas.toDataURL() outputs a representation of the canvas as a PNG or JPEG using the data: URL scheme, which can be wrapped in JSON and pushed to subscribed clients for display.

However, the ‘No’ is due to drawWindow being a non-standard, Mozilla-only addition to the canvas API. Furthermore, security restrictions—designed to prevent bad guys from stealing the content of your screen—mean that the drawWindow function is only available to the browser itself, or to browser add-ons.

So I made a proof-of-concept Firefox extension called ‘Screenpush’. Here’s a demo:

The Screenpush Extension

The Screenpush UI is just an icon in the Firefox status bar (which reflects connection status), and a context-popup menu. There are two methods of publishing your screen to subscribing clients. You can manually invoke a screenpush by double-clicking the status bar icon (equivalent to “Push Current Screen” in the popup menu), or you can check the “Auto-Push Screen” menu item. In auto-push mode, Screenpush will publish a grab whenever a new page is loaded, or the content of a page changes, whenever you click or type in a page, and whenever you scroll the page, resize the browser or switch tabs. This mode works well for broadcasting an S5 slideshow, or demoing navigation through a website.

When auto-push is active, the “Track Mouse Pointer” option is available. When you enable this setting, Screenpush is triggered by mouse movements, tracking your cursor position with a fake mouse pointer drawn onto the published image (it even has a nice drop shadow in FF3.1). In this mode, Screenpush works like a rudimentary streaming screencast. It will react to mouseover JavaScript menus, CSS hover effects etc., but the downside is the generation of much more Comet traffic, so this mode may not perform too well across the internet.

Extension Settings

The Screenpush extension has a simple options screen (accessed via the pop-up menu, or from the Firefox Add-Ons dialog), where you can configure the Cometd service location and Bayeux channel to use, as well as selecting PNG or JPEG output, and JPEG quality. PNG looks nicest for simple pages, but scales badly for anything detailed or photographic. For instance, Google Maps screens in Satellite view can render to PNGs larger than a megabyte, which makes the Comet push pretty sluggish. Using JPEGs, with quality set around 70, seems to be the best compromise between image quality and push responsiveness.

There’s also one Screenpush setting that isn’t exposed in the UI. It’s accessible through the about:config screen, under the “extensions.screenpush.pushDelay” key. This value is the lag in milliseconds between an event firing and a screengrab being taken and published, and acts as a throttle on the frequency of screen pushes, and hence on Comet traffic. The lag also allows a few moments for a rapid sequence of events to conclude, such as those resulting from a mouse move, window scroll or a DOM modification cascade. You might want to experiment with this setting to make Screenpush a bit snappier.

The Subscriber Client

The Screenpush subscriber is just a static HTML page and a bit of JavaScript. Enter your Cometd service URL and channel (the default is /screenpush/demo, same as in the Extension), and hit Connect. Once you get the “Connected” message, the page will auto-update as the Screenpush extension publishes images.

The subscriber works in most modern browsers, with some minor caveats (see below). Since Cometd can use JSONP for cross-domain communication, the subscriber page can be hosted anywhere, even loaded straight from the filesystem. The page just needs to have its Dojo libs in the same directory.

As with Simon’s demo, the code for the subscriber is mostly trivially simple: a few lines of JavaScript to receive a data: URL via Comet and set it as an image source.

var receive = function(comet) {
  var img = dojo.byId("screenpush-image");
  img.src = comet.data.dataUrl;
}

dojox.cometd.subscribe(cometd.c,receive);

A Note on Browser Compatibility

The Screenpush subscriber works very well in Firefox and Chrome, and mostly works in Safari, Camino and Opera. IE7 is a non-starter as there it has no support for the data: URL scheme. I haven’t bothered to try IE8.

Safari and Opera appear to have problems maintaining a long-lived cross-domain Comet connection, eventually leading to disconnection. You can avoid this issue by hosting your Screenpush subscriber page on the same host and port as the Cometd service.

Camino just has a minor issue where it refuses to display certain data: images, meaning that the Screenpush subscriber will intermittently display empty images.

The Cometd Server

There’s no need for any Screenpush-specific server side code, you just need a compliant Bayeux/Cometd server. I downloaded Jetty7, which runs a Cometd service out of the box at http://<yourhost>:8080/cometd/cometd. You may need to tweak the maximum form post size allowed by your server, especially if using PNG compression (see Extension Settings). If you start to see the Screenpush error icon when you push a large or detailed screen, this will be the problem. I raised the default maximum form size on Jetty from 200Kb to 10Mb by adding this to etc/jetty.xml:

<Call class="java.lang.System" name="setProperty">
  <Arg>org.mortbay.jetty.Request.maxFormContentSize</Arg>
  <Arg>10000000</Arg>
</Call>

Miscellaneous Technical Detail

The core of the Screenpush extension is little more than the drawWindow/toDataURL calls I mentioned earlier, along with code to push out the generated data URL over Comet. The window content gets rendered to a hidden canvas element embedded in the statusbar next to the Screenpush icon. Here is a simplified version of the actual push code:

// Get a reference to the window object
var win = gBrowser.selectedBrowser.contentWindow;

// Get the element that knows window dimensions and scroll info
var clientInfo = win.document.compatMode == "CSS1Compat" ?
                    win.document.documentElement :
                    win.document.body;

var region = { x: win.scrollX,
               y: win.scrollY,
               width: clientInfo.clientWidth,
               height: clientInfo.clientHeight };

// Size the canvas
canvas.style.maxwidth = region.width + "px";
canvas.style.maxheight = region.height + "px";
canvas.width = region.width;
canvas.height = region.height;

// Render to the canvas
var ctx = canvas.getContext("2d");
ctx.clearRect(region.x, region.y, region.width, region.height);
ctx.drawWindow(window, region.x, region.y,
                    region.width, region.height, "#000");

// Push out the rendered image
var dataUrl = canvas.toDataURL();
jQuery.comet.publish(cometdChannel, { dataUrl: dataUrl });

I’m using a hacked version of the jQuery Comet plugin to connect to the Comet service and push out image data (I don’t know if using jQuery within an extension is a great idea, but I’m not aware of any standalone Cometd client libraries at present). I had to prevent jQuery interpreting the Ajax requests made from browser chrome as cross-domain request, and switching to JSONP mode. JSONP is a non-starter inside an extension, as you can’t load external scripts from chrome—it would be a huge security hole.

Tracking Cometd connection state is also slightly tough. The Screenpush subscriber client actually does a better job of this than the extension itself, as the Dojo Comet library it uses supports client-side meta events that you can subscribe to for information about connection attempts. I had to hack my own rough interpretation of the client-side /cometd/meta/ channel into jQuery Comet. This feeds back information about connection attempts to the UI, where problems are reflected by a change to the Screenpush statusbar icon. The main outstanding problem is that since Cometd connections are long-lived, they don’t timeout easily, so it can take a long time to learn that there’s a connection issue.

The other significant part of the extension code is event subscription wrangling. Screenpush listens on the DOMAttrModified, DOMNodeInserted and DOMNodeDeleted events, and the problem is that dynamic pages tend to generate great cascades of these events when updating the UI. This also happens with scroll, mousemove, and resize events. Obviously, it would be very inefficient to push out a screengrab every time one of these events fires, so my low-tech solution is the pushDelay timer I described in Extension Settings. To make things more efficient, I also cache the last-published data URL, and use it to see if the screen has actually changed since the last push (with Track Pointer switched on, the composited mouse pointer is itself part of the rendered image, so moving the mouse in this mode will correctly push out a new image with the pointer position updated).

Try It Out

If you want to try out Screenpush (and report bugs to me), then first of all grab the Screenpush extension XPI . You can download Jetty7 from Codehaus—I used the latest pre-release. The Screenpush subscriber webpage and files are zipped up in screenpush-subscriber.zip. Or you can just browse to the subscriber page hosted on this site, but take note of the cross-domain Comet issue with certain browsers.

And let me know if you find any practical use for Screenpush—I’ll tidy it up and release it properly if anybody wants to use it.

Written by phil on December 10th, 2008

Tagged with , , , , ,

Comet & Java: Threaded Vs Nonblocking I/O

Posted on 2008-11-20 in comet, java

I’ve written a couple of times about Jetty’s Continuations mechanism, most recently in Comet and Reverse Ajax. Continuations are Jetty’s approach to scaling Java servlets to the perculiar profile of Comet traffic. In common with other Java servers’ Comet extensions, Continuations use Java’s asynchronous “NIO” API to circumvent the limitations of the thread-per-request servlet model. However, the advent the efficient Native Posix Threading Library on Linux has led to some controversy over the benefits of NIO versus old-fashioned threaded concurrency for serving general web traffic. I decided to run some tests to see if this is true for handling Comet requests too.

For the impatient, here’s a pretty chart. You can skip to the end to see what it means:

Comet connection rates for NIO versus multi-threaded request handling

Comet and the Servlet Model

Java servlets are designed around a thread-per-request model—each incoming request is allocated its own servlet thread, allowing many concurrent requests to be processed simultaneously by the server. That’s ideal for regular web traffic, where the aim is to deal with multiple incoming requests as quickly as possible: servlet threads concurrently perform work to generate the response for each request, and the resulting high throughput minimizes the number of requests pending at any given time.

Comet traffic, however, is quite different. When a Comet client opens a request to the server, it isn’t retrieving a resource or invoking a service to perform a task. Instead, it’s simply waiting, more or less indefinitely, for some interesting event to occur. That event might be a chat message from another Comet user, or a “new mail” notification from a webmail application; typically, Comet events will occur infrequently, and result in only a small amounts of data being sent to the client. A busy Comet server will therefore have a very large number of requests open at any given time—one for each and every client—but most of the time those requests will simply be idling.

As a result, thread-per-request is a poor fit for Comet traffic. Creating threads consumes resources, imposing a limit on the number of clients a Comet app can support. Moreover the primary benefit of thread-per-request—the ability to do work for many requests simultaneously—is of little relevance to a Comet app where the majority of requests will be merely waiting.

The Role of NIO

Jetty6’s response to the problem of Comet traffic is its Continuations mechanism. Without getting into too much detail, Continuations are a proprietary extension to the servlet mechanism, which allow requests to enter a waiting state without consuming a servlet thread. Jetty does this using Java’s NIO API, the low-level non-blocking select-style I/O mechanism introduced in Java 1.4.

Jetty is not alone in providing a non-blocking mechanism tailored to Comet applications. Tomcat now has its own CometProcesser to provide non-blocking request handling, Sun’s Grizzly project has its own NIO-based mechanism, and the soon-to-be-released Jetty7 introduces a second proprietary non-blocking Comet API to supercede Continuations.

This is all pretty confusing, but the upcoming Servlet 3.0 specification (JSR-315) includes “Async and Comet support”, which intends to unify these proprietary mechanisms under a standard Comet-centric non-blocking servlet mechanism. The final form this will take is yet to crystallize, but it seems likely that implementations will use an NIO approach.

There has been some interesting debate, however, about the performance of NIO versus regular multithreaded request handling in a highly concurrent environment, essentially challenging the relevance of NIO for serving web traffic.

NIO Performance vs. NPTL

Here’s a good post by Paul Tyma of Mailinator fame (Mailinator is made from Java and deals with high volumes of both mail and web traffic). Paul basically says that since the Native Posix Thread Library (NPTL) arrived in Linux 2.6, multithreading is so cheap that it outperforms the select-based NIO alternative. He goes on to quote some impressive benchmarks from Rahul Bhargava, which show that multithreading gives at least 25% greater throughput that NIO, in a test with 1,700 concurrent connections.

Now, throughput isn’t everything to Comet applications, being of lesser importance than the ceiling on the connected client count. But it’s certainly worth taking a look at NPTL vs. NIO request handling in a Comet-like simulation. The difficult part is writing a Comet client that can efficiently generate thousands of simultaneous requests.

Fortunately enough, last.fm CTO Richard Jones recently wrote a series on scaling Comet applications written with Erlang’s MochiWeb framework. In RJ’s first article, he demonstrates a simple Erlang-based client-server Comet test that scales to 10,000 concurrent requests.

I’m going to re-use RJ’s test setup, but replace his MochiWeb server component with a Java servlet, and compare NIO and multi-threaded performance. Read the original article for background and environmental setup.

The Code

Here’s my test servlet, written with Jetty Continuations (I’m using Continuations due to familiarity, but the results should apply to other NIO-based Comet servlet mechanisms too):

public class CometServlet extends HttpServlet {
  private static int SLEEP_TIME = 10000;

  public void doGet(HttpServletRequest req,
                    HttpServletResponse res)
              throws ServletException, java.io.IOException {

    String reqId = req.getParameter("id");

    Continuation c = ContinuationSupport.
                       getContinuation(req,null);

    if (!c.isPending()) {
      res.setContentType("text/plain");
      res.setHeader("Transfer-Encoding", "chunked");
      res.getWriter().println("Welcome! Your Id: "+reqId);
      res.flushBuffer();

      TimedNotifier.getInstance().addListener(
                                  new EventListener(c));
      c.suspend(0);
    }

    while(true) {
      res.getWriter().println("Chunk "+c.getObject()+
                                        " for Id: "+reqId);
      res.flushBuffer();
      c.reset();
      c.suspend(0);
    }
  }

  private static class EventListener {
    Continuation c;
    int chunkIdx = 0;

    EventListener(Continuation c) {
      this.c = c;
    }

    void onEvent() {
      c.setObject(++chunkIdx);
      c.resume();
    }
  }

  private static class TimedNotifier implements Runnable {
    private static TimedNotifier instance = new TimedNotifier();

    private Collection<EventListener> listeners =
                     new ConcurrentLinkedQueue<EventListener>();

    private static int numChunks=0;

    private TimedNotifier() {
      new Thread(this).start();
    }

    static TimedNotifier getInstance() {
      return instance;
    }

    void addListener(EventListener listener) {
      listeners.add(listener);
    }

    public void run() {
      while(true) {
         try {
           Thread.sleep(SLEEP_TIME);
         } catch (InterruptedException e) {}

         int numListeners = 0;
         for (EventListener listener : listeners) {
           numListeners++;
           listener.onEvent();
         }

        numChunks += numListeners;
        System.out.println("Listeners: "+numListeners+
                                  ", chunks: "+numChunks);
      }
    }
  }
}

The code is pretty straightforward. There’s a TimedNotifier class that runs on its own thread, sending an event to any attached EventListeners every ten seconds. The servlet proper just creates an EventListener for its incoming request, waits for the listener to receive an event, pushes out a few response bytes (a “chunk”), and then waits again. Requests are held open indefinitely.

So what’s happening with the Continuation? When a request’s Continuation object is first obtained, its isPending() method returns false, so the HTTP headers and initial part of the response are sent straight away. The Continuation is then suspended until the request’s EventListener resumes it again. The suspend() and resume() calls themselves are pretty much analagous to the normal Java Object wait()/notify() mechanism—conceptually these methods just pause code execution on a thread, and allow it to be resumed from another thread.

The trick of Continuations is that the servlet thread is freed by the suspend() call, so it can go off and service other requests. When resume() is invoked, Jetty gets a free thread from its threadpool, and uses it to call the servlet’s doGet() method again. This time, however, the isPending state of the Continuation is true, so by making any code prior to the suspend() call conditional on isPending(), execution will effectively pick up where it left off.

The Continuation interface also provides a setObject()/getObject() pair to allow arbitrary context to be passed around easily; my code uses this mechanism to pass the request’s chunk count back to the servlet.

One other notable feature of Continuations is that the implementation falls back to a wrapper over Java’s wait/notify mechanism if Jetty’s NIO features are disabled at runtime. This means I can test NIO and multi-threading using this same piece of code, just by altering my Jetty config.

Here’s the output of the servlet if you connect to it with lynx:

$ lynx -source http://buttercup:8081/comet?id=1
Welcome! Your Id: 1
Chunk 1 for Id: 1
Chunk 2 for Id: 1
Chunk 3 for Id: 1
Chunk 4 for Id: 1
Chunk 5 for Id: 1

I’m going to use RJ’s Erlang floodtest client to open up thousands of simultaneous requests to my Comet servlet, and see how both NIO and threading perform. The only major change I made to Richard’s floodtest setup was to have the Erlang client create only two connections per second, rather than ten, so the system could be observed over a longer period.

Test Results

I used Java6 on Linux 2.6.20 for these tests—allowing Java to employ the epoll mechanism for NIO—on a fairly puny and ancient Linux box, with 1GB of RAM and a single-core Pentium M. Hardly server-class stuff, so it was possible to stress the system with a relatively small number of requests. VM options were a maximum heap size of 256MB and the default thread stack size of 64KB (-Xms64m -Xmx256m -Xss64k). Each test’s results were averaged over three runs.

With NIO-backed Continuations, Jetty was able to handle approximately 10,820 connected subscribers before dying with a "java.lang.OutOfMemoryError: Direct buffer memory". This error isn’t the caused by the JVM running out of heap space, it occurs when there’s no memory left for the JVM to allocate to NIO buffers (I believe you can override the default ceiling with -XX:MaxDirectMemorySize, but I wanted to use the same basic options for both tests). The Java process had a resident size of about 490MB by this time, though its CPU usage was mostly zero, spiking briefly up to about 20% every ten seconds as new chunks were pushed out. However, overall load average was high, above one. It’s hard to say how much of this was down to Java, or to the Erlang client, since both were running on the same machine.

The multithreaded test fell over at around 6,860 connected subscribers, complaining of a "java.lang.OutOfMemoryError: Java heap space". The entire heap had been exhausted by thread creation. Raising the Java heap size would have helped here but, as mentioned, I wanted a level playing field for the two tests. The JVM’s resident size was 505MB just before failure, but unlike the NIO test, top reported java’s CPU usage to be pegged at 95%. That was possibly due to the garbage collector working flat out to free up some heap. Again, load average was above one by this point.

Connection Rates

Now, let’s take another look at that chart:

Comet connection rates for NIO versus multi-threaded request handling

The data here was derived from the servlet’s console output: sampling every tenth line of output gave me the cumulative total of connected subscribers at each 100 second interval. I calculated the deltas between each interval, and plotted them against time elapsed. The expected outcome would be a straight line at 200 connections per interval, since the Erlang client establishes a new connection every half second.

Both the NIO and threaded tests start off at pretty much the expected rate (+/- a connection here or there, the graph ought to have error bars). The rates both begin to decline around 2,500 connections, however, NIO noticeably more steeply. In comparison, the threaded line stays close to 200 for most of its run.

This would seem to confirm that overall throughput is lower with NIO vs. NPTL, though I can only guess at the cause. In the NIO/Continuations case, Jetty still has a threadpool that it uses to service the requests. I picked an arbitrary threadpool size of 100 for this test, so at 10,000 client connections each thread would be cycling through 100 requests in sequence to send out message chunks. I’d hazard that this tied up Jetty briefly every few seconds, and prevented new connections from being accepted at the rate they were received.

In their death throes, both tests went a little haywire, probably due to frantic garbage collection and/or swapping. In the NIO case, it’s also possible that the Erlang client was stuggling to attempt new connections due to overall system load. I don’t think much can be concluded from those wild spikes, anyway.

In conclusion, then, NPTL does appear to have a performance edge over NIO when it comes to serving large numbers of concurrent Comet clients, but the trade-off is a lower ceiling on concurrent clients.

Of course, this is a contrived test, with no network involved, and no actual work happening on the server. In the real world, as always, you should make your own measurements and see which works best for your own Comet app.

Written by phil on November 20th, 2008

Tagged with , , , , ,