Comet & Java: Threaded Vs Nonblocking I/O

Posted on 2008-11-20 in comet, java

I’ve written a couple of times about Jetty’s Continuations mechanism, most recently in Comet and Reverse Ajax. Continuations are Jetty’s approach to scaling Java servlets to the perculiar profile of Comet traffic. In common with other Java servers’ Comet extensions, Continuations use Java’s asynchronous “NIO” API to circumvent the limitations of the thread-per-request servlet model. However, the advent the efficient Native Posix Threading Library on Linux has led to some controversy over the benefits of NIO versus old-fashioned threaded concurrency for serving general web traffic. I decided to run some tests to see if this is true for handling Comet requests too.

For the impatient, here’s a pretty chart. You can skip to the end to see what it means:

Comet connection rates for NIO versus multi-threaded request handling

Comet and the Servlet Model

Java servlets are designed around a thread-per-request model—each incoming request is allocated its own servlet thread, allowing many concurrent requests to be processed simultaneously by the server. That’s ideal for regular web traffic, where the aim is to deal with multiple incoming requests as quickly as possible: servlet threads concurrently perform work to generate the response for each request, and the resulting high throughput minimizes the number of requests pending at any given time.

Comet traffic, however, is quite different. When a Comet client opens a request to the server, it isn’t retrieving a resource or invoking a service to perform a task. Instead, it’s simply waiting, more or less indefinitely, for some interesting event to occur. That event might be a chat message from another Comet user, or a “new mail” notification from a webmail application; typically, Comet events will occur infrequently, and result in only a small amounts of data being sent to the client. A busy Comet server will therefore have a very large number of requests open at any given time—one for each and every client—but most of the time those requests will simply be idling.

As a result, thread-per-request is a poor fit for Comet traffic. Creating threads consumes resources, imposing a limit on the number of clients a Comet app can support. Moreover the primary benefit of thread-per-request—the ability to do work for many requests simultaneously—is of little relevance to a Comet app where the majority of requests will be merely waiting.

The Role of NIO

Jetty6’s response to the problem of Comet traffic is its Continuations mechanism. Without getting into too much detail, Continuations are a proprietary extension to the servlet mechanism, which allow requests to enter a waiting state without consuming a servlet thread. Jetty does this using Java’s NIO API, the low-level non-blocking select-style I/O mechanism introduced in Java 1.4.

Jetty is not alone in providing a non-blocking mechanism tailored to Comet applications. Tomcat now has its own CometProcesser to provide non-blocking request handling, Sun’s Grizzly project has its own NIO-based mechanism, and the soon-to-be-released Jetty7 introduces a second proprietary non-blocking Comet API to supercede Continuations.

This is all pretty confusing, but the upcoming Servlet 3.0 specification (JSR-315) includes “Async and Comet support”, which intends to unify these proprietary mechanisms under a standard Comet-centric non-blocking servlet mechanism. The final form this will take is yet to crystallize, but it seems likely that implementations will use an NIO approach.

There has been some interesting debate, however, about the performance of NIO versus regular multithreaded request handling in a highly concurrent environment, essentially challenging the relevance of NIO for serving web traffic.

NIO Performance vs. NPTL

Here’s a good post by Paul Tyma of Mailinator fame (Mailinator is made from Java and deals with high volumes of both mail and web traffic). Paul basically says that since the Native Posix Thread Library (NPTL) arrived in Linux 2.6, multithreading is so cheap that it outperforms the select-based NIO alternative. He goes on to quote some impressive benchmarks from Rahul Bhargava, which show that multithreading gives at least 25% greater throughput that NIO, in a test with 1,700 concurrent connections.

Now, throughput isn’t everything to Comet applications, being of lesser importance than the ceiling on the connected client count. But it’s certainly worth taking a look at NPTL vs. NIO request handling in a Comet-like simulation. The difficult part is writing a Comet client that can efficiently generate thousands of simultaneous requests.

Fortunately enough, last.fm CTO Richard Jones recently wrote a series on scaling Comet applications written with Erlang’s MochiWeb framework. In RJ’s first article, he demonstrates a simple Erlang-based client-server Comet test that scales to 10,000 concurrent requests.

I’m going to re-use RJ’s test setup, but replace his MochiWeb server component with a Java servlet, and compare NIO and multi-threaded performance. Read the original article for background and environmental setup.

The Code

Here’s my test servlet, written with Jetty Continuations (I’m using Continuations due to familiarity, but the results should apply to other NIO-based Comet servlet mechanisms too):

public class CometServlet extends HttpServlet {
  private static int SLEEP_TIME = 10000;

  public void doGet(HttpServletRequest req,
                    HttpServletResponse res)
              throws ServletException, java.io.IOException {

    String reqId = req.getParameter("id");

    Continuation c = ContinuationSupport.
                       getContinuation(req,null);

    if (!c.isPending()) {
      res.setContentType("text/plain");
      res.setHeader("Transfer-Encoding", "chunked");
      res.getWriter().println("Welcome! Your Id: "+reqId);
      res.flushBuffer();

      TimedNotifier.getInstance().addListener(
                                  new EventListener(c));
      c.suspend(0);
    }

    while(true) {
      res.getWriter().println("Chunk "+c.getObject()+
                                        " for Id: "+reqId);
      res.flushBuffer();
      c.reset();
      c.suspend(0);
    }
  }

  private static class EventListener {
    Continuation c;
    int chunkIdx = 0;

    EventListener(Continuation c) {
      this.c = c;
    }

    void onEvent() {
      c.setObject(++chunkIdx);
      c.resume();
    }
  }

  private static class TimedNotifier implements Runnable {
    private static TimedNotifier instance = new TimedNotifier();

    private Collection<EventListener> listeners =
                     new ConcurrentLinkedQueue<EventListener>();

    private static int numChunks=0;

    private TimedNotifier() {
      new Thread(this).start();
    }

    static TimedNotifier getInstance() {
      return instance;
    }

    void addListener(EventListener listener) {
      listeners.add(listener);
    }

    public void run() {
      while(true) {
         try {
           Thread.sleep(SLEEP_TIME);
         } catch (InterruptedException e) {}

         int numListeners = 0;
         for (EventListener listener : listeners) {
           numListeners++;
           listener.onEvent();
         }

        numChunks += numListeners;
        System.out.println("Listeners: "+numListeners+
                                  ", chunks: "+numChunks);
      }
    }
  }
}

The code is pretty straightforward. There’s a TimedNotifier class that runs on its own thread, sending an event to any attached EventListeners every ten seconds. The servlet proper just creates an EventListener for its incoming request, waits for the listener to receive an event, pushes out a few response bytes (a “chunk”), and then waits again. Requests are held open indefinitely.

So what’s happening with the Continuation? When a request’s Continuation object is first obtained, its isPending() method returns false, so the HTTP headers and initial part of the response are sent straight away. The Continuation is then suspended until the request’s EventListener resumes it again. The suspend() and resume() calls themselves are pretty much analagous to the normal Java Object wait()/notify() mechanism—conceptually these methods just pause code execution on a thread, and allow it to be resumed from another thread.

The trick of Continuations is that the servlet thread is freed by the suspend() call, so it can go off and service other requests. When resume() is invoked, Jetty gets a free thread from its threadpool, and uses it to call the servlet’s doGet() method again. This time, however, the isPending state of the Continuation is true, so by making any code prior to the suspend() call conditional on isPending(), execution will effectively pick up where it left off.

The Continuation interface also provides a setObject()/getObject() pair to allow arbitrary context to be passed around easily; my code uses this mechanism to pass the request’s chunk count back to the servlet.

One other notable feature of Continuations is that the implementation falls back to a wrapper over Java’s wait/notify mechanism if Jetty’s NIO features are disabled at runtime. This means I can test NIO and multi-threading using this same piece of code, just by altering my Jetty config.

Here’s the output of the servlet if you connect to it with lynx:

$ lynx -source http://buttercup:8081/comet?id=1
Welcome! Your Id: 1
Chunk 1 for Id: 1
Chunk 2 for Id: 1
Chunk 3 for Id: 1
Chunk 4 for Id: 1
Chunk 5 for Id: 1

I’m going to use RJ’s Erlang floodtest client to open up thousands of simultaneous requests to my Comet servlet, and see how both NIO and threading perform. The only major change I made to Richard’s floodtest setup was to have the Erlang client create only two connections per second, rather than ten, so the system could be observed over a longer period.

Test Results

I used Java6 on Linux 2.6.20 for these tests—allowing Java to employ the epoll mechanism for NIO—on a fairly puny and ancient Linux box, with 1GB of RAM and a single-core Pentium M. Hardly server-class stuff, so it was possible to stress the system with a relatively small number of requests. VM options were a maximum heap size of 256MB and the default thread stack size of 64KB (-Xms64m -Xmx256m -Xss64k). Each test’s results were averaged over three runs.

With NIO-backed Continuations, Jetty was able to handle approximately 10,820 connected subscribers before dying with a "java.lang.OutOfMemoryError: Direct buffer memory". This error isn’t the caused by the JVM running out of heap space, it occurs when there’s no memory left for the JVM to allocate to NIO buffers (I believe you can override the default ceiling with -XX:MaxDirectMemorySize, but I wanted to use the same basic options for both tests). The Java process had a resident size of about 490MB by this time, though its CPU usage was mostly zero, spiking briefly up to about 20% every ten seconds as new chunks were pushed out. However, overall load average was high, above one. It’s hard to say how much of this was down to Java, or to the Erlang client, since both were running on the same machine.

The multithreaded test fell over at around 6,860 connected subscribers, complaining of a "java.lang.OutOfMemoryError: Java heap space". The entire heap had been exhausted by thread creation. Raising the Java heap size would have helped here but, as mentioned, I wanted a level playing field for the two tests. The JVM’s resident size was 505MB just before failure, but unlike the NIO test, top reported java’s CPU usage to be pegged at 95%. That was possibly due to the garbage collector working flat out to free up some heap. Again, load average was above one by this point.

Connection Rates

Now, let’s take another look at that chart:

Comet connection rates for NIO versus multi-threaded request handling

The data here was derived from the servlet’s console output: sampling every tenth line of output gave me the cumulative total of connected subscribers at each 100 second interval. I calculated the deltas between each interval, and plotted them against time elapsed. The expected outcome would be a straight line at 200 connections per interval, since the Erlang client establishes a new connection every half second.

Both the NIO and threaded tests start off at pretty much the expected rate (+/- a connection here or there, the graph ought to have error bars). The rates both begin to decline around 2,500 connections, however, NIO noticeably more steeply. In comparison, the threaded line stays close to 200 for most of its run.

This would seem to confirm that overall throughput is lower with NIO vs. NPTL, though I can only guess at the cause. In the NIO/Continuations case, Jetty still has a threadpool that it uses to service the requests. I picked an arbitrary threadpool size of 100 for this test, so at 10,000 client connections each thread would be cycling through 100 requests in sequence to send out message chunks. I’d hazard that this tied up Jetty briefly every few seconds, and prevented new connections from being accepted at the rate they were received.

In their death throes, both tests went a little haywire, probably due to frantic garbage collection and/or swapping. In the NIO case, it’s also possible that the Erlang client was stuggling to attempt new connections due to overall system load. I don’t think much can be concluded from those wild spikes, anyway.

In conclusion, then, NPTL does appear to have a performance edge over NIO when it comes to serving large numbers of concurrent Comet clients, but the trade-off is a lower ceiling on concurrent clients.

Of course, this is a contrived test, with no network involved, and no actual work happening on the server. In the real world, as always, you should make your own measurements and see which works best for your own Comet app.

Written by phil on November 20th, 2008

Tagged with , , , , ,

9 comments

9 Responses to 'Comet & Java: Threaded Vs Nonblocking I/O'

Subscribe to comments with RSS or TrackBack to 'Comet & Java: Threaded Vs Nonblocking I/O'.

  1. [...] Comet & Java: Threaded Vs Nonblocking IO  [...]

  2. What if you try to reduce the stack memory of threads?

    I just can see IO outperforms NIO until memory is exhausted.

    Serving 6.860 connected subscribers with just 256MB and maybe most of the 64Kb of the stack not being in use is AMAZING!

    jmarranz

    17 Jan 11 at 2:57 pm

  3. And you forget this serious problem in a real world Comet application:

    http://bit.ly/hr7WVd

    jmarranz

    17 Jan 11 at 2:58 pm

  4. Hi,

    i’m using jetty-7.4.1.v20110513 and servlet-api-2.5

    i tried to use continations in the following servlet.

    import java.io.*;
    import java.util.*;
    import javax.servlet.ServletException;
    import javax.servlet.http.HttpServlet;
    import javax.servlet.http.HttpServletRequest;
    import javax.servlet.http.HttpServletResponse;

    import org.eclipse.jetty.continuation.*;

    public class cs extends HttpServlet {

    public void service(HttpServletRequest req, HttpServletResponse res)
    throws java.io.IOException {

    String reqId = req.getParameter(”id”);

    Continuation cc = ContinuationSupport.getContinuation(req);

    res.setContentType(”text/plain”);
    res.getWriter().println(”Request: “+reqId+”\tstart:\t”+new Date());
    res.getWriter().flush();

    cc.setTimeout(2000);
    cc.suspend();

    res.getWriter().println(”Request: “+reqId+”\tend:\t”+new Date());
    cc.complete();
    }
    }

    I’m expecting a 2 sec delay but when i run the servlet from browser i’m getting output as following without any delay:

    Request: null start: Sat May 21 15:25:02 IST 2011
    Request: null end: Sat May 21 15:25:02 IST 2011

    please help.
    Thanks,
    ravi.

    ravi

    21 May 11 at 1:20 pm

  5. スカート ジャケット 最安値価格 クラシックな 優雅なリッチカジュアル 上質 fendi 長財布 掘り出し物 羨望 ファーも豪華でいい 古典 長袖 彼 激安通販 べルト 革のバッグ
    なんぱ あいす とうしき なんもん かしきり fendi 財布 しずまりかえる うかぬかお ゲマインシャフト あいあいがさ おんじゅん まえもって はっこうきんし
    素敵 友人 割引価格 きれい ショーツ 超越 あさ ロエベ バッグ 大衆的な販売層 绿色 オリジナル 新着 購買 ゴージャス 信用 使い易さ 身長 素晴らしい
    めいし かこうしょくひん たとい たいしゃ ロエベ バッグ 新作 ひっくりかえる みささぎ せいしょう りょうひん すえ りこ かしゃく ガーター

    Erovivohevoip

    7 Aug 13 at 6:47 am

  6. Incroyable poste : pérennise dans cette voie

  7. Un immense bravo au webmaster de ce site

    Luz

    31 May 14 at 7:30 pm

  8. J’ai pas terminé de lire toutefois je reviendrai demain

    Josefa

    4 Jun 14 at 3:58 pm

  9. I read a lot of interesting posts here. Probably you spend a
    lot of time writing, i know how to save you a lot of work, there
    is an online tool that creates readable, google friendly articles in seconds, just type in google
    - laranitas free content source

    Manie

    25 Sep 14 at 2:51 am

Leave a Reply