We’re Hiring!

New Bamboo Web Development

Bamboo blog. Our thoughts on web technology.

Brain Dump of Real Time Web(RTW) and WebSocket

by Makoto Inoue

Since I started blogging about WebSocket, people have been kind enough to send me more information through comments, tweets, and emails. I am happy that everyone has been sending me all this cool information and I thought I would dump some of that information here. This below is my view of the world of the Real Time Web and WebSockets.

Aghh! They look all tangled up like spaghetti code! I've tried to untangle some of this and break it down into byte sized chunks. I've come up some common themes and lots of links. Hopefully this post will help you understand what kind of different technologies are out there.

  • First, the definition of RTW
  • The many ways to achieve RTW
  • The many implementations of WebSockets
  • Are WebSockets the silver bullet for RTW?
  • Summary

First, the definition of RTW

From Wikipedia,

The real-time web is a set of technologies and practices which enable users to receive information as soon as it is published by its authors, rather than requiring that they or their software check a source periodically for updates. It is fundamentally different from real-time computing since there is no knowing when, or if, a response will be received. The information types transmitted this way are often short messages, status updates, news alerts or links to longer documents. The content is often “soft” in that it is based on the social web - people’s opinions, attitudes, thoughts and interests - as opposed to hard news or facts.

So, according to Wikipedia, my “Real Time Activity monitor” is more for “Real-Time Computing”, as it was periodic update and the content is not “soft”. However, in this article, I would like to combine both “non periodic update” of Real Time Web(RTW going forward), and “periodic, very frequent update” of Real Time Computing aspect, as it shares similar problems. The keyword is “Push, Not Pull”. The current Web Paradigm is capable of pulling “non frequent update”, up until daily or hourly. However, when the update is in very short or unknown intervals, then clients end up hitting servers very heavily.

The many ways to achieve RTW

Comet

Comet is an umbrella term, which covers a number of techniques to achieve push-technology on the web.

In web development, Comet is a neologism to describe a web application model in which a long-held HTTP request allows a web server to push data to a browser, without the browser explicitly requesting it. Comet is an umbrella term for multiple techniques for achieving this interaction. All these methods rely on features included by default in browsers, such as JavaScript, rather than on non-default plugins.

The most common approach to Comet, using XMLHttpRequest long polling is fine when the pushing interval is unknown, but begins to fall down when you have a known, frequent push interval.

In my my previous article I demonstrated an activity monitor using web sockets (Code). As a followup to that, Rob Righter cloned that app but used long polling instead. I suggest reading both the websockets code and the long polling code to understand how the techniques differ.

Regarding the lack of a Comet standard, there are several applications and frameworks which offer a Comet solution so that you can concentrate on writing your business logic, rather than implementing Comet from scratch. Jetty is an Java based HTTP server and Jetty Continuations is an early adopter to offer Comet solution. APE (Ajax Push Engine) also provides a complete comet solution which works with various javascript libraries.

Interestingly enough, both Jetty and APE now offer WebSocket, too.

XMPP

Unlike Comet, XMPP stands for Extensible Messaging and Presence Protocol and it is a standards based protocol.

The main feature is “Presence” which takes care of authentication, and shows whether participants are online or not. You have already seen examples of XMPP use on web via GTalk and Google Wave.

XMPP Standards Foundation also defined “Bidirectional-streams Over Synchronous HTTP”(BOSH) which can be used to transport XMPP over HTTP. There is a library called strophe where you can write XMPP client in both JavaScript and C for use in a wide variety of languages (Thanks to Luis Cipriani for all the info).

There is a screencast which explains the combination of the three.

According to the screencast, BOSH is a fancy name for long polling, so you could use WebSocket instead of BOSH. In fact, Kaazing Gateway already supports it.

I haven’t explored XMPP much yet, but Luis is visiting New Bamboo soon, so hopefully we can write another blog post about this. For now, here is my list of questions to XMPP and its use on Web. If you already know answers, please feel free to comment.

FlashSocket

Flash has supported socket programming for a long time. web-socket-js is actually an WebSocket browser side implementation on top of Flashsocket.

WebSockets

Yes, I have discussed WebSockets in the previous blog entires and I will discuss more in the following sections.

The many implementations of WebSockets.

I think the node.js evented approach goes very well with WebSockets, but does it mean that we should abandon our favourite server side language/framework? Not quite. Like Jetty and APE already support WebSocket, most concurrency orientated languages, libraries, and frameworks already have WebSocket support.

Here is how you write “Echo” example in a number of different languages/libraries/frameworks.

Erlang

 1 start() ->
 2        F = fun interact/2,
 3        spawn(fun() -> start(F, 0) end).
 4       interact(Browser, State) ->
 5        receive
 6          {browser, Browser, Str} ->
 7          Str1 = lists:reverse(Str),
 8          Browser ! {send, "out ! " ++ Str1},
 9          interact(Browser, State);
10        after 100 ->
11          Browser ! {send, "clock ! tick " ++
12                        integer_to_list(State)},
13          interact(Browser, State+1)
14        end.

Google Go

 1 // A trivial example server is:
 2     //
 3     package main
 4     import (
 5        "http"
 6        "io"
 7        "websocket"
 8     )
 9     // echo back the websocket.
10     func EchoServer(ws *websocket.Conn) {
11          io.Copy(ws, ws);
12     }
13     func main() {
14       http.Handle("/echo", websocket.Handler(EchoServer));
15       err := http.ListenAndServe(":12345", nil);
16       if err != nil {
17           panic("ListenAndServe: ", err.String())
18       }
19     }

Python pywebsocket

 1 from mod_pywebsocket import msgutil
 2 
 3   _GOODBYE_MESSAGE = 'Goodbye'
 4 
 5   def web_socket_do_extra_handshake(request):
 6       pass  # Always accept.
 7 
 8   def web_socket_transfer_data(request):
 9       while True:
10           line = msgutil.receive_message(request)
11           msgutil.send_message(request, line)
12           if line == _GOODBYE_MESSAGE:
13               return

Python Tornado

1 class EchoWebSocket(tornado.websocket.WebSocketHandler):
2       def open(self):
3           self.receive_message(self.on_message)
4 
5       def on_message(self, message):
6          self.write_message(u"You said: " + message)

Ruby EventMachine

1 require 'em-websocket'
2     EventMachine::WebSocket.start(:host => "0.0.0.0", :port => 8080) do |ws|
3       ws.onopen    { ws.send "Hello Client!"}
4       ws.onmessage { |msg| ws.send "Pong: #{msg}" }
5       ws.onclose   { puts "WebSocket closed" }
6     end

Ruby Sunshowers

 1 # A simple echo server example
 2   require "sunshowers"
 3   use Rack::ContentLength
 4   use Rack::ContentType
 5   run lambda { |env|
 6     req = Sunshowers::Request.new(env)
 7     if req.ws?
 8       req.ws_handshake!
 9       ws_io = req.ws_io
10       ws_io.each do |record|
11         ws_io.write_utf8(record)
12         break if record == "Goodbye"
13       end
14       req.ws_quit! # Rainbows! should handle this quietly
15     end
16     [404, {}, []]
17   }

Node.js (I know of at least 3 different implementations. The code example is for node.websocket.js)

 1 var Module = this.Module = function(){
 2     };
 3     Module.prototype.onData = function(data, connection){
 4       if (data == 'start'){
 5         this.interval = setInterval(function(){
 6           connection.send(JSON.stringify(data));
 7         }, 1000);
 8       }  
 9     };
10     Module.prototype.onDisconnect = function(connection){
11       clearInterval(this.interval);
12     };

Java Jetty

 1 public class WebSocketChatServlet extends WebSocketServlet
 2     {
 3         private final Set _members = new CopyOnWriteArraySet();
 4     
 5         protected void doGet(HttpServletRequest request, HttpServletResponse response) 
 6             throws ServletException ,IOException 
 7         {
 8             getServletContext().getNamedDispatcher("default").forward(request,response);
 9         }
10     
11         protected WebSocket doWebSocketConnect(HttpServletRequest request, String protocol)
12         {
13             return new ChatWebSocket();
14         }
15     
16         class ChatWebSocket implements WebSocket
17         {
18             Outbound _outbound;
19 
20             public void onConnect(Outbound outbound)
21             {
22                 _outbound=outbound;
23                 _members.add(this);
24             }
25         
26             public void onMessage(byte frame, byte[] data,int offset, int length)
27             {}
28 
29             public void onMessage(byte frame, String data)
30             {
31                 for (ChatWebSocket member : _members)
32                 {
33                     try
34                     {
35                         member._outbound.sendMessage(frame,data);
36                     }
37                     catch(IOException e) {Log.warn(e);}
38                 }
39             }
40 
41             public void onDisconnect()
42             {
43                 _members.remove(this);
44             }
45         }
46     }

PHP

  • There are phpwebsocket and phpwebsockets, but neither has an echo example. If you do understand php, can you read the source code and provide the echo sample?

It’s interesting to see so many ways to do the same thing. Which way did you like it? Is your favourite language listed here?

Are WebSockets the silver bullet for RTW?

I have advocated a lot about WebSockets, but let’s do a few reality checks before everybody jumps onto the WebSockets bandwagon.

NOTE: I will skip the adoption rate of WebSocket in browser, because there are not much I can add to the existing info.

Video and Audio use.

This area is still dominated by Flash technologies. Even though WebSockets can send binary data, you need something at client side to decode. The best that I can think of is to bridge between javascript and Flash, but I guess it’s better to learn Flash in that case.

Mobile use

As Ilya Grigorik mentioned in his blog, Websockets’ bi-directional push will be easier on the battery and much more efficient for bandwidth consumption. However, mobile networks tend to lose their connection or switch to different networks (eg: from 3G to wifi) frequently, so not sure how WebSockets works in real mobile scenario. Also, due to the nature of mobile phone mostly being standby mode, probably “Push notification” has bigger demand, and you may need to access devices’ native feature from various libraries like below.

Any problems with the WebSockets protocol itself?

Greg Wilkins, who implemented WebSocket feature on Jetty, posted a blog article about problems of Websocket protocol and how to improve it (or even suggested alternative approach)

Among his various arguments, I found the following two points very interesting.

Orderly Close

websocket has no concept of an idle connection, and thus an implementation will either keep connections open forever (DOS risk) or risk closing an in-use connection. Note also that the burden of handling disconnection and message retries falls to the application with websocket. Short of acknowledging every message is a significant overhead and thus not practicable as a solution for all.

Message Fragmentation

Another issue with HTTP/1.1 pipelining is that the time taken to transmit/receive/process one message in the pipeline can unreasonably delay the handling of subsequent messages. While websocket is not hampered in this regard by request response semantics, it still suffers from the issue that the sending of a large websocket message may unreasonably delay the transmission of other messages.

If you are going to bet on WebScokets for 2010, you might want to spend a little bit of time reading the above blog article, the WebSocket protocol draft, or the source code of any WebSocket supporting libraries.

Summary

After reading my yet another long blog post, did you get clear idea about RTW and WebSocket, or get confused?

Here are my takes after this brain dump.

  • There are Real Time Web and Real Time Computing. They are both push based technology, but the update frequency and content of data are different between them.
  • Comet (aka Ajax long polling), XMPP, FlashSocket, and WebSocket are currently available technology for Real Time Web. Each technology has pros and cons.
  • Many concurrency orientated languages/libraries/frameworks offer WebSocket solutions. Also some Comet based frameworks now offers WebSocket support as well.
  • WebSocket is NOT the silver bullet for all RTW scenarios.

The world of RTW and WebSockets is changing rapidly and new examples, blog posts, and libraries appear almost every day. Please let us know such new information as you find or create them.