New Bamboo Web Development

Bamboo blog. Our thoughts on web technology.

Living on the edge of the WebSocket protocol

by Makoto Inoue

  • Introduction
  • Getting Info
  • So What’s Changed?
  • What about version incompatibility?
  • Summary

Introduction

Hello. It's been a while since I last blogged about WebSocket. It's been exciting that many new frameworks and services are out including our own Pusher.

Having said that, we have to be aware that the protocol is still actively being changed.

Just a few weeks ago, there was an important announcement at Chromium blog.

Developers should be aware that starting from WebKit nightly build r59903 and Chrome 6.0.414.0 (r47952), the client will talk to a server using -76 version of the protocol, so it will fail to open WebSocket connections with a WebSocket server based on draft-hixie-thewebsocketprotocol-75. Since -75 version of the protocol is obsoleted and no longer supported in future version of browsers, to support new clients you need to update the server implementation. (Note that Chrome 5 uses -75 version of protocol).

And I find this is a very bold statement.

The WebSocket protocol is still actively being changed. Until there is more consensus, we will continue to update our implementation to follow the latest draft of specification, rather than worrying about breaking changes.

This means that you (or the framework maintainer) must be aware of these changes and must act quickly to keep up to date.

Luckily, there are already people who analysed these new changes and blogged about it.

Martyn and I spent last Friday applying these changes to em-websocket (which we use for Pusher) and they're now available as of version 0.1.1. We're already seeing ~ 2% of clients connecting to pusherapp using the new protocol.

I'd like to share our findings and hope that other framework/library authors will do the same. The last thing I want to hear is "WebSocket keeps changing and most frameworks/libraries/services are broken". That will not increase the general adoption of WebSocket.

I also hope that this is technically interesting enough for people who want to know under the hood of WebSocket implementation, but not necessarily write a framework. (NOTE: If you are interested in just using WebSocket in various languages, then you can use Pusher via various language bindings)

Getting Info

First things first, you need to know about the fact that the change did happen, so here are a little bit of tips.

Follow Twitter

If you really want to stay ahead of the game, you should either follow @chromiumbuild or search "websocket" twitter keyword.
There are lots of noises, but you know about the change very soon. Twitter was buzzing a week before the update was announced on the chromium blog.

Use Chrome dev channel

Instead of using normal chrome, you can use the dev channel version.

There is Stable/Beta/Dev channel and they get updated according to the schedule.

Dev channel is updated every week and you are automatically updated as soon as you restart your browser, so you need to be aware of the consequences.

You can also use Webkit Nightly build, but not sure where I can find the schedule of Safari update. Does anyone have an idea?

Reading the draft

Reading the entire draft (over 50 pages) is a pain. However, you can read just section 1 and 5 (18 pages together) to understand the change on the server side.

Referring to other implementations.

WebSocket implementation in Google Go and Pywebsocket are implemented by a Google engineer (Fumitoshi Ukai).

I highly recommend reading pywwebsocket source codes. It is an apache module, but standalone.py can run on its own without configuring apache.

When we were writing draft76 patch for Ruby server (em-websocket), we actually used "echo_client.py" to hit the Ruby server for testing.
This is much easier than checking via 2 browsers. pywebsocket is actually used to test Chromium browser implementation, so it will be highly likely kept up to date.

 1 [example]$ ./echo_client.py -p 8080 --draft75
 2 Send: Hello
 3 Recv: Hello
 4 Send: 日本
 5 Recv: 日本
 6 
 7 [example]$ ./echo_client.py -p 8080
 8 Send: Hello
 9 Recv: Hello
10 Send: 日本
11 Recv: 日本

The detailed tutorial of how to install pywebsocket is here.

Please note that you actually need to build and install it even when you use standalone.py. If you don't do install, you will get the following error message.

 1 [src]$ echo $PYTHONPATH 
 2 /Users/makoto/src/pywebsocket-read-only/src 
 3 [mod_pywebsocket]$ sudo python ./standalone.py -p 9999 
 4 Traceback (most recent call last): 
 5   File "./standalone.py", line 474, in <module> 
 6     _main() 
 7   File "./standalone.py", line 361, in _main 
 8     default=handshake.DEFAULT_WEB_SOCKET_PORT, 
 9 AttributeError: 'module' object has no attribute 
10 'DEFAULT_WEB_SOCKET_PORT'

If you understand Japanese

html5-developers-jp google group is very active and looks Google engineers are also following. I posted some questions prior to start implementing the upgrade support, and got lots of useful feedbacks (Yes, you can also show off your google translation skills by posting in Japanese, but don't forget that they do some funny translations).

So What's Changed?

This section goes into much more technical detail so may wish to skip it.

The initial upgrade request was changed to include some extra handshake protocol, to make sure that it will fail fast if WebSocket request is received by non WebSocket server (especially normal HTTP server, unless the HTTP server opt in to handle this handshake protocol explicitly).

It also added better way to handle closing phase which I will explain later.

Opening Handshake

From Client to Server

Old

1 GET /demo HTTP/1.1
2 Upgrade: WebSocket
3 Connection: Upgrade
4 Host: example.com
5 Origin: http://example.com
6 WebSocket-Protocol: sample

New

 1 GET /demo HTTP/1.1
 2 Host: example.com
 3 Connection: Upgrade
 4 Sec-WebSocket-Key2: 12998 5 Y3 1  .P00
 5 Sec-WebSocket-Protocol: sample
 6 Upgrade: WebSocket
 7 Sec-WebSocket-Key1: 4 @1  46546xW%0l 1 5
 8 Origin: http://example.com
 9 
10 ^n:ds[4U

Here are the list of changes.

  1. Name("Web Socket" => "WebSocket")
  2. "Sec-" prefix
  3. New fields (Sec-WebSocket-Key1/2)
  4. Some random characters at the body of request/response (^n:ds[4U and 8jKS'y:G*Co,Wxa-)

#2 Sec- prefix, the explanation is described at Security model

It is similarly intended to fail to establish a connection when data from other protocols, especially HTTP, is sent to a WebSocket server, for example as might happen if an HTML |form| were submitted to a WebSocket server. This is primarily achieved by requiring that the server prove that it read the handshake, which it can only do if the handshake contains the appropriate parts which themselves can only be sent by a WebSocket handshake; in particular, fields starting with |Sec-| cannot be set by an attacker from a Web browser, even when using |XMLHttpRequest|.

NOTE: I am not sure why Sec- cannot be set from a Web browser (unless Sec- is a reserved word for HTTP protocol). If you know any info, please comment.

#3 and #4 are used for opening handshake and I will explain in next section.

From Server to Client

Old

1 HTTP/1.1 101 Web Socket Protocol Handshake
2 Upgrade: WebSocket
3 Connection: Upgrade
4 WebSocket-Origin: http://example.com
5 WebSocket-Location: ws://example.com/demo
6 WebSocket-Protocol: sample

New

1 HTTP/1.1 101 WebSocket Protocol Handshake
2 Upgrade: WebSocket
3 Connection: Upgrade
4 Sec-WebSocket-Origin: http://example.com
5 Sec-WebSocket-Location: ws://example.com/demo
6 Sec-WebSocket-Protocol: sample
7 
8 8jKS'y:G*Co,Wxa-

The opening handshake protocol at server side is described in 13 steps (The client side handshake protocol has 45 steps!!, and this is another reason I use pywebsocket for client testing rather than creating my own client).

The description steps are understandable, but quite hard to read, so it's better to read the source code of other implementations.
I refer to the pywebsocket version and php version described by @WebReflection.

Here are the steps.

  1. Extract numbers at Key1(eg: 4 @1 46546xW%0l 1 5) and concatenate them
  2. Count number of spaces at Key1
  3. Devide #1 by #2
  4. Change the format of #3 into "big-endian 32 bit integer"
  5. Repeat #1 by #4 for Key2(eg: 12998 5 Y3 1 .P00)
  6. Concatenate #4, #5, and the body(eg: ^n:ds[4U) of the request
  7. Digest the result in MD5 format

Did you get it? Probably better to read the code. Here is the Ruby implementation.

 1 def solve_challange(first, second, third)
 2   # Refer to 5.2 4-9 of the draft 76
 3   sum = (extract_nums(first) / count_spaces(first)).to_a.pack("N*") +
 4         (extract_nums(second) / count_spaces(second)).to_a.pack("N*") +
 5         third
 6   Digest::MD5.digest(sum)
 7 end
 8 
 9 def extract_nums(string)
10   string.scan(/[0-9]/).join.to_i
11 end
12 
13 def count_spaces(string)
14   string.scan(/ /).size        
15 end

Closing handshake

In earlier drafts, you could terminate connection, but there was no proper protocol to acknowledge the closing phase between client and server. This lack of "Orderly close" was criticised here.

To solve this problem, Closing handshake is specified so that both client and server can send a 0xFF frame with length 0x00 to begin the closing handshake.

What about version incompatibility?

The biggest disruption of the new draft is incompatibility with the prior version. Unfortunately, the draft version number is not specified at the HTTP request header (and probably won't be included until the draft get stabilised). For now it's easy enough to get by. You either

  • A: Try with draft76, catch exception, and fallback with draft75
  • B: Identify the version by checking newly added fields.

Pywebsocket uses A

 1 class Handshaker(object):
 2     """This class performs Web Socket handshake."""
 3 
 4     def __init__(self, request, dispatcher, allowDraft75=False, strict=False):
 5         """Construct an instance.
 6 
 7         Handshaker will add attributes such as ws_resource in performing
 8         handshake.
 9         """
10 
11         if allowDraft75:
12             self._fallbackHandshaker = draft75.Handshaker(
13                 request, dispatcher, strict)
14 
15     def do_handshake(self):
16         try:
17             self._handshaker.do_handshake()
18         except HandshakeError, e:
19             if self._fallbackHandshaker:
20                 self._logger.warning('fallback to old protocol')
21                 self._fallbackHandshaker.do_handshake()
22                 return
23             raise e

We (em-websocket) use B

 1 version = request['Sec-WebSocket-Key1'] ? 76 : 75
 2 
 3   case version
 4   when 75
 5     Handler75.new(request, response, debug)
 6   when 76
 7     Handler76.new(request, response, debug)
 8   else
 9     raise "Must not happen"
10   end

Summary

The updated WebSocket protocol changes the opening and closing handshakes. For now we can support draft 75 and 76, though the implementation is a bit hacky. Our changes are tested against the pywebsocket script (which is used by both Chromium developers to test their own implementations), and against Chrome and Webkit development builds, but please let us know if we are missing anything.