Building a GOV.UK exhibit
The GOV.UK website is being shown at the Design Museum as part of the 2013 Designs of the Year awards. GDS asked me to develop an interactive exhibit to highlight GOV.UK’s responsive design by allowing gallery visitors to browse the site on a desktop computer, an iPhone and an iPad simultaneously.
I chose a simple solution based on web technologies: proxy the HTTP requests to the GOV.UK site, inject some JavaScript into every page, and use a WebSocket connection to propagate navigation and scrolling events between browsers on the three devices.
Contents
- Proxying requests
- Injecting JavaScript
- Installing Faye
- Syncing scroll positions
- Syncing URLs
- Forcing internal links
- Handling history
- Restoring state
- Enabling fullscreen
- Further work
- Conclusions
Proxying requests
I began by building a basic HTTP proxy in Ruby with Rack::Proxy. I installed Thin and wrote this config.ru
:
require 'rack/proxy'
require 'uri'
GOV_UK_URL = URI.parse('https://www.gov.uk')
class Proxy < Rack::Proxy
def rewrite_env(env)
env.
merge(
'rack.url_scheme' => GOV_UK_URL.scheme,
'HTTP_HOST' => GOV_UK_URL.host,
'SERVER_PORT' => GOV_UK_URL.port
).
reject { |key, _| key == 'HTTP_ACCEPT_ENCODING' }
end
def rewrite_response(response)
status, headers, body = response
[
status,
headers.reject { |key, _| %w(status transfer-encoding).include?(key) },
body
]
end
end
run Proxy.new
Aside from forwarding every request to https://www.gov.uk/, the proxy makes a couple of other changes:
- Remove the
Accept-Encoding
header from the request. The browser will probably sendAccept-Encoding
to indicate that it can cope with gzipped or deflated content, and leaving this header intact would encourage the upstream server to send a compressed message body, making it slightly harder to rewrite the response later. - Remove the
Status
header from the response. The GOV.UK server sendsStatus
in every response for reasons that are unclear to me, but the Rack specification says that the HTTP status code must be communicated solely by the status component of the response triple, so this header shouldn’t be passed downstream. (This is arguably a bug in Rack::Proxy.) - Remove the
Transfer-Encoding
header from the response. GOV.UK sends chunked responses for pages above a certain size, but these are transparently reassembled by Net::HTTP, so the downstream response will no longer be chunked unless we explicitly re-chunk it. (Again, arguably a Rack::Proxy bug.)
This simple Rack app doesn’t yet modify the content of pages at all, but it already does a decent job of proxying the site without too many problems. Thankfully most of the navigational links on GOV.UK are relative, so the browser often stays at the proxy hostname rather than navigating away to the real site; conversely, almost all of the asset URLs are absolute, so the images and stylesheets get loaded from the real site without burdening the proxy.
Injecting JavaScript
I created an empty file, mirror.js
, and used the Rack::Static middleware to serve it from the proxy app:
require 'rack'
require 'rack/proxy'
require 'uri'
GOV_UK_URL = URI.parse('https://www.gov.uk')
MIRROR_JAVASCRIPT_PATH = '/mirror.js'
# ...
use Rack::Static, urls: [MIRROR_JAVASCRIPT_PATH]
run Proxy.new
Next I wrote a Rack middleware to add a <script>
tag to the <head>
of text/html
responses so that mirror.js
got loaded on every page:
# ...
class InsertTags < Struct.new(:app)
def call(env)
status, headers, body = app.call(env)
Rack::Response.new(body, status, headers) do |response|
if media_type(response) == 'text/html'
content = add_tags(response.body.join)
response.body = [content]
response.headers['Content-Length'] = content.length.to_s
end
end
end
def media_type(response)
response.content_type.to_s.split(';').first
end
def add_tags(content)
content.sub(%r{(?=</head>)}, script_tags)
end
def script_tags
%Q{<script src="#{MIRROR_JAVASCRIPT_PATH}"></script>}
end
end
use Rack::Static, urls: [MIRROR_JAVASCRIPT_PATH]
use InsertTags
run Proxy.new
Installing Faye
Using a raw WebSocket connection involves dealing with low-level details like timeouts, buffering and message encoding, as well as writing a WebSocket server to marshal communication between clients. To avoid this administrative overhead I decided to use Faye, a project which provides Ruby and JavaScript implementations of the Bayeux protocol for asynchronous publish/subscribe messaging over HTTP. Faye abstracts away the WebSocket layer and exposes a simple interface for bidirectional real-time communications: any client can publish a message on a named channel, and the Faye server delivers that message to all clients who are subscribed to the same channel.
I installed the faye
gem, loaded the Thin WebSocket adapter, and added Faye’s Rack adapter to the middleware stack:
require 'faye'
require 'rack'
require 'rack/proxy'
require 'uri'
# ...
Faye::WebSocket.load_adapter('thin')
use Faye::RackAdapter, mount: '/faye'
use Rack::Static, urls: [MIRROR_JAVASCRIPT_PATH]
use InsertTags
run Proxy.new
This configures Faye to respond to HTTP requests whose paths begin with /faye
. Faye can serve the source of its own JavaScript client, so I modified the InsertTags
middleware to tell the browsers to load that file on every page too:
# ...
GOV_UK_URL = URI.parse('https://www.gov.uk')
FAYE_JAVASCRIPT_PATH = '/faye/faye-browser-min.js'
MIRROR_JAVASCRIPT_PATH = '/mirror.js'
# ...
class InsertTags < Struct.new(:app)
# ...
def script_tags
[FAYE_JAVASCRIPT_PATH, MIRROR_JAVASCRIPT_PATH].
map { |src| %Q{<script src="#{src}"></script>} }.join
end
end
Syncing scroll positions
Now that every browser was loading Faye and my (empty) mirror.js
script, I was ready to write some JavaScript to connect them together.
Synchronising their scroll positions was the easiest part. The scroll
event fires whenever a window is scrolled, and window.scrollTo
can be used to set the current scroll position, so I could just use Faye to broadcast scroll events and recreate them on other devices.
To avoid feedback loops and other asynchronous difficulties, I decided to designate each browser as either sending scroll messages or receiving them, but not both. (Gallery visitors will be interacting with the desktop machine while watching the iPhone and iPad under perspex, so one-way synchronisation is sufficient.) The simplest way to differentiate these roles was to use two hostnames: the controlling browser opens a URL containing the canonical hostname of the proxy server, while each mirroring browser is started at an alias hostname beginning with mirror
. The JavaScript can then easily check the current hostname and decide whether to send or receive scroll events.
Here’s what went into mirror.js
:
(function () {
var begin = function (beginControlling, beginMirroring) {
var faye = new Faye.Client('/faye');
if (window.location.hostname.indexOf('mirror') === 0) {
beginMirroring(faye);
} else {
beginControlling(faye);
}
};
var beginControlling = function (faye) {
window.addEventListener('scroll', function () {
faye.publish('/scroll', { x: window.scrollX, y: window.scrollY });
});
};
var beginMirroring = function (faye) {
faye.subscribe('/scroll', function (message) {
window.scrollTo(message.x, message.y);
});
};
begin(beginControlling, beginMirroring);
}());
This is enough to reproduce the scrolling behaviour of the controlling browser in each of the mirroring browsers.
Syncing URLs
The next step was to synchronise the URL of the page being shown by all the browsers. I did this by publishing a message to the /navigation
channel every time a click
event occurred inside any <a>
element in the controlling browser, and setting window.location.href
in each mirroring browser when this message was received:
(function () {
// ...
var navigateTo = function (url) {
if (window.location.href !== url) {
window.location.href = url;
}
};
var beginControlling = function (faye) {
// ...
window.addEventListener('click', function (event) {
var element = event.target;
while (element) {
if (element.localName === 'a') {
event.preventDefault();
faye.publish('/navigate', { url: element.href });
navigateTo(element.href);
break;
}
element = element.parentNode;
}
});
};
var beginMirroring = function (faye) {
// ...
faye.subscribe('/navigate', function (message) {
navigateTo(message.url);
});
};
begin(beginControlling, beginMirroring);
}());
Manually setting window.location.href
in the controlling browser (rather than allowing the default click
event behaviour) has the desirable side-effect of forcing any awkward links (e.g. target="_blank"
) to open in the current window.
Forcing internal links
Although this code successfully synchronises the first URL change, it causes the mirroring browsers to navigate away from the mirroring hostname and onto the controlling hostname, preventing any further updates. I fixed this by updating navigateTo
to rewrite all URLs to use the browser’s current protocol
and host
:
(function () {
// ...
var navigateTo = function (url) {
var a = document.createElement('a');
a.href = url;
a.protocol = window.location.protocol;
a.host = window.location.host;
if (window.location.href !== a.href) {
window.location.href = a.href;
}
};
// ...
}());
Because the controlling browser is also using navigateTo
, this prevents the user from navigating away from GOV.UK, although the resulting behaviour — clicking on an external link takes you to that link’s path on the current hostname — is unexpected. To avoid this I just completely disabled navigation to any external link:
(function () {
// ...
var beginControlling = function (faye) {
// ...
window.addEventListener('click', function (event) {
var element = event.target;
while (element) {
if (element.localName === 'a') {
event.preventDefault();
if (element.host === window.location.host || element.hostname === 'www.gov.uk') {
faye.publish('/navigate', { url: element.href });
navigateTo(element.href);
}
break;
}
element = element.parentNode;
}
});
};
// ...
}());
Clicking on links isn’t the only way to navigate around GOV.UK. Some forms on the site (e.g. Pay your council tax) generate an HTTP redirect when submitted, so the Location
headers of these responses need to be rewritten to prevent the browser navigating away from the current host:
# ...
class RewriteRedirects < Struct.new(:app)
def call(env)
status, headers, body = app.call(env)
Rack::Response.new(body, status, headers) do |response|
if response.redirect?
url = URI.parse(response.location)
url = url.route_from(GOV_UK_URL) if url.absolute?
if url.relative?
response.redirect(url.to_s, response.status)
else
response.status = 204
end
end
end
end
end
Faye::WebSocket.load_adapter('thin')
use Faye::RackAdapter, mount: '/faye'
use Rack::Static, urls: [MIRROR_JAVASCRIPT_PATH]
use RewriteRedirects
use InsertTags
run Proxy.new
The RewriteRedirects
middleware turns absolute GOV.UK URLs into relative ones so that the browser stays on the current host. If the redirect URL points at a non-GOV.UK site, the response code is changed to 204 No Content to prevent the browser from navigating anywhere.
Handling history
The current URL also changes when the user manipulates the browser history, e.g. with the back/forward buttons, but this isn’t caught by the click
handler. A catch-all solution is to listen for the pageshow
event and republish the /scroll
and /navigate
messages whenever a new page is shown:
(function () {
// ...
var beginControlling = function (faye) {
// ...
window.addEventListener('pageshow', function () {
faye.publish('/scroll', { x: window.scrollX, y: window.scrollY });
faye.publish('/navigate', { url: window.location.href });
});
};
// ...
}());
(Incidentally, this papers over the race condition which occurs when the click
handler is trying to publish the /navigate
message before the current page unloads. If the Faye client loses this race, the subsequent pageshow
event will bring the mirroring browsers back in sync.)
Unfortunately no event fires at all when GOV.UK’s JavaScript updates the current URL by calling history.pushState
(e.g. on the maternity leave calculator), so I had to replace the browser’s pushState
implementation with one that publishes a message:
(function () {
// ...
var beginControlling = function (faye) {
// ...
var realPushState = window.history.pushState;
window.history.pushState = function (state, title, url) {
faye.publish('/navigate', { url: url });
return realPushState.call(window.history, state, title, url);
};
};
// ...
}());
I couldn’t find a GOV.UK page that uses history.replaceState
, so I ignored it.
Restoring state
At this stage the implementation was complete enough to deliver, but I wanted to make it more robust by keeping track of the current URL and scroll position on the server so that any new client (e.g. a rebooted iPhone or iPad) could be sent straight to the right page instead of having to wait for a user interaction to trigger an update.
The first step was to write a server-side Faye extension to remember the values that appeared in the most recent /scroll
and /navigate
messages:
# ...
class StateCache < Struct.new(:x, :y, :url)
def incoming(message, callback)
channel, data = message.values_at('channel', 'data')
case channel
when '/scroll'
self.x = data['x']
self.y = data['y']
when '/navigate'
self.url = data['url']
end
callback.call(message)
end
end
Faye::WebSocket.load_adapter('thin')
use Faye::RackAdapter, mount: '/faye', extensions: [StateCache.new(0, 0, '/')]
use Rack::Static, urls: [MIRROR_JAVASCRIPT_PATH]
use RewriteRedirects
use InsertTags
run Proxy.new
The server boots with reasonable defaults for the current URL and scroll position, but to improve accuracy I wrote a client-side extension to republish the actual values whenever the controlling browser reconnects:
(function () {
// ...
var beginControlling = function (faye) {
// ...
faye.addExtension({
outgoing: function (message, callback) {
if (message.channel === '/meta/handshake') {
faye.publish('/scroll', { x: window.scrollX, y: window.scrollY });
faye.publish('/navigate', { url: window.location.href });
}
callback(message);
}
});
};
// ...
}());
This automatically freshens the server’s state if it gets restarted for any reason.
The current state could now be sent to clients as soon as they connected. I did this by adding the appropriate values to the ext
field of the /meta/subscribe
response sent to a mirroring browser when it successfully subscribes to a channel:
# ...
class StateCache < Struct.new(:x, :y, :url)
def incoming(message, callback)
# ...
end
def outgoing(message, callback)
channel, successful, subscription =
message.values_at('channel', 'successful', 'subscription')
if channel == '/meta/subscribe' && successful
case subscription
when '/scroll'
message['ext'] = { x: x, y: y }
when '/navigate'
message['ext'] = { url: url }
end
end
callback.call(message)
end
end
# ...
To make use of this data I added another client extension to catch /meta/subscribe
messages and update the browser’s scroll position and URL:
(function () {
// ...
var beginMirroring = function (faye) {
// ...
faye.addExtension({
incoming: function (message, callback) {
if (message.channel === '/meta/subscribe') {
if (message.subscription === '/scroll') {
window.scrollTo(message.ext.x, message.ext.y);
} else if (message.subscription === '/navigate') {
navigateTo(message.ext.url);
}
}
callback(message);
}
});
};
begin(beginControlling, beginMirroring);
}());
Enabling fullscreen
The final change was to make the proxied site appear fullscreen on the iPhone and iPad by injecting a single Apple-specific <meta>
tag into each page:
# ...
class InsertTags < Struct.new(:app)
# ...
def add_tags(content)
content.sub(%r{(?=</head>)}, meta_tags + script_tags)
end
def meta_tags
'<meta name="apple-mobile-web-app-capable" content="yes">'
end
# ...
end
# ...
Further work
That’s all I did — the code is on GitHub. There were several more features I’d planned to build but ultimately didn’t need to:
- Add a caching layer: The proxy application is wasteful in its network usage, re-fetching GOV.UK pages on every request. In practice this didn’t introduce enough latency to be problematic — the GOV.UK site is very fast, and the browsing speed was limited by the museum’s internet connection rather than how fast the proxy could produce responses.
- Run a separate Faye process: Faye runs in the single Thin process that’s also running the Rack proxy, which might cause contention under load. In practice there are only three browsers talking to Faye, so this doesn’t cause any detectable slowdown.
- Run multiple Thin processes: The Rack proxy can only serve a single request at once, so throughput could be improved by running several processes and distributing load between them. Again there was no obvious need to do this; the three clients were acceptably responsive when talking to a single Thin process.
A subsequent discussion with Chris Roos made me realise that writing a Chrome extension could’ve made the controlling side easier to implement: I might have been able to listen to events through the Chrome APIs (perhaps chrome.tabs.onUpdated
and/or chrome.history.onVisited
?) instead of trying to do it all in-page. Ultimately the controlling browser ended up being a generic WebView inside a kiosk application anyway, so a vendor-specific extension wouldn’t have worked, but I’d investigate this option more thoroughly for any similar projects in future.
Conclusions
This project took about two days, including time spent on the initial brief and the physical installation of devices at the Design Museum.
It wouldn’t have been possible to get everything working in such a short time without several advantages:
- The GOV.UK site is well-built and makes proper use of web technologies and conventions. Its judicious use of GET requests, 302 redirects and fragment identifiers makes it straightforward to synchronise views of the site across multiple devices. Even the decision to use relative rather than absolute links made it significantly easier to get a simple tech demo running before I’d done any work on handling external URLs.
- GDS employs talented people and empowers them to move quickly and get things done with a minimum of hassle. Ben Terrett and Alexandra Bobbe were clear in their brief, quick to respond to questions, and constantly available for testing and experimentation; Paul Downey spun up an EC2 instance for this project as fast as I could email him my public key.
- Rack is a powerful abstraction with a rich ecosystem of off-the-shelf components that are easy to plug together. Although Rack::Proxy isn’t perfect, it works well enough to be adapted to this purpose, and saved me the trouble of writing any hairy Net::HTTP code. The middleware concept makes it easy to incrementally improve a bare-bones app until it does everything you need. Most significantly, a Rack app does exactly as much or as little as you tell it to, so there is zero time spent wrestling with a framework.
- Faye is an exceptionally good open source project, consisting of a high-quality implementation backed by high-quality documentation. It was effortless to install and worked first time; when I wanted to do something more advanced, its author James Coglan was immediately responsive to questions on Twitter.
It’s extremely satisfying to work with the wind at your back like this. I enjoyed this project a lot, and in future I’ll be more likely to come back to these technologies (and these people) when I want to make something fast and fun.
The 2013 Designs of the Year exhibition runs from 20th March until 7th July. Good luck, GDS — I hope you win.