Redirector pages: why #hash instead of ?query

This was originally a P.S. in the redirector page post, but I figured it deserved a post of its own.

In other redirector page tutorials found around the net, the query string is used to pass the asset URL, like:

http://example.com/redirector.html?http://example.com/my.pdf

While my redirector code puts the asset URL after the hash (#):

http://example.com/redirector.html#http://example.com/my.pdf

You can build a working redirector page using either method, but my use of the hash was a deliberate choice for both performance and usability.

The performance side

It's about caching. Browsers (always) and CDNs and proxies (usually) consider the URL up to, but not including the hash* to see if they have a cached version to use instead of fetching the full page from the origin server.

That is, /redirector.html?http://example.com/my.pdf and /redirector.html?http://example.com/my-other-asset.pdf are separate URLs, separately downloaded. Even if the first one is cached for quick retrieval, they'll have to fetch the second one from scratch. This is true even if the only real difference is in the next hop to the asset URL.

In contrast, /redirector.html#http://example.com/my.pdf and /redirector.html#http://example.com/my-other-asset.pdf are the same URL for caching purposes. If a lead has visited either one of these before, they'll use the cached copy (not the same copy of the asset, the same copy of the redirector page!).

Granted, a stripped-down page like the redirector page should load (the main document HTML, that is) in a half-second or less, even without caching. But wouldn't it be better to save those 500ms, since we're imposing a delay before redirecting? That gives us 500ms more wiggle room while we wait for Munchkin to complete.

Caching the redirector doc is indisputably A Good Thing, since the HTML is the same regardless of what asset they're downloading. (You can use this same approach to reduce server traffic with other projects, by the way.)

The usability side

It's about Marketo. In Marketo-land, the Web Page in most parts of the UI means the protocol + hostname + pathname + hash, but not the query string.

That is, if the URL in the browser is http://example.com/path/to/document.html?queryparam=value&queryparam2=value2#hash_stuff, Marketo considers the page to be http://example.com/path/to/document.html#hash_stuff, with the query string stored separately.

So if you want to filter or trigger on Web Page without having to manage Query String constraints as well, having the asset URL be in the hash is a good call.

Notes

* Technically, caches never even see the hash, since it isn't passed from the browser to the server, but we can say it's effectively ignored.