Stop using direct download links (unless you want to lose tracking!)

Update 2017-03-02: This post originally showed code that didn't work in IE (bug in Microsoft's ILocation implementation). I'd fixed the code on the demo page but failed to update it below in the post itself. Make sure to get the latest!

Sorry for the publishing gap, guys! Got some really good drafts ready, though.

Today, let's explore a questionable call made by… well, probably by every single Marketo user, ever: sending direct links to PDFs or other downloadable collateral.

I think most people understand that downloadables, on their own, can't associate web sessions with known leads: PDFs don't run Munchkin, nor do Excel spreadsheets. So if someone clicks from an email straight to your blahblah.pdf, their web session will still be anonymous unless it's been associated via other means.

Yet most people send direct download links anyway, the thought process presumably being They filled out a form to trigger my auto-responder campaign, so their web session was already associated.

Not. So. Fast!

What about a lead who submitted the form on their desktop, then checked mail on their phone and clicked the link (or, obviously, vice versa)? That person is going to be anonymous in one of those two places, so analysis of their cross-device click path will be impossible.

Sure, you'll see that they clicked the email in both cases (Clicked Email events are logged by the tracking server before redirecting to the PDF, and they do not require Munchkin) but you won't see any other part of their clickpath — no past or future web activities to analyze.

And what if the direct link wasn't in an email, but in an social post like Marketo user LH asked about here? Even if that lead filled out a form before on that device, you'll never know they downloaded the asset. And if they fill out a form later, you won't be able to look back and see they downloaded the asset, since there's no web activity recorded.

Overall, pretty sucky, huh? We want all the known lead tracking we can get, and we're clearly throwing some away unnecessarily.

So rather than sending people to the PDF directly, we want to make sure Munchkin gets a chance to run first. This is not primarily to log a Visited Web Page for the doc itself (though that's nice to have, too) but to make sure that past and future web activities in that browser can be associated with the lead.

We do this via a redirector page, which is a single barebones landing page, with no body content necessary. You use the same redirector page for all your assets. The page logs a hit, then redirects, using data passed in the URL to determine the “next hop” for the lead (that is, the next URL is passed inside the first URL).

Ideally, the redirect happens immediately after the hit is logged — no sooner and no later — for a superior user experience. But unfortunately, Munchkin doesn't offer a feature that's comparable to Google's hitCallback, so we can't get that level of control.

(Here at TEKNKL, we use a custom Munchkin library called Munchkin Enhanced which adds this feature to Munchkin as well as some other reliability and performance enhancements, but it's not out there for the general public. Incidentally, not using hitCallback is the reason many people's Marketo-to-GA connections are badly broken.)

Without a callback feature, we can still do the next-best thing, and that is insert a short delay before redirecting. We choose a duration during which the lion's share of (though not, to restate, all) Munchkin library loads + hits can be expected to complete. Accordingly, in the code below, the delay is set to 3.5 seconds. Long enough for the vast majority of cases, but in your little journal of assumptions and risks (you do have one, don't you?) you should note that it's not 100% coverage. It's still a heckuva lot better than 0% coverage!

The recipe

The redirector page doesn't have to be a Marketo LP (it could be on your corp site, so long as that shares the same parent domain) but hey, why not?

Wherever you put it, it works the same way.

Disable the automatic Munchkin at the LP level (obviously, we are using Munchkin, but I like to have more control) and insert this into the <head>:

<script type="text/javascript">
  document.write(unescape("%3Cscript src='//munchkin.marketo.net/munchkin-beta.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script>
  Munchkin.init('111-222-333'); // your Munchkin ID and options, obviously!
</script>
<script>
  (function(redirectTarget){
    var allowedOrigins = [
         'http://pages.example.com',
         'http://www.example.com',
         'http://example.com'
        ], // which domains are allowed for redirection
        redirectMs = 3500, // how long before redirecting  
        progressMs = 500,  // how long between updates of the "progress meter"
        progressChar = '&bull;', // progress character (HTML bullet)
        errNoAsset = 'Asset URL not found.', // message when no asset in hash
        errInvalidAsset = 'Asset URL not allowed.', // when asset not our domain
        progress = setInterval(function(){
          if (redirectTarget) {
            document.body.insertAdjacentHTML('beforeend',progressChar);
          } else {
            clearInterval(progress), clearTimeout(redirect);          
            document.body.insertAdjacentHTML('beforeend',errNoAsset);          
          }
        }, progressMs),
        redirect = setTimeout(function(){         
            var redirectLoc = document.createElement('a');
            redirectLoc.href = redirectTarget;
            redirectLoc.origin = redirectLoc.origin ||
                [redirectLoc.protocol,
                  '//',
                  redirectLoc.hostname, 
                  ['http:','http:80','https:','https:443']
                    .indexOf(redirectLoc.protocol+redirectLoc.port) != -1
                      ? ''
                      : ':' + redirectLoc.port
                ].join('');

            clearInterval(progress);
            if (allowedOrigins.indexOf(redirectLoc.origin) != -1) {            
              document.location.href = redirectTarget;
            } else {
              document.body.insertAdjacentHTML('beforeend',errInvalidAsset);                   
            }
        }, redirectMs);
    })(document.location.hash.substring(1));
</script>

As you'll see when you try it, I did add a little filigree (the ticking “progress” bullets don't have any technical significance) but it's otherwise totally basic.

Now, to build a download link, bounce it off the redirector page by including the asset URL after the document hash (the # in the URL):

http://pages.example.com/redirector#http://example.com/redirected.pdf

And triggering on downloads in Marketo is as simple as:

Enjoy!

Addendum: why the origin restriction?

Because this. You don't want to let anyone use your redirector page to bounce people onto their own site, under cover of a URL starting with http://toteslegitbusiness.com. So it's locked down by default, and you whitelist domains allowed to serve up assets. This is a common oversight!

Addendum #2: Why use document.hash?

More on this in the next post.