Don’t worry, it’s not your fault! The web was simply designed with an undue amount of trust in, well, people.
As the web has evolved, some data points that used to be available across sites have been determined to be security and/or privacy risks, and they’ve been made harder to get to, if not impossible.
Cookies — which went through several iterations of security and visibility lockdowns — get the most publicity in this regard. But much the same has happened with the Referer
HTTP header and the corresponding document.referrer
property in JavaScript.
What is (or was) the Referrer?
When you click a hyperlinked <a>
in a browser, the Referer
header (yes, that misspelling is in the original standard, perhaps a harbinger of things to come!) tells the target URL about the source URL of the request.
So when you click to go from the URL
https://www.some.example/?some_query_param=here
to
https://some.other.example
the server running https://some.other.example
sees this header:
Referer: https://www.some.example/?some_query_param=here
and the web page at https://some.other.example
can read the same value in the JavaScript variable document.referrer
:
https://www.some.example/?some_query_param=here
Same goes for remote sub-resources fetched by a page, like images, scripts, CSS stylesheets, IFRAMEs, and Ajax requests. The server who owns those resources sees the Referer
, so it knows who was asking for them. The Referer
is also sent with other methods of navigation, like posting a form or JS-based changes to document.location
.
The only exception used to be when the source page ran over SSL (https://www.some.example/?some_query_param=here
) while the target page did not (http://www.example.com
). In that case, the Referrer was not sent at all, though to preserve the security of the source page — not so much the privacy of either side. (The idea being if the source URL was originally unreadable on the wire, then it shouldn’t then be exposed as plain text to an eavesdropper.)
Moving toward a new standard
Funny thing about the Olde Referrer Days (which only officially ended in late 2020, when Chrome 85 came out!) is that with the notable exception of giant search engines, there wasn’t wide concern about the privacy implications of Site B’s owner knowing someone was just looking at mypage.html on Site A.
Yes, Google and Bing and Yahoo made some special (and non-standardized) adjustments to their code so that a target site would only see a simplified Referer
/document.referrer
like “https://www.google.com”
instead of the full search string. This was about keeping user-entered keywords private... well, that and preserving market share by making you go to the search engine’s console to see keyword trends!
Yet the web as a whole didn’t take any such precautions. So if you ran a personal blog, any site you linked to — say, a company you were critiquing — would know someone was on your blog first. (Some wanted the source to be exposed for affiliate marketing purposes, but most people probably didn’t.)
Likewise for corporate websites, where you might link to industry news sites or what-have-you. Given a choice, you wouldn’t want to share the last page your visitors were on. If you had a sufficiently skilled developer, you could do what search engines do, bouncing people off an interstitial page so only your domain (not path and query string) would be shown. But most companies didn’t do that. So in practice, most Referrers were being shared.
Referrer-Policy
enters the chat
Things started to change when the W3C introduced the Referrer-Policy header (and a companion <meta>
tag that has the same function). Referrer-Policy lets the source site determine exactly what will be shown in the Referer
header when connecting to target sites: the full URL including query string, just the origin (“https://www.example.com”
), or perhaps nothing at all.
This feature has been supported in some form in all browsers released since 2012, which is impressive. Setting your site set up to support IE 11, original Edge 14-18, and Safari 11 is trickier than just focusing on later browsers, though, so in practice it’s more for Chromium Edge/Chrome/Firefox/Safari 12+.
However, prior to late 2020, you still needed to deliberately enable the header or <meta>
tag if you didn’t want to reveal the full source URL. If you didn’t do anything special, the default behavior would be up to the browser, and the browsers mostly sent the full URL (called the unsafe-url
option) like the old days.
The first somewhat-harder-core exception was Safari, which starting from 13.3 forcibly sends only “https://www.example.com”
to cross-origin subresources (CSS, JS, images, IFRAMEs) even if the site wants to send the full URL. But that didn’t affect marketing attribution efforts, which only deal with the journey between main documents.
But with Chrome 85, the default changed quite drastically. And since a plurality, if not clear majority, of your visitors are using Chrome, you’re gonna notice.
Chrome’s current default since last July is strict-origin-when-cross-origin
. This means unless the source site is specifically configured to reveal more of its own visitor data, and the sites are on different origins (crucial note: https://www.example.com
and https://pages.example.com
do not have the same origin!) the target site will only see the Referrer “https://www.example.com”
.
Why it matters for Marketo
The reason this all matters is you may have a Hidden field that Autofills from a Referrer Parameter:
In Marketo-speak, “Referrer Parameter” means a query parameter in the Referrer URL. That is, the document.referrer
value is parsed and its name=value&name2=value2
pairs are made available.
But that only works when:
- the Referrer is passed at all and
- the Referrer includes the full source URL
With Chrome’s newer default behavior, you will only see the origin of the source URL, which never has a query string regardless of whether the previous pageview had a query string.
So a concept like having an ad partner drive UTM-tagged traffic to their site, and then have a link (not UTM-tagged) to your site, will not work unless you coordinate with the partner to make sure they use the Referrer-Policy feature.
When you can’t “replay” an earlier touch across origins using an explicit Referrer-Policy, the data must instead be passed to your site in the URL or — only for origins that happen to be under the same parent private domain — in a cookie.