Don’t worry, it’s not your fault! The web was simply designed with an undue amount of trust in, well, people.
As the web has evolved, some data points that used to be available across sites have been determined to be security and/or privacy risks, and they’ve been made harder to get to, if not impossible.
Cookies — which went through several iterations of security and visibility lockdowns — get the most publicity in this regard. But much the same has happened with the
Referer HTTP header and the corresponding
What is (or was) the Referrer?
When you click a hyperlinked
<a> in a browser, the
Referer header (yes, that misspelling is in the original standard, perhaps a harbinger of things to come!) tells the target URL about the source URL of the request.
So when you click to go from the URL
the server running
https://some.other.example sees this header:
and the web page at
Same goes for remote sub-resources fetched by a page, like images, scripts, CSS stylesheets, IFRAMEs, and Ajax requests. The server who owns those resources sees the
Referer, so it knows who was asking for them. The
Referer is also sent with other methods of navigation, like posting a form or JS-based changes to
The only exception used to be when the source page ran over SSL (
https://www.some.example/?some_query_param=here) while the target page did not (
http://www.example.com). In that case, the Referrer was not sent at all, though to preserve the security of the source page — not so much the privacy of either side. (The idea being if the source URL was originally unreadable on the wire, then it shouldn’t then be exposed as plain text to an eavesdropper.)
Moving toward a new standard
Funny thing about the Olde Referrer Days (which only officially ended in late 2020, when Chrome 85 came out!) is that with the notable exception of giant search engines, there wasn’t wide concern about the privacy implications of Site B’s owner knowing someone was just looking at mypage.html on Site A.
Yes, Google and Bing and Yahoo made some special (and non-standardized) adjustments to their code so that a target site would only see a simplified
“https://www.google.com” instead of the full search string. This was about keeping user-entered keywords private... well, that and preserving market share by making you go to the search engine’s console to see keyword trends!
Yet the web as a whole didn’t take any such precautions. So if you ran a personal blog, any site you linked to — say, a company you were critiquing — would know someone was on your blog first. (Some wanted the source to be exposed for affiliate marketing purposes, but most people probably didn’t.)
Likewise for corporate websites, where you might link to industry news sites or what-have-you. Given a choice, you wouldn’t want to share the last page your visitors were on. If you had a sufficiently skilled developer, you could do what search engines do, bouncing people off an interstitial page so only your domain (not path and query string) would be shown. But most companies didn’t do that. So in practice, most Referrers were being shared.
Referrer-Policy enters the chat
Things started to change when the W3C introduced the Referrer-Policy header (and a companion
<meta> tag that has the same function). Referrer-Policy lets the source site determine exactly what will be shown in the
Referer header when connecting to target sites: the full URL including query string, just the origin (
“https://www.example.com”), or perhaps nothing at all.
This feature has been supported in some form in all browsers released since 2012, which is impressive. Setting your site set up to support IE 11, original Edge 14-18, and Safari 11 is trickier than just focusing on later browsers, though, so in practice it’s more for Chromium Edge/Chrome/Firefox/Safari 12+.
However, prior to late 2020, you still needed to deliberately enable the header or
<meta> tag if you didn’t want to reveal the full source URL. If you didn’t do anything special, the default behavior would be up to the browser, and the browsers mostly sent the full URL (called the
unsafe-url option) like the old days.
The first somewhat-harder-core exception was Safari, which starting from 13.3 forcibly sends only
“https://www.example.com” to cross-origin subresources (CSS, JS, images, IFRAMEs) even if the site wants to send the full URL. But that didn’t affect marketing attribution efforts, which only deal with the journey between main documents.
But with Chrome 85, the default changed quite drastically. And since a plurality, if not clear majority, of your visitors are using Chrome, you’re gonna notice.
Chrome’s current default since last July is
strict-origin-when-cross-origin. This means unless the source site is specifically configured to reveal more of its own visitor data, and the sites are on different origins (crucial note:
https://pages.example.com do not have the same origin!) the target site will only see the Referrer
Why it matters for Marketo
The reason this all matters is you may have a Hidden field that Autofills from a Referrer Parameter:
In Marketo-speak, “Referrer Parameter” means a query parameter in the Referrer URL. That is, the
document.referrer value is parsed and its
name=value&name2=value2 pairs are made available.
But that only works when:
- the Referrer is passed at all and
- the Referrer includes the full source URL
With Chrome’s newer default behavior, you will only see the origin of the source URL, which never has a query string regardless of whether the previous pageview had a query string.
So a concept like having an ad partner drive UTM-tagged traffic to their site, and then have a link (not UTM-tagged) to your site, will not work unless you coordinate with the partner to make sure they use the Referrer-Policy feature.
When you can’t “replay” an earlier touch across origins using an explicit Referrer-Policy, the data must instead be passed to your site in the URL or — only for origins that happen to be under the same parent private domain — in a cookie.