Hashing or encrypting Marketo data for export (using FlowBoost)

It’s shocking that more users don’t ask about hashing email addresses when sending Marketo data to 3rd parties. “Sending” includes not just exported CSVs, but also links to external sites where the email address is in the query string.

After all, for privacy reasons, some analytics and advertising tools only allow hashed emails! How are people complying with that, I wonder?

(Perhaps by leaving the data out entirely so all results are anonymous? Or by using Excel to generate hashes and reimporting? Ugh on both counts.)

30-second hashing rundown

If you’re unaware, hashing creates a one-way “scrambled” version of a string. One-way means you can’t derive the original value from the hash: it can’t be decoded. But, critically, the same value always results in the same hash.

For example, the email address

sandy@example.com

always results in the SHA-256 hash

554cee57f2f634d933980c8fb57af0998e41b6437e7e45412dc86f700a9ce008

That relationship is guaranteed in any context. You can’t turn the hash back into an email address. But if you have two hashes from different sources, you can see if the hashes are equal — meaning the email addresses were also equal — without knowing the original value(s).

(In essence, that’s how passwords are stored and compared on web apps. And it’s why an app — a properly engineered one, at least — lets you reset your password but cannot send you your old one. It doesn’t have the old one, it only has its hash![1])

3rd parties prefer hashes

Using hashes can satisfy partners, number crunchers, end users, and you.

An ad partner might send to people they’ve already opted-in, but they don’t want to give you their whole db, and vice versa: exchanging hashes keeps each side honest.

An analytics app can use database tricks to measure engagement without storing PII in the cloud. Yet you can export from the app and re-join to the leads in your own system.

Some 3rd-party preference centers use hashes so no PII ever appears in the URL.

Anyway, hashes are great.

FlowBoost lets you easily compute hashes because it includes the gold-standard JS library CryptoJS.

Computing SHA-256 hashes

In FlowBoost, CryptoJS is offered as FBUtil.crypto, and every sample you see on the web is easily used in FlowBoost.

To get the SHA-256 hash of the (lowercase) Email Address field:

lcEmail = {{lead.Email Address}}.toLowerCase();
hashedEmail = FBUtil.crypto.SHA256(lcEmail).toString();

Map hashedEmail to a custom String field in the webhook’s Response Mapping and you’ve got something suitable for 3rd-party use.

One step further: Encryption

If you’re extra paranoid you might not even want hashes being sent around. Say I’m doing a spear phishing attack against a known address sandy@example.com. I could be on the lookout for hash 554cee57f2f634d933980c8... in a trove of stolen data, which I’d know is the same thing. (Obviously rare but hey, you never know.)

In this case, you can instead truly encrypt the value (in a decryptable form) using a secret passphrase or key.

Both sides must know the passphrase: you and the service reading the link. But anyone intercepting the data wouldn’t be able to read it or even compare it to a known representation[2] (which makes it different from a hash).

To get the AES-encrypted Base64 representation of the Email Address using the passphrase secret key 123:

encryptedEmail = FBUtil.crypto.AES.encrypt({{lead.Email Address}}, 'secret key 123').toString();

Map back encryptedEmail to a field and you’ve got something that would impress even hardcore IT folks:

U2FsdGVkX18jxQ2jaQXvGGe3NbF3UYFjB7UzHuUNgBnwuENIi3tl83jcMl/ynLM7

The exact decrypting code depends on the language used by the 3rd-party service. But if they too were using FlowBoost/CryptoJS it would be:

decryptedEmail1 = FBUtil.crypto.AES.decrypt(encryptedEmail,'secret key 123').toString(FBUtil.crypto.enc.Utf8);

Simple stuff!

Notes

[1] Not the place to discuss salts and bcrypt and Argon2 and all that.

[2] Yeah, chosen ciphertext, again not the place!