Country-to-ISO-Code translation in FlowBoost (+ notes on text compression)

A Community discussion today brought up an easy use for FlowBoost.

Marketo user MO needed to translate full Country Names (like "Zambia") into their ISO-3166 Alpha-2 equivalent ("ZM"): that's the format required by the Dun & Bradstreet webhook he's using to enrich leads.

Before I could write back — grrr! :) — MO ended up building a 100-step flow to catch his most common countries, so he doesn't need the 'boost for this right now. But (naturally) I'd go with a FlowBoost solution myself, especially when you include all 249 current countries. Plus, this is an opportunity to not only do something cool with FB but to teach you guys some advanced-ish development techniques. (You do want to learn about text compression, right? Right?)

Getting the ISO info

Note that several different formats are all considered “ISO country codes”: Alpha-2 is the 2-letter shortest form ("ZM"); Alpha-3 is a more verbose/abbreviation-like form ("ZMB"); there's a numeric-only version (260); and so on. It's up to a service provider to decide which one(s) to expect. When countries change names, previous codes are flagged as migrated (but not invalid per se) while I believe old country names are considered invalid and unmatchable. (I mention this to explain why storing just the full name is risky, and codes are an important part of data management.)

Anyway, text files that map each official ISO country name to its corresponding ISO code(s) can be downloaded (for money!) directly from the ISO site and (for free) from lots of data clearinghouses, like here and here.

The FlowBoost JS

We know we can fetch all the data, and the mapping tables are pretty small (more on that below) so we can include the whole table in a FlowBoost call if we must.

After pasting the data, you just need a couple of lines of JS. Here's how it's done:

var ISOAlpha2Map = [
  {Name:"Afghanistan",Code:"AF"},
  {Name:"Åland Islands",Code:"AX"},
  {Name:"Albania",Code:"AL"},
  {Name:"Algeria",Code:"DZ"}, 
  … lots more countries clipped out for brevity … ,
  {Name:"Zambia",Code:"ZM"},
  {Name:"Zimbabwe",Code:"ZW"}
];

var countryAlpha2Match = ISOAlpha2Map.reverse().find( 
  (mapEntry) => new RegExp('^' + mapEntry.Name + '(\\s|$)','i').test(country) 
);

ISOAlpha2Map = undefined; 

And sample output, suitable for easy mapping to a Marketo field:

"countryAlpha2Match": {
  "Name": "Zambia",
  "Code": "ZM"
}

The matching “Algorithm”

It's nothing special: a case-insensitive prefix match (remember, JavaScript == is case-sensitive, so a regex can be used for /insensitivity). Bonus points to whoever figures out why reverse() is in there.

Important for performance is setting ISOAlpha2Map = undefined at the end of the code. This means the data table won't be included in the response (undefined values are always skipped), saving lots of bandwidth. We're about to look into how to conserve bandwidth on the request side.

Efficiently sending static data in a webhook request

If you're using FlowBoost Pro, you could store the mapping table in a .json text file in your Marketo instance (or on any webserver), then fetch and cache it on the first webhook call so it doesn't need to be fetched again. That's hyper-efficient: your webhook code itself stays lean and mean.

But if you're on FlowBoost Community or Standard, you can't do an outbound network call (HTTP GET) so you have to include the table in the request. Can you still conserve bytes on the wire? Sure, with the magic of text compression.

Compressing the table

JSON is not the most space-efficient storage format, so the first thing a techie might say is to switch to CSV or another structured format.

True, when JSON is used to store only one level of data, check out how many times the characters { } " , and the words Name and Code are repeated, compared to a CSV representing the exact same data! But CSVs don't support nesting, complex array-object relationships, and so on (even if we don't need those features here) and besides, I like JSON. So let's accept that we're using JSON and JS objects.

The JSON Country-Name-to-Alpha-2 table with extra whitespace removed is 8807 bytes, or about 9K. This is by no means colossal, but consider that the entire rest of the code (the part that searches the table) is only 127 bytes! Quite a disparity. And you really want to keep webhook payloads as small as possible, because you'll be repeating them perhaps tens of thousands of times per day. In offhand tests, I can see the 9K payload approximately doubling FlowBoost response time, from ~100ms to ~200ms. Still fast, sure, but let's see if we can compress the 9K by at least half and so only have ~50ms extra time on the wire.

Gzip…

For everyday text compression the standard is the venerable gzip. We're going to stick with that, since we're not getting crazy. (Newer algorithms like bzip could maybe shave another 5-10% off but our goal is a very conservative 50% reduction, from 9K to 4.5K or less.)

Now, I'm not going to detail every part of how I compressed the data. Like an offstage-prepped dish on a cooking show, you'll just have to accept that I ran the JSON through the gzip command-line tool and it came out to a 2.4K file (2,443 bytes). Great results, right? Yes, but there's a catch: that's a binary file (like the zip files you've surely used) and we can't just paste a binary file into a webhook payload (webhook payloads support text only).

… then Base64

So we have to encode the binary data back into text. This is always going to add some bytes back in. As with compression, if we were obsessively efficient, we'd choose an obscure/cutting-edge text encoding like Base32768 so we could add as few bytes as possible.

Let's not do that and stick with an old standby: Base64. Base64 always adds 33% overhead, so the 2.4K turns into 3.2K of text. And those 3.2K (3,260 bytes) are all right here:

H4sIAAAAAAAAA31aS3LjOBLd1ykY3nR1hLsPMDuJkiiZH9MiJY/dMQuIRJEokYQKJOWWJmY5Z5no/dygLtZJyVPOB6pnU47yewCBzJcfAP706bd/3kWilnd/u5t8KUrRqLYTzd39navzyy8Xd/+6/0H5/u9KNLmzaocfLWP9nbMm1Y7mEQwOEC6k4fDsFeCa0Ew0TiJqzSdJgNXk2hgOzxAudMXRRwvtVQX4CvFOmKyjVTDGk8VQRS+cwRpTYXZ9zqkeUE0hid1wwhoJtURzhQj3Ow4+A9i3nREVjN6MCACnAJ+l2Qn1FVwO7piKUtSC+XqaWLARio2eLhFuikrksi0ZY4YMWkCu+QemgMtKmJ7DLxZcqL5m8NyC1VkyFPdGdudLf0DQ1ODWKbhlWvYQKFOw61RX6qjEvRNXvSHfd0o3onKSTnTS0V/YsEcc1ghl5L2TqKZz5uQ7Gtm3F5klgstg+oTjWlLQhbaU5iwLfQTBTSfI7to3Afgz4v1Rdu9RzkhbIBlxVhVD14iqTrWls2pyRaH8mEn6N5WGfq3N6WPUCrdv+kYqZzY4vBWV4G6NgNhXhQBZTz3EzZ4M4CxEqxlnYXF6Wh2DIQe4ot7pnH/CX1q4NFozBbgh4o3g6nEniB6ks5UmZ9p0t8g41WSzUar1Qf4upRaKf2fy5Zoz1/LQ7yqVsVlh024pmEfTGWKq4ssJLJALxo0QNFQ2KEuMNONCYXB1plvnsy8pKpvi5/HmXBfpla533AXuI+K1Njxz+OgBTUWAjfVG4L3TldKZyVpnhgIt+2E+CtEBYoPRUlrvbyzeRw4Fr7OGGuJCjLjf/0vJIP9pddQU84yFOjSalsYmWeIkUBzcDWJGfP+P4DaAOHdPB8isLkrrLLPylqAghc5kUwuzZ8UcrDD7qna671iUzSDJznStGrDRLLyF31L2DMQwz3oqI+YDnoOW5sXp0DEQxDCnzCyqI45PIBzn33pBqUtRrHm9aiRbsQepeE4pznB4Dv6at52GYj+HejXvSqUPgENdWYhqz1sw53NIC6fQbH/+GLLwcYjRcqzWBVhvob4yHy0eEGswqBcg0IURTcbku1gjKhuSEdkMCo63uEGKdXVqZMt3H9/iJaSoUpqPgqIk21gKQzyx4znamyCI6cUD7XlSmwLgOcKGEjSrZTOES9wwFA9P7ShrU5PJCGBUz0jJjeqtbdTyiRdYBKw+HuQvj0JFVro/8C/EFoMVX29jYZ2sBW+fvRQJ5JxWMtN4HuJWAEVj9JepohagZ6RnJJ3QvJC6lkLxlLOE1S2lMP+Ln0vHFGazoTm7caxZgh6WJE8nkdL5vB2KBeUkV3Wna0/Hom87wUFNTmmYTxpZeOH49A8jQPgu+4ZaHWbLJThjlUkUwgra80v/xcDIArUVbyvQyYpC+/5ilhqLIxuwtgZ8Y9gTYvZKIV7oK0Nn7IS8qV6FSDFCspZzBZpfdaLinSU4/YHOMFBmHkKED/yzDxALD9KAmB9g3Q/a5DAW8qovzmJf4pHahwLqy+bEm0yY3FdG7QTXsg9pwtdUau55BxNLfajkT+1tb/nxjdG3meBXv3+jkGIoBKN/MsXpbG0SIj4Q+mNlN/qtj3HBBMd1Ry7PAGpyIHei4dk9mCLc6q5k3U8AoRGoHV5DBGsL5n4JXhCk3qiTTdtJfoAMVkjqyh4vQgIQZdD/LumQ0RsW/AFEdygy3r6FjxYo86GZuLaxC23oROK89IWmgD3edmvo4wy5KESb8ToUesioxBuTX/hsoyfIH+GLhefqyItzuLVwPndgYR2fOEXQtKWoqnHKDpcWr6P+8VvPCl34hAw6n3fopXA9ZijeKIeWk06669gXXnCt8neVcSfCkShUmbkm4XtnIXNJcSHza01pwXMLSFihrnJ9/KvgDWfIbQQswLXQQsMFUhhZOAldFobPMLcZLXVigiWIMEGG0RksYYLweWjE0EuQJEOq9jWoFKxBPxQ0ctEEYTrqMxDcG8mDYDUlihEcGk1LYlGAlDfHFdU1EhnJtUmvUmD9i2CLEZUnI4qez7FCQiGZBaL5CITvewhz20Yg30ibL7raj47u0cJiXXvucLh3acSNyIst/ptgRTOC1PVY81rxCM6Mxd66g459xCveF8bPFihbCvnhCm101xYnSG2oKWDoBNFDL5zBb3a3GnvIuziN7TSGFBhLLr0YnBaXqlKHA03PzBhDAotVlwlluDEgOunIBE6LA0RN1xdc3jGkppga9U4PlxQsNmMIjycBp5QnMNL6+x99o3gFXsMG17rGzLoGFaz7th1uB9/zHk4EGl2/0S75POD0RAy3pVPK9eX3PypZM2dMgzFxKSs6Hd07kzajCk5fvZwCUnNRnZMLx+2bkn0sWY7n8FXXXa9lI3lU/AoqGpODPoMmwB1TroXK+fx+wj3Qf9mRIlyMR8SKUq68HmCG1Flx68XheMBW0TGdfg4jhn7hekZE8W2tpcEDzHOC4DUXNEw6ifVZ7aS6vq4xJmam+IEzSZHc58qZGAF5PAG5JUMd4mpO0NjSwOA1rleeslJWFd9ugtsdLCqcQFI9ZhxUkKLT2EHz67rEswiDN8l/ktw567uxNxNoAJKK6vgeNu3bOD4PJVAWkuGSVN+4KE6mSKvxmSh5RLjvyvdb5A/O62TMeb8V+SGj629JDfmbor2OluEl4zmSHo5MCVIO8J40R9AoJxDNnocTmsuaeobg8BZTc+dBskuOotoNtwPD5h4uAj9JPhvciyVvwwsIpN/kFQmU2PjwOaKqO1+biw+Ki7nmNNTaS1TcOColUGhSod6G03pshref7HKWti7t02cc8FXZlTZ9QEZzFpfzxaZRQ1t6s9lMYc9pKSybpLClVNXa/EJHM94wpxBjqebX9qmH2F5C+U99hCk8GQgapwzfqFxcvZvqnYDPQDZKqazBuSaNEDZ7fiOQrm10eNG1TBvanGv9cIUaHkVGcZO6yCdp8m3DOWpTYHncgM02++GFlpl7A2H97tuLyOa1Gg4g/GV/foPrUxbMNb8gnN5gXQ8x7LPJX5KohDXaOI99V51o7rE1NmC9jemx59pAKGzOO2krewMq3YqmFx2z5xbajS0VmjOVVFL+5SH30vLeVv8W7LNVsnMifnW6jRCn3PkjV9Ps12dSRvf+H33za/Ir40IdeKbjsLpKatF3PY/7Z2gfnofYM8OfV5SC/wHFHOL0RdY8db3ANl+te/NX8M6rqndi98YE90qJ5x+f/gQOCMe2bCIAAA==

While the easily intimidated would call that gibberish, it's an exact copy of this JSON file, but at 37% of the size.

Passing that to FlowBoost

Now that we have that hunk of Base64 we can pass that to FlowBoost instead. FB has built-in decompression routines so it can unwrap the encoding and gzipped stuff and turn it back into JSON:

var base64gzipped = 'H4sIAAAAAAAAA31aS3LjOBLd1ykY3nR1hLsPMDuJkiiZH9MiJY/… more Base64 data here… /wHFHOL0RdY8db3ANl+te/NX8M6rqndi98YE90qJ5x+f/gQOCMe2bCIAAA==';   

// undo all our compression and serialization stuff
var binaryGzip = new Buffer(base64gzipped,'base64'),
  array = FBUtil.compress.gunzip(binaryGzip),
  jsonString = array.toString(),  
  ISOAlpha2Map = JSON.parse(jsonString);
  
// we have our ISOAlpha2Map again
var countryAlpha2Match = ISOAlpha2Map.find( 
  (mapEntry) => new RegExp('^' + mapEntry.Name + '(\\s|$)','i').test({{lead.Country}}) 
);

ISOAlpha2Map = undefined; 

So there you have it: a journey into ISO codes and a side trip into compression!