Fixing the current Excel/Marketo list import bug (Unicode characters)

Never thought I'd write another Excel macro, let alone blog about it, but duty calls.

(Excel is pretty much my nemesis app: I scrupulously avoid using it so I don't get lost in getting good at it instead of doing my “real” database + programming work. Respect to the experts, though — what's up, LC!)

So today I'll be doing my best impersonation of an Excel power user…

The bug

A curiously downplayed (and as yet unfixed) bug was introduced during a Marketo upgrade sometime in the past couple of weeks. Depending on the scope of your business, it can have a major impact and I think a lot of people aren't realizing they're affected.

Alert user DT pointed out on the Community and champ DS confirmed that Excel's so-called Unicode Text export format can no longer be used for Marketo uploads with non-Latin-1 characters.

Take a simple sheet like this:

ss

Choosing this Excel Save As option, which is the only Save As format that supports Unicode characters, used to work:

ss

But it no longer does. Instead, you'll see this garbling in the Marketo UI:

ss

(And it's not just a cosmetic issue, the import will indeed garble your data.)

Unfortunately — and this is why I think more people are being adversely affected than realize it — the UI garbling only serves as a warning signal if the first line of the import (the one shown in the Sample Value column) has non-Latin-1 data. If the first row happened to be simple ASCII and you shrugged and chose your field mapping, it would look like this, which isn't nearly as disturbing:

ss

But the import would still garble all the rows that aren't previewed.

What is Excel's “Unicode Text” option, anyway?

As I mentioned on the Community thread linked above (you should read it since not everything is repeated here), the Excel option called Unicode Text is specifically tab-delimited UTF-16 LE with BOM (Little Endian encoding with explicit Byte Order Mark).

You don't need to know exactly what that means (though it's fun to learn) but you need to know it's not “The One True Unicode.” Nor is it (as Marketo support erroneously put it) “binary encoding.”

It's just a common text encoding and one of several ways to transport Unicode characters in a text file. Other ways include the extremely common UTF-8 and the rarer UTF-16BE and UTF-32. All of these ways earn the name “Unicode Text” as much as UTF-16LE.

Unfortunately, within the limited realm of Excel's Save As, UTF-16LE is special because it's the only Unicode-capable text format on offer. And it used to work fine with Marketo. But now it doesn't.

What works with Marketo?

Per support communications provided by DS, Marketo claims to work with:

UTF-8, UTF-16, Shift-JIS and EUC-JP

But that's clearly not true, since perfectly formed UTF-16 is what Excel is exporting. In reality, Marketo's UTF-16 support is broken.

I can't vouch for the Shift-JIS and EUC-JP (both Japanese-only encodings) either way, but I can say that UTF-8 still works.

So we need to get our UTF-16LEBOM export files converted to UTF-8BOM. That's what the solution below does.

The solution

Converting from UTF-16 to UTF-8 can be done from the command line (you can use the age-old iconv on *nix and Mac OS boxes, and there's a Windows version, too) but that's probably scary for most of the folks charged with uploading lists into Marketo.

So I wrote a short Excel macro that uses built-in Excel and Windows functions to do the conversion. In a nutshell, it:

  • lets you choose an export filename from the usual dialog
  • saves the file as Unicode Text i.e. UTF-16LE w/BOM
  • rereads the file contents
  • writes the contents as UTF-8 w/BOM to a new file in the same directory

ss

The resulting Marketo-ready UTF-8 export ends with _MarketoReady_utf8:

ss

And the file is then good to go in Marketo:

ss

Get the code

You can download an XLSM file with the macro code from here.