Velocitip: Show a string’s Unicode codepoints

Inevitable question: Why would you need to delve this deeply into strings in Velocity Template Language, of all languages?

Answer: Sometimes you need the internals of the Velocity version of a Marketo field in order to do custom reformatting. The Velocity value isn’t always — spoiler alert! — the same as the value stored in the Marketo database. Until I put this code together, I wasn’t 100% sure there was a (specific, predictable) discrepancy!

Much more detail on that in an upcoming post.

But even if the values in Velocity were exactly the same as those in the Lead Detail UI, it would still be useful to see what hidden characters, curious accents, and other Unicode oddities lurk behind them.

The #stringToHex_v1 Velocimacro (code below) takes a string like this:

Let's break this down.

And displays its codepoints in hex:

L     |0x00004C
e     |0x000065
t     |0x000074
'     |0x000027
s     |0x000073
      |0x000020
b     |0x000062
r     |0x000072
e     |0x000065
a     |0x000061
k     |0x00006B
      |0x000020
t     |0x000074
h     |0x000068
i     |0x000069
s     |0x000073
      |0x000020
d     |0x000064
o     |0x00006F
w     |0x000077
n     |0x00006E
.     |0x00002E

It detects Unicode surrogate pairs to reduce confusion (i.e. by not displaying low surrogates by themselves). And everything is wrapped in a <pre> tag to make it suitable for the HTML preview pane — it actually looks better there than in Text preview.

Get the code

#macro( stringToHex_v1 $myString )
#set( $myCharArray = $myString.toCharArray() )
<pre>
#foreach( $char in $myCharArray )
#if( !$char.isLowSurrogate($char) )
#if( !$char.isISOControl($char) )
#set( $printable = $char )
#else
#set( $printable = "" )
#end
#set( $cp = $char.codePointAt($myCharArray,$foreach.index) )
${display.cell($printable,6)}|$display.printf("0x%05X",$cp)
#end
#end
</pre>
#end

And call it like #stringToHex_v1($lead.MktoPersonNotes).