Splitting delimited strings in a less-smelly way (the “Header String” way)
This JS string-splitting approach is a sure code smell, but I see it all the time:
var partsOfString = stringifiedLeadInfo.split("|");
var firstName = partsOfString[0];
var lastName = partsOfString[1];
var companyName = partsOfString[2];
var phoneNumber = partsOfString[3];
/* ... and so on and so on... */
Presumably stringifiedLeadInfo
when the code was first written was a string like
Sandy|Whiteman|FigureOne, Inc.|212-222-2222
But this code is clearly fragile: there's no guarantee that the “magic numbers” 0, 1, 2, and 3 will continue to represent the same data (business-wise) inside the string. If order shifts around at the source, or if a new data point is added in the middle, all these lines may need to change. That leads to bugs.
Instead, use what I call a header string. It's nothing more than a sample string containing the variable names in the currently expected order:
var delim = "|",
stringifiedLeadHeaders = "firstName|lastName|companyName|phoneNumber",
leadHeaders = stringifiedLeadHeaders.split(delim);
var leadInfo = stringifiedLeadInfo
.split(delim)
.reduce(function(acc,next,idx){
acc[leadHeaders[idx] || "Unknown_Property_" + idx] = next;
return acc;
},{});
Now, leadInfo
is a simple object:
{
firstName: "Sandy",
lastName: "Whiteman",
companyName: "FigureOne, Inc.",
phoneNumber: "212-222-2222"
}
And you only need to change the header string if the data starts coming in differently. No other lines need to be added or changed.
(I also made the delimiter a variable, ’cuz that could change too. And if new data points appear in the data before you add them to the header, they're given automatic names like Unknown_Property_5
to help signal the change.)
Please use this — or something along these lines, there are other methods with the same effect — in your code. It makes it less painful to read (scrolling through 25 variable assignments ain’t fun) and because of my curious specialty I spend a lot of time reading other people's stuff. ☺
Do it in Velocity, too
The equivalent can be done in any language. Always better than magic numbers, IMNSHO. Here's the comparable VTL:
#set( $delim = "\|" )
#set( $stringifiedLeadHeaders = "firstName|lastName|companyName|phoneNumber" )
#set( $leadHeaders = $stringifiedLeadHeaders.split($delim) )
#set( $leadHeadersCount = $leadHeaders.size() )
#set( $leadInfo = {} )
#foreach( $next in $stringifiedLeadInfo.split($delim) )
#if( $foreach.index < $leadHeadersCount )
#set( $void = $leadInfo.put($leadHeaders[$foreach.index], $next) )
#else
#set( $void = $leadInfo.put("Unknown_Property_${foreach.index}", $next ) )
#end
#end
The main difference here (Velocity's verbosity aside) is that Java's String.split
always treats the delimiter as a regular expression, not a simple string. Since the pipe symbol "|"
has special meaning in regex-land, I escaped it as "\|"
to treat it non-specially. Character class "[|]"
would also implicitly escape the pipe.
(JavaScript's split(delim)
also supports regexes, but the language can tell the difference between a "string"
and a /regex/
so you don't need to escape strings.)
Better yet, don't give yourself the need to split
It could be argued that all string splitting is smelly, and this improvement is just code cologne.
Indeed, the best string-splitting code is the code== you don't have to write, because you store multivalued fields as JSON== or some other well-known, self-describing format. Private formats with pipes, semicolons, or commas are to be avoided when possible. We'll never completely get away from them, though, and they’re admittedly efficient storage-wise.