Just look at how bad this code is

Unless you're in a masochistic mood, might as well skip my opening rant and check out the main topic ↓.

Is bad code a teachable moment? I keep thinking it is… it sounds reasonable that you can learn good code by breaking down bad code, like you can do with bad design, ugly architecture, etc.

Might just be an excuse for another rant, though.

It’s code-clobberin’ time (it’s Grimm out there)

Professional developers command high wages and respect based on understanding — sorry, grokking — things mortals don't.

And we get away with murder, professionally speaking (myself included). We get job security through obscurity: You don't grok, we claim, that bugs are inevitable.

If Justin in Customer Service raises his voice on a call, he figures it's his last day; Jessica in Marketing gets raked over the coals for sending to the wrong Smart List. They don't get to say it's the nature of the beast.

Meanwhile, Joe Developer pushed broken public-facing code — broken because it wasn't tested against common real-world input — and his worst fear is he'll have to look up from Reddit and mumble “must be a browser bug” before half-heartedly reworking it (and it'll likely be broken in a new way).

Don't get me wrong: the mantra Bugs Are Inevitable is worth repeating. Even if a laughable amount of bad code is in the wild, sillier still would be expecting that anyone can write and run only bug-free code for more than, like, an hour straight.

But as in an earlier rant, my beef with bad code(rs) isn't about typos, or bugs in implementing complex business logic, and definitely not with syntax errors, or not covering corner cases. It's with a certain type of proudly bad code: code that seems to taunt non-coders because they won't know how wrong it is.

So hard to explain to a newbie, but it's as if the coder said, It'll work for one incredibly narrow and undocumented case: Done! (If they had only added // @todo make this less embarrassing it wouldn't fit the proudly bad moniker anymore.)

What brought this on?

Was resolving a Community thread and saw some so-bad-you-have-to-be-kidding code. Check out this (quotes intended) “URL Parser”:

Let me run down what's wrong with it.

(1) You shouldn't be writing most of this

The first mistake is a URL parser isn't even something you should be trying to write. Use an established library like URI.js. As I said in another post: you aren't going to do a better job than the guy who's spent 6 years on this, squashing bugs (both in response to bug reports and of his own accord), adding new APIs, and finding new corner cases to accommodate. Yes, you might find URI.js outputs something in a different way than you'd hope, but even if you can't get Hr. Rehm to change it, you can write a little helper function without rewriting the whole thing.

I use URI.js not just in large commercial projects (with attribution of course) but even in my little CodePen demos: the little weight it adds to a 10-line snippet is much better than the alternative (i.e. me screwing up yet another attempt to write a mini-parser).

To repeat: use a time-tested library for this task. You save nothing, not one thing, by doing it yourself.

(2) Browsers break up the current URL into correct parts for you

So you decided you could write your own parser. Let's see how you do.

window.location.href.split("&") will split this URL:

https://pages.example.com/path/to/folder.html?q1=value1&q2=value2&q3=value3#item1

into this JS array with 3 string elements:

[
  "https://pages.example.com/path/to/folder.html?q1=value1",
  "q2=value2",
  "q3=value3#item1"
]

Bad start. You can only go downhill from there.

The first query parameter is tacked onto the end of the path. The last query parameter is tacked onto the beginning of the hash.

The code above doesn't even seem aware of these problems.

  • It ignores the first query parameter, without any documentation of this behavior: urlParams.shift() hacks off the hostname, path, and first query parameter from the array, so it's never seen again.
  • It does nothing to further split the last array element on the # character.

So it's obviously not been tested with real-world URLs.

More/as important, there was no reason to base anything on the document.location.href as the document.location.search property is built into every browser and already holds the full query string (and only the query string):

q1=value1&q2=value2&q3=value3

In short, Locations are already parsed at this first level, so the very fact the code thinks it needs to parse the Location is a code smell.

(3) Um, query strings do not have exactly 6 items (???)

I don't just mean query strings in general, I mean query strings on your site.

Ever clicked a link in a Marketo email and seen the added mkt_tok query parameter? Or clicked a URL from a search engine results page? You have to expect additional query parameters to be added to your links in the real world.

The additional params might not have any meaning to you, but they'll be captured and forwarded by analytics libraries, ROI pixels, etc. And that's totally fine. The important thing is that your pages must still work even if there are extra params (if the the URL is missing critical parameters, you may need to throw an error, but extra ones need to be ignored).

Yet the code above does this:

var userInfoArray = urlParams.splice(-6, 6);

That takes the last six elements in the param array as if those are always the “interesting” ones. But if there's a mkt_tok or AdWords ValueTrack param added to the end, that doesn't automatically make it more interesting than the first parameter. Ridiculous!

(4) URL parameters may technically be ordered — but not in the real world

The code goes completely off the cliff in the next step (the last thing I'll rant about today).

It reads those last 6 array elements as if they will always represent the Email, FirstName, LastName, Company, Country, and Phone in that exact order. Instead of checking the query param names, it just reads the values.

Even if you don't have any unforeseen extra fields (which you will) and even if you don't care about the first query param (which you should) and even if there's no document #hash value (which you can't count on)… this is still bad code.

Interestingly, if you're a stickler for standards, query parameters might be considered ordered. To the degree they reflect form fields, even if there aren't literal <input> elements, the order of params in a URL-encoded string is supposed to be the order of fields in a (literal or theoretical) HTML <form> tag. That is to say, it's not random, and switching parameter order changes the semantics of a form post.[1]

But in the real world, the query string FirstName=Sandy&LastName=Whiteman is accepted as having the same meaning as LastName=Whiteman&FirstName=Sandy.[2] So you can't go assuming that the six fields you want will always appear in the same order.

It's like I said in my post on email address case-sensitivity: sometimes something is true in the only standard that matters — but we can't afford to have that standard strictly enforced if we want the web to work.

Anyway, just… don't.

Main thing I'm trying to impart is (1) above: don't try to build things you don't understand. There are established libraries to help you, and even if those libs prove imperfect, you can contact their developer, who's bound to have a huge jump on you technically.


Notes

[1] Meaning if you were trying to reproduce a <form> that could lead to the form post, it would have to look like

<form><input name="FirstName" value="Sandy"><input name="LastName" value="Whiteman"></form>

not

<form><input name="LastName" value="Whiteman"><input name="FirstName" value="Sandy"></form>

but I can't imagine when you'd be doing this.

[2] When arrays are encoded in query parameters like Interest[]=Apples&Interest[]=Oranges then if the parser gives this special meaning (PHP does, other engines & frameworks do not) they'll be turned into an array of strings ["Apples","Oranges"]. Ergo, it must be ordered (no such thing as an unordered array). Whether the order ends up mattering in your back-end code is a different matter, the point is that the query params are interpreted as ordered.

Similarly, if you have two or more of the same query param name Interest=Apples&Interest=Oranges without the [] brackets, then in those frameworks that don't just take the last value (overwriting previous values along the way) the result will also be an array.

But these quirks are way beyond the abilities of the code above.