Explore and filter lead domains using FlowBoost's FBEmail

Of all the frustrations of Marketo's matching and filtering engine, few can rival its inability to accurately match on the domain part of an email. ([Email Address] Contains “@google.ca” is obviously inaccurate because it will match joe@google.cannot-be-trusted.co.uk.1)

Ends With isn't the only thing you might want to do with email domains, of course. You may want to scan for a match in a list with thousands of names, group domains with the same parent company (even across TLDs), or check to see which service scans inbound mail for a lead.

In FlowBoost Pro v13, we're rolling out the FBEmail helper library and I'd like to introduce its “marquee” function, FBEmail.getDomainInfo.

You could already grab some domain info using standard JS in FlowBoost2, or by exporting a list to Excel and using its primitive string macros, but your results would be filled with either errors or uncertainties. getDomainInfo is not simply a convenience wrapper around string functions: it offers unequaled parsing ability by being plugged into registrar info and other special-purpose databases — and directly into live DNS results.

Let's take a look at a sample FlowBoost call from Marketo:

return FBEmail.getDomainInfo({{Lead.Email Address}});

And its response:

{
  "emailAddress" : "testme@teknkl.com.au",
  "mailboxName" : "testme",
  "shortestPrivateDomain" : "teknkl.com.au",
  "longestPublicDomain" : "com.au",
  "domainIsWellFormed" : true,
  "domainIsDisposable" : false,
  "domainIsFree" : false,  
  "domainHasMx" : true, 
  "mxProvider" : "mailscanner.com" 
}

Exciting, eh? Here are the fields returned by getDomainInfo:

  • emailAddress - the first argument passed to getDomainInfo. By default, this just echoes back the email address, but by using FBString (see below) you can convert the input value to lowercase on the way out for data normalization.

  • mailboxName - the left-hand-side of the email address. For harvey.wallbanger@oldcompany.com this is harvey.wallbanger. Isolating this value can aid deduping efforts, as that may be the same person as harvey.wallbanger@newcompany.com. Of course, it may be a different human entirely, and for shorter LHSes it's too ambiguous to call. But taken together with other information (and in the absence of first and last name) it can add intelligence to merge decisions.

  • shortestPrivateDomain - the topmost private (i.e. registerable) domain name in the DNS tree, representing the overarching corporate entity. For both jill@microsoft.com and jill@apps.microsoft.com, the shortest private domain (SPD) would be microsoft.com. In kind, for ken@google.co.uk the SPD is google.co.uk, and for lisa@internal.t.co it's t.co.

Those are the easy examples, though. Computing the SPD correctly using only string parsing is absolutely impossible (don't let anyone tell you otherwise!) so FBEmail uses direct insight into registrar restrictions. For example, for mark@mail.cloudapp.net the shortest private domain is mail.cloudapp.net, not cloudapp.net (cloudapp.net itself is public, despite looking like a typical .net). This phenomenon is explored in my earlier post on Amazon Elastic Beanstalk + Munchkin.

Some might call the SPD the “parent domain,” but since it may actually be the grandparent or great-grandparent, or not a parent at all, we've chosen the more exact name. (Geek note: browser source code calls this value the “domain group,” which is kind of interesting, and is based on its use in indexing cookies.)

  • longestPublicDomain - the reverse of the SPD — what's left in the domain after you remove the SPD. So for jill@apps.google.com this would be com and for ken@google.co.uk it would be co.uk. This can be useful for sorting leads by country (using ccTLDs), though be wary of course of ccTLDs that sell subdomains for international use, like ly and at (don't expect joe@bit.ly to be living or working in Libya).

  • domainIsWellFormed (true/false) - whether the domain has both public and private part, which is another way of saying “Does it look like an Internet-routable domain?” niti@loco has no public part, while omar@co.uk has no private part, and cameron has neither. So none of those will ever receive your mail.

  • domainIsDisposable (true/false) - whether the domain appears in the Freemail-Disposable list — domains used to generate one-time/throwaway addresses. (You may not want to deny such people access to their first asset, but it won't pay to continue to send stuff to them.)

  • domainIsFree (true/false) - whether the domain appears in the general Freemail list. This is a list of email providers offering free mailboxes. I'm not personally a fan of discarding so-called “non-corporate” email addresses — to borrow from Josh Hill, if someone wants your content and will read your emails, demanding more may just tick them off — but some businesses find it useful to winnow their leads this way, so you may take action if you want!

  • domainHasMX (true/false) - whether there's an MX or A record for the domain, indicating at least that the domain owner claims there's a mailserver waiting for connections. FlowBoost doesn't make an connection to the mailserver itself — this check is strictly at the DNS level, not SMTP — but without a record for the mailserver, the address will not be emailable. Getting early awareness of this can help you clean your database.

  • mxProvider - the domain of the mailserver that initially receives mail for the recipient. For domains using dedicated anti-spam services, this will be the SPD of that service, like messagelabs.com; for those using all-in-one hosted email w/anti-spam, it'll be a domain like outlook.com or google.com. (For companies hosting totally in-house it'll be the recipient domain, a subdomain, or a domain alias.)

Knowing the MX provider is invaluable when debugging deliverability. For example, if you see that your clicks are lower when the lead uses Outlook 365 (all other things being equal) you can concentrate on testing with an account there.

Mapping domain aliases

FBEmail lets you pass in an optional object containing domain aliases. For example, if you want domains related to one of your key accounts to be consolidated:

return FBEmail.getDNSInfo({{Lead.Email Address}}, { 
   aliases : {
     "acquiredcompany.com,knownsubsidiary.com,obsoleteconame.com" : "bigcompany.com",
     "t.co" : "twitter.com"
  }
});

If you include an alias map, any matches will be translated to the primary domain (that is, the right-hand-side of the map entry): joe@apps.acquiredcompany.com will show the shortestPrivateDomain bigcompany.com.

Additionally “massaging” results using FBString

You can optionally use FlowBoost's FBString helper functions to format output automatically. Do this via the formatValues option:

return FBEmail.getDNSInfo({{Lead.Email Address}}, { 
   formatValues : FBString.valuesToLower
});

valuesToLower will change right-hand-sides in the results (i.e. the string values, not the property names!) to lower case. So for JiM@goOgLE.CA you'd get

{
  …
  "shortestPrivateDomain" : "google.ca",
  "emailAddress" : "jim@google.ca"
  …
}

and so on.

Integrating into Etumos Verify

FBEmail results can be used as a precondition for Etumos Verify or other real-time validation services to conserve resources. For example, if a lead's domainHasMx result is false, there's no need to run further validation at that time.

Changes over time

DNS is a quickly adaptable protocol, particularly when it comes to negative (missing) results. Perhaps IT hadn't set up inbound mail for a new domain, but someone at the company filled out a form prematurely... rare but by no means impossible. So, painful as it may seem, you should recheck addresses with bad results perhaps a week after the first check, just to be sure.


Notes

1 If you want a very quick and dirty workaround, you can create a second field that concatenates {{Lead.Email Address}} and a designated EOL delimiter like $. Then you can match [Email Address] Contains “@google.ca$”. But c'mon, use FlowBoost. It's fun, and I'm working my butt off on it!

2 For example, to get the full private @domain in FB Standard (without any smart shortening):

var domain = ( {{Lead.Email Address}}.match(/@(.+)/) || ['N/A'] ).pop();