Posts Tagged ‘Name Validation’

Using DOTS Name Validation 2 for De-Duping Names

De-duping data is one of the most popular uses of our services, and for good reasons: duplicated data can lead to redundancy, extra costs, waste and even lack of customer engagement. In this blog post, we will examine a particularly challenging issue: getting rid of duplicate contact names.

De-duping for some data sets can be easier than others. For example, phone and email addresses can be relatively straightforward, as those provide a unique ID to tie back to a customer. We’ve recently published a blog on how to dedupe to addresses with the barcode digits field in our Address Validation – US service, so de-duping validated address data should be a cakewalk.

While de-duping with emails, phone numbers or barcode digits can be relatively straightforward, de-duping names in a database might prove to be a bit more difficult. One of the issues that may present itself is that a database might have several different variations of the same name. For example, the name “John” could come in different variations like Johnny, Jonathan or even slightly mistyped as Jon.

Our DOTS Name Validation is an API that can help with this and provide the tools you need to get rid of duplicate name entries in a database. When a name is processed through Name Validation, we check it against millions of different names to find out if it matches a known name. Because of this, we are able to return several other names that may match one of the variants we’ve found in our database.

Let’s take the name “James Smith” as an example. When this is run through our service it will result in the following response:

The service returns various pieces of information that can be used to determine the validity of a name, but one of the fields we want to utilize for removing duplicates is the RelatedNames field highlighted below:

Here we see some common variants of the name “James.” The RelatedNames field will return names that are often associated or variants of the input name. With these, in addition to the last name, you would have the following 5 different combinations of the name you are attempting to find matches for.

Jaime Smith
James Smith
Jim Smith
Jimmie Smith
Jimmy Smith

With these in hand, you can loop through some of the names in your database beginning with the letter “J” and flag any entries that come up with a match. These matches would likely still need a human eye and some more investigation in order to determine if they were actually duplicates, but this will most certainly help reduce the amount of human time involved in looking through duplicate names.

If you wanted to make your search for duplicate names even more aggressive you could use the field FirstNameSimilar to look for other potential duplicates. In our example, this field will return the following highlighted names:

This field will return other names that are phonetically similar to the first name submitted on the input. These names might be a bit more far-reaching or different from the intended first name, so if you look for matches in your database with these, there is a chance it might flag names that are unique from one another. So using a human eye to review these results is still a good idea.

The LastNameSimilar field will also return last names that are phonetically similar to the input last name. If you are running into cases where you have a lot of duplicate last names you could use this field in the same fashion as noted above. This situation may be less common than the first two presented, as there is less of a chance there would be different variants of a last name as there is with a first name.

With any use case or Service Objects API integration, there are always caveats and special scenarios. If you need some advice or recommendations on how to best integrate our API in your application, please reach out to our Applications Engineering team and we’ll be happy to provide any integration assistance that we can.

What’s in a Name? The Importance of Global Name Validation

A quick online search of the US white pages shows that there are over 1500 real people named George Washington, more than 50 named William Shakespeare, and a surprising number of Steven Spielbergs. There is even a recent tongue-in-cheek commercial featuring a real person named Mac(kenzie) Book touting the benefits of the Microsoft Surface laptop.

When it comes to your contact database, however, fake or incorrect names can cost you real time and money – and can even make you vulnerable to fraud or damage your brand reputation. In this article, we will look at how validating first and last names can give your business the power to make sure you have genuine, accurate contact records for real customers and prospects.

Why the right name is important

There are several reasons why effective contact records start with having the right name:

  • Getting the name – or gender – of someone wrong can leave a bad impression with customers and prospects.
  • Fake names such as “Donald Duck,” often submitted to obtain free information from marketing campaigns, waste your time and increase your costs every time you run a campaign.
  • Bogus or garbage names that cannot be traced to real people can be a flag for suspicious activity, such as fraudulent orders or marketing inquiries.

In addition to having valid names to protect your contact database integrity, learning more about these contact names can also lead to more profitable marketing efforts and customer relationships. For example, knowing a person’s gender, or whether a contact is a business or personal name, can help you target your contact activities more precisely. In this way, name validation gives you the ability to both verify and enhance your contact data assets.

How name validation works

So how does it work? DOTS Name Validation parses names from anywhere in the world into individual data fields, while checking them for accuracy and validity.

This service is easy to use: its input consists of a name string, together with some optional parameters for special cases. In response, it outputs a validated name where possible (or sometimes two names, if your input consisted of a string such as “Steve and Mary Smith”), as well as a wealth of associated data. In addition, Name Validation retains important accented characters often found in global names in the output of the service.

Here are some of the things we check for:

  • Whether this name validates against a proprietary global database of millions of domestic and international names
  • Individual data fields for the name’s prefix (Mr., Ms., etc.), first name, last name, and suffix (such as Jr. or Sr.)
  • The suggested order of the name: for example, changing “Jones John” to “John Jones”
  • Quantitative scores for the likelihood that this name consists of a vulgar, garbage or celebrity name, as well as common dictionary phrases such as “White House”
  • The likely gender of this name, including a “Neutral” response for names common to both genders
  • National origin of common names
  • Whether this name is a business or a person
  • Related names (such as Bill or Billy for William)

Perhaps most importantly, we indicate whether this name appears to be valid, as well as a “best guess” name for those that are unable to be validated. Armed with this information, you can flag suspicious names for removal or further processing and have a greater degree of confidence in names that pass validation.

Good marketing starts with knowing your customer

Aside from better data hygiene and contact quality, there is another very important reason for getting accurate customer name data: it is the key to personalizing your marketing efforts.

A recent survey of over 1000 consumers showed that personalization leads to larger purchases, increased revenue, and fewer returns. More important, it creates greater brand loyalty: forty-four percent of customers surveyed say they are more likely to return for more business after a personalized experience. And with the holiday shopping season nearly upon us, this can be an important competitive advantage for your marketing campaigns and customer outreach.

According to Dale Carnegie, the sweetest sound in any language is the sound of a person’s own name. With help of automated name validation capabilities, you can leverage accurate name data as a key component of building effective relationships with your customers and prospects.

Lead Validation: The Core Components

How good are your leads? DOTS Lead Validation, Service Objects’ most popular composite service, is designed to measure lead quality – helping our clients reduce fraud, increase conversions and enhance incoming leads, web orders and customer lists.

Lead with Certainty

Lead Validation blends the strengths of our Name, Address, Phone, Email and IP Address validation services to provide authoritative details and return a Certainty score from 0-100. Marketing teams can use the results to assess the quality of incoming leads in real time, sales teams can prioritize their leads based on quality, and companies, in general, can make sure their CRM and other customer databases are kept clean, accurate and as up-to-date as possible.

Lead Validation verifies leads in real time for the United States and Canada. Our DOTS Lead Validation – International service works in a very similar fashion, adding the capabilities of validating global leads to the mix while including the strength of our core services for validating leads from the US and Canada. In this blog, we are focused on the components of Lead Validation and how it helps our clients.

Our Lead Validation service contains six primary components:

  • Name
  • Address
  • Phone
  • Email
  • IP address
  • Business

Each of these components are discussed in greater detail below.

Name component

The name component is built on the strength of our DOTS Name Validation service to validate names, verify accuracy, parse out name components, return gender information and more. It gives you insight into the name, looking for similar names and nicknames to improve matching, and flags questionable things like names that seem to contain vulgar words, match well-known celebrities, or appear to be fabricated garbage names such as random keystrokes. Name Validation also has access to names from all over the world, giving it the ability to handle leads with names that are less common in North America.

Lead Validation compares the name to other data points such as the phone number, email and address to determine how the name connects to the rest of the lead. Red flags found, such as those listed above, factor into the scoring,  returning a quality score that indicates the reliability of the given name, both by itself and as part of the lead. Unusual or unknown names are not necessarily failed. Generally, names can be considered good, unknown or bad. However, to get the “bad” designation, we expect to see that the name fails in at least one of the red flag categories mentioned above.

Address component

The address component uses DOTS Address Validation – US, DOTS Address Validation Canada, and DOTS Address Detective to correct, standardize and validate addresses in the United States and Canada, as follows.

  • Address Validation – US uses our top of the line address validation engine and the USPS dataset to validate the given address, identify it as a business or residence, and determine if it is a mailable location, among other things.
  • Address Validation – Canada parses, cleanses and validates Canadian addresses in English and French.
  • Address Detective uses tools to deal with extremely messy addresses, from address information all appearing on a single line to jumbled inputs such as the street address being assigned to the city or the state being assigned to the postal code. Address Detective also has access to addresses not available in the USPS dataset to help with more challenging address inputs.

Lead Validation uses the results of these services, along with comparisons against other results like phone number and IP address, to build a component score that reflects both the quality of the component and its relation to the lead as a whole. Other Lead Validation specific tests look for things like hotels, prisons, intentionally bad data, post office boxes, CMRAs and more that also influence the component score.

Phone component

The phone component uses primarily DOTS Geophone Plus to gather contact, provider and location data for up to two phone numbers. Other important pieces of information are also collected. Is the number a landline, wireless or VOIP line? Is it a residential or business number? Does it appear to be a Google or Skype number? Is it connected? Can we detect patterns in the number that might signify that it is just randomly typed in numbers?

Lead Validation compares the resulting contact and location data back to the initially given name, business name, address, email, and IP address inputs to determine any connections that can be made between data points. These influence the component score along with the basic question: does it appear to be a good number?

The additional data points collected from the phone number also provides additional insights. Did a business lead provide a residential phone number, a personal email address or was a wireless line used? Dozens of tests create a component score that reflects the quality of the given phone number and its connection to the lead. If two numbers are given, the analysis is done on both numbers and the better fit for the lead is chosen as the primary one for comparison purposes.

Email component

The email component uses our DOTS Email Validation service to perform a step by step process that attempts email correction to fix common mistakes, syntax checks to make sure the address is both syntactically valid for email and that it conforms to the rules of the given domain, a DNS check to make sure the domain exists, an SMTP check to find the presence of a valid mail server and other various integrity checks. It tests that an email server is operational and accepting mail as well as if a specific mailbox is valid. Other data points collected include; if the email seems to be bogus, vulgar, garbage, disposable, an alias, a spam trap, is associated with a bot, is a free or business email and more.

Lead Validation compares email to other data points like name, business name, IP address and phone number to see if they can be connected. That combined with whether the mailbox is good, seems to be connected to the user and considerations for any red flags found while testing the email, lead to a component score that considers how valid the email address is and its likelihood of being a part of the lead.

IP address component

The IP address component uses primarily the DOTS IP Validation service to identify if the IP address is a good one, its country of origin or more accurate location, whether or not it belongs to a known proxy and if it appears to belong to a mobile device. The service can identify harmful proxies to determine if the lead is attempting to hide their location or check if the IP address has been linked to malicious behavior. Other pieces of information are returned as well to identify the internet service provider (ISP) or link the IP address to a business.

Lead Validation will compare IP address to address, phone, email and business to determine if any positive or negative connections can be made. Lead Validation will assess any high-risk countries along with the consideration of malicious and proxy IP addresses to determine the quality of the IP component and how it fits in with the lead.

Business component

The business component is unique from the other components. While the other components work for all leads, the business component is designed to work only with leads designated as business leads. This designation is controlled by TestType (i.e. TestType=business or TestType=businessonly) as users can decide if their leads are business, residential or perhaps a bit of both. This component also does not rely on existing services to gather core data.

Lead Validation performs its own internal tests and checks against our business datasets. Data points found can be compared to business names, addresses, IP addresses, phone numbers, and emails to look for connections. Other checks look for red flags in the given business name such as vulgarities and potential bogus submissions. All of these checks combine to create a business component score that reflects its validity and how it fits in with the lead as a whole.


Each of the components described above return their own 0-100 certainty score and a quality recommendation (i.e. Review, Review or Reject). Generally, high scores indicate the component score itself is good, while a low one indicates that it is bad. However, each component also has scoring based on cross-comparisons built in as well. For example, a given phone number might be perfectly valid but during the cross-examination phase, we find that it seems to belong to a person not indicated by the input lead. This would likely lead the phone component to a poor score because while the number is technically a “good” number, it is not good for the lead.

Hopefully, this gives you a strong overview of our Lead Validation service, as well as provide some insight into how the components are tested and how they relate to the overall lead. If you would like to learn more about Lead Validation, please visit our product page and developer guide.

DOTS Name Validation 2: What Do The Scores Mean?

What’s in a name? Hopefully, valuable contact data for your business. But some names clearly contain red flags for bad data – and that’s where we come in.

Name Validation is a very effective tool for weeding out garbage, bogus and unreliable names. This service can be used in real-time while creating leads, or used to process a large list of names at once. It is great tool for cutting down on the amount of unreliable data that can be entered into a system.

This article will walk you through the different scores that the DOTS Name Validation 2 service provides, to help you get the most out of this tool. In addition to a massive list of names that we compare input names against, we also do several other checks. These scores can help identify why a particular name was considered to be invalid, as well as helping to shed some light as to what types of validation Name Validation performs.

Overall scores

One of the first things users will want to look at is the OverallNameScore value. This score represents the service overall rating for the given name. This score value ranges from 0 to 5, with 0 indicating a definitely bad name and 5 indicating a definitely good name. This is usually the first result someone might look at when determining the validity of a name.

We generate this overall score based on several other checks, validations and scores that the service can generate. However this might not be the last stop a user would make when attempting to determine if a name is valid or not. Based on your use case, you may want to look at one of the other score values our service provides, described below.

Other scores provided

The other score values that the service gives also range from 0 to 5. These values indicate the likelihood that the particular scoring category applies to that name. For example if a name received a VulgarityScore of 5, then that name would definitely have some type of vulgar word present. Below are the different scoring categories that the service provides.


As mentioned above, this score indicates the likelihood that a vulgar word is present in the input name. This score highly affects the overall score, as this is a key item used to sniff out bad or unprofessional name information.


This rating represents the likelihood that the input name provided is a known celebrity. This field will also work with fictional celebrities, so names like “Micky Mouse” and “Homer Simpson” will receive high Celebrity scores, as well as real life celebrities like “Tom Cruise” or “Madonna”.


The BogusScore field will let the user know if a given name is simply just a word or phrase that wouldn’t make sense. For example, single words or phrases that aren’t names (such as “Sandwich” or “The Quick Brown Fox”) will receive a high bogus score.


Random key strokes or inputs that are not valid words will receive a high Garbage score. This would correspond to input like “asdfg” or any other series of random letters, keystrokes and input that doesn’t make a whole lot of sense as a name.


Finally, we provide scores that indicate the likelihood that the input text is a dictionary word. These tend to have less weight on the overall score, as there are quite a few legitimate dictionary terms that can be considered last names. For example, the name “Park” is a relatively common last name, so it will receive a lower dictionary score of 1, while a word like “Fluorescent” would receive a high dictionary score because it is less common.

As with any of our services, there can always be specific use cases that may require some more information about how our services work. Service Objects has a team of customer focused people standing by to help you get the validated data you need. If you have any questions about our services, don’t hesitate to reach out to us – we would love to help you get the validated data you need!

Bringing Dead Letters Back to Life

All right, we are finally going to admit it: there are some bad mailing addresses out there that even Service Objects can’t fix.

Of course, we’re talking about cases like illegible handwriting, physical damage, or the kid who addresses a Christmas letter to “Santa Claus, North Pole.” But even for them, there is hope – in the form of a nondescript building on the outskirts of Salt Lake City, Utah, known as the USPS Remote Encoding Center. Images of illegible mailing addresses are sent here online from all over the United States, in a last-ditch effort to get these pieces of mail where they are going.

Behind the walls of this beige, block-long building lies an optometrist’s dream: nearly 1700 employees working 24 hours a day, each scanning a new image every few seconds and matching it to addresses in the USPS database. (The same database we use to verify your contact address data, incidentally.) Most get linked to a verified address and are sent on their merry way; the truly illegible ones are forwarded to the USPS’s Dead Letter facility to be opened, and those letters to Santa get forwarded to a group of volunteers in Alaska to be answered.

According to the Smithsonian, there used to be more than 50 of these facilities all over the US. With time and improving automation, all of them have now been shuttered, with the exception of this lone center in Salt Lake City. To work there, you need to be fast, precise, and then go through more than a full week of training – and then you get put on one of 33 shifts, handling the roughly two percent of mail pieces that the Post Office’s computers cannot read automatically. That’s between five and eleven million pieces of mail per day on most days.

Of course, technology continues to improve, and USPS has become a world leader in optical character recognition for both handwritten and machine-addressed mailing pieces – even 98 percent of hand-addressed envelopes are processed by machine nowadays. In an interview with the New York Times, the center’s operations director acknowledges that computer processing could eventually put them out of business entirely. But for now, human intervention for illegible addresses hasn’t yet gone the way of the elevator operator.

Thankfully, your business correspondence probably isn’t hand-scrawled by your Aunt Mildred. And hopefully Santa Claus doesn’t show up very often in your prospect database (although fake names get entered for free marketing goodies more often than you think, and we can easily catch and fix these). So your chances of ending up on a computer screen in Salt Lake City are pretty slim – which means we can help you ensure clean contact data, and leverage this data for better marketing insight.

So for those of you who can’t spell, failed penmanship when you went to school, or have a habit of leaving your envelopes out too long in the rain, there is still hope. For the rest of you, there is Service Objects.

Thinking Alternatively About Place Names

Here at Service Objects we come across a lot of names, particularly the names of places. We also work with a lot of personal names, but for now I would like to focus on just place names. Whether the name is for a city, town, village, hamlet, district, region, state, prefecture, mining area, national park, theme park or what have you; chances are that the place may have one or more even alternate spellings and alternate names associated with it.

For a human fluent in English, “North Carolina” and “N. Carolina” will be considered equal, but for a computer they are not. With the use of fuzzy-matching and/or standardization we can work around seemingly trivial issues like this. Now let us suppose that you are working with a set of Japanese data and come across the same name but written in Katakana “ノースカロライナ” or Ukrainian data written in Cyrillic “Північна Кароліна” or even Thai “รัฐนอร์ทแคโรไลนา”. Well, fuzzy-matching and standardization are still our friends; we just have more fuzzy-matching and standardization rules to consider. However, we first need to ensure that we even have the data available to associate a name in a different language.

We’ve been creating a list of place names to help us tackle problems like the ones mentioned above. We currently have a list of over five million unique place names generated from a pool of approximately 11 million names. We are aggregating name data to come up with a more comprehensive list that consists of known alternates, variations in spellings, different languages and the transliterated versions for the different languages.

Here’s a quick look at what we have accomplished, so far:

  • Current list of approximately eight million place names and growing
  • Transliteration and phonetic mappings for various languages
  • Case, accent and kana sensitivity handling
  • Queryable using fuzzy-matching algorithms

We have taken some of what we have learned from our DOTS Address Validation – International service and built upon it in order to improve data beyond the realm of just address validation. When working with Phone, Email, IP, Demographic and Geo-coordinate related data we too often find that location names do not match up. Naturally this is to be expected, since different data vendors will have different standardizations and practices when it comes to naming conventions. Utilizing a comprehensive place name library will allow us to quickly perform various actions, such as cross checking multiple data sources against each other with increased flexibility and match rates.

It may not be immediately apparent how useful a place name library like this is and what kind of avenues it can open up, but expect to see new and exciting developments from us in the coming months!

Name Deduplication Techniques

The bane of any Database Administrator is maintaining duplicate records. They take up unnecessary space and generally do not provide any added value to contact records. A more challenging task for Database Administrators is how to identify and merge records which might be duplicates, and in particular, duplicate names.

Identifying Duplicate Records

There may be variants for a given name which might not be easily identified in a query, but they are invariantly linked. A common example might be Joe Smith vs Joseph Smith. Both could be referring to the same person depending on how the user may have entered their name.

Name Variants, Finding the Common Name

A particularly useful feature of the Name Validation 2 service is the Related Names output field. This field provides a comma-separated list of first name variants for a provided name. For example, using the given name; Joe, related names returned include Joel, Joeseph, Joey, Josef, Joseph, and José.

With this information, it becomes easier to identify names which are related but in a different form. There may be cases, however, where names cannot be identified as related but can be linked from similarity. Some examples include names that are misspelled or alternate names which are not related but similar. These names can still be identified through the Similar Names output fields of the Name Validation 2 service.

Similar Sounding Names

DOTS Name Validation 2 employs sophisticated similar name matching algorithms to match names drawing from a database of international names with up to 1.4 million first names and 2.75 million last names. First and last name similar results are returned in a comma-separated list which can be used to compare against names that already exist in the database.

An example similar name result for the given name; Robert Smith, would return similar first names Rhobert, Róbert, Robertt, Roebert, Roibert, Rubert, Robbert, and similar last names Smyth, Smithe, Smiith, Smiyth. Of the similar names that are found, names are returned in order of most common to least common.

Merge and Promote the Winning Record

Using these results, a query can potentially link similar or related names and identify records which are duplicates. Once duplicate records are identified, the question becomes which should be promoted as the winning record? This decision can depend on factors based on business logic, perhaps a record which contains other vital contact points such as address or phone number or perhaps entry date is chosen as the winning record. Once a winning record is chosen, a merge process is incorporated to merge contact fields from identified duplicates to build a complete record.


Ridding your database of duplicate contact records can be an arduous task, but with the help of Name Validation 2, it doesn’t have to be. Leveraging the vast quantity of names that Name Validation 2 draws upon yields a top quality solution to identifying duplicates through related and similar names.

For more information about Name Validation 2 service, or to receive a free trial key, click here.

For developers, our Name Validation 2 documentation can be found here.

Why We Geek Out Over Name Validation

What’s in a name? Everything — especially if you’re trying to connect with customers and prospects. If you’re emailing, mailing, or calling someone and you have her name wrong, you’ve already lost her.

The importance of name validation APIs

Name validation is becoming increasingly important in the modern world where social media and the Internet allow for a faster-than-ever propagation of bad data. For example, as people opt into various offers, it’s not unusual for auto-correct to change their entries, for a typo to occur, or for the person to enter a bogus name. On other occasions, a name that looks fraudulent and is labeled as such, could really be a legal name. This is the case for this man who legally changed his name to Fire Penguin Disco Panda:


Companies wanting to avoid potentially embarrassing situations like putting a bad name on a piece of mail, or removing a perfectly good contact with a name they think is fraudulent, should consider using a service like DOTS Name Validation, an essential ingredient in marketing automation, business databases, CRMs, and the like. Not only does name validation perform helpful changes such as parsing names into individual fields, fixing the order of names, and returning the gender of the individual, our name validation API runs a variety of checks to ensure the name isn’t a bogus, celebrity, or vulgar name.

Updated name validation scoring algorithms

We recently pushed a major update to our name validation service, including many international names as well as massive improvements to our scoring algorithms. Our name validation database now has almost 5 million first and last names in it.

Our scoring algorithms are where the service truly shines. Even when we get an obscure name that we are not sure about, we look to our algorithms to separate the unknown from the bad. This is where our team likes to geek out. We enjoy thinking of new ways to combine results to identify complex names.

Here’s where we get geeky

We love to get creative with our name validation service. We spend time pouring through lists of celebrities, vulgar names, and any crazy goofy thing we can think of.

What are some of the things we are interested in? We love unusual names. For example, should we consider the names Anakin and Khaleesi as valid now that people are actually naming their babies after these characters? And you can imagine the fun we’ve had talking about Anita Bath and Warren Peace.

We track a lot of vulgar and goofy-type potential names, but what about alterations to those? For example, we might nail the name Hugh Jass, but what about similar names like Hue Jass, Hugh Jazz, Hou Gass, or Hue G. Azz? What if someone submits the name Bob Ba$$? Could we figure out that the intended name should be Bob Bass?

What if a name is submitted that should not be a name like “House on the corner” or perhaps the name of a business instead of a person? These sorts of things can be tricky to identify in an automated system, but our team lives for solving these kinds of problems.

We let our inner geeks out so that we can anticipate and flag bogus, prank, and unusually challenging names. Though our name validation software uses algorithms to score and validate data, they’re powered by both artificial and human intelligence.

Use Name Validation to Get your Customer’s Name Right

Name Validation

It’s very important in a lot of ways – it’s one of the easiest ways for someone to provide fake information on a web form but can be very tricky to properly detect. Take this example of a real piece of mail:


More processes that accept this sort of data are being run by computers. Less often human eyes review as the entire process from start to finish is becoming fully automated.

Name validation can be easily overlooked as an unnecessary addition, but the ramifications of making mistakes can be far reaching. Small mistakes can be very embarrassing, larger ones can lead to a big PR black eye for a company if a very embarrassing mistake makes its way onto the internet.

What’s going on behind the scenes

At Service Objects, we are always looking for ways to improve all data inputs at the point of entry, and name validation is no exception. We have millions of known first and last names from around the world and algorithms honed over years of work to weed out oddities in names. We are looking for celebrity names, vulgar words, words from a dictionary and things that just plain look like garbage or bogus. We constantly strive to improve our algorithms and take pride in identifying fake names.

In the example above it seems obvious that the name is bad, but to an automated process is it safe to say this is bad? What about a valid name such as Martita Boobier, which contains questionable words? What about something like Letit Boobra which doesn’t appear vulgar but also doesn’t appear to be a valid name as well? The goal of DOTS Name Validation is to properly place these names into the appropriate category to take the worry out of an automated process improperly placing them.

Avoid adding bad names to your CRM in real-time

Bad names such as “Trucker Bob”, “Doctor Nick”, “Homer Simpson”, and “Felix the Cat”, names that don’t appear to be names such as “The Big Bang” or “Service Objects”, or names that just appear to be complete garbage such as “Asdf Blah”. DOTS Name Validation can properly identify many cases that might otherwise slip through the cracks without proper review.

Data Validation In Real Estate

The real estate industry can gain a competitive edge with data validation

Data-based marketing, outreach and lead generation isn’t only for cutting-edge B2B companies anymore. Data runs the world these days and successful businesses in every industry can benefit from using verified, validated data in smart ways.
Working with generic data isn’t enough, either. It can be inaccurate and out of date, making it as useful as no data at all—worse, even, if you’re relying on this information. That’s why smart real estate organizations—from large firms to independent agents—are investing in data validation services.

Data validation verifies that the information you’re working from, whether about a specific lead or regional demographics, is accurate and up to date. Validation can be as simple as verifying correct names, phone numbers and current addresses, or can be as nuanced as geo-targeting, IP address validation and reverse phone lookup discovery. No matter the level of data verification, the results are the same: correct information can help you make better-informed decisions and accurately target your audience.

Clever and industrious people in the real estate industry can benefit from just about every type of data validation; it’s all about keeping an eye on trends and getting the right message to the right people at the right time.

Address validation

This is simple but crucial for real estate agents, who still spend a considerable amount on direct mail marketing. Getting a personalized mailer in the hands of the right person is important. RealTrends found that targeted direct mail pieces had a 2-5 percent response rate, versus the 1 percent rate when real estate agents mailed the piece to everyone without specific targeting.

Address Validation before a direct mail send can help ensure that you have the resident’s correct name (“Current Resident” makes the piece seem extra promotional and impersonal), the correct gender salutation, and helps make sure that the target actually lives at that address.

Or Current Resident Edit
Image via Evil Mad Scientist

Using a data validation service that has access to the USPS National Change-of-Address database can help further refine outreach. If a new family just moved into the address you’re targeting, they’re probably not looking to move again soon, so strike that address off the list for now.

Taking address validation a step further with geocoding validation can help real estate agents get a jump on hot trends and growing neighborhoods. Cross check a list of addresses against a trending neighborhood’s longitude and latitude to make sure the addresses you have really are in the hot spot. People currently in this neighborhood might want to capitalize on the new demand and sell their home at a profit, making them prime contacts for savvy real estate agents. Extend your validation and outreach efforts to the surrounding neighborhoods to get a leg up on the competition.

Reverse phone lookup

Reverse Phone Look-up enables companies to put a name and current address to a phone number. This is particularly useful since many people now move but keep their original cell phone number. This trend makes phone numbers alone a hard way to target people, especially with the declining use of landlines. According to Time, 41 percent of homes were landline-free as of 2014 and 60 percent of adults ages 30-34 exclusively use a mobile phone. With the average age of first-time home buyers currently sitting at 31 and expected to climb to 32-34 in the coming years, this makes reverse look-up validation an invaluable resource for real estate agents.

This type of validation will tell you if the people on your list of phone numbers truly do live in your territory. Plus, it will give you their most current address and name. National real estate companies can use this validated data to send location-specific messaging to everyone on their list, based on the person’s current location.

Demographic validation

A core premise of marketing, no matter what industry, is “know your audience.” Demographic data validation can help real estate agents get an accurate and intimate understanding of the areas they work in. Gut instincts are essentially gambles, whereas using validated data ensures you have reasonably accurate and updated information. By working with US census validated demographic data, real estate agents can change and target their messages based on location.

  • Spanish-language ads can be placed in predominately Hispanic neighborhoods
  • First-time homebuyer messaging can be sent to areas with a high concentration of young adults reaching the pivotal first-time homebuyer age
  • Direct mail pieces discussing downsizing can be targeted to areas with mostly older adults
  • Target small business owners in the area about property opportunities in the up and coming business district

SuburbsUnderstanding the population make-up of a particular area can also help influence how you market properties. Areas that are mainly suburban are likely to connect more with family-oriented messages while urban areas probably want to hear more about high-end home features and nearby amenities. By using a combination of demographic validation and geocoding validation, agents can perfectly target each area.

This level of data also provides insight into the average income and spending of nearby households, which is helpful when pricing houses and projecting commissions.

Competitive edge

Many real estate agents work independently and cannot afford to waste time, resources, and money on misguided marketing and outreach efforts. This is where a commitment to clean data and consistent data validation can provide a competitive advantage. Committing to using validated data as a key business tool can help real estate firms accurately focus efforts and spend smartly with better response rates.

Data can be intimidating, but with good data validation the return on investment is well worth it. Look into the different features and options offered to begin cleaning up your data and deciding which level of data-based targeting will work best for you. Go beyond just address validation and get creative if you want to pull ahead of the pack.

Name Validation and the Colorado Rockies

Have you ever misspelled a customer’s name? If so, then you know that the backtracking you need to do in order to remedy the problem can take some time and effort. Misspelling customers’ names, especially in service-oriented businesses, can easily derail your customer service and marketing efforts. Think about how much time and money you’ve already spent in order to learn about your target customers, their preferences, and how best to reach out to them. Then you blemish what should have been a pleasant moment of contact by misspelling their names. 

name-validationConsider these two Colorado Rockies blunders of 2014. First, they honored the spectacular batting average of Troy Tulowitzki, and gave away 15,000 baseball jerseys that spelled out the celebrated shortstop’s last name as Tulowizki. The Rockies did a well-timed damage control by posting on their Facebook page an apology that acknowledged the mistake. Then barely two weeks later, the same Major League Baseball team introduced in its merchandise a batch of new souvenir cups that had Nolan Arenado’s last name printed as Arendo. The Rockies’ third baseman Arenado won a Gold Glove and was thus referred to in the souvenir cups as Golden Arendo.

Examining these examples from a branding perspective, you might conclude that botching people’s names can make your prospective customers lose confidence in what you have to offer. The lack of attention to detail in something as crucial and downright personal as a name says a lot about your ability to deliver, say, whatever marketing claims you are making about your product or service. If the slip-up happens only once, then people can still be forgiving. However, if you commit the same oversight over and over, then it can suggest a lack of attention to detail or customer service. Imagine how potentially alienating it can be for an eager shopper—one who is clearly interested about your offerings because he may have gone through the trouble of signing up to your lead generation form—to see his name being misspelled on follow-up communications. In some cases, mucking up your time-honored personalization routine by messing up a customer’s name can even make or break a sale.  

This is the era of highly targeted and location-based marketing programs, social media, mobile wallets, smart wearables, advanced analytics, and drones that deliver merchandise at the doorsteps of shoppers. Consumers have developed higher expectations, and they formulate their purchase decisions based on those expectations. Marketers, on the other hand, can now wield a plethora of digital tools to streamline, as well as make more profitable, their day-to-day business operation. 

Getting your customers’ name spelled correctly shouldn’t be one of your issues. This tedious part of customer service can be verified, corrected and flagged automatically by a real-time name validation service. Our name verification API service intelligently validates not only your customers’ names but can also identify gender that is associated with the first name, helping you to address them properly. 

If you equip your business with the right tools to support your branding, marketing and customer service efforts, you’ll reduce the time, energy and money spent on fixing errors.


International Name Validation – Making Sense of Latin Characters

name verificationOlá, Grüezi, Cześć, ¡Hola! – Hello! While you may or may not be multilingual, your company likely has customers whose names contain accented Latin characters such as ø, á, ñ, or ü.

A brief history of accented Latin characters

According to Omniglot, the online encyclopedia of writing systems and language, the modern Latin alphabet consists of 52 letters (upper and lower case) as well as various symbols, punctuation marks, and numerals. In addition to the basic Latin alphabet, many languages (such as French, Spanish, Swedish, Italian, Portuguese, and many others) supplement the Latin alphabet by adding accent marks to vowels and some consonants.

Latin accents, such as the tilde (ã), umlaut (ä), slash (ø), acute (á), grave (à), circumflex (â), and cedilla (ç), are used to:

  • Change pronunciation
  • Indicate emphasis in a sentence
  • Indicate what to stress in a word
  • Indicate pitch or intonation
  • Indicate vowel length
  • Visually distinguishing homophones

The problem with Latin characters

Latin characters are difficult to render properly in computer programs and APIs that do not support international characters. For example, a name such as Zoë Smith might be failed as “garbage” in a typical name validation tool simply because the service doesn’t understand the “ë” character. In fact, our very own name validation previously had the same issue. However, we recently rectify this character set issue in our own service with a new feature that now adds support for approximately 62 common Latin and international characters.

DOTS Name Validation now accepts the most common accented Latin characters. This Latin character set update allows more Spanish, French, Italian, and German names to be validated and “pass” our name validation service without being kicked back as a “garbage” name.

What does our International Name Validation update mean to you?

Our customers asked us to support accented Latin characters, and we’re excited to deliver!

Since the update, more names in your contact database such as Terje Lundbø, Carlos Fernández, Jason Castañeda, and Fritz Müller are now being processed and verified by DOTS Name Validation successfully. This added feature also supports our composite lead verification services including DOTS Lead Validation and DOTS Order Validation. If your database contains a variety of international names, give our name validation services a try for free and see how it works for you!

Connecting the DOTS: It Starts With a Name

New video series Connecting the DOTS featuring Jim Harris of Obsessive-Compulsive-Data-Quality explores name validation and it’s important role in ensuring data quality excellence within your organization.

The most personal of personal data is a person’s name, which is why the most impersonal thing, is getting a person’s name wrong. When our names are entered into databases, either by ourselves, or others, we want interfaces that can parse and validate our names, and be able to differentiate the authentic from the invalid and fraudulent. Your business is dependent on the quality of your contact data, and when it comes to contact data, it starts with a name.

DOTS Name Validation is a real-time API web service that parses names into individual data fields, fixes the order of names, and returns the gender associated with the first name. With name verification, companies can instantly weed out legitimate contacts from bogus ones, stopping fraudulent names at the point-of-entry. Make sure that your contact records contain correct names by using DOTS Name Validation 2 to verify accuracy.

Jim Harris OCDQ BlogJim Harris is a recognized data quality thought leader with over 20 years of enterprise data management experience. Jim is a freelance writer, independent consultant, and Blogger-in-Chief at Obsessive-Compulsive Data Quality (, a vendor-neutral blog about data quality and its related disciplines.

What’s in a (First & Last) Name?

Little known fact: William Shakespeare was an accomplished data quality expert before he hit it big as a playwright. In fact, data quality was an underlying theme of many of his plays.

romeo-julietName validation, for example, played a central role in Romeo and Juliet where, in the family feud of fair Verona, the Capulets and Montagues were sworn enemies. This was bad news for the budding relationship of Juliet Capulet and Romeo Montague. If an unacceptable family name be the cause of such calamity, what could the young lovers do? Perhaps a little strategic data cleansing? “Deny thy father, and refuse thy name,” suggests Juliet to Romeo. “Or, if you will not, be but sworn my love, and I’ll no longer be a Capulet.”

After realizing that parting with their family names would be such sweet sorrow, Juliet ponders their predicament more philosophically. “What is Montague? It is not hand, nor foot, nor arm, nor face, nor any other part belonging to a man.” Furthermore, she semantically argues, “what’s in a name? That which we call a rose by any other name would smell as sweet.”

Shakespeare’s other line of work was much more pragmatic. What’s in a name? Prefix, first name, middle name, last name, and suffix. To parse or not to parse was often not a question since many sources of contact data do not separate names into those individual data fields. What is Montague? Romeo’s last name. Further name verification can determine the gender most commonly associated with the first name, identifies possible nicknames and related names (including Bill, Billy, Will, and Willie for the so-called Bard of Avon), and assesses the likelihood a name is fake (even outside of Verona, the name Juliet Capulet-Montague probably merits suspicion).

If the validity of the names Montague and Capulet had not been so fiercely debated, the story of Juliet and her Romeo would not have been a tale of woe. The blogger protests too much, methinks. One thing is certain, however. Name validation is essential to preventing your business and its contact data from becoming star-crossed lovers.

Jim Harris of OCDQ BlogThis post comes from guest blogger Jim Harris of Obsessive-Compulsive Data Quality. Harris is a recognized data quality thought leader with over 20 years of enterprise data management experience. Harris is a freelance writer, independent consultant, and Blogger-in-Chief at Obsessive-Compulsive Data Quality (, a vendor-neutral blog about data quality and its related disciplines.


When Name Data Goes Terribly Wrong

An Open Letter to Businesses,

Last week we learned of an extremely unfortunate event where a Fortune 500 company unintentionally mislabeled a piece of promotional mail with a very hurtful personal message. If you haven’t seen this article look here.

Big data is a land of opportunity, but unfortunately it is also polluted with typos and poorly inputted information. Marketers today rely on data from hundreds of sources to grow their business. These include compiled telephone books, census data, voter lists, and by rented mailing lists from other companies. As a general rule of thumb, list managers do a pretty good job removing data errors through address validation, but rarely do they look at the quality of the name and title.

Address validation does a very effective job weeding out and cleaning up the bad addresses. Modern address verification tools can append missing ZIP codes, fix common misspellings and even, in some cases, add a suite or apartment number. Address validation tools are inexpensive and come in various types (batch, real-time, standalone).

Name validation however, is far more complex. The distribution of name information is much larger than in the past, and the quality is much worse. Social media allows anyone to enter virtually any name without verification. As marketers dig deeper into their big data resources (like social media) they run the risk of embarrassing themselves, tarnishing their brand, and wasting their resources in the meantime.

The reason name validation is so complicated is because of variants. Take my given name “Geoffrey,” this is valid and known first name. However, there are at least a dozen variations of “Geoffrey” that include Jeff, Geoff, Jeffrey, Jeoff, Geoffroy…  In order to weed out bogus names traditional ‘known names’ filters are obsolete. In order to filter out invalid names we need a list of more than 300 million names parsed and ranked by frequency of first and last name. Many IT departments think name quality is only about filter-out the vulgar words only (they have a good time writing the procedures to do so). While linguists have long debates about how to validate names, we take the middle road and use a list of stacked ranked collection of three million first name and 10 million last names. Proper name validation is critical for businesses because it’s vital to the customer relationship. Weeding out bogus names like “Miiike Wiiilson,” “Coffee Cup,” and “Slow Motion” is a difficult challenge and one we are not afraid of. Large datasets provide large insight, with enormous potential for knowledge.

Service Objects has been purifying name information for over a decade. Our goal and mission is to make sure every name in every database is accurate and up-to-date as it possible can be. We truly hope that no one has to suffer or be offended by misaddressed mail again.

Geoffrey W. Grow

Founder & CEO

Geff Goow
PS: Above is an example of an offensive label I received a long time ago. I saved it because I was upset with the gross errors, and I am sure I never bought anything for this company. I’ve saved this label for over 20 years, so you can bet I’m passionate about name validation.

Name Validation – The Most Important Data There Is

DOTS Name Validations 2

“Remember that a person’s name is to that person the sweetest and most important sound in any language.” – Dale Carnegie, “How to Win Friends & Influence People”

In today’s highly connected world, we have more opportunities to connect with people than ever before. But what if those opportunities are overwhelming not only to us but to our prospects as well? Companies are fighting hard to put themselves in front of customers and prospects in an ever-crowded marketing space.

One way to give your company a competitive advantage is to show your customers how important they are to you – not just as a revenue source, but as a unique individual.To accomplish this lofty goal, the first and most important step is to address your customers by their correct name and gender-specific title – whether over the phone, in an email or letter. What woman wants to get a letter addressed to Mr. Jane Doe? Do you know any men that want to be called “Miss Christopher Smith?”

A robust name validation service can help ensure that each contact name in your database is spelled correctly and aligns with the correct gender of the name. This will help you personalize your outbound communications, and show your customers that you care to know them at the most basic level.

A second critical step is ensuring that the name is “genuine” — why clutter your contact database with names like Mickey Mouse or Britney Spears? A reputable name validation service clears out bogus, vulgar and celebrity names, as well as garbled keystrokes. Removing disingenuous names allows you to focus on the real prospects that are interested in your products or services. Plus you’ll be reducing any waste caused by creating and preparing materials for bogus contacts, or follow-ups involved.

Often name validation is overlooked in overall data quality. If your company does not solve for this, you run the risk of reduced customer satisfaction. Learn how DOTS Name Validation 2 can be integrated into your existing systems. Our proprietary database of nearly 10 million names will help take your business to the next level.

Shakespeare once wrote, “What’s in a name? That which we call a rose/By any other name would smell as sweet.” The question is, would he have felt the same if he had received correspondence addressed “Dear Ms. Shacke Spear…”?

How Social is Your Name?

The Most Popular Names in Social Networks

What’s the most “social” name in the networks? It may not be what you think it is because of many factors, the first being that many popular social networks operate using Roman (Latin) character sets. So, a surname like Chang may weigh low on the list because it has many Roman equivalents: Chang, Zhang, Chong, Cheung, Cheong, Chong, Jang.

Name validators are no better at distinguishing between the variances in pronunciations and spellings in names that have been converted from non-Roman alphabets. So when you’re entering client data, be sure that someone with the name George Li isn’t really George Lee; or Debbie Whang isn’t really Debbie Wang.

I’ve researched the top last names, based on data collected from 140 Million social network users worldwide. Here’s a list of the top 50, arranged from the highest frequency rate to the lowest. How social is your last name? Do you see it in the list?

No. Frequency Surname
1 913,465 smith
2 571,819 johnson
3 512,312 jones
4 503,266 williams
5 471,390 brown
6 386,764 lee
7 360,010 khan
8 355,639 singh
9 343,220 kumar
10 324,972 miller
11 311,576 davis
12 280,747 wilson
13 277,466 taylor
14 263,054 thomas
15 260,203 garcia
16 258,501 anderson
17 245,078 sharma
18 236,778 martin
19 236,338 rodriguez
20 230,068 ali
21 226,143 white
22 225,097 jackson
23 223,170 thompson
24 221,085 moore
25 214,823 ahmed
No. Frequency Surname
26 201,310 martinez
27 199,151 lopez
28 188,844 harris
29 187,711 patel
30 185,105 king
31 177,520 walker
32 175,455 hernandez
33 172,994 clark
34 172,959 lewis
35 171,236 robinson
36 163,959 young
37 157,821 gonzalez
38 157,300 hall
39 155,551 wright
40 155,322 scott
41 154,630 perez
42 154,532 green
43 152,797 allen
44 150,361 tan
45 149,749 shah
46 145,495 roberts
47 144,804 adams
48 143,332 nguyen
49 142,322 james
50 141,683 hill

Posted by: Geoff G.