Service Objects’ Blog

Thoughts on Data Quality and Contact Validation

Email *

Share

Posts Tagged ‘Name Validation’

Bringing Dead Letters Back to Life

All right, we are finally going to admit it: there are some bad mailing addresses out there that even Service Objects can’t fix.

Of course, we’re talking about cases like illegible handwriting, physical damage, or the kid who addresses a Christmas letter to “Santa Claus, North Pole.” But even for them, there is hope – in the form of a nondescript building on the outskirts of Salt Lake City, Utah, known as the USPS Remote Encoding Center. Images of illegible mailing addresses are sent here online from all over the United States, in a last-ditch effort to get these pieces of mail where they are going.

Behind the walls of this beige, block-long building lies an optometrist’s dream: nearly 1700 employees working 24 hours a day, each scanning a new image every few seconds and matching it to addresses in the USPS database. (The same database we use to verify your contact address data, incidentally.) Most get linked to a verified address and are sent on their merry way; the truly illegible ones are forwarded to the USPS’s Dead Letter facility to be opened, and those letters to Santa get forwarded to a group of volunteers in Alaska to be answered.

According to the Smithsonian, there used to be more than 50 of these facilities all over the US. With time and improving automation, all of them have now been shuttered, with the exception of this lone center in Salt Lake City. To work there, you need to be fast, precise, and then go through more than a full week of training – and then you get put on one of 33 shifts, handling the roughly two percent of mail pieces that the Post Office’s computers cannot read automatically. That’s between five and eleven million pieces of mail per day on most days.

Of course, technology continues to improve, and USPS has become a world leader in optical character recognition for both handwritten and machine-addressed mailing pieces – even 98 percent of hand-addressed envelopes are processed by machine nowadays. In an interview with the New York Times, the center’s operations director acknowledges that computer processing could eventually put them out of business entirely. But for now, human intervention for illegible addresses hasn’t yet gone the way of the elevator operator.

Thankfully, your business correspondence probably isn’t hand-scrawled by your Aunt Mildred. And hopefully Santa Claus doesn’t show up very often in your prospect database (although fake names get entered for free marketing goodies more often than you think, and we can easily catch and fix these). So your chances of ending up on a computer screen in Salt Lake City are pretty slim – which means we can help you ensure clean contact data, and leverage this data for better marketing insight.

So for those of you who can’t spell, failed penmanship when you went to school, or have a habit of leaving your envelopes out too long in the rain, there is still hope. For the rest of you, there is Service Objects.

Thinking Alternatively About Place Names

Here at Service Objects we come across a lot of names, particularly the names of places. We also work with a lot of personal names, but for now I would like to focus on just place names. Whether the name is for a city, town, village, hamlet, district, region, state, prefecture, mining area, national park, theme park or what have you; chances are that the place may have one or more even alternate spellings and alternate names associated with it.

For a human fluent in English, “North Carolina” and “N. Carolina” will be considered equal, but for a computer they are not. With the use of fuzzy-matching and/or standardization we can work around seemingly trivial issues like this. Now let us suppose that you are working with a set of Japanese data and come across the same name but written in Katakana “ノースカロライナ” or Ukrainian data written in Cyrillic “Північна Кароліна” or even Thai “รัฐนอร์ทแคโรไลนา”. Well, fuzzy-matching and standardization are still our friends; we just have more fuzzy-matching and standardization rules to consider. However, we first need to ensure that we even have the data available to associate a name in a different language.

We’ve been creating a list of place names to help us tackle problems like the ones mentioned above. We currently have a list of over five million unique place names generated from a pool of approximately 11 million names. We are aggregating name data to come up with a more comprehensive list that consists of known alternates, variations in spellings, different languages and the transliterated versions for the different languages.

Here’s a quick look at what we have accomplished, so far:

  • Current list of approximately eight million place names and growing
  • Transliteration and phonetic mappings for various languages
  • Case, accent and kana sensitivity handling
  • Queryable using fuzzy-matching algorithms

We have taken some of what we have learned from our DOTS Address Validation – International service and built upon it in order to improve data beyond the realm of just address validation. When working with Phone, Email, IP, Demographic and Geo-coordinate related data we too often find that location names do not match up. Naturally this is to be expected, since different data vendors will have different standardizations and practices when it comes to naming conventions. Utilizing a comprehensive place name library will allow us to quickly perform various actions, such as cross checking multiple data sources against each other with increased flexibility and match rates.

It may not be immediately apparent how useful a place name library like this is and what kind of avenues it can open up, but expect to see new and exciting developments from us in the coming months!

Name Deduplication Techniques

Identifying Duplicate Records

The bane of any Database Administrator is maintaining duplicate records. They take up unnecessary space and generally do not provide any added value to contact records. A more challenging task for Database Administrators is how to identify and merge records which might be duplicates, and in particular, duplicate names.

There may be variants for a given name which might not be easily identified in a query, but they are invariantly linked. A common example might be Joe Smith vs Joseph Smith. Both could be referring to the same person depending on how the user may have entered their name.

Name Variants, Finding the Common Name

A particularly useful feature of the Name Validation 2 service is the Related Names output field. This field provides a comma separated list of first name variants for a provided name. For example using the given name; Joe, related names returned include Joel, Joeseph, Joey, Josef, Joseph, and José.

With this information, it becomes easier to identify names which are related but in a different form. There may be cases, however, where names cannot be identified as related but can be linked from similarity. Some examples include names that are misspelled or alternate names which are not related but similar. These names can still be identified through the Similar Names output fields of the Name Validation 2 service.

Similar Sounding Names

DOTS Name Validation 2 employs sophisticated similar name matching algorithms to match names drawing from a database of international names with up to 1.4 million first names and 2.75 million last names. First and last name similar results are returned in a comma separated list which can be used to compare against names that already exist in the database.

An example similar name result for the given name; Robert Smith, would return similar first names Rhobert,Róbert,Robertt,Roebert,Roibert,Rubert,Robbert, and similar last names Smyth,Smithe,Smiith,Smiyth. Of the similar names that are found, names are returned in order of most common to least common.

Merge and Promote the Winning Record

Using these results, a query can potentially link similar or related names and identify records which are duplicates. Once duplicate records are identified, the question becomes which should be promoted as the winning record? This decision can depend on factors based on business logic, perhaps a record which contains other vital contact points such as address or phone number or perhaps entry date is chosen as the winning record. Once a winning record is chosen, a merge process is incorporated to merge contact fields from identified duplicates to build a complete record.

Conclusion

Ridding your database of duplicate contact records can be an arduous task, but with the help of Name Validation 2, it doesn’t have to be. Leveraging the vast quantity of names that Name Validation 2 draws upon yields a top quality solution to identifying duplicates through related and similar names.

For more information about Name Validation 2 service, or to receive a free trial key, click here.

For developers, our Name Validation 2 documentation can be found here.

Why We Geek Out Over Name Validation

What’s in a name? Everything — especially if you’re trying to connect with customers and prospects. If you’re emailing, mailing, or calling someone and you have her name wrong, you’ve already lost her.

The Importance of Name Validation APIs

Name validation is becoming increasingly important in the modern world where social media and the Internet allow for a faster-than-ever propagation of bad data. For example, as people opt into various offers, it’s not unusual for auto-correct to change their entries, for a typo to occur, or for the person to enter a bogus name. On other occasions, a name that looks fraudulent and is labeled as such, could really be a legal name. This is the case for this man who legally changed his name to Fire Penguin Disco Panda:

Fire-Penguin-Disco-Panda

Companies wanting to avoid potentially embarrassing situations like putting a bad name on a piece of mail, or removing a perfectly good contact with a name they think is fraudulent, should consider using a service like DOTS Name Validation, an essential ingredient in marketing automation, business databases, CRMs, and the like. Not only does name validation perform helpful changes such as parsing names into individual fields, fixing the order of names, and returning the gender of the individual, our name validation API runs a variety of checks to ensure the name isn’t a bogus, celebrity, or vulgar name.

Updated Name Validation Scoring Algorithms

We recently pushed a major update to our name validation service, including many international names as well as massive improvements to our scoring algorithms. Our name validation database now has almost 5 million first and last names in it.

Our scoring algorithms are where the service truly shines. Even when we get an obscure name that we are not sure about, we look to our algorithms to separate the unknown from the bad. This is where our team likes to geek out. We enjoy thinking of new ways to combine results to identify complex names.

Here’s Where We Get Geeky

We love to get creative with our name validation service. We spend time pouring through lists of celebrities, vulgar names, and any crazy goofy thing we can think of.

What are some of the things we are interested in? We love unusual names. For example, should we consider the names Anakin and Khaleesi as valid now that people are actually naming their babies after these characters? And you can imagine the fun we’ve had talking about Anita Bath and Warren Peace.

We track a lot of vulgar and goofy-type potential names, but what about alterations to those? For example, we might nail the name Hugh Jass, but what about similar names like Hue Jass, Hugh Jazz, Hou Gass, or Hue G. Azz? What if someone submits the name Bob Ba$$? Could we figure out that the intended name should be Bob Bass?

What if a name is submitted that should not be a name like “House on the corner” or perhaps the name of a business instead of a person? These sorts of things can be tricky to identify in an automated system, but our team lives for solving these kinds of problems.

We let our inner geeks out so that we can anticipate and flag bogus, prank, and unusually challenging names. Though our name validation software uses algorithms to score and validate data, they’re powered by both artificial and human intelligence.

Use Name Validation to Get your Customer’s Name Right

Name Validation

It’s very important in a lot of ways – it’s one of the easiest ways for someone to provide fake information on a web form but can be very tricky to properly detect. Take this example of a real piece of mail:

AwkwardLetter

More processes that accept this sort of data are being run by computers. Less often human eyes review as the entire process from start to finish is becoming fully automated.

Name validation can be easily overlooked as an unnecessary addition, but the ramifications of making mistakes can be far reaching. Small mistakes can be very embarrassing, larger ones can lead to a big PR black eye for a company if a very embarrassing mistake makes its way onto the internet.

What’s going on behind the scenes

At Service Objects, we are always looking for ways to improve all data inputs at the point of entry, and name validation is no exception. We have millions of known first and last names from around the world and algorithms honed over years of work to weed out oddities in names. We are looking for celebrity names, vulgar words, words from a dictionary and things that just plain look like garbage or bogus. We constantly strive to improve our algorithms and take pride in identifying fake names.

In the example above it seems obvious that the name is bad, but to an automated process is it safe to say this is bad? What about a valid name such as Martita Boobier, which contains questionable words? What about something like Letit Boobra which doesn’t appear vulgar but also doesn’t appear to be a valid name as well? The goal of DOTS Name Validation is to properly place these names into the appropriate category to take the worry out of an automated process improperly placing them.

Avoid adding bad names to your CRM in real-time

Bad names such as “Trucker Bob”, “Doctor Nick”, “Homer Simpson”, and “Felix the Cat”, names that don’t appear to be names such as “The Big Bang” or “Service Objects”, or names that just appear to be complete garbage such as “Asdf Blah”. DOTS Name Validation can properly identify many cases that might otherwise slip through the cracks without proper review.

Data Validation In Real Estate

The Real Estate Industry Can Gain a Competitive Edge with Data Validation

Data-based marketing, outreach and lead generation isn’t only for cutting-edge B2B companies anymore. Data runs the world these days and successful businesses in every industry can benefit from using verified, validated data in smart ways.
Working with generic data isn’t enough, either. It can be inaccurate and out of date, making it as useful as no data at all—worse, even, if you’re relying on this information. That’s why smart real estate organizations—from large firms to independent agents—are investing in data validation services.

Data validation verifies that the information you’re working from, whether about a specific lead or regional demographics, is accurate and up to date. Validation can be as simple as verifying correct names, phone numbers and current addresses, or can be as nuanced as geo-targeting, IP address validation and reverse phone lookup discovery. No matter the level of data verification, the results are the same: correct information can help you make better-informed decisions and accurately target your audience.

Clever and industrious people in the real estate industry can benefit from just about every type of data validation; it’s all about keeping an eye on trends and getting the right message to the right people at the right time.

Address Validation

This is simple but crucial for real estate agents, who still spend a considerable amount on direct mail marketing. Getting a personalized mailer in the hands of the right person is important. RealTrends found that targeted direct mail pieces had a 2-5 percent response rate, versus the 1 percent rate when real estate agents mailed the piece to everyone without specific targeting.

Address Validation before a direct mail send can help ensure that you have the resident’s correct name (“Current Resident” makes the piece seem extra promotional and impersonal), the correct gender salutation, and helps make sure that the target actually lives at that address.

Or Current Resident Edit
Image via Evil Mad Scientist

Using a data validation service that has access to the USPS National Change-of-Address database can help further refine outreach. If a new family just moved into the address you’re targeting, they’re probably not looking to move again soon, so strike that address off the list for now.

Taking address validation a step further with geocoding validation can help real estate agents get a jump on hot trends and growing neighborhoods. Cross check a list of addresses against a trending neighborhood’s longitude and latitude to make sure the addresses you have really are in the hot spot. People currently in this neighborhood might want to capitalize on the new demand and sell their home at a profit, making them prime contacts for savvy real estate agents. Extend your validation and outreach efforts to the surrounding neighborhoods to get a leg up on the competition.

Reverse Phone Look-Up

Reverse Phone Look-up enables companies to put a name and current address to a phone number. This is particularly useful since many people now move but keep their original cell phone number. This trend makes phone numbers alone a hard way to target people, especially with the declining use of landlines. According to Time, 41 percent of homes were landline-free as of 2014 and 60 percent of adults ages 30-34 exclusively use a mobile phone. With the average age of first-time home buyers currently sitting at 31 and expected to climb to 32-34 in the coming years, this makes reverse look-up validation an invaluable resource for real estate agents.

This type of validation will tell you if the people on your list of phone numbers truly do live in your territory. Plus, it will give you their most current address and name. National real estate companies can use this validated data to send location-specific messaging to everyone on their list, based on the person’s current location.

Demographic Validation

A core premise of marketing, no matter what industry, is “know your audience.” Demographic data validation can help real estate agents get an accurate and intimate understanding of the areas they work in. Gut instincts are essentially gambles, whereas using validated data ensures you have reasonably accurate and updated information. By working with US census validated demographic data, real estate agents can change and target their messages based on location.

  • Spanish-language ads can be placed in predominately Hispanic neighborhoods
  • First-time homebuyer messaging can be sent to areas with a high concentration of young adults reaching the pivotal first-time homebuyer age
  • Direct mail pieces discussing downsizing can be targeted to areas with mostly older adults
  • Target small business owners in the area about property opportunities in the up and coming business district

SuburbsUnderstanding the population make-up of a particular area can also help influence how you market properties. Areas that are mainly suburban are likely to connect more with family-oriented messages while urban areas probably want to hear more about high-end home features and nearby amenities. By using a combination of demographic validation and geocoding validation, agents can perfectly target each area.

This level of data also provides insight into the average income and spending of nearby households, which is helpful when pricing houses and projecting commissions.

Competitive Edge

Many real estate agents work independently and cannot afford to waste time, resources, and money on misguided marketing and outreach efforts. This is where a commitment to clean data and consistent data validation can provide a competitive advantage. Committing to using validated data as a key business tool can help real estate firms accurately focus efforts and spend smartly with better response rates.

Data can be intimidating, but with good data validation the return on investment is well worth it. Look into the different features and options offered to begin cleaning up your data and deciding which level of data-based targeting will work best for you. Go beyond just address validation and get creative if you want to pull ahead of the pack.

Name Validation and the Colorado Rockies

Have you ever misspelled a customer’s name? If so, then you know that the backtracking you need to do in order to remedy the problem can take some time and effort. Misspelling customers’ names, especially in service-oriented businesses, can easily derail your customer service and marketing efforts. Think about how much time and money you’ve already spent in order to learn about your target customers, their preferences, and how best to reach out to them. Then you blemish what should have been a pleasant moment of contact by misspelling their names. 

name-validationConsider these two Colorado Rockies blunders of 2014. First, they honored the spectacular batting average of Troy Tulowitzki, and gave away 15,000 baseball jerseys that spelled out the celebrated shortstop’s last name as Tulowizki. The Rockies did a well-timed damage control by posting on their Facebook page an apology that acknowledged the mistake. Then barely two weeks later, the same Major League Baseball team introduced in its merchandise a batch of new souvenir cups that had Nolan Arenado’s last name printed as Arendo. The Rockies’ third baseman Arenado won a Gold Glove and was thus referred to in the souvenir cups as Golden Arendo.

Examining these examples from a branding perspective, you might conclude that botching people’s names can make your prospective customers lose confidence in what you have to offer. The lack of attention to detail in something as crucial and downright personal as a name says a lot about your ability to deliver, say, whatever marketing claims you are making about your product or service. If the slip-up happens only once, then people can still be forgiving. However, if you commit the same oversight over and over, then it can suggest a lack of attention to detail or customer service. Imagine how potentially alienating it can be for an eager shopper—one who is clearly interested about your offerings because he may have gone through the trouble of signing up to your lead generation form—to see his name being misspelled on follow-up communications. In some cases, mucking up your time-honored personalization routine by messing up a customer’s name can even make or break a sale.  

This is the era of highly targeted and location-based marketing programs, social media, mobile wallets, smart wearables, advanced analytics, and drones that deliver merchandise at the doorsteps of shoppers. Consumers have developed higher expectations, and they formulate their purchase decisions based on those expectations. Marketers, on the other hand, can now wield a plethora of digital tools to streamline, as well as make more profitable, their day-to-day business operation. 

Getting your customers’ name spelled correctly shouldn’t be one of your issues. This tedious part of customer service can be verified, corrected and flagged automatically by a real-time name validation service. Our name verification API service intelligently validates not only your customers’ names but can also identify gender that is associated with the first name, helping you to address them properly. 

If you equip your business with the right tools to support your branding, marketing and customer service efforts, you’ll reduce the time, energy and money spent on fixing errors.

 name-validation-trial

International Name Validation – Making Sense of Latin Characters

name verificationOlá, Grüezi, Cześć, ¡Hola! – Hello! While you may or may not be multilingual, your company likely has customers whose names contain accented Latin characters such as ø, á, ñ, or ü. 

A Brief History of Accented Latin Characters

According to Omniglot, the online encyclopedia of writing systems and language, the modern Latin alphabet consists of 52 letters (upper and lower case) as well as various symbols, punctuation marks, and numerals. In addition to the basic Latin alphabet, many languages (such as French, Spanish, Swedish, Italian, Portuguese, and many others) supplement the Latin alphabet by adding accent marks to vowels and some consonants.

Latin accents, such as the tilde (ã), umlaut (ä), slash (ø), acute (á), grave (à), circumflex (â), and cedilla (ç), are used to: 

  • Change pronunciation 
  • Indicate emphasis in a sentence 
  • Indicate what to stress in a word 
  • Indicate pitch or intonation 
  • Indicate vowel length 
  • Visually distinguishing homophones

The Problem with Latin Characters

Latin characters are difficult to render properly in computer programs and APIs that do not support international characters. For example, a name such as Zoë Smith might be failed as “garbage” in a typical name validation tool simply because the service doesn’t understand the “ë” character. In fact, our very own name validation previously had the same issue. However, we recently rectify this character set issue in our own service with a new feature that now adds support for approximately 62 common Latin and international characters. 

DOTS Name Validation now accepts the most common accented Latin characters. This Latin character set update allows more Spanish, French, Italian, and German names to be validated and “pass” our name validation service without being kicked back as a “garbage” name.

What Does Our International Name Validation Update Mean to You?

Our customers asked us to support accented Latin characters, and we’re excited to deliver! 

Since the update, more names in your contact database such as Terje Lundbø, Carlos Fernández, Jason Castañeda, and Fritz Müller are now being processed and verified by DOTS Name Validation successfully. This added feature also supports our composite lead verification services including DOTS Lead Validation and DOTS Order Validation. If your database contains a variety of international names, give our name validation services a try for free and see how it works for you!

Connecting the DOTS: It Starts With a Name

New video series Connecting the DOTS featuring Jim Harris of Obsessive-Compulsive-Data-Quality explores name validation and it’s important role in ensuring data quality excellence within your organization.

The most personal of personal data is a person’s name, which is why the most impersonal thing, is getting a person’s name wrong. When our names are entered into databases, either by ourselves, or others, we want interfaces that can parse and validate our names, and be able to differentiate the authentic from the invalid and fraudulent. Your business is dependent on the quality of your contact data, and when it comes to contact data, it starts with a name.

DOTS Name Validation is a real-time API web service that parses names into individual data fields, fixes the order of names, and returns the gender associated with the first name. With name verification, companies can instantly weed out legitimate contacts from bogus ones, stopping fraudulent names at the point-of-entry. Make sure that your contact records contain correct names by using DOTS Name Validation 2 to verify accuracy.

Jim Harris OCDQ BlogJim Harris is a recognized data quality thought leader with over 20 years of enterprise data management experience. Jim is a freelance writer, independent consultant, and Blogger-in-Chief at Obsessive-Compulsive Data Quality (ocdqblog.com), a vendor-neutral blog about data quality and its related disciplines.

Service Objects is the industry leader in real-time contact validation services.

Service Objects has verified over 2.8 billion contact records for clients from various industries including retail, technology, government, communications, leisure, utilities, and finance. Since 2001, thousands of businesses and developers have used our APIs to validate transactions to reduce fraud, increase conversions, and enhance incoming leads, Web orders, and customer lists. READ MORE