Service Objects’ Blog

Thoughts on Data Quality and Contact Validation

Email *

Share

Posts Tagged ‘Data Cleansing’

What’s Your Data Story?

So many reports focus on spitting out data that they often overlook the importance of being able to quickly digest the information and present a clear action plan. At Service Objects, we want you spending your valuable time acting on the results – not trying to make a report readable and understandable. As a result, we have invested considerable resources into ensuring our Batch Summary reports – the ones we provide you after we run your list – not only look great, but are immediately accessible and actionable. Your account executive will review the results of the report with you and answer any questions you may have, but you will also have a link to the detailed report for your reference and to share with your team members.

So how we did we improve the reports? We focused on telling your business’ data story and showing how our services can help improve your data accuracy. We have started with a few services and operations, and in the coming months, we will continue to roll out more of them out as they are ready. Some of the ways we tell the story better is presenting easy to understand charts and data breakdowns so that you can focus on the parts of your data that you are most interested in.

The following link provides a sample of our DOTS Address Validation US – 3 batch summary report and I have detailed out the features of the report below.

The summary starts with a brief description of the service and operation followed by a section where we define the main output of the service. In this case, the report is focused on Delivery Point Validation or DPV.

We show how the DPV results break down across the varying DPV notes, corrections and Is Residential data points. So, at a glance, it is easy to decipher the balance between the various DPV values.

 

 

 

 

 

Throughout each report, when we see interesting data points, we shine a spotlight on them and add additional custom content to help highlight them.

 

 

The report also drills down on the geographic nature of the data, showing how your list of addresses are distributed across each state and the country. The values are plotted on a map to provide a strong visual representation and hovering over a particular location also displays the underlying values.

Included in this location distribution, is how the DPV values correlate to a location, where we overlay the pie chart breakdown of the actual DPV values.

The break downs are by county and congressional district so your analysis can be completed very quickly.

Clicking on the three bars in the top right of any chart or graph will allow you to either save or print that particular chart. These new batch reports will also allow you to view your details from anywhere, on any screen size. No need to mess with PDF or specific file types, you just need an internet connection and a link to the report.

Lastly, we take data security very seriously. The reports are all provided very securely, so no one can see anyone else’s reports and data is never shared. Our hope is to provide a clearer understanding of your data, making it fast to digest and act on. If you have any questions or would like to us to run a sample data set for you, please contact sales@serviceobjects.com.

How to Use DOTS Email Validation 3

The DOTS Email Validation 3 (EV3) service has been designed to be robust enough to accommodate the particular needs of a detailed oriented programmer and simple enough to be used by a marketing assistant who needs to run an email campaign. The service can meet various needs that can essentially be narrowed down to two use cases, form validation and post-processing jobs such as batches and database hygiene. Before we discuss those two cases we will first go over the recommended service operation and review some of the important result fields.

Which Operation Should I Use?

The recommended service operation for EV3 is the ValidateEmailAddress method. This operation performs real-time server-to-server email verification. It lets the user specify a timeout value, in milliseconds, for how long it can take to perform real-time server checks. A minimum value of 200 milliseconds is required; however, results are dependent on the network speed of an email’s host, which may require several seconds to verify. Average mail server response times are approximately between 2-3 seconds, but some slower mail servers may take 15 seconds or more to verify.

Please note that the above information is also available in the service developer guide.

Understanding the Results

The service returns many results that can be used to meet a programmer’s particular email validation needs, but the easiest way to determine if an email should be accepted or rejected is by looking at either the IsDeliverable value or the Score value.

Score:

For most cases it is recommended to use the Score along with other output values to cater to your particular needs. Here are the possible score values.

Score Description Notes
0 Email is Good Indicates with high confidence that the email address is deliverable and good. The email address was verified with the host mail server and no malicious warnings were found.
1 Email is Probably Good Indicates that the email is deliverable but one or more lesser warnings were found. For example the email may be a potential alias or a role, which are sometimes used as disposable addresses.
2 Unknown Indicates that not enough information was available to determine deliverability and integrity. Unknowns most commonly occur for slow mail servers that do not respond to the web service in time. They also occur for catch-all mail servers and greylists.
3 Email is Probably Bad Indicates that one or more warnings were found, such as a potential vulgarity or a string of garbage-like characters.
4 Email is Bad Indicates with high confidence that the email address is bad and/or undeliverable. Occurs for email addresses that fail critical checks such as syntax validation and DNS verification. Most commonly occurs for email addresses where the actual host mail server verified that the email does not exist. Also occurs for deliverable email addresses that are known spam traps or bots.

IsDeliverable:

The simplest way to use the service is to look at the IsDeliverable field. This field will return true, false or unknown. If your primary concern is to be able to send out email with the lowest possible chance of a hard bounceback then this field alone will suffice. However, this field does not take spamtraps, vulgarities, bots or other factors into consideration. It simply indicates if the service was able to verify the deliverability of an email address with the host mail server. It does not measure the overall integrity of the email address.

If you choose to only look at one result value then it is our recommendation that you use the Score value instead of the IsDeliverable value. The Score evaluates the overall integrity of the email address and not just its deliverability. Either one of these fields can be used in conjunction with other result values to more intelligently evaluate an email address if the need arises. For example, if an email comes back as unknown in either the Score or in IsDeliverable, then we can refer to the following outputs to help us decide if we should accept, reject or retry the email address.

IsSMTPServerGood:

Returns true, false or unknown to indicate if the email’s host mail server was responsive at the time of the check. This is a one of the service’s critical checks. If this value comes back false then it will be reflected in the IsDeliverable value and in the score. Refer to this value if the email is unknown. If the value for this field is also unknown then the service most likely did not have enough time to finish verifying the email address with its host mail server. In these cases the service will continue to try and verify the email in a background process even though the request has finished. Chances are high that if you wait one or more hours and check the email again that the service will have been able to finish verifying the email addresses with the host mail server.

IsCatchAllDomain:

Returns true, false or unknown to indicate if the email’s host mail server is a catch-all. A catch-all mail server will say that an email address is deliverable even if it is not.  This is because catch-all mail servers do not reject email addresses during the initial SMTP session. This means that a catch-all mail server cannot be trusted to verify the deliverability of an email address because it may or may not reject the email address until after an email message is sent. If an email address is unknown and this value is false then chances are good that if the email is checked again at a later time then the service will have verified its deliverability. If catchall is true and there are no warnings, then we know that the mail server is good and that the email does not appear to be bad. In general this scenario leads to a 55% chance that the email is deliverable and won’t result in a hard bounce.
IsSMTPMailBoxGood:

Returns true, false or unknown to indicate if the service was able to verify the email address with its host mail server. This value can be treated similarly to the IsDeliverable value. A true value indicates that the email address is deliverable. If the value comes back false then the mail server verified that the email is undeliverable. A false will be accompanied by the warning flag, ‘Email is Bad – Subsequent checks halted.‘ Some common reasons why this value will return unknown; the mail server is a catch-all, the service ran out of time when communicating with the host mail server or the host mail server used a defensive tactic such as a greylist.

A complete list of the output fields and values are available in the service developer guide.

The result fields given above are useful when it comes to sorting, grouping and filtering all of your validated email addresses. This is useful when working on a post-processing email job, which we will discuss later. Next, we will look at some of the descriptive flags that the service will return. These flags can be used programmatically or at a glance to determine the status of an email address.

Warning Codes & Descriptions:

There are many warning flags that the service may return but we will look at some of the more common and critical ones.

DisposableEmail, SpamTrap, KnownSpammer and Bot

An email address may be deliverable but if one or more of these warning flags is returned then it is highly recommended to reject it.

Alias, Bogus and Vulgar

If one of these warning flags is returned then you may want to either reject the email or set it aside for later review, depending on how strict you want to be.

InvalidSyntax, InvalidDomainSpecificSyntax and InvalidDNS

These are warnings for critical checks that failed. If one of these flags appears then it will be immediately followed by the warning flag ‘Email is Bad – Subsequent checks halted.

Email is Bad – Subsequent checks halted

This warning indicates that the email failed a critical check and is undeliverable. If the flag is not preceded by one of the critical warning flags then it simply means that the email’s host mail server verified that the email address is undeliverable.

A complete list of warning codes and their descriptors are available in the dev guide.

Note Codes & Descriptions:

The note flags will return descriptive information about the email, not all of which will affect the score, but we will focus on the ones that will explain why some email addresses came back as unknown.

GreyListed

The service is good at detecting greylist behavior from mail servers and has procedures in place to avoid them, but not all greylists are avoidable. If the service encounters a greylist then it is temporarily unable to verify the email address with its host mail server. If you encounter a greylist then chances are good that if you try to validate the email again a couple of hours later that you will get a better response.

MailServerTemporarilyUnavailable

This flag indicates that the service was able to connect to the email’s host mail server, but that the server was temporarily busy or unavailable and it was unable to verify the email for us. If you encounter this flag then try and validate the email again a few of hours later to see if the server becomes more responsive then.

ServerConnectTimeout

This flag indicates that the service was unable to establish a connection with a host mail server. A possible reasons for the connection failure could be that the mail server is completely offline or it is responding too slow and unable to respond in time. Some mail servers are configured to commonly respond slowly, taking as long as 60 seconds to respond to a connection. This behavior is rare but it is not entirely uncommon. If an email returns this flag then try and enter a longer timeout time to allow the service the time it needs to verify the email.

MailBoxTimeout

This flag indicates that the service was unable to finish verifying the email address with the host mail server in the time allowed. The mail server could be responding very slowly or the timeout time given to the service was too short. If an email returns this flag then try and enter a longer timeout time to allow the service the time it needs to verify the email.

A complete list of note codes and their descriptors are available in the developer guide.

Use Case 1 – Using Validate Email Address for Form Validation

The ValidateEmailAddress method has four input fields that are all required.

Input Field Name Description Notes
EmailAddress The email address you wish to validate.
AlowCorrections Accepts true or false. The service will attempt to correct an email address if set to true. Otherwise the email address will be left unaltered if set to false. The majority of the email corrections are being performed on the domain. The local part of the email address, the portion before the @ symbol, is generally left untouched.
Timeout Accepts an integer as a string. Timeout time is in milliseconds. Do not include any commas or non-numeric values. This value specifies how long the service is allowed to wait for all real-time network level checks to finish. Real-time checks consist primarily of DNS and SMTP level verification. A minimum value of 200ms is required. When it comes to form validation it is recommended to use a timeout time that is short enough to not keep your user impatiently waiting, but long enough to allow the server-to-server communication time to finish. A relatively short timeout time between 2 to 4 seconds is generally recommended.

 

LicenseKey Your license key to use the service.

Accept, Reject or Review & Retry

ACCEPT

Emails with a score of 0, 1 or 2. In general it is recommended to not be too strict when accepting emails in a form because you do not want to potentially lose an end user.  Also, when performing form validation an end user may become agitated if they have to wait more than 5 seconds for the validation process to complete, but some slow mail servers may not be able to respond in that short amount of time.

REJECT

Emails with a score of 3 or 4. If you do not want to be too strict then you can accept 3 for review, but you should always reject an email that receives a score of 4.

REVIEW & RETRY

Depending on how strict/cautious you want to be you can choose to not initially accept emails with a score of 2 and instead put them aside to have them reviewed. If the IsCatchAllDomain field is not true then you can try and validate the email again later. Email addresses that return a score of 3 can also be set aside for review if you do not want to initially reject all of them. An email will commonly be given a score of 3 if a potential vulgarity or string of garbage characters is found.

In form validation the programmer is sometimes allowed some luxuries while others are taken away. For example, a programmer can be given the opportunity to communicate a result back to the end user but is usually restricted to a shorter timeout time so that the end user is not kept waiting too long. If you have the ability to communicate back the end user then ask the user to check for a typo and try again or try a different email address. If you don’t want to accept a role or alias type email address because they are commonly not accepted by mass email marketers then you can catch for that and tell the user to try again with a different email address.

Use Case 2 – Using ValidateEmailAdress for Batches, Email Campaigns and Data Hygiene

The ValidateEmailAddress method has four input fields that are all required.

Input Field Name Description Notes
EmailAddress The email address you wish to validate.
AlowCorrections Accepts true or false. The service will attempt to correct an email address if set to true. Otherwise the email address will be left unaltered if set to false. The majority of the email corrections are being performed on the domain. The local part of the email address, the portion before the @ symbol, is generally left untouched. Since you are unable to ask a user to re-enter and try again if they make a mistake you can set this value to true and allow the service to make corrections.
Timeout Accepts an integer as a string. Timeout time is in milliseconds. Do not include any commas or non-numeric values. This value specifies how long the service is allowed to wait for all real-time network level checks to finish. Real-time checks consist primarily of DNS and SMTP level verification. A minimum value of 200ms is required. For non-form validation it is recommended to give the service plenty of time to verify an email address with its host mail server. Most mail servers will only take about 2 seconds on average to verify an email address, but for the occasional slow mail server that requires more time it is recommended to set the timeout time to 65 seconds. The number of mail servers that require this much time is generally minimal, so the long timeout should not make a big impact on the overall batch job.

 

LicenseKey Your license key to use the service.

Accept, Reject or Review & Retry

ACCEPT

Emails with a score of 0 or 1.

REJECT

Emails with a score of 3 or 4. If you do not want to be too strict then you can accept 3 for review, but you should always reject an email that receives a score of 4.

REVIEW & RETRY

Emails with a score of 2, unless the IsCatchAllDomain field value is true. An email that gets an unknown score  due to a greylist, timeout or temporarily busy server should be checked again a couple of hours later.

If you would like to discuss your particular use case for recommendations and best practices contact us!

New CRM or ERP? Reduce Your Migration Risk

Birds and data have one thing in common: migration is one of the biggest dangers they face. In the case of our feathered friends, their annual migration subjects them to risks ranging from exhaustion to unfamiliar predators. In the case of your data, moving it to a new CRM or ERP system carries serious risks as well. But with the right steps, you can mitigate these risks, and preserve the asset value of your contact database as it moves to a new system.

In general, there are two key flavors of data migration, each with their own unique challenges:

The Big Bang Approach. This involves conducting data migration within a small, defined processing window during a period when employees are not actively using the system – for example, over a long weekend or holiday break.

This approach sounds appealing for many sites, because it is the quickest way to complete the data migration process. However, its biggest challenge involves data verification and sign-off. Businesses seldom conduct a dry run before going live with migration, resulting in the quality of migrated data often being compromised.

One particular issue is the interface between a new enterprise system and internal corporate systems. According to TechRepublic, enterprise software vendors still suffer from a lack of standardization across their APIs, with the result that every integration requires at least some custom configuration, leading to concerns about both data integrity and follow-on maintenance.

The Trickle Approach. Done with real-time processes, this approach is where old and new data systems run in parallel and are migrated in phases. Its key advantage is that this method requires zero downtime.

The biggest challenge with this approach revolves around what happens when data changes, and how to track and maintain these changes across two systems. When changes occur, they must be re-migrated between the two systems, particularly if both systems are in use. This means that it is imperative for the process to be overseen by an operator from start to finish, around the clock.

Beyond these two strategies, there is the question of metadata-driven migration versus content-driven migration – another major hurdle in the quest to migrate genuine, accurate, and up to date data. IT might be more focused on the location of the source and the characteristics of each column, whereas marketing depends upon the accuracy of the content within each field. According to Oracle, this often leads to content that does not match up with its description, and underscores the need for close inter-departmental coordination.

Above all, it is critical that a data validation and verification system be in place before moving forward with or signing-off on any data migration process. The common denominator here is that you must conduct data validation and verification BEFORE, DURING, and AFTER the migration process. This is where Service Objects comes into play.

Service Objects offers a complete suite of validation solutions that provide real-time data synchronization and verification, running behind the scenes and keeping your data genuine, accurate, and up to date. These tools include:

One particular capability that is useful for data migration is our Address Detective service, which uses fuzzy logic to fill in the gaps of missing address data in your contact records, validates the result against current USPS data, and returns a confidence score – perfect for cleaning contact records that may have been modified or lost field data during the migration process.

Taking steps to validate all data sources will save your company time and extra money. With Service Objects data validation services, we’ll help you avoid the costs associated with running manual verifications, retesting, and re-migration. And then, like the birds, it will be much easier for you and your data to fly through a major migration effort.

Best Practices for List Processing

List processing is one of the many options Service Objects offers for validating your data. This option is ideal for validating large sets of existing data when you’d rather not set up an API call or would simply prefer us to process the data quickly and securely. There is good reason to have us process your list: we have high standards for security and will treat a file with the utmost care.

As part of our list processing service, we offer PGP encryption for files, SFTP file transfers, and encryption to keep your data private and secure. We also have internal applications that allow us to process large lists of data quickly and easily. We have processed lists ranging from tens of thousands of records to upwards of 15 million records. Simply put, we consider ourselves experts at processing lists, and we’ll help ensure that your data gets the best possible return available from our services.

That said, a few steps can help guarantee that your data is processed efficiently. For the best list processing experience – and the best data available, we recommend following these best practices for list processing.

CSV Preparation

Our system processes CSV files. We will convert any file to the CSV format prior to list processing. If you want to deliver a CSV file to us directly, keep the following CSV preparation best practices in mind:

Processing international data – If you have a list of international data that needs to be processed, make sure the file has the right encoding. For example, if the original set of data is in an Excel spreadsheet, converting it to a CSV format can destroy foreign characters that may be in your file. When processing a list of US addresses, this may not be an issue but if you are processing an International set of addresses through our DOTS Address Validation International service, then something like this could highly impact your file. One workaround is to save the file as Unicode text through Excel and then set the encoding to UTF-8 with BOM through a text editor. Another option is to send us the Excel file with the foreign characters preserved and we will convert it to CSV with the proper encoding.

Preventing commas from creating unwanted columns – Encapsulating a field containing commas inside quotation marks will prevent any stray commas from offsetting the columns in your CSV file. This ensures that the right data is processed when our applications parse through the CSV file.

Use Multiple Files for Large Lists

When processing a list with multiple millions of records, breaking the file into multiple files of about 1 million records each helps our system more easily process the list while also allowing for a faster review of the results.

Including a unique ID for each of the records in your list helps when updating your business application with the validated data.

Configure the Inputs for the Service of Choice

Matching your input data to ours can speed up list processing time. For example, some lists parse address line 1 data into separate fields (i.e., 123 N Main St W would have separate columns for 123, N, Main, St, and W). DOTS Address Validation 3 currently has inputs for BusinessName, Address1, Address2, City, State and Zip.  While we can certainly manipulate the data as needed, preformatting the data for our validation service can improve both list processing time and the turnaround time for updating your system with freshly validated data.

These best practices will help ensure a fast and smooth list processing experience. If you have a file you need cleansed, validated or enhanced, feel free to upload it here.

The Letter that Continues to Arrive

Before moving to my current home, making sure I completed a change of address form with the Post Office was on the top of my “to do” list.  Although most mail received these days is typically coupons and business advertisements, I looked forward to receiving the first piece of mail with my name and new address on the envelope. What can I say… I appreciate the little things in life.

Well, the first time I checked the mail I found a letter addressed to the prior resident. As I had recently filled out my own change of address form at the post office I understood it would take some time for each other’s information to be updated and anticipated this would continue happening for a bit. As expected, I began receiving mail addressed with my name soon after. However, years later I’m still getting the same letter from one particular storage company for the prior resident.

Cost of Just One Letter 3

At first, I tried writing “Not at This Address,” “Moved, Left No Forwarding Address” and “Return to Sender” on the letters. After a couple months I realized this did not work. The next thing I tried was calling the storage company. I thought the human element of speaking to someone over the phone and explaining the situation would resolve the case of this never ending letter. This also did not work and actually seemed to make it worse.

As I mentioned previously, the bulk of my mail (like many other people) consists of coupons and advertisements addressed to “current resident” which are seemingly impossible to stop. Along with these, the never ending letter from this storage center started taking the excitement out of checking my mail. For a few years, checking the mail monthly instead of every few days became the routine. Every month, my mailbox was filled to max capacity with mainly junk and of course a letter (or two or three) from the storage center. Unfortunately there are some draw backs to checking your mail so infrequently. I eventually learned that if the mail does not fit in your box it is sent back to the post office which is how I missed a wedding invitation and a few birthday cards. Needless to say I went back to checking my mail more frequently and simply continued sending back the storage company letter hoping they’d eventually run their customer database through a National Change of Address (NCOA) service.

While this situation was obviously annoying, I also wondered how much this letter alone must be costing the storage center. At this point, I estimate receiving about 100 copies of the same letter equating to:

  • $46 in just postage, each has a $0.46 First-Class stamp
  • 100 wasted envelopes
  • 100 wasted pieces of paper
  • Ink for each letter
  • Wasted time/salary of the person(s) at the storage center responsible for mailing
  • Wasted time for the mail sorter(s)
  • Wasted gas and time of the mail carrier(s)
  • A big Headache for me over the last few years
  • Possible frustration for the last tenant who still hasn’t received this letter (I’m assuming it’s a bill which is even worse if they are incurring additional costs all this time)

Ultimately, this also damaged the reputation of the storage company. This mail discrepancy gave me a glimpse into their lack of customer service, organization and concern for our environment. By simply implementing an address validation check in their processes this entire scenario could be avoided. What’s worse is imagining how many other letters they are sending to the wrong address.

After further research, I found out anyone can submit a change of address form at the post office for prior tenants by making a note that they did not provide a forwarding address (the online form requires a forwarding address to submit). I’ll be heading to the post office today to fill one out. If that doesn’t resolve this, I also learned storage centers eventually auction off your items if you don’t pay your bills. Although I don’t want the prior tenant to lose their personal items, I’ll be glad to stop receiving these notices.

Until then…

If your business needs help avoiding unnecessary costs, resources and headaches associated with outdated customer information including name, address, phone, email and more contact us!

Service Objects Lands on CIOReview’s Top 20 Most Promising API Solutions

Service Objects is very proud to have been recently selected as one of CIOReview’s Top 20 Most Promising API Solution Providers for 2016, judged by a distinguished panel comprised of CEOs, CIOs and VPs of IT, including CIOReview’s editorial board.

Now if you are reading this, you probably have one of two reactions: “Wow, that’s cool!” Or perhaps, “What’s an API?”

If it is the latter, allow us to explain. An API, short for an Application Programming Interface, is code that allows our data validation capabilities to be built into your software. Which means that applications ranging from marketing automation packages to CRM systems can reach into our extensive network of contact validation databases and logic, without ever leaving the application.

What this means for them is seamless integration, real time results and better data quality. Their databases have correct, validated addresses. Their leads are scored for quality, so they are mailing to real people instead of “Howdy Doody.” Their orders are scanned for potential fraud, ranging from BIN validation on credit cards to geolocation for IP addresses, so that you know when an order for someone in Utah is originating in Uzbekistan.

What this means for you is that the applications you use are powered by the hundreds of authoritative data sources available through Service Objects – even if you never see it. Of course, we have many other ways to use our products, including real-time validation of lists using our PC-based DataTumbler application, batch FTP processing of lists, and even the ability to quickly look up specific addresses via the Web. But we are proud of our history of providing world-class data validation tools to application developers and systems integrators.

Now, if APIs are old hat to you, this award represents something important to you too: it recognizes our track record within the developer community of providing SaaS tools with superior commercial applicability, data security, uptime and technical support. As a companion article in CIOReview points out, “Service Objects is the only company to combine the freshest USPS address data with exclusive phone and demographic data. Continuous expansion of their authoritative data sets allows Service Objects to validate billions of addresses and phone numbers from around the world, making their information exceptionally accurate and complete.”

There is much more coming in the future, for systems integrators and end users alike. Our CEO Geoff Grow shared with CIOReview that one key focus is “more international data, as many of our clients are doing business outside the United States and Canada … The European and Asian markets are becoming increasingly important places (and) it is important for us to expand our product offerings and our expertise in more regions of the world.” And of course, our product offerings continue to grow and expand for clients in each of the markets we serve.

If you are a developer, we make it easy to put the power of Service Objects’ data validation capabilities in your own applications. Visit our website for complete documentation and sample code, or download a free trial API key for one of our 25 data quality solutions. We know you will see why our peers rank us as one of the best in the industry!

Data Monetization: Leveraging Your Data as an Asset

Everyone knows that Michael Dell built a giant computer business from scratch in a college dorm room. Less well known is how he got started: by selling newspaper subscriptions in his hometown of Houston.

You see, most newspaper salespeople took lists of prospects and started cold-calling them. Most weren’t interested. In his biography, Dell describes using a different strategy: he found out who had recently married or purchased a house from public records – both groups that were much more likely to want new newspaper subscriptions – and pitched to them. He was so successful that he eventually surprised his parents by driving off to college in a new BMW.

This is an example of data monetization – the use of data as a revenue source to improve your bottom line. Dell used an example of indirect data monetization, where data makes your sales process or other operations more effective. There is also direct data monetization, where you profit directly from the sale of your data, or the intelligence attached to it.

Data monetization has become big business nowadays. According to PWC consulting firm Strategy&, the market for commercializing data is projected to grow to US $300 billion annually in the financial services sector alone, while business intelligence analyst Jeff Morris predicts a US $5 billion-plus market for retail data analytics by 2020. Even Michael Dell, clearly remembering his newspaper-selling days, is now predicting that data analytics will be the next trillion-dollar market.

This growth market is clearly being driven by massive growth in data sources themselves, ranging from social media to the Internet of Things (IoT) – there is now income and insight to be gained out of everything from Facebook posts to remote sensing devices. But for most businesses, the first and easiest source of data monetization lies in their contact and CRM data.

Understanding the behaviors and preferences of customers, prospects and stakeholders is the key to indirect data monetization (such as targeted offers and better response rates), and sometimes direct data monetization (such as selling contact lists or analytical insight). In both cases, your success lives or dies on data quality. Here’s why:

  • Bad data makes your insights worthless. For example, if you are analyzing the purchasing behavior of your prospects, and many of them entered false names or contact information to obtain free information, then what “Donald Duck” does may have little bearing on data from qualified purchasers.
  • The reputational cost of inaccurate data goes up substantially when you attempt to monetize it – for example, imagine sending offers of repeat business to new prospects, or vice-versa.
  • As big data gets bigger, the human and financial costs of responding to inaccurate information rise proportionately.

Information Builders CIO Rado Kotorov puts it very succinctly: “Data monetization projects can only be successful if the data at hand is cleansed and ready for analysis.” This underscores the importance of using inexpensive, automated data verification and validation tools as part of your system. With the right partner, data monetization can become an important part of both your revenue stream and your brand – as you become known as a business that gives more customers what they want, more often.

How Much Is Bad Contact Data Costing Your Organization?

Collecting visitor and customer data through a variety of channels allows you to quickly grow your contact list. When contact information comes in, your company is provided many new opportunities to expand your business. However, receiving high-quality data isn’t necessarily a given. After all, some visitors will enter bad contact data, such as bogus phone numbers, in an attempt to limit telemarketing calls; others may think they’re being funny by entering fake names such as Donald Duck or Homer Simpson; others might accidentally misspell a street name. Even autocorrect may change a legitimate entry, and some visitors may intentionally enter a bad address in an attempt to commit fraud.

Managing the effects of bad contact data is a surprisingly large cost for many organizations, involving a great deal of human intervention as well as diluted contact effectiveness. This is one area where an ounce of prevention is worth a pound of cure – particularly in today’s era of automated contact verification tools. Whatever your touch points are with customers or prospects – including customer data, contest entry forms, phone surveys, lead generation and point of sale interactions – some data will be inaccurate, incomplete, or fraudulent.

Why Data Quality Matters

Do you really need to worry about a few Donald Ducks or Homer Simpsons in your contact database? Yes, you do! Poor data quality translates into wasted time, resources, and money. For example, if you have a mailing list of 10,000 addresses, and 10 percent of those are inaccurate, you will waste 1,000 pieces of mail — plus the cost of product and postage – not to mention the human resources involved in processing incorrect outgoing and returned mail.

Data quality from a customer service perspective is another big concern. Suppose that a customer orders an expensive product from your website, and accidentally enters their address as “2134 Main Street” instead of the correct address of “1234 Main Street”? First of all, you would ship the package to the wrong address. Not your fault, but it doesn’t matter: delivery will be delayed, the customer will have a poor experience, and you will incur re-shipping costs. You may even not get the original shipment back. It’s a lose-lose situation.

The problem is compounded when it comes to getting these customers in the first place: an estimated 25% percent of marketing contact data is bad. And according to the Data Warehouse Institute, this ocean of bad data costs businesses over US $600 billion per year. At the level of the individual company, this means that over a quarter of your sales and marketing resources are lost to bad prospects, for reasons that range from intentional fake contact data to the natural contact record aging process. In fact, Salesforce.com reports that after just one year, nearly 70% of contact data goes bad in some form as people change jobs, phone numbers and email addresses.

So how can your company avoid the challenges associated with bad data? Start by assessing what areas of your company need better data quality control. Whether you identify a single area or several areas that could benefit from improved data quality, realigning your data quality goals to hit the data trifecta can improve your bottom line, optimize your human capital and even help the environment.

To read more about this topic, download our white paper Hitting the Data Trifecta – Three Secrets of Achieving Data Quality Excellence.

Where Does Bad Data Come From?

We talk a great deal about data quality, validating information, and the impact on our business. Do we ever stop and think where bad data comes from? It’s not like there is some bad part of town where bad data hangs out as in some B-movie. Bad data doesn’t spontaneously appear as some clouds part. It’s not delivered by some evil version of the stork. Bad data has to come from someplace, but where?

I like to put the sources of bad data into one of three categories: people, processes, and policies. It’s not that any of this happens intentionally. In the course of doing business, we make decisions or perform actions that impact data quality. If we understand the source, we can be better prepared to address the issues. Let’s look at the categories:

The first source of bad data is people. People do enter names like “Mickey Mouse” in a web form to download a piece of information. The resulting lead quality is now very low. If I’m a salesperson, I want to be selling so I may not be very diligent entering prospect information into a CRM system. In many instances, people just don’t know. How many of us know the full 9 digits of our home zip code? Could you properly format an address on a letter to France? How many different versions of a company name could be in the order entry system because the contact center people want to get the order booked? None of this is malicious, but it happens.

The second category, process, is a little more subtle. Two companies combine through a merger or acquisition. Those companies have different ERP systems. Chances are the data in the two systems aren’t consistent, so we now have a data quality problem trying to find the common customer records. Even within a single organization, the people in accounts receivable may be treating data differently than the people in shipping. When a customer moves, the process to change the customer may not be getting enough attention. The orders and invoices are now going to the wrong place costing money and lowering customer satisfaction.

Policies can be external to an organization. Did you know that over 100 different postcode formats exist across the globe? In the US, we don’t even call them postcodes; we call them zip codes. Many countries don’t have postcodes at all. In countries like Japan, the format of the address changes depending on the language in which the address is written. The US includes states as a part of the address; most countries don’t. What happens to our data and our customers if we require a state and US-format zip code on a web form? You get the picture by now.

Rather than bemoan the state of data quality, let’s be aware of the sources. When we build our ERP systems, install our marketing automation systems, and create our websites, think about what can happen. From that point, we can help the people who use these systems and their policies and procedures cope with all the issues. Improving data quality at the source has huge payoffs.

30% of the data in your marketing automation platform is likely incorrect – see how bad your data is with a free scan!

Service Objects is the industry leader in real-time contact validation services.

Service Objects has verified over 2.8 billion contact records for clients from various industries including retail, technology, government, communications, leisure, utilities, and finance. Since 2001, thousands of businesses and developers have used our APIs to validate transactions to reduce fraud, increase conversions, and enhance incoming leads, Web orders, and customer lists. READ MORE