Where Does Bad Data Come From?
We talk a great deal about data quality, validating information, and the impact on our business. Do we ever stop and think where bad data comes from? It’s not like there is some bad part of town where bad data hangs out as in some B-movie. Bad data doesn’t spontaneously appear as some clouds part. It’s not delivered by some evil version of the stork. Bad data has to come from someplace, but where?
I like to put the sources of bad data into one of three categories: people, processes, and policies. It’s not that any of this happens intentionally. In the course of doing business, we make decisions or perform actions that impact data quality. If we understand the source, we can be better prepared to address the issues. Let’s look at the categories:
The first source of bad data is people. People do enter names like “Mickey Mouse” in a web form to download a piece of information. The resulting lead quality is now very low. If I’m a salesperson, I want to be selling so I may not be very diligent entering prospect information into a CRM system. In many instances, people just don’t know. How many of us know the full 9 digits of our home zip code? Could you properly format an address on a letter to France? How many different versions of a company name could be in the order entry system because the contact center people want to get the order booked? None of this is malicious, but it happens.
The second category, process, is a little more subtle. Two companies combine through a merger or acquisition. Those companies have different ERP systems. Chances are the data in the two systems aren’t consistent, so we now have a data quality problem trying to find the common customer records. Even within a single organization, the people in accounts receivable may be treating data differently than the people in shipping. When a customer moves, the process to change the customer may not be getting enough attention. The orders and invoices are now going to the wrong place costing money and lowering customer satisfaction.
Policies can be external to an organization. Did you know that over 100 different postcode formats exist across the globe? In the US, we don’t even call them postcodes; we call them zip codes. Many countries don’t have postcodes at all. In countries like Japan, the format of the address changes depending on the language in which the address is written. The US includes states as a part of the address; most countries don’t. What happens to our data and our customers if we require a state and US-format zip code on a web form? You get the picture by now.
Rather than bemoan the state of data quality, let’s be aware of the sources. When we build our ERP systems, install our marketing automation systems, and create our websites, think about what can happen. From that point, we can help the people who use these systems and their policies and procedures cope with all the issues. Improving data quality at the source has huge payoffs.