Posts Tagged ‘Address Deduplication’

US Address Validation Use Cases

Why do people use our DOTS Address Validation – US product? There are many reasons to choose our service that really depends on your business or project. At the highest level, our clients have a goal of gaining efficiency in their systems and want to reduce waste, fraud, and abuse. The next question becomes, how are we going to create efficiencies in our client’s systems?

The primary solutions Service Objects’ real-time APIs offer typically fall into one of these general use case categories: preventing lost deliveries, eliminating waste and fraud, standardizing for compliance, gaining insights into customers and prospects and optimizing marketing automation. These categories are relevant to the majority of our APIs, however, this blog will focus on Address Validation – US and give examples of how this service can be your solution.

Preventing lost deliveries (and delivering more)

Address Validation – US can help with lost deliveries, and in turn allow for the successful completion of more deliveries, by guaranteeing accurate validated addresses. We do this with our CASS certified address validation engine which employs the culmination of 20 years of experience. We make sure addresses exist and are complete by fixing addresses and returning codes that give you additional details and visibility into the addresses at hand.

For example, besides just telling you what was fixed, or telling you what address component is missing, we will provide insights into the address and let you know if it was found to be vacant, returning mail, general delivery, rural route or highway contract and many more, which you can find in our developer guides.

With this information, your organization can make smart decisions on how to handle addresses and reduce the costs of delivering, re-delivering, handling, lost materials, materials sent to the wrong addresses and the cost of damaging your reputation.

Eliminating fraud

Fraud can manifest itself in many ways, and it is always important to keep in touch with new ways that this kind of abuse is attempted. There are several ways Address Validation – US will help eliminate fraud.

Some fraud is attempted by creating duplicate orders in an effort to either get free samples or try to get multiple orders of something and in turn try to resell them. The people behind these efforts will try to circumvent processes by entering their data into a system multiple times, each time entering the data slightly differently. These differences may include a misspelling on purpose such as using ‘Summerland’ and ‘Summer Land’ for the city, or entering the wrong street suffix such as ‘ST’ instead of ‘RD,’ or using both in separate instances. The changes or variations are usually small enough to create multiple different addresses that end up being, in fact, the same address.

Sometimes deliveries are made to multiple forms of the same address. Alias street names can be completely different even though they are the same street, for example, Highway 28 and Allen Street are the same street. Whatever the address inputs were, if the underlying address is actually the same, we can identify these duplicates in two ways. First, we return a standardized address that eliminates the variation in addresses, and second, we return a barcode digit unique to the address that can also be used for de-duplication.

Besides altering addresses to get multiple deliveries, fraud can come into play when someone has stolen purchasing details and needs to receive the delivery of a fraudulent order at another location. In some cases, the perpetrator will use the address of a vacant lot or an address with no delivery and wait for the mail to arrive. With the notes (detailed in our developer guide) that we return in the service, those kinds of details can be identified and flagged for review. In any case, an area found to be problematic or suspicious can be detected and flagged as well.

For instance, if your organization is having trouble with deliveries in certain regions or territories, ensuring the address you have is valid and accurate can help you flag addresses appropriately by getting correct the address, city, county, state, district, building type, delivery type and so on, where leaving these items not validated could lead to potential fraud. The cost savings in making your process less prone to fraud can come in many forms such as saving on delivery, handling materials, producing materials and protecting your organization’s reputation.

Standardizing for compliance

Compliance is a huge issue for most businesses nowadays, and it can be costly to be out of compliance. In recent years, lawmakers in various countries including the US have enacted compliance laws with respect to the gathering, handling, and storage of personal information. Not being in compliance with these new regulations can have a huge impact on your organization in terms of financial penalties and reputation damage.

When a request is made to purge personal information, organizations need to be able to identify all of the personal data connected to this individual and be able to purge it with certainty. One way we can help is by making sure the data you have on an individual is valid, accurate and standardized. In this situation, all three of these are equally important.

Take standardization as an example, if the same address is entered into your system with several different variations when it comes time to purge the data some of the data may be missed based on differences in the address. These variations can come from typos on forms, a call center technician misinterpreting information conveyed over the phone, or even data digitized through processes such as OCR that scan hard copy documents to digitize them for processing by computer systems.

Our Address Validation – US product helps solve these problems. First, it performs the address validation so that you can be sure that the address you have is valid and accurate. Second, it standardizes the address so that organizations can rely on consistency, and they can use our barcode digits to identify duplicate addresses even when the original addresses had variations to them. (Of course, various people can live at a single address, so care has to be taken in properly identifying, individuals at an address and making sure the wrong records are not being purged.)

Our address validation service is a huge step in untangling this problem, but we should also point out that we have other validation services, To untangle data points even further our DOTS GeoPhone Plus service and others can help. The costs associated with being in compliance are minuscule to what they can be when dealing with litigation, data handling, and reputation issues.

Gaining insights into customers and prospects

Gaining insights into your customers and prospects from your data and having a more complete picture of your leads can give you many strategic advantages. As an example, imagine being on the phone with an engaged prospect, trying to relate to them by telling them about an experience you had in their home town, just to hear them say “I’m not sure what you’re talking about, I’m not from there.” That’s embarrassing! Getting this wrong can make you look contrived.

We can increase confidence when you are reaching out to contacts with valid accurate data, but we are also giving you an opportunity to hone in and paint a better, more complete picture. What congressional district they are in can give you insights into potential political leanings. Is their address residential? Is it a rural address? Do they live in an apartment? Is it a military address? We address these kinds of questions and help shape the relationships and outreach organizations have with their contacts. Moreover, delivery strategies can be tailored to be more efficient. For instance, depending on the insights provided, deliveries can be distributed to the appropriate delivery team or person, scheduled for the right times and/or charged the proper amount.

Having a complete picture allows you to be able to not only distribute leads accurately to the right teams but also allows you to create unique territories that in turn match your strengths as a sales team. Leads can be distributed loosely or tightly. The human resource cost in handling leads and gaining insights into your customers and prospects can be tremendous, as are the costs in trying to correct the data manually. Service Objects is here to help you out with greatly reducing that burden.

Optimizing marketing automation

Just as creating smart sales territories are important for lead distribution, they are just as important in marketing territories. When distributing marketing materials, it can be important that they are tailored and are sensitive to the target audience and location. It is also important to deliver materials to addresses that exist so that you can reach as many people as possible, and make your campaign a success with a minimum of waste. Employing Address Validation – US as part of your solution can help minimize costs such as human resources, corrections, time, and delivery and re-delivery, as well as the implied costs to your reputation.

This is far from an all-inclusive list of use case categories. We have 23 other validation services besides Address Validation – US. Just imagine the possibilities when you pair this service with our other validation and data enrichment services like:

Depending on your organization, one or more of these can be part of your solution in gaining maximum efficiency and reducing waste, fraud and abuse.

In Search of the Unique Address

Some things seem simple on the surface, but aren’t so easy in reality – for example, programming your DVR, or building assemble-it-yourself furniture. In the world of contact data quality, we would add one more item to the list: removing duplicate addresses from your database.

Why is this? Because in many cases, the exact same delivery location can be described in multiple ways and formats. Some of them are quirks of geography: for example, a location that can be described as part of different municipality levels, or a rural route location that also has a valid street address. Others are victims of syntax, such as having different ways of listing a suite or office number. Some can be caught by the human eye, but not easily by a computer. Still others would confuse anyone.

This article will look at many of the ways that duplicate address can slip by in your database – and some ways you can fix this, with a little automation. Let’s dive in.

Spotting Duplicate Addresses

If you were to look at the following address examples you would be able to easily identify them as being the same.

Example 1A:

27 East Cota Street Suite 500
Santa Barbara, CA 93101

Example 1B:

27 E Cota St #500
Santa Barbara, CA 93101

After all, the only difference between the two is that example 1B is abbreviated and example 1A is not. To a computer however, the two addresses are distinctly different, and they would therefore require standardization in order to look the same to a computer.

Using an automated solution, like our DOTS Address Validation products, that standardizes addresses according to USPS or other guidelines is a great solution for these scenarios. Here is how both of these addresses would look after being standardized:

27 E Cota St Ste 500
Santa Barbara, CA 93101-7602

How about this next example: do these addresses look the same to you?

Example 2A:

960 Embarcadero Del Norte
Isla Vista, CA 93117-5106

Example 2B:

960 Embarcadero Del Norte
Santa Barbara, CA 93117-5106

Example 2C:

960 Embarcadero Del Norte
Goleta, CA 93117-5106

Address examples 2A, 2B and 2C are all valid, USPS standardized, and they are all for the same mailing address. However, to a computer, they are still uniquely different. If you were maintaining a list of addresses and trying to remove duplicate addresses or prevent duplicates from being added, then the above examples would likely slip by unnoticed.

In this next example, would you have been able to guess that they are both for the same mailing address?

Example 3A:

RR 1 Box 1465
Bunch, OK 74931-5160

Example 3B:

90455 S 4687 Rd
Bunch, OK 74931-5160

In this case, a rural route address also has a street address equivalent. Here’s another example of duplicate addresses that would be tough to detect.

Example 4A:

10246 Spicewood Rd
Cadet, MO 63630-7211

Example 4B:

RT 2 Box 2730
Cadet, MO 63630-7211

If you were simply reliant on the address string to try and detect duplicates, then there is no way that you would be able to catch those examples. Wouldn’t it be nice if there was some sort of simple ID code or preferably an ID number that you could use to identify addresses instead of the full address? Actually, there is a solution for this in some countries, in the form of a unique address ID (UID).

Where Do IDs Come From?

Unique Address IDs should ideally come from an authoritative source, such as a postal authority or municipality. Authorities such as municipalities are generally responsible for naming streets and addressing buildings, while postal authorities are responsible for delivering mail to these locations. Differing authorities will generally come up with different IDs to fit their specific needs, and it is unlikely that address IDs will be shared by both. For example, municipalities will generally be more concerned with where an address is physically located and its classification type, whereas a postal authority will focus more on mail delivery and carrier routes. Therefore, it is not uncommon for a mailing address to differ drastically from its corresponding physical address.

Going back to address examples 2A, 2B and 2C, we have three duplicate mailing addresses, but of the three example 2A is the one that best describes address’ geographic location. This is because the address is geographically located in the unincorporated community of Isla Vista; however, Isla Vista has no post office of its own and mail is likely served by post offices in the neighboring cities of Goleta and Santa Barbara. According to USPS, all three city names are acceptable and can be used equally. This is because USPS has assigned them the same address barcode.

Address Barcodes

Barcodes are often unique address identifiers. For US mailing addresses, USPS provides a Delivery Point Barcode which is comprised of the full ZIP+4, the Delivery Point Code, and finally a checksum digit. The barcode digits can be used to identify duplicate mailing addresses. Take the previously mentioned address examples, their barcode digits are as follows.

Example 1 Barcode: 931017602254

Example 2 Barcode: 931175106601

Example 3 Barcode: 749315160554

Example 4 Barcode: 636307211461

With the barcode digits, it doesn’t matter how different the various duplicate address strings look since they all share the same numeric barcode digit. Note that you can obtain these barcodes digits as one of the outputs from DOTS Address Validation – US for US addresses.

USPS is not the only postal authority to offer a unique ID. Australia, for example, has a Delivery Point Identifier (DPID) that can be used as a unique identifier for an Australian address. The DPID is generated and maintained by Australia Post. According to the Australia Post Data Guide, the DPID is defined as follows.

  • The Delivery Point Identifier (DPID) is a randomly generated, unique 8-digit number, which is allocated for every new address added to the source address database. All DPIDs, for complete addresses, fall within the range of 30,000,000 to 99,999,999.

Unfortunately, not every country has an authority that offers a unique address identifier. Sometimes delivery point data simply isn’t available, and identifiers are only available for some buildings, streets, communities and regions.

IDs do not guarantee uniqueness

Even when an authoritative source offers a unique delivery point identifier, this does not necessarily equate to uniqueness. For example, not all addresses are deliverable, and some communities rely on general delivery services where the recipient is required to pick up their mail at a post office. Since a general delivery address can serve more than one person or household, it would be dangerous to rely on the address ID of one to try and remove duplicate addresses from a database if they are associated with contacts.

It is also not uncommon for some rural areas to share a mailbox. According to USPS’s General Guidelines and Policies for Rural Delivery,

  • On a rural route, more than one (1) family, but not more than five (5) families may use the same mailbox. A written notice of agreement signed by those who use such a box is filed with the postmaster at the delivery unit.

If more than one household is sharing a rural route mailbox then they will all share the same address barcode ID. So, while the address itself may be unique, it is not truly representative of the number of households behind it. This could prove problematic for businesses looking to use an address ID as a way to limit sales and promotions to a certain number of purchases or entries per household.

In some cases, there will be addresses that represent entities that send and receive large volumes of mail. These entities, such as universities, government agencies and some large corporations, will sometimes be assigned their own unique postal code. In the US these entities will be assigned a “unique ZIP+4” code. The French postal authority, La Poste, assigns CEDEX (Courrier d’Entreprise à Distribution Exceptionnelle) codes. In the UK, these codes are sometimes called “large user” codes and they are managed by Royal Mail. It is not uncommon for these large organizations to have their own mail department. Postal carriers are generally only responsible for delivering mail to these internal mail departments, and the large organization will handle delivery to the recipient.

Making sense of address IDs

Overall, when it comes to address IDs it important to keep a few things in mind.

  • Make sure the address you are using accurately represents what you need it to in order to meet your business needs. For example, you are not trying to use a mailing address as a physical address and vice versa.
  • If an address ID is available then ensure that it is generated and maintained by the appropriate authority, such as a postal authority for mailing addresses.
  • Depending on your needs, address IDs may not always be an appropriate method to ensure uniqueness.
  • Authorities have full control over the address ID. An address ID may change at any time or become orphaned without warning.

If you’re feeling less sure now about what you need then don’t fret. Sometimes the more you learn about a subject the more confusing that subject becomes. Here at Service Objects, we pride ourselves on helping you find the right tool for the job and are here to help.

Photo of a barcode

Address Deduplication Using USPS Barcodes

When are two addresses actually the same? And when can you remove one of them from your contact database?

The answer isn’t as simple as it sounds. Suppose you have two addresses as follows: 429 East Figueroa Street, Apartment 1, Santa Barbara, California, 93101 versus 429 E Figueroa St Apt 1, Santa Barbara, CA 93101. Or that for only one of these two addresses, the street address and the apartment number are on separate lines of the address. Simple text or line-by-line comparisons aren’t going to work in this case.

However, the United States Postal Service (USPS) can come to the rescue here, thanks to its standards for delivery barcodes.

What is a barcode?

Barcodes are unique identifiers assigned to each deliverable address by the USPS. A set of digits between 00 and 99 are assigned to each address and then, when that number is combined with the address’ zip+4, a sequence is created to uniquely identify the delivery point. The complete barcode consists of a zip+4, a 2 digit code identifying the premise, and a checksum digit to allow barcode sorters to verify the zip, zip+4 and delivery point code’s correctness.

Barcode Example: 931011445011

Zip+4Deliver Point Code Checksum Digit

How barcodes help you clean up duplicates

In short, barcodes can be leveraged to help identify duplicate records in your address database. The uniqueness of the barcode helps to solve the age old problem of identifying duplicate data. Let’s go back to the example we mentioned above:

Address AAddress B
429 East Figueroa Street429 E Figueroa St Apt 1
Apartment 1
Santa Barbara, California, 93101Santa Barbara, CA 93101

On the surface, these addresses seem very similar. They would both be deemed deliverable by the USPS despite their spelling differences. On one hand, you have Address A spelling out “East”, “Street”, “Apartment”, and “California”. On the other hand, Address B abbreviates these same fields. If you were to address an envelope with either of the spellings, it would reach the same destination.

As a human, looking at the two addresses above, it is easy to figure out that these two addresses are really the same delivery point. As a developer, however, figuring out that the two are the same is a nightmare without some sort of unique identifier. You would break these addresses into their component parts – address, address2, city, state, and zip – and then compare each field for Address A versus Address B.

If you came across any field that didn’t match up perfectly, you would assume the addresses were different and handle them accordingly. At this point it is easy to see that this approach is inadequate and would lead to the misidentification of the Address A/B example above. And even if you tried to write a smarter program, you would quickly discover that this a complex problem involving fuzzy matching, distance algorithms, and various other string comparison algorithms. If only there was a unique identifier that could be assigned to an address…

This is where Service Objects’ DOTS Address Validation products shine. On top of the validation of each input, every deliverable address is matched up with its USPS barcode. With these barcodes in hand, it is easy to compare two addresses without having to worry about spelling or standardization differences.


Mailing address input:

Example of full address input  Example of abbreviated address input


Service Objects’ return with barcode:

Example of full address return from Service Objects' address validation tool with barcode highlighted  Example of abbreviated address return from Service Objects' address validation tool with barcode highlighted

Detecting duplicate mailing addresses using the address’ USPS barcode is a simple, elegant solution to a complicated problem. If you’d like to try any of our address services, sign up for a free trial key and get your first 500 transactions free.