Author Archive

Beginner’s Guide to International Phone Exchange

Our DOTS Phone Exchange can validate both domestic and international phone numbers. In this article, we will focus on validating international phone numbers using the operation, GetInternationalExchangeInfo. This operation parses a given international phone number to determine the validity of its telephone exchange. The resulting outputs also include geographic location and carrier information when available. Use of this operation helps determine the overall validity of a phone number and returns a standardized format along with other information about the line-type. Many countries have distinct numbering plans, formats, and rules. This operation verifies that each number conforms to those rules and confirms regional validity. 

To get started, the following GetInternationalExchangeInfo inputs are recommended:

NameTypeRequired?Description
PhoneNumberStringYesThe phone number to be parsed, validated and formatted. The phone number may include extension as well. If no country is provided, the country code should be provided in the phone number.
CountryStringNoThe Country format to be validated using the provided phone number. Acceptable Country formats include ISO2 (preferred), ISO3, full country name, variant of full country name, and IP of the phone number’s collection region.
LicenseKeyStringYesYour license key to use the service.
Sign up for a free license key.

Although Country input is not required, we highly recommended it. The Country input helps the service validate the phone number according to the rules and format of the given country. Making it easier to parse the number and make possible corrections. Without the Country input, the service does its best to find a country that matches the given number; however, some countries share numbering plans and formats, which can lead to ambiguities.

Outputs from Phone Exchange

The operation output consists of a response container and three complex child objects:

Response Objects Container:

  • InternationalExchangeInfoResponse

Child Objects:

  • InternationalExchangeInfo – contains useful information about the number, such as line type, validity, and geographic location details. The developer guide includes a full list of output fields and descriptions. NOTE: If an Error object is returned, then expect this object to be null.
  • Error – returned if a problem occurs. Specific error types and descriptions are returned and handled accordingly.
  • Debug – primarily used for internal use by Service Objects, but in some special cases, a client request may enable it to help troubleshoot client-specific issues.

Below is an example of the response XML body:

Example of the response JSON body:

 

Handling the Response Output from Phone Exchange

When calling the service, first check for errors and handle them appropriately. There are four different error types, which are listed below.

Error TypeTypeCode
Authorization1
User Input2
Service Objects Fatal3
Domain Specific4

Please refer to the developer guide for a complete list of errors and their descriptions.

Error type 1: Authorization

Authorization errors indicate a problem with the given license key. License key expiration and exceeding maximum transaction threshold triggers Authorization errors. Authorization errors indicate a license key is no longer authorized to process transactions. These types of errors cannot be resolved on their own and require the license key owner to contact Service Objects to reauthorize their license key or increase their maximum transaction threshold. The most common Authorization errors occur when first integrating the service or when a user tries using the wrong license key.

Error type 2: User Input

User Input errors occur because of a wrong input, like a missing PhoneNumber or LicenseKey input value.

Error type 3: Service Objects Fatal

These errors occur when the service encounters an unexpected error that causes the application to fail. If you encounter a fatal error, please report it to Service Objects support team, along with any pertinent details, such as the request date and time, the operation name and the input values. Fatal errors should be infrequent, so please report them so that the support team can investigate and fix any potential issues.

Error type 4: Domain Specific

These errors show that despite successful processing of the request, a problem occurred. For example, a phone number entered with a missing a country code, or a value that does not appear to be a valid phone number. Most domain-specific errors for the GetInternationalExchangeInfo indicate a problem with the phone number and notifies the user that it is likely invalid. Refer to the developer guide for a list of domain-specific errors and be sure to handle each one appropriately.

InternationalExchangeInfo Object

If no errors are returned, then you may start using the InternationalExchangeInfo object to get details about the phone number, like country, locality and line type. We recommend first checking the following two output values before any others in the InternationalExchangeInfo object:

NameTypeDescription
IsValidBooleanA boolean response type determining whether the phone number is a valid phone number.
IsValidForRegionBooleanA boolean response type determining whether the phone number is a valid phone number for the provided Country.

If the phone number is found to be invalid, you can save yourself time by immediately rejecting it. If the number is valid, then start making use of the other output values as they pertain to your needs. The complete list of outputs can be seen below:

NameTypeDescription
PhoneNumberInStringThe phone number that was provided as input.
CountryCodeStringThe country code of the provided phone number.
FormatNationalStringThe provided phone number in a national format.
ExtensionStringThe parsed extension from the provided phone number.
LocalityStringThe locality from where the phone number belongs. The locality format is generally Locale/Region, Region or Country.
LocalityMatchLevelStringThe match level that was determined from the locality that was found. Possible values include (Locale/Region, Region, Country) match.
TimeZoneStringThe Time Zone of the validated phone number.
LatitudeStringThe latitude of the locality determined from the phone number.
LongitudeStringThe longitude of the locality determined from the phone number.
CountryStringThe country to which the validated phone number belongs.
CountryISO2StringThe ISO 2 character country designation for a validated phone number.
CountryISO3StringThe ISO 3 character country designation for a validated phone number.
FormatInternationalStringThe provided phone number in international format.
FormatE164StringThe provided phone number in E.164 format.
LineTypeStringThe linetype determined for the phone number. See InternationalExchangeInfo LineType table.
SMSAddressStringThe SMS gateway address for the provided mobile number.
MMSAddressStringThe MMS gateway address for the provided mobile number.
IsValidBooleanA boolean response type determining whether the phone number is a valid phone number.
IsValidForRegionBooleanA boolean response type determining whether the phone number is a valid phone number for the provided Country.
NoteCodesStringThe corresponding codes which match NoteDescriptions. These values vary from 1 to 6 based on test results. See NoteCode table below.
NoteDescriptionsStringThe corresponding descriptions which match NoteCodes. These values vary based on NoteCodes. See NoteCode table below.

Geographic information about a number is useful, as is knowing if the line type is mobile, fixed (landline) line or Voice Over IP (VOIP) line. As always, please refer to the developer guide for the complete list of outputs and descriptions.

When in doubt – reach out!

Hopefully, this overview provides a strong understanding of how you can use the operation, GetInternationalExchangeInfo, to validate and standardize international phone numbers. If you are looking to validate and standardize US and Canadian phone numbers, we recommend the GetExchangeInfo operation.

As always, our experienced staff is ready to help, so please do not hesitate to reach out. We would be happy to answer any questions and offer best practices for using our service.

For companies who deal with users in the United Kingdom, this reference guide can help you better understand how addresses in the UK are formatted and what makes an address valid.

Understanding Addresses in the United Kingdom

For companies who deal with users in the United Kingdom, this reference guide can help you better understand how UK addresses are formatted and what makes an address valid.

The United Kingdom: Three Nations, One Province, 29 Million Addresses

The United Kingdom of Great Britain and Northern Ireland – commonly referred to as Britain, the United Kingdom, or simply the UK – is made up of three nations and one province: England, Scotland, Wales, and Northern Ireland. There are approximately 29 million known deliverable addresses in the UK, with over five thousand addresses being added and removed monthly.

International Country Code

The International Organization for Standardization (ISO) published the ISO 3166 standard, officially known as Codes for the representation of names of countries and their subdivisions.

The ISO 3166 standard consists of three parts:

Part 1:
ISO 3166-1
Country Codes – defines codes for the names of countries, dependent territories, and special areas of geographical interest.
Part 2:
ISO 3166-2
Country subdivision code – defines codes for the names of primary subdivisions of a country, such as a state or a province.
Part 3:
ISO 3166-3
Code for formerly used names of countries – defines codes for country names that have been removed from ISO 3166-1.

 

ISO 3166-1, which defines country codes, contains three sets of country codes:

ISO 3166-1 alpha-2: Defines a country as a two-letter country code, commonly referred to as the ISO, ISO2, or ISO-2.
ISO 3166-1 alpha-3: Defines a country as a three-letter country code, commonly referred to as numeric-3, ISO3, or ISO-3.

 

ISO 3166-1 Country Codes – United Kingdom

ISO 3166-1 alpha-2 code ISO 3166-1 alpha-3 code ISO 3166-1 numeric code
GB GBR 826
Note that the alpha-2 code is GB and not UK.

 

ISO 3166-2 Country Codes

In the ISO, England, Scotland, Wales, and Northern Ireland are not included in the ISO 3166-1 country list and are instead listed as subdivisions of GB in ISO 3166-2. However, their subdivision description is that of “country,” except for Northern Ireland, which is described as a “province.”

ISO 3166-2 code Subdivision Name Subdivision category
GB-ENG England Country
GB-SCT Scotland Country
GB-WLS Wales Country
GB-NIR Northern Ireland Province

 

UK vs. GB Country Code – FIPS vs. ISO 3166-1

The United States Federal Government developed the Federal Information Processing Standards (FIPS) for use in computer systems by non-military government agencies and government contractors. Country codes are defined in the FIPS 10-4 standard, where the United Kingdom is listed as UK and not GB. However, where the FIPS 10-4 codes where defined for use in computer systems, the standard has been dropped by many institutes and agencies in preference to ISO 3166-1, making ISO 3166-1 alpha-2 the global standard.

Postal Services

Mail in the United Kingdom is primarily handled by Royal Mail. Royal Mail was established in 1516 by King Henry the VIII, and it was government owned for 499 years. There are other mail delivery services available in the market, but many of them use Royal Mail for the last mile delivery.

Address Format

The address format for mail delivery in the United Kingdom is defined by Royal Mail, where an address is made up of four elements. The elements should appear in the following order:

Address Element: Address Example: Element Names:
Premise Royal Mail
Flat 9, Wheatstone House
[Organization]
[Sub-Building], [Building Name]
[Building Number]*
Thoroughfare 47* Gorse View
School Road
[Dependent Thoroughfare]
[Thoroughfare]
Locality Southampton
Knodishall
Saxmundham
[Double Dependent Locality]
[Dependent Locality]
[Post Town]
Postcode SWA1AA [Postcode]

*NOTE: Although the building number appears with the thoroughfare, it is part of the Premise element.

Premise Elements

A mailing address premise is made up of the following elements:

Order Premise Element Name Description Example
1 Organization The name of the organization, and when necessary the name of the department within the organization, which is registered to the delivery address. Royal Mail
2 Sub-Building This is also known as a sub-premise, such as an apartment, flat, or suite. Flat 9
3 Building Name The name of the building of the business or residence. Wheatstone House
4 Building Number Also known as the premise number or address number, this number identifies the premise on the thoroughfare or dependent thoroughfare. 47

 

Not all elements are required, but enough must be given to identify a single unambiguous delivery point. Also, note that while the building number is a part of the premise element, it must be applied on the same line as its corresponding thoroughfare.

Thoroughfare Elements

A thoroughfare premise is made up of the following elements:

Order Thoroughfare Element Name Description Example
1 Dependent Thoroughfare Distinguishes a premise when a thoroughfare appears more than once in a post town. Gorse View
2 Thoroughfare This is also known as the street or road. School Road

 

Royal Mail defines three thoroughfare address possibilities:

  1. Thoroughfare without a dependent thoroughfare – When an address does not include a dependent thoroughfare, the element is to be omitted.
  2. Thoroughfare with a dependent thoroughfare – When an address contains both elements, Royal Mail instructs that the dependent thoroughfare is required and the thoroughfare is optional.
  3. No Thoroughfare – Not all addresses contain a thoroughfare, in which case the thoroughfare element is simply omitted.

Example:

Her Majesty the Queen
Buckingham Palace
London
SW1A 1AA

Locality Elements

The mail address Locality is made up of the following elements:

Order Locality Element Name Description Example
1 Double Dependent Locality Distinguishes a premise when an address thoroughfare appears more than once in the same post town and dependent Locality. Southampton
2 Dependent Locality Distinguishes a premise when an address thoroughfare appears more than once in the same post town. Knodishall
3 Post Town Also known as the Locality; however, the post town represents the postal delivery Locality and not necessarily the geographic Locality. Saxmundham

 

Other aspects of Locality elements you should be aware of:

1. The Post Town is required.
2. The initial letter of the Post Town must always be capitalized.
3. The Post Town may be written in all capital letters (uppercase). It is the only Locality element where this is allowed.

Postcode

The postcode is made up of the following elements:

Order Element Name Description Example
1 Postcode Also known as a postal code, this is an alpha-numeric code that is associated with one or more addresses along one or more thoroughfares. SW1A 1AA

 

The postcode must be written in all capital letters (uppercase) and must be the last address element. Royal Mail recommends that the postcode be listed as a singular element on the last line of the address; however, it may be preceded by either the county or post town on the same address line when separated by a space or on the preceding line.

Regarding County

Though the geographic county of an address is not required, according to the Royal Mail website regarding the inclusion of county, “you are welcome to do so.” However, the issue of listing geographic counties and postal counties has been the cause of some confusion over the years. Counties were removed from the address elements in the early 2000s, and they are no longer officially supported. This was due, in some part, to the boundaries of postal counties and geographic counties not matching up, so an address in one geographic county would be listed in a postal county of a different name.

Postcode Overview

The postcode is an alphanumeric code of varying length that is composed of two codes called the outward code and inward code. It ranges from six to eight characters in length with a single space to separate the two codes. A postcode may represent a group of addresses on a street or on a part of a street, a group of premises, or a single premise. On rare occasions, it may also represent a group of addresses on more than one street.

Postcode Format

Outward code

1. Postcode area
2. Postcode district

Inward Code

1. Postcode sector
2. Postcode unit

For example, in the postcode “SW1A 1AA” we have the following:

Name Example
Postcode SW1A 1AA
Outward code SW1A
Postcode area SW
Postcode district SW1A
Inward code 1AA
Postcode sector SW1A 1
Postcode unit AA

Outward Code

The outward code represents the first half of the postcode that precedes the single space separator. It is made up of the postcode area and postcode district. The length of the outward code is between two and four characters.

Postcode Area 

The postcode area is an alpha code that is one or two characters in length. The code commonly represents a geographical area. For example, “SW” represents London, “AB” is commonly Aberdeenshire, “BS” is often Avon, and so on.

Postcode District 

The postcode district is the postcode area plus an alphanumeric code, essentially making it the outward code.

Inward Code

The inward code represents the second half of the postcode, immediately following the single space separator in the middle. It is three characters in length. The inward code is used to assist in the delivery of mail within a district.

Postcode Sector 

The postcode sector is between four and six characters in length. It begins with the outward code, followed by the single space separator, and ends with the first digit of the inward code.

Postcode Unit 

The postcode unit is an alpha code that is two characters in length. In addition to representing a group of addresses, the postcode unit may also represent a unique premise, an individual organization, or even a subsection/department of an organization. Postcode unit level designation cannot be purchased; it is determined by the amount of mail received by the premises or organization.

Special Postcodes

Royal Mail will assign postcodes to some high-profile organizations such as banks and telecoms, as well as non-geographic postcodes for assignment to PO Boxes and direct marketing. It will also assign postcodes to crown dependencies, overseas territories, and HM British Forces.

Crown dependencies

The crown dependencies are three self-governing island territories off the coast of Britain for which the United Kingdom is responsible. However, they are not a part of the United Kingdom or its territories. These islands have adopted the UK postcode format.

Name Postcode area
Guernsey GY
Jersey JE
Isle of Man IM

Overseas Territories

There are 14 overseas territories in the United Kingdom. They may be commonly referred to as British Overseas Territories or the United Kingdom Overseas Territories. These territories are mostly self-governed, and some have developed their own postal codes, such as Bermuda, the Cayman Islands, the British Virgin Islands, and Montserrat.

British Forces Post Office

The British Forces Post Office (BFPO) and Royal Mail use the non-geographic postcode area “BF” to represent a BFPO address.

Address Validation International: Overcoming Cultural Idiosyncrasies and Postal Format Variables

The above content provides a general overview of addresses in the United Kingdom. Understanding all the ins-and-outs of UK addresses can be a monumental task on its own. In addition, the ever-changing list of addresses, postcodes, and regulatory boundaries involved can make for a very dizzying array of challenges. Fortunately, the DOTS Address Validation International real-time service is capable and robust enough to handle various address formats and cultural idiosyncrasies. As always, our experienced staff is here to help, so please do not hesitate to reach out to us! We would be happy to answer any follow-up questions you may have and make recommendations on how to interpret and use the results from the service.

 

 

 

DOTS Address Validation International (AVI) enables businesses to develop consistent addressing formats for your international addresses.

AVI Address Output: We Speak Your Language

You say tomato, I say tomahto.
You say Rome, I say Roma.
You say Munich, I say München.
Let’s Not call the whole thing off.

Have you ever wondered why the country code and abbreviation for Germany is DE, or similarly why it is ES for Spain? Unlike FR and CA, which are France and Canada respectively, DE and ES seem out of place for Germany and Spain. A simple explanation is that DE is short for Deutschland and ES is short for España – which are the names used locally for these countries.

Local names such as Deutschland and España are known as endonyms, and Germany and Spain are English language exonyms. You may be wondering, what are endonyms and exonyms? To put it simply, endonyms are the names of places used by the locals and exonyms are the names used by foreigners. So an endonym is what a country calls itself, and an exonym is the name used by other countries.

(As another example, United States is an endonym for, well, the United States. Meanwhile, exonyms for the United States will depend on the country involved: the French call us the États Unis and the Russians call us Соединенные Штаты.)

The DOTS Address Validation International (AVI) service currently offers three output language options to let the end user choose their preferred language setting and behavior: ENGLISH, BOTH (English and local addresses), and LOCAL_ROMAN. Let’s examine each of these in detail:

ENGLISH – Instructs the service to return the address in English, without any localized text or accents.

BOTH – Instructs the service to return a standardized address in both English and in its localized text (e.g., Cyrillic, Chinese, etc.) and format when applicable.

Here’s an example of a Chinese address in both English and in its local Chinese text.

Address input in English

No. 1514 Changyang Lu
Yangpu Qu, Shanghai Shi

Address output in Simplified Chinese

上海市杨浦区长阳路1514号

 

Here’s an example of a Russian address in both English and Cyrillic.

Address input in English

Kommunarov Ul, 290, 9
Krasnodar
Krasnodarskiii Kraii
350020

Address output in Cyrillic

Коммунаров ул, д. 290, OFFICE 9
КРАСНОДАР
КРАСНОДАРСКИЙ КРАЙ
350020

 

One last example, this time in Greece.

Address input in English

Alkamenous 76
104 40 Athens

Address output in Greek

104 40 Αθηνα
Αλκαμενους 76

 

LOCAL_ROMAN – Instructs the service to return the address in its local spelling using Roman text.

For example, the city of Rome will be returned as Roma, Naples as Napoli, Dublin as Baile Átha Cliath, Naestved as Næstved, and Cologne as Köln. Let’s take a look at some address examples.

Here’s an example of an address in Italy.

Address input in English

Via Villafranca 20
00185 Rome RM

Address output in Italian

Via Villafranca 20
00185 Roma RM

 

Example of an address in Denmark

Address input in English

Kobmagergade 20
4700 Naestved

Address output in Danish

Købmagergade 20
4700 Næstved

 

Example of an address in Germany.

Address input in English

Weisshausstr. 20-30
50939 Cologne

Address output in German

Weißhausstr. 20-30
50939 Köln

 

The service also has the ability with some countries to accept an address in its localized spelling and text and return the address in English. Try entering any of the address examples above into the AVI service using the local language, spelling, and format with the output language set English to see the address validated and standardized into English. When submitting an address in a non-English language, be careful to ensure that the text is properly encoded.

The AVI service cannot correct corrupted characters, so it is important to ensure that anything that will hold the address in memory and stores the data can support the character set. Otherwise, you will end up with data corruption, which is not always easy to detect or fix.

For example, in some cases, a character may simply come back as a question mark ‘?’ or a square ‘■’. Take the following address.

Weißhausstr. 20-30
50939 Köln

The fourth character of the first line and the eighth character of the second line will come back corrupted, as follows:

Weihausstr. 20-30
50939 K?ln

 

In other cases, the corruption can be quite severe, and you may end up with something like ‘تخت اره ÙŠÚ©’. Not only is it important to ensure that you do not send any corrupted data to the AVI service, but you also want to make sure that you properly handle and store the service response. Otherwise you may end up corrupting an address after it has been validated. (How this happens would make a good topic for another blog, but for now, just make sure to use the Unicode Transformation Format (UTF) on everything that handles the data.)

Each of these options gives you the flexibility to have a consistent addressing format for your international addresses, depending on your location, your customers, and your mailing conventions. All of them provide an automated, consistent approach to address validation. Whether it is addressing mail to customers in the format of their home countries, translating addresses, or ensuring readability for the sender, DOTS Address Validation International truly speaks your language.

Mail Servers: Where in the world…?

We love data here at Service Objects. We are constantly working to expand and improve on our datasets to further innovate our product lineup. A big part of what makes our Email Validation (EV) service so good is the data that helps drive it. When communicating with a mail server in real-time to verify an email address it helps to know what kind of mail server it is dealing with and if it is trustworthy. Just because an email address is deliverable does not always mean that it is good.  For example, an email may be disposable, vulgar or worse yet, a spamtrap.

Our Email Validation service already keeps track of mail server behavior patterns for millions of domains, which allows us to identify and flag mail servers with malicious activity or servers that have a high association with malicious activity.  In addition to monitoring behavior patterns, we are now focusing on determining the geographic location of the email servers.

What benefits does identifying mail server location offer?

Email addresses can be sent and received from anywhere in the world. They are not anchored to one physical location, and at a glance, one cannot easily discern its geographic origin. Even email addresses with a country code for a Top Level Domain (TLD) can have a global presence and may have servers located in multiple countries.  Fortunately, mail server location data can be derived and aggregated from some of our other datasets. This allows our Email Validation service to better identify potentially malicious mail servers and flag servers from known geographic hot spots.

In addition to helping identify problematic email servers, mail server location data can provide additional insights and benefits. From a marketing and administration perspective, the mail server location data can be used to help identify and organize email addresses for a particular region. The location information can also be used to gain business insights about a company and its location(s). At Service Objects, we are using the additional information to further enhance some of our other services, such as Lead Validation.

Challenges to identifying mail server location information

There are a number of challenges to accurately identifying mail server location information. First, we are identifying the mail server locations of a domain, not attempting to identify where an email message was sent from. This would require more than just a simple email address. However, the location data can be used to help cross-check and verify the legitimacy of an email message. For example, an email message is received, and the headers say that the message was sent from Gmail.com. However, the server IP address in the header does not match any of the known Gmail mail server locations, so chances are the message was spoofed and that it is spam or part of a phishing scam.

Second, trying to identify all of the mail servers for a particular domain is not something that can be done quickly enough for a real-time service where end-users expect sub-second response times. Real-time communication with a mail server can often take several seconds, but trying to identify all the mail servers for a domain from around the world can sometimes take several minutes. For this reason, our DOTS Email Validation service does not include mail server location identification in its suite of real-time checks. Instead, the service relies on background systems that have already collected and identified mail server locations from around the world. This ensures that the service is not bogged down by slow processes and continues to respond normally. While mail server location identification may be too slow for a real-time check, it is a daily process that we perform to ensure our list of locations is up to date. The process is also quick enough that our background processes can routinely check for any new domains that we have not come across before and process them hourly.

Third, if a business has multiple locations, then a typical DNS lookup for a domain will just tell you which mail server(s) to connect to that are closest to your area, and not necessarily tell you about their other mail servers. DNS does this to help ensure that communication is quick and efficient, that way an end-user isn’t trying to communicate with a server on the other side of the country or potentially in a different nation entirely if it doesn’t have to. Part of what makes the location identification process “slow” is that we are looking for mail servers in every major region of the world, and not just in our own local areas.

What’s going on behind the scenes

While our email validation service will currently only display the location(s) of the mail server(s) in the notes of the output when it has been identified, it is doing a lot more with that data behind the scenes. Knowing the IP Addresses and locations of the mail servers means that we can perform cross-checks against more data points in other areas. Service Objects is extremely interested in fraud prevention, so we use this data to check for associations with known proxies, VPNs, bot services and other data points that have ties to malicious activity. The data allows us to check various data driven blacklists and white hat resources against more than a simple email address and domain.  Instead, we can pull back the curtain, so to speak, and dig deeper into the mail server(s) that run behind the scenes. All, while continuing and expanding our server behavior monitorization.

With the addition of this new data, we have added additional NoteCodes to the output from our DOTS Email Validation 3 service. Below is a list of the new notes codes and that have been added:

Code Description Example
11 Countries: The ISO2 country code for the country where the mail server(s) is located. If mail servers are found in more than one country, then all country ISO2 codes will be represented in a pipe-delimited list. JP
12 Region: The region in the country where the mail server(s) is located. The region is commonly returned as a two-character abbreviation. If mail servers are found in more than one region then the value will be a pipe-delimited list of the regions. OS|TY
13 Localities: The name of the locality where the mail sever(s) is located in. If mail servers are found in more than one locality then the value will be a pipe-delimited list of all the localities. Osaka|Tokyo
14 PostCodes: The post code of where the mail server(s) is located. If multiple post codes are found, then the value will be a pipe-delimited list. 543-0062|102-0082

 

For more information about terms for international addresses and locations please check out this previous blog post.

Unlike other NotesCodes where the corresponding NotesDescriptions value will be a human readable flag to describe the note code, the value will instead contain the list of locations found.

Get started testing DOTS Email Validation by downloading a real-time API trial key or sending is a sample list to run for you.

How to Use DOTS Email Validation 3

The DOTS Email Validation 3 (EV3) service has been designed to be robust enough to accommodate the particular needs of a detailed oriented programmer and simple enough to be used by a marketing assistant who needs to run an email campaign. The service can meet various needs that can essentially be narrowed down to two use cases, form validation and post-processing jobs such as batches and database hygiene. Before we discuss those two cases we will first go over the recommended service operation and review some of the important result fields.

Which Operation Should I Use?

The recommended service operation for EV3 is the ValidateEmailAddress method. This operation performs real-time server-to-server email verification. It lets the user specify a timeout value, in milliseconds, for how long it can take to perform real-time server checks. A minimum value of 200 milliseconds is required; however, results are dependent on the network speed of an email’s host, which may require several seconds to verify. Average mail server response times are approximately between 2-3 seconds, but some slower mail servers may take 15 seconds or more to verify.

Please note that the above information is also available in the service developer guide.

Understanding the Results

The service returns many results that can be used to meet a programmer’s particular email validation needs, but the easiest way to determine if an email should be accepted or rejected is by looking at either the IsDeliverable value or the Score value.

Score:

For most cases it is recommended to use the Score along with other output values to cater to your particular needs. Here are the possible score values.

Score Description Notes
0 Email is Good Indicates with high confidence that the email address is deliverable and good. The email address was verified with the host mail server and no malicious warnings were found.
1 Email is Probably Good Indicates that the email is deliverable but one or more lesser warnings were found. For example the email may be a potential alias or a role, which are sometimes used as disposable addresses.
2 Unknown Indicates that not enough information was available to determine deliverability and integrity. Unknowns most commonly occur for slow mail servers that do not respond to the web service in time. They also occur for catch-all mail servers and greylists.
3 Email is Probably Bad Indicates that one or more warnings were found, such as a potential vulgarity or a string of garbage-like characters.
4 Email is Bad Indicates with high confidence that the email address is bad and/or undeliverable. Occurs for email addresses that fail critical checks such as syntax validation and DNS verification. Most commonly occurs for email addresses where the actual host mail server verified that the email does not exist. Also occurs for deliverable email addresses that are known spam traps or bots.

IsDeliverable:

The simplest way to use the service is to look at the IsDeliverable field. This field will return true, false or unknown. If your primary concern is to be able to send out email with the lowest possible chance of a hard bounceback then this field alone will suffice. However, this field does not take spamtraps, vulgarities, bots or other factors into consideration. It simply indicates if the service was able to verify the deliverability of an email address with the host mail server. It does not measure the overall integrity of the email address.

If you choose to only look at one result value then it is our recommendation that you use the Score value instead of the IsDeliverable value. The Score evaluates the overall integrity of the email address and not just its deliverability. Either one of these fields can be used in conjunction with other result values to more intelligently evaluate an email address if the need arises. For example, if an email comes back as unknown in either the Score or in IsDeliverable, then we can refer to the following outputs to help us decide if we should accept, reject or retry the email address.

IsSMTPServerGood:

Returns true, false or unknown to indicate if the email’s host mail server was responsive at the time of the check. This is a one of the service’s critical checks. If this value comes back false then it will be reflected in the IsDeliverable value and in the score. Refer to this value if the email is unknown. If the value for this field is also unknown then the service most likely did not have enough time to finish verifying the email address with its host mail server. In these cases the service will continue to try and verify the email in a background process even though the request has finished. Chances are high that if you wait one or more hours and check the email again that the service will have been able to finish verifying the email addresses with the host mail server.

IsCatchAllDomain:

Returns true, false or unknown to indicate if the email’s host mail server is a catch-all. A catch-all mail server will say that an email address is deliverable even if it is not.  This is because catch-all mail servers do not reject email addresses during the initial SMTP session. This means that a catch-all mail server cannot be trusted to verify the deliverability of an email address because it may or may not reject the email address until after an email message is sent. If an email address is unknown and this value is false then chances are good that if the email is checked again at a later time then the service will have verified its deliverability. If catchall is true and there are no warnings, then we know that the mail server is good and that the email does not appear to be bad. In general this scenario leads to a 55% chance that the email is deliverable and won’t result in a hard bounce.
IsSMTPMailBoxGood:

Returns true, false or unknown to indicate if the service was able to verify the email address with its host mail server. This value can be treated similarly to the IsDeliverable value. A true value indicates that the email address is deliverable. If the value comes back false then the mail server verified that the email is undeliverable. A false will be accompanied by the warning flag, ‘Email is Bad – Subsequent checks halted.‘ Some common reasons why this value will return unknown; the mail server is a catch-all, the service ran out of time when communicating with the host mail server or the host mail server used a defensive tactic such as a greylist.

A complete list of the output fields and values are available in the service developer guide.

The result fields given above are useful when it comes to sorting, grouping and filtering all of your validated email addresses. This is useful when working on a post-processing email job, which we will discuss later. Next, we will look at some of the descriptive flags that the service will return. These flags can be used programmatically or at a glance to determine the status of an email address.

Warning Codes & Descriptions:

There are many warning flags that the service may return but we will look at some of the more common and critical ones.

DisposableEmail, SpamTrap, KnownSpammer and Bot

An email address may be deliverable but if one or more of these warning flags is returned then it is highly recommended to reject it.

Alias, Bogus and Vulgar

If one of these warning flags is returned then you may want to either reject the email or set it aside for later review, depending on how strict you want to be.

InvalidSyntax, InvalidDomainSpecificSyntax and InvalidDNS

These are warnings for critical checks that failed. If one of these flags appears then it will be immediately followed by the warning flag ‘Email is Bad – Subsequent checks halted.

Email is Bad – Subsequent checks halted

This warning indicates that the email failed a critical check and is undeliverable. If the flag is not preceded by one of the critical warning flags then it simply means that the email’s host mail server verified that the email address is undeliverable.

A complete list of warning codes and their descriptors are available in the dev guide.

Note Codes & Descriptions:

The note flags will return descriptive information about the email, not all of which will affect the score, but we will focus on the ones that will explain why some email addresses came back as unknown.

GreyListed

The service is good at detecting greylist behavior from mail servers and has procedures in place to avoid them, but not all greylists are avoidable. If the service encounters a greylist then it is temporarily unable to verify the email address with its host mail server. If you encounter a greylist then chances are good that if you try to validate the email again a couple of hours later that you will get a better response.

MailServerTemporarilyUnavailable

This flag indicates that the service was able to connect to the email’s host mail server, but that the server was temporarily busy or unavailable and it was unable to verify the email for us. If you encounter this flag then try and validate the email again a few of hours later to see if the server becomes more responsive then.

ServerConnectTimeout

This flag indicates that the service was unable to establish a connection with a host mail server. A possible reasons for the connection failure could be that the mail server is completely offline or it is responding too slow and unable to respond in time. Some mail servers are configured to commonly respond slowly, taking as long as 60 seconds to respond to a connection. This behavior is rare but it is not entirely uncommon. If an email returns this flag then try and enter a longer timeout time to allow the service the time it needs to verify the email.

MailBoxTimeout

This flag indicates that the service was unable to finish verifying the email address with the host mail server in the time allowed. The mail server could be responding very slowly or the timeout time given to the service was too short. If an email returns this flag then try and enter a longer timeout time to allow the service the time it needs to verify the email.

A complete list of note codes and their descriptors are available in the developer guide.

Use Case 1 – Using Validate Email Address for Form Validation

The ValidateEmailAddress method has four input fields that are all required.

Input Field Name Description Notes
EmailAddress The email address you wish to validate.
AlowCorrections Accepts true or false. The service will attempt to correct an email address if set to true. Otherwise the email address will be left unaltered if set to false. The majority of the email corrections are being performed on the domain. The local part of the email address, the portion before the @ symbol, is generally left untouched.
Timeout Accepts an integer as a string. Timeout time is in milliseconds. Do not include any commas or non-numeric values. This value specifies how long the service is allowed to wait for all real-time network level checks to finish. Real-time checks consist primarily of DNS and SMTP level verification. A minimum value of 200ms is required. When it comes to form validation it is recommended to use a timeout time that is short enough to not keep your user impatiently waiting, but long enough to allow the server-to-server communication time to finish. A relatively short timeout time between 2 to 4 seconds is generally recommended.

 

LicenseKey Your license key to use the service.

Accept, Reject or Review & Retry

ACCEPT

Emails with a score of 0, 1 or 2. In general it is recommended to not be too strict when accepting emails in a form because you do not want to potentially lose an end user.  Also, when performing form validation an end user may become agitated if they have to wait more than 5 seconds for the validation process to complete, but some slow mail servers may not be able to respond in that short amount of time.

REJECT

Emails with a score of 3 or 4. If you do not want to be too strict then you can accept 3 for review, but you should always reject an email that receives a score of 4.

REVIEW & RETRY

Depending on how strict/cautious you want to be you can choose to not initially accept emails with a score of 2 and instead put them aside to have them reviewed. If the IsCatchAllDomain field is not true then you can try and validate the email again later. Email addresses that return a score of 3 can also be set aside for review if you do not want to initially reject all of them. An email will commonly be given a score of 3 if a potential vulgarity or string of garbage characters is found.

In form validation the programmer is sometimes allowed some luxuries while others are taken away. For example, a programmer can be given the opportunity to communicate a result back to the end user but is usually restricted to a shorter timeout time so that the end user is not kept waiting too long. If you have the ability to communicate back the end user then ask the user to check for a typo and try again or try a different email address. If you don’t want to accept a role or alias type email address because they are commonly not accepted by mass email marketers then you can catch for that and tell the user to try again with a different email address.

Use Case 2 – Using ValidateEmailAdress for Batches, Email Campaigns and Data Hygiene

The ValidateEmailAddress method has four input fields that are all required.

Input Field Name Description Notes
EmailAddress The email address you wish to validate.
AlowCorrections Accepts true or false. The service will attempt to correct an email address if set to true. Otherwise the email address will be left unaltered if set to false. The majority of the email corrections are being performed on the domain. The local part of the email address, the portion before the @ symbol, is generally left untouched. Since you are unable to ask a user to re-enter and try again if they make a mistake you can set this value to true and allow the service to make corrections.
Timeout Accepts an integer as a string. Timeout time is in milliseconds. Do not include any commas or non-numeric values. This value specifies how long the service is allowed to wait for all real-time network level checks to finish. Real-time checks consist primarily of DNS and SMTP level verification. A minimum value of 200ms is required. For non-form validation it is recommended to give the service plenty of time to verify an email address with its host mail server. Most mail servers will only take about 2 seconds on average to verify an email address, but for the occasional slow mail server that requires more time it is recommended to set the timeout time to 65 seconds. The number of mail servers that require this much time is generally minimal, so the long timeout should not make a big impact on the overall batch job.

 

LicenseKey Your license key to use the service.

Accept, Reject or Review & Retry

ACCEPT

Emails with a score of 0 or 1.

REJECT

Emails with a score of 3 or 4. If you do not want to be too strict then you can accept 3 for review, but you should always reject an email that receives a score of 4.

REVIEW & RETRY

Emails with a score of 2, unless the IsCatchAllDomain field value is true. An email that gets an unknown score  due to a greylist, timeout or temporarily busy server should be checked again a couple of hours later.

If you would like to discuss your particular use case for recommendations and best practices contact us!

Making an (email) list and checking it twice: Best practices for email validation

For most organizations, one of the most critical assets of their marketing operations is their email contact database. Email is still the lingua franca of business: according to the Radicati Group, over a quarter of a trillion email messages are sent every business day, and the number of email users is expected to top 4 billion by 2021 – roughly half of the world’s population. This article will explore current best practices for protecting the ROI and integrity of this asset, by validating its data quality.

The title of this article is not just a cute play on words – and it has nothing to do with Santa. Rather, it describes an important principle for your game plan for email data quality. By implementing a strong two-step email validation process, as we describe here, you will dramatically reduce deliverability problems, fraud and blacklisting from your email marketing and communications efforts.

The main reason we recommend checking emails in two stages revolves around the time these checks take: many checks can be performed live using a real-time API, particularly as email addresses are entered by users, but server validation in particular may require a longer processing time and interfere with user experience. Here are 3 of the most important checks that are part of the email validation process:

• Syntax (FAST): This check determines if an email address has the correct syntax and physical properties of an email address.

• DNS (FAST): We can quickly check the DNS record to ensure the validity of the email domain (MX record) for the email address. (There are some exceptions to this – for example, where the DNS record is with a shoddy or poor registry and the results take longer to come back.)

• Email Server (VARIABLE, and not within the email validation tool’s control): Although this check can take from milliseconds to minutes, it is one of the most important checks you can make – it ensures that you have a deliverable address. This response time is dependent on the email server provider (ESP) and can vary widely: large ESPs like Gmail or MSN normally respond quickly, while corporate or other domains may take longer.

There are many more checks in Service Objects’ Email Validation tool, including areas such as malicious activity, data integrity, and much more – over 50 verification tests in all! We auto-correct addresses for common spelling and syntax errors, flag bogus or vulgar address entries, and calculate an overall quality score you can use to accept or reject the email address. (For a deeper dive, take a look at this article to see many of the features of an advanced EV tool.)

Here are the two stages we recommend for your email validation process:

Stage 1: At point of entry. Here, you validate emails in real-time, as they are captured. This provides the opportunity for the user to correct mistakes in the moment such as typos or data entry errors. Here you can use our EV software to check for issues like syntax, DNS and the email server – however we recommend setting the API configuration settings to no more than a wait of a couple of seconds, for the sake of customer experience. At this stage either the user or validation software has a chance to update bad addresses.

Stage 2 – Before sending a campaign. Validate the emails in your database – using the API – after the email has been captured and the user is no longer available in real-time to make corrections. In this stage, you have more flexibility to wait for responses from the ESPs, providing more confidence in your list.

It is estimated that 10-15% of emails entered are not usable, for reasons ranging from data entry errors to fraud, and 30% of email addresses change each year. Together these two steps ensure that you are using clean and up-to-date email data every time – and the benefit to you will be fewer rejected addresses, a better sender reputation, and a greater overall ROI from your email contact data.

Maintaining a Good Email Sender Reputation

What are Honeypot Email Addresses?

A honeypot is a type of spamtrap. It is an email address that is created with the intention of identifying potential spammers. The email address is often hidden from human eyes and is generally only detectable to web crawlers. The address is never used to send out email and it is for the most part hidden, thus it should never receive any legitimate email. This means that any email it receives is unsolicited and is considered to be spam. Consequently, any user who continues to submit email to a honeypot will likely have their email, IP address and domain flagged as spam. It is highly recommended to never send email to a honeypot, otherwise you risk ruining your email sender reputation and you may end up on a blacklist.

Spamtraps typically show up in lists where the email addresses were gathered from web crawlers. In general, these types of lists cannot be trusted and should be avoided as they are often of low quality.

Service Objects participates in and uses several “White Hat” communities and services. Some of which are focused on identifying spamtraps. We use these resources to help identify known and active spamtraps. It is common practice for a spamtrap to be hidden from human eyes and only be visible in the page source where a bot would be able to scrape it, but it is important to note that not all emails from a page scrape are honeypot spamtraps. A false-positive could unfortunately lead to an unwarranted email rejection. Many legitimate emails are unfortunately exposed on business sites, job profiles, twitter, business listings and other random pages. So it is not uncommon to see a legitimate email get marked as a potential spamtrap by a competitor.

 

Not all Spamtraps are Honeypots

While the honeypot may be the most commonly known type of spamtrap, it is not the only type around. Some of you may not be old enough to remember, but there was a time when businesses would configure their mail servers to accept any email address, even if the mailbox did not exist, for fear that a message would be lost due to a typo or misspelling. Messages to non-existent email address would be delivered to a catch-all box as long as the domain was correctly spelled. However, it did not take long for these mailboxes to become flooded with spam. As a result, some mail server administrators started to use catch-alls as a way to identify potential spammers. A mail server admin could treat the sender of any mail that ended up in this folder as a spammer and block them. The reasoning being that only spammers and no legitimate senders would end up in the catch-all box. Thus making catch-alls one of the first spamtraps. The reasoning is flawed but still in practice today. Nowadays it is more common for admins use firewalls that will act as catch-alls to try and catch and prevent spammers.

Some spamtraps can be created and hidden in the source code of a website so that only a crawler would pick it up, some can be created from recycled email addresses or created specifically with the intention of planting them in mailing lists. Regardless of how a spamtrap is created it is clear that if you have one in your mailing list and you continue to send mail to it, that you will risk ruining your sender’s reputation.

Keeping Senders Honest

The reality is that not all honeypot spamtraps can be 100% identified. Doing so would highly diminish their value in keeping legitimate email senders honest.

It is very important that a sender or marketer follows their regional laws and best practices, such as tracking which emails are received, opened or bounced back. For example, some legitimate emails can still result in a hard or permanent bounce back. This may happen when an email is an alias or role that is connected to a group of users. In these cases, the email itself is not rejected but one of the emails within the group is. Which brings up another point. Role based email addresses are often not eligible for solicitation, since they are commonly tied to positions and not any one particular person who would have opted-in. That is why the DOTS Email Validation service also has a flag for identifying potential role based addresses.

Overall, it is up to the sender or marketer to ensure that they keep track of their mailing lists and that they always follow best practices. They should never purchase unqualified lists and they should only be soliciting to users who have opted-in. If an email address is bouncing back with a permanent rejection then they should remove it from the mailing list. If the email address that is being bounced back is not in your mailing list then it is likely connected to a role or group based email that should also be removed.

To stay on top of potential spamtraps marketers should also be keeping track of subscriber engagement. If a subscriber has never been engaged or is no longer engaged but email messages are not bouncing back, then it is possible that the email may be a spamtrap. If an email address was bouncing back before and not anymore, then it may have been recycled as a spamtrap.

Remember that by following the laws and best practices of your region you greatly reduce the risk of ruining your sender reputation, which will help ensure that your marketing campaigns reach the most amount of subscribers as possible.

Thinking Alternatively About Place Names

Here at Service Objects we come across a lot of names, particularly the names of places. We also work with a lot of personal names, but for now I would like to focus on just place names. Whether the name is for a city, town, village, hamlet, district, region, state, prefecture, mining area, national park, theme park or what have you; chances are that the place may have one or more even alternate spellings and alternate names associated with it.

For a human fluent in English, “North Carolina” and “N. Carolina” will be considered equal, but for a computer they are not. With the use of fuzzy-matching and/or standardization we can work around seemingly trivial issues like this. Now let us suppose that you are working with a set of Japanese data and come across the same name but written in Katakana “ノースカロライナ” or Ukrainian data written in Cyrillic “Північна Кароліна” or even Thai “รัฐนอร์ทแคโรไลนา”. Well, fuzzy-matching and standardization are still our friends; we just have more fuzzy-matching and standardization rules to consider. However, we first need to ensure that we even have the data available to associate a name in a different language.

We’ve been creating a list of place names to help us tackle problems like the ones mentioned above. We currently have a list of over five million unique place names generated from a pool of approximately 11 million names. We are aggregating name data to come up with a more comprehensive list that consists of known alternates, variations in spellings, different languages and the transliterated versions for the different languages.

Here’s a quick look at what we have accomplished, so far:

  • Current list of approximately eight million place names and growing
  • Transliteration and phonetic mappings for various languages
  • Case, accent and kana sensitivity handling
  • Queryable using fuzzy-matching algorithms

We have taken some of what we have learned from our DOTS Address Validation – International service and built upon it in order to improve data beyond the realm of just address validation. When working with Phone, Email, IP, Demographic and Geo-coordinate related data we too often find that location names do not match up. Naturally this is to be expected, since different data vendors will have different standardizations and practices when it comes to naming conventions. Utilizing a comprehensive place name library will allow us to quickly perform various actions, such as cross checking multiple data sources against each other with increased flexibility and match rates.

It may not be immediately apparent how useful a place name library like this is and what kind of avenues it can open up, but expect to see new and exciting developments from us in the coming months!

Can Google Maps be Used to Validate Addresses?

In November of 2016, Google started rolling out updates to more clearly distinguish their Geocoding and Places APIs, both of which are a part of the Google Maps API suite. The Places API was introduced in March 2015 as a way for users to search for places in general and not just addresses. Until recently the Geocoding API functioned similarly to Places in that it also accepted incomplete and ambiguous queries to explore locations, but now it is focusing more on returning better geocoding matches for complete and unambiguous postal addresses. Do these changes mean that Google Maps and its Geocoding API can finally be used as an address validation service?

No, it cannot. Now before I explain why, let’s first acknowledge why someone would think Google Maps can be used to validate addresses in the first place. The idea starts with the simple argument that if an address can be found in Google Maps then it must exist. If it exists then it must be valid and therefore deliverable. However, this logic is flawed.

Addressing a Common Problem

One of the biggest problems many users overlook with Google Maps and the Geocoding API is that incomplete and/or ambiguous address queries lead to inaccurate and/or ambiguous results. It is common for users to believe that the address entered was correct and valid simply because Google returns a possible match. These users often ignore that the formatted address in the output may have changed significantly from what they had originally entered.The people over at Google Maps must have realized this too as the Geocoder API is now more prone to return ‘ZERO_RESULTS’ instead of a potentially inaccurate result. However, not all users are pleased with the recent changes. Some have noted that addresses that once returned matches in the Geocoding API no longer do so.

Has the Geocoding API become stricter? Yes. Does Google Maps finally make use of address data from the actual postal authorities? Not likely.

Geocoding vs Deliverability

Google Maps does not verify if an address is deliverable. The primary purpose of the Geocoding API is to return coordinate information. At its best it can locate an individual residential home or a commercial building. Other times it is an address estimator. However, not all addresses are for single building locations.

Apartment and unit numbers, suites, floors and PO boxes are typical examples of the type of address that the Google Maps Geocoding API was not intended to handle. They now recommend that those type of addresses be passed to the Places API instead, but not because the Places API can validate or verify those types of addresses. Again, none of the APIs in the Google Maps suite will verify addresses. No, it is because information like a unit number is currently superfluous when it comes to their roof-top level geo-coordinates. Google Maps does not need to know if an address is a multi-unit and/or multi-floored building in order to return a set of coordinates.

Take the Service Objects address for example,

27 E Cota St Ste 500
Santa Barbara, CA 93101-7602

The Google Maps Geocoding API returns the following address and coordinates,

“formatted_address” : “27 E Cota St, Santa Barbara, CA 93101, USA”

“location” : {               “lat” : 34.41864020000001,               “lng” : -119.696178            }

Notice that the formatted address output value has dropped the suite number even though the address is valid. Let’s change the suite number from 500 to a suite number that does not exist, such as 900.

“formatted_address” : “27 E Cota St, Santa Barbara, CA 93101, USA”

“location” : {               “lat” : 34.41864020000001,               “lng” : -119.696178            }

We get back the exact same response, because they are both the same in the eyes of Google Maps.

A similar thing happens if we try the same using the Google Maps web site.

This is the result for when Suite 500 is passed in:

This is the result for when Suite 900 is passed:

Notice that 900 remains in the address.

An unsuspecting user could easily mistake the Suite 900 address for being valid if they were simply relying on the Google Maps website, and its mistakes like these that often lead people to believe that an address may exist when it does not.

The Right Tool for the Job

When selecting a dedicated address validation service here are a few critical and rich features you will want to look for:

Even with the recent updates Google Maps is still no alternative for a dedicated address validation service and choosing not to use one could prove to be an expensive mistake.