Posts Tagged ‘International Address Validation’

DOTS Address Validation International (AVI) enables businesses to develop consistent addressing formats for your international addresses.

AVI Address Output: We Speak Your Language

You say tomato, I say tomahto.
You say Rome, I say Roma.
You say Munich, I say München.
Let’s Not call the whole thing off.

Have you ever wondered why the country code and abbreviation for Germany is DE, or similarly why it is ES for Spain? Unlike FR and CA, which are France and Canada respectively, DE and ES seem out of place for Germany and Spain. A simple explanation is that DE is short for Deutschland and ES is short for España – which are the names used locally for these countries.

Local names such as Deutschland and España are known as endonyms, and Germany and Spain are English language exonyms. You may be wondering, what are endonyms and exonyms? To put it simply, endonyms are the names of places used by the locals and exonyms are the names used by foreigners. So an endonym is what a country calls itself, and an exonym is the name used by other countries.

(As another example, United States is an endonym for, well, the United States. Meanwhile, exonyms for the United States will depend on the country involved: the French call us the États Unis and the Russians call us Соединенные Штаты.)

The DOTS Address Validation International (AVI) service currently offers three output language options to let the end user choose their preferred language setting and behavior: ENGLISH, BOTH (English and local addresses), and LOCAL_ROMAN. Let’s examine each of these in detail:

ENGLISH – Instructs the service to return the address in English, without any localized text or accents.

BOTH – Instructs the service to return a standardized address in both English and in its localized text (e.g., Cyrillic, Chinese, etc.) and format when applicable.

Here’s an example of a Chinese address in both English and in its local Chinese text.

Address input in English

No. 1514 Changyang Lu
Yangpu Qu, Shanghai Shi

Address output in Simplified Chinese

上海市杨浦区长阳路1514号

 

Here’s an example of a Russian address in both English and Cyrillic.

Address input in English

Kommunarov Ul, 290, 9
Krasnodar
Krasnodarskiii Kraii
350020

Address output in Cyrillic

Коммунаров ул, д. 290, OFFICE 9
КРАСНОДАР
КРАСНОДАРСКИЙ КРАЙ
350020

 

One last example, this time in Greece.

Address input in English

Alkamenous 76
104 40 Athens

Address output in Greek

104 40 Αθηνα
Αλκαμενους 76

 

LOCAL_ROMAN – Instructs the service to return the address in its local spelling using Roman text.

For example, the city of Rome will be returned as Roma, Naples as Napoli, Dublin as Baile Átha Cliath, Naestved as Næstved, and Cologne as Köln. Let’s take a look at some address examples.

Here’s an example of an address in Italy.

Address input in English

Via Villafranca 20
00185 Rome RM

Address output in Italian

Via Villafranca 20
00185 Roma RM

 

Example of an address in Denmark

Address input in English

Kobmagergade 20
4700 Naestved

Address output in Danish

Købmagergade 20
4700 Næstved

 

Example of an address in Germany.

Address input in English

Weisshausstr. 20-30
50939 Cologne

Address output in German

Weißhausstr. 20-30
50939 Köln

 

The service also has the ability with some countries to accept an address in its localized spelling and text and return the address in English. Try entering any of the address examples above into the AVI service using the local language, spelling, and format with the output language set English to see the address validated and standardized into English. When submitting an address in a non-English language, be careful to ensure that the text is properly encoded.

The AVI service cannot correct corrupted characters, so it is important to ensure that anything that will hold the address in memory and stores the data can support the character set. Otherwise, you will end up with data corruption, which is not always easy to detect or fix.

For example, in some cases, a character may simply come back as a question mark ‘?’ or a square ‘■’. Take the following address.

Weißhausstr. 20-30
50939 Köln

The fourth character of the first line and the eighth character of the second line will come back corrupted, as follows:

Weihausstr. 20-30
50939 K?ln

 

In other cases, the corruption can be quite severe, and you may end up with something like ‘تخت اره ÙŠÚ©’. Not only is it important to ensure that you do not send any corrupted data to the AVI service, but you also want to make sure that you properly handle and store the service response. Otherwise you may end up corrupting an address after it has been validated. (How this happens would make a good topic for another blog, but for now, just make sure to use the Unicode Transformation Format (UTF) on everything that handles the data.)

Each of these options gives you the flexibility to have a consistent addressing format for your international addresses, depending on your location, your customers, and your mailing conventions. All of them provide an automated, consistent approach to address validation. Whether it is addressing mail to customers in the format of their home countries, translating addresses, or ensuring readability for the sender, DOTS Address Validation International truly speaks your language.

5 Commonly Used Terms and Definitions in International Address Validation Systems

When dividing the countries of the world into regions and sub-regions for the purpose of Address Validation, it is important to find a common ground and to use a set of widely adopted terms and definitions.

In the United States of America, (US), we commonly use the terms city, state and zip code when referring to addresses. While that may mostly work for a country like Mexico (MX), it is not appropriate for other countries like Japan (JP) where the country is divided into prefectures instead of states. Not all countries call their sub-region divisions the same thing and many countries have several levels of sub-divisions. To further complicate the matter, not all sub-division levels are necessarily interchangeable from one country to another. For example, a first level sub-region in the US is a state, such as California (US-CA), but a first level sub-region for the United Kingdom of Great Britain and Northern Ireland (GB) is a country, such as England (GB-ENG).

Every country can have its own particular set of terms and definitions; to try to go over them all would be too complicated and inefficient. Instead, let’s go over some commonly used terms that are helpful when talking about international addresses.

Country Code

An alphabetic or numeric code used to represent a country. Various types of country codes exist for different particular uses, but the most commonly used codes come from the ISO 3166 standard. Part one of this standard, ISO 3166-1, consists of the following code formats:

  • ISO 3166-1 alpha-2 – a two-letter country code.
  • ISO 3166-1 alpha-3 – a three-letter country code.
  • ISO 3166-1 numeric – three-digit country code.

Postal Code

An alphabetic, numeric or alphanumeric code that may sometimes include spaces or punctuation that is commonly used for the purpose of sorting mail. Commonly referred to as the Postcode. Some country-specific terms include ZIP code (US), PLZ (DE, AU, and CH), PIN code (IN) and CAP (IT).

Administrative Areas

The regions in which a country is divided into. Each region typically has a defined boundary with an administration that performs some level of government functions. These areas are commonly expected to manage themselves with a certain level of autonomy. Various administrative levels exist that can range from “first-level” administrative to “fifth-level” administrative. The higher the level number is the lower its rank will be on the administrative level hierarchy. For example, the US is made up of states (first-level), which are divided into counties (second-level) that consist of municipalities (third-level). For comparison, the United Kingdom (GB) is comprised of the four countries England, Scotland, Wales and Northern Ireland (first-level). These countries are made up of counties, districts and shires (second-level), which in turn are made up of cities and towns (third-level) and small villages and parishes (fourth-level). Other common terms for an administrative area are administrative division, administrative region, administrative unit, administrative entity and subdivision.

Locality

In general, a locality is a particular place or location. More specifically, a locality should be defined as a distinct population cluster. Localities are commonly recognized as cities, towns, and villages; but they may also include other areas such as fishing hamlets, mining camps, ranches, farms and market towns. Localities are often lower-level administrative areas and they may consist of sub-localities, which are segments of a single locality. Sub-localities should not be confused for being the lowest level administrative area of a country, nor should they be confused as being separate localities.

Thoroughfare

In general, a thoroughfare is a transportation route between one location and another. On land, it is more commonly referred to as a type of road or route that is typically used by motorized vehicles, such as a street, avenue or highway.