Posts Tagged ‘Email Validation False Negative’

The Trouble with Numeric and Fake-looking Chinese Email Addresses

If you were to encounter an email address that was comprised of just numbers, what would be your first reaction? You might suspect that it was a fake or disposable email address. But in some countries, such as China, this isn’t necessarily the case. In this blog post, we will take a deeper dive into when to be cautious about email addresses from China.

Obviously fake email addresses… right?

For example, let’s randomly type in some numbers.

  • 6843619
  • 1684154646514
  • 735416442
  • 94633252361

If we were to use these numbers as an email address with a company domain like or even a free email provider like, to create something like Most likely, you would dismiss it as being garbage, fake or just simply bad. However, what if we instead used one of the following domains?


And created something like Now you might be thinking, “That’s even worse! Even the domains are all numbers now. Those are obviously fake email addresses. I’m absolutely positive.”

“Positive I tell you!”

OK, fine. I would agree. It looks fake to me too.

Now, what if we instead applied those numbers to the domain,  to get this, Would you still think it was an ‘obviously fake email address’?

Maybe not so ‘obviously fake email address’

In China, all-numeric email addresses are very common. If you made your way to this blog article, then chances are you have encountered one or more numeric email addresses that turned out to be genuine when you may not have expected them to be. For example, the domains noted above,,, and, are not fake. They are real domains with valid Mail Exchange (MX) records that point to real mail servers for handling real email communication.

You might be more familiar with the domain, particularly if you work in international business and/or marketing.

QQ, which is owned by the Chinese tech giant Tencent, is a messaging application similar to Skype. In China and parts of Asia, is like what, or are to the US in terms of providing email, messaging and communication services. In fact, in 2014, QQ was recognized by Guinness World Records for having the most simultaneous users on an instant messaging platform with more than 200 million simultaneous users and over 800 million Monthly Active Users (MAU).

All of these QQ users have a email address, and all QQ accounts have a numeric email address.

But why numbers for an email address?

Numbers aren’t that hard to memorize. Most people have several phone numbers memorized, maybe a bank account or two, or perhaps a combination lock at their local gym. However, there is something impersonal and dissociative about numbers. A random number, like 845796833, doesn’t really tell you much like say, Support@ or ILuvKittens@ or ImBatman@ or just having a plain old name as an email. So, what’s so different about China that makes numbered email addresses so popular?

Well, there is an interesting article from The New Republic that tries to shed some light on the subject. It brings up an interesting notion that suggests that numbers, when used as homonyms for the Chinese language, can be used to more quickly and easily spell out Chinese words. One example from the article is where the numbers 5 and 1 in Chinese sound like the words “I” “want”, which helps explain why a job-hunting web site would choose for their domain. In Chinese, 5-1-Job would mean “I want Job”. Cute.

The meaning behind numbered emails can go beyond simple homonyms, however. The article calls it a “numbered-based slang,” and here is one example that I think helps explain the idea. Quoting the article:

“The Internet company NetEase uses the web address—a throwback to the days of dial-up when Chinese Internet users had to enter 163 to get online.”

They go on to state that 163 is not a homonym for anything, but is instead a throwback reference. A similar example would be the search engine website. is a throwback to when people in the US would dial 4-1-1 for information (as opposed to now where most people simply ‘google’ to search for information).

More Than Just Numbers

Slang in any language can be very complicated, and staying well-informed on the subject matter to understand its meaning is not easy. Technical slang takes this complexity to a whole new level. Take for example this surprisingly common password, “ji32k7au4a83”. One would think that this seemingly complicated password would be quite rare if not unique; however, it turns out it’s not. As the article in the link points out, the password “ji32k7au4a83” can be translated to mean “my password” in English.

This is how it breaksdown:

ji3 -> 我 -> M

2K7 -> 的 -> Y

au4 -> 密 -> PASS

a83 -> 碼 -> WORD

The article details how a major Chinese transliteration system can be creatively used to map English to Chinese to Unicode and vice-versa. This process can be used to come up with some very complicated looking email addresses and not just passwords.

It would not be a stretch to say that the process bears some resemblance 1337 Speak (Leet Speak). Take the previously mentioned “ImBatman” email example. One leet interpretation of it would be “1mb47m4n”. The result appears similarly nonsensical and complicated, wouldn’t you say? However, the problem with verifying Chinese email addresses goes beyond superficial, fake-looking mailboxes and domains.

Disposable email addresses are easier to create

Let’s circle back to the widely popular QQ application, and the all-numeric email addresses. When a user registers for a QQ account they are given a QQ ID number, and this number is also their QQ email address. This ID number can be bound to another email address, so instead of giving someone your actual email, you just give them your QQ number. It’s a nice feature. Unfortunately, it is easy for users to create disposable accounts with QQ and bind them to their real email address. These disposable accounts are commonly used by bots, often created for or by Chinese vendors trying to push their products via spam.

This can lead to some false-negatives when validating email addresses. It is not uncommon to receive a business email address with a domain and for it to end up going bad. The domain and some of their IP addresses tend to accumulate bad sender reputations due to the large amounts of spam abuse, as mentioned above. Spam and abuse are not just a problem for, unfortunately, malicious internet activity is very common in China and Chinese service providers struggle with the problem.

Countries with malicious networks or spam saturation: Use Caution

If you were to search for the countries with the worst spam or malicious networks, you would likely find the following result.

Countries with the worst spam/malicious networks

  1. United States
  2. China
  3. Russia

SPAMHAUS lists the worst spam enabling countries and Country IP Blocks (CIPB) lists countries with the most malicious networks, and both lists come back with the same top three countries in the same order. On both lists, the US is the worst offending country of all. Surprised?

CIPB also re-orders their top ten list by the number of malicious networks as a percentage of the total number of networks for the given country. Here is their re-organized list.

Countries with the most infected networks*

  1. Brazil 89%
  2. Turkey 54%
  3. Romania 39%
  4. China 32%
  5. Russia 11%
  6. United Kingdom 11%
  7. Japan 10%
  8. Ukraine 9%
  9. Germany 6%
  10. United States 6%

*Results are based on CIPB’s current top 10 countries with the most malicious networks.

Another CIPB top ten list places China as the current world leader in malicious internet activity. Brazil and Russia take second and third place respectively. The US is not on the list.

SPAMHAUS’ list of the 10 Worst Botnet Countries

  1. India
  2. China
  3. Vietnam
  4. Iran
  5. Thailand
  6. Brazil
  7. Indonesia
  8. Pakistan
  9. Algeria
  10. Russia

Overall, the real issue with trying to verify email addresses from China is not how they look complicated and fake, but that the country is a hot bed for malicious activity. Just because an email address is deliverable, doesn’t mean that it is good or safe. In some cases, it would not be surprising to see one out of three email addresses from China turn out to be a bot and/or disposable.

How Email Validation can help

So how can you differentiate between, say, a legitimate alphanumeric email address that looks suspicious versus a spambot? Our DOTS Email Validation product can help you navigate some of the challenges and complexities of email data quality, particularly for contact or marketing with international addresses.

Our Email Validation service tests emails at multiple distinct levels.

  • First, of course, we check for basic syntax errors, common domain typos and perform a DNS or domain name check to make sure the domain exists and has a valid MX record.
  • We also perform a comprehensive SMTP check by communicating directly with the target mail server to determine three key pieces of information; is the server working, will it accept any address and will it accept a specific address.
  • Finally, we perform multiple integrity checks to see if the email address is associated with problematic addresses and services like; spam-traps, known disposable address providers and blacklisted servers.

Ultimately determining if the email address is a real, functioning email address.

Circling back to the Chinese email addresses we discussed earlier: our Email Validation service can validate these with no problem, but clients often get confused when these emails get a low score. We verify that they are deliverable, but give them a low score because of problems such as being bots or malicious. It is then up to you to decide whether you want to take the risk of using these email addresses or not. So in closing, understand that numerical or nonsensical emails from other countries are often OK is a good first step, but automated validation can help you make an informed decision on whether to use them.

Photo of @ symbol on a red background

Tackling False Negatives in Email Validation

What is a false negative? In email validation, the term is used when an address is incorrectly identified as being invalid or undeliverable – in other words, it is flagged as being bad when it is actually good. In some cases, false negatives may result in lost leads or unwanted rejections. This blog article looks at how they happen, and what we can do about it.

What causes false negatives in email addresses?

The DOTS Email Validation service offers real-time validation and verification of email addresses. Email verification is handled by directly communicating with an email address’ host mail server(s) via Simple Mail Transfer Protocol (SMTP). The protocol is quite old: the original Request for Comments (RFC) for it was published in 1982 and its most recent definition, RFC 5321, was last updated in October of 2008.

A Request for Comments, or RFC, is an official document by which the Internet Engineering Task Force (IETF) publishes standards, protocols, best practices, or other information relative to the operation of the Internet. With enough interest, an RFC may evolve into an internet standard.

The RFC provides rules and guidelines for SMTP communication and behavior; however, mail transfer agents are free to handle compliance in their own way, while others simply operate out of compliance. Additionally, some mail and network administrators will modify mail transfer agent behavior in an effort to battle large volumes of spam/junk mail and malicious behavior.

Moreover, an admin can configure their servers to lie or behave defensively, misusing SMTP codes and/or using codes with misguiding messages. They can configure firewalls and/or install sophisticated software filters to protect their servers from unwanted exposure, which can result in communication behavior akin to a conversation with Dr. Jekyll and Mr. Hyde. In short, there are many scenarios in which false negatives may occur as well as many opportunities for new ones to arise. This is why we are so vigilant in our monitoring of mail server behavior, and why we take false negative reports so seriously.

Understanding temporary versus permanent rejects

False negatives are commonly broken into two categories, temporary failure rejects (also known as soft bounce backs) and permanent failure rejects (known as hard bounce backs). Temporary rejects are commonly used to graylist incoming email; these account for most of the false negative mail server behavior that we see. Conversely, using a permanent reject code to induce a false negative is much more severe and rare, as doing so can lead to unwanted side effects if not properly implemented by a mail server administrator.

DOTS Email Validation is extremely good at identifying and handling behavior that would result in temporary rejects. Permanent reject false negatives are primarily seen when a mail server implements and uses a blacklist. Other cases stem from edge cases that then result in edge cases of edge cases.

Our Email Validation service handles a variety of blacklist and graylist techniques. New blacklist and graylist techniques are generally rare. However, they are not to be taken lightly. These techniques used by mail servers often leave a lasting, but minimal, effect and we frequently audit the Email Validation system for evidence of unhandled blacklists and graylists. If an unconventional blacklist/graylist technique pops up on the radar, we work quickly to identify the specific behavior. Once identified, we are able to update our data to handle future occurrences when communicating with mail servers.

Common permanent reject scenarios

Permanent rejects are commonly used to help identify undeliverable email addresses. Not all SMTP ‘accept codes’ mean that an email address is deliverable, and conversely not all ‘reject codes’ mean that it is undeliverable. When it comes to email validation and verification there are many gray areas to consider and handle.

Based on client feedback, the Email Validation engines have been tuned to err on the side of caution when handling certain unclear permanent reject behavior from mail servers. The Email Validation service will lean more towards returning an UNKNOWN for the IsDeliverable output value when the SMTP session contains potentially contradicting data. Our clients have expressed that they would rather see an email be left as UNKNOWN than to risk it being a possible false negative. This is why Email Validation has a comprehensive output, containing over twenty warning and note flags to help the user better understand why the Email Validation service scored an email the way it did.

Other permanent reject examples are due to scenarios related to (but not limited to) disabled & suspended email accounts, unreachable domain group errors, and various network and storage related errors. Here’s a brief description for each:

  • Disabled and suspended accounts – An email address or domain may be disabled or suspended by the mail host for a variety of reasons. Some examples include delinquency, abuse, exceeding a quota, high traffic, misconfiguration, and migration. These emails will often return a permanent reject code, but can change at any time due to user intervention.
  • Unreachable domain groups – Mail servers can sometimes encounter internal errors when trying to find and/or connect to a domain group and will report back a false negative. Likely caused by misconfiguration, ambiguity and/or migration.
  • Network and storage related errors – Mail servers and DNS can be configured poorly at times, to the point where they become unreachable or unresponsive.

Even though the above scenarios will often lead to permanent rejects, they can change at any time due to user/admin intervention, or sometimes simply waiting for a change to finish propagating.

In some cases, a mail server may handle the above-mentioned errors poorly and return a wrong or misleading response. For example, the mail server returns a permanent reject code with the description “The e-mail account does not exist. Check the e-mail address, or contact the recipient directly to find out the correct address,” even though the address does exist, but the mail server could not find it at the time. In this case, there is nothing in the SMTP description to indicate that the server encountered an internal error or that the email address is bad.

Service Objects persistently works to improve the Email Validation service to better identify and handle potential false negatives. As mentioned previously, some scenarios cannot always be accurately identified, and new scenarios can always arise, but we will continue to update the service to minimize false negatives as much as possible. If you have a question about false negatives or a scenario to troubleshoot, contact our team to further discuss.