Posts Tagged ‘Fake Email’

The Trouble with Numeric and Fake-looking Chinese Email Addresses

If you were to encounter an email address that was comprised of just numbers, what would be your first reaction? You might suspect that it was a fake or disposable email address. But in some countries, such as China, this isn’t necessarily the case. In this blog post, we will take a deeper dive into when to be cautious about email addresses from China.

Obviously fake email addresses… right?

For example, let’s randomly type in some numbers.

  • 6843619
  • 1684154646514
  • 735416442
  • 94633252361

If we were to use these numbers as an email address with a company domain like @serviceobjects.com or even a free email provider like @gmail.com, to create something like 6843619@serviceobjects.com. Most likely, you would dismiss it as being garbage, fake or just simply bad. However, what if we instead used one of the following domains?

  • 126.com
  • 139.com
  • 163.com

And created something like 6843619@123.com. Now you might be thinking, “That’s even worse! Even the domains are all numbers now. Those are obviously fake email addresses. I’m absolutely positive.”

“Positive I tell you!”

OK, fine. I would agree. It looks fake to me too.

Now, what if we instead applied those numbers to the domain qq.com,  to get this, 6843619@qq.com? Would you still think it was an ‘obviously fake email address’?

Maybe not so ‘obviously fake email address’

In China, all-numeric email addresses are very common. If you made your way to this blog article, then chances are you have encountered one or more numeric email addresses that turned out to be genuine when you may not have expected them to be. For example, the domains noted above, 126.com, 139.com, and 163.com, are not fake. They are real domains with valid Mail Exchange (MX) records that point to real mail servers for handling real email communication.

You might be more familiar with the domain qq.com, particularly if you work in international business and/or marketing.

QQ, which is owned by the Chinese tech giant Tencent, is a messaging application similar to Skype. In China and parts of Asia, qq.com is like what gmail.com, yahoo.com or outlook.com are to the US in terms of providing email, messaging and communication services. In fact, in 2014, QQ was recognized by Guinness World Records for having the most simultaneous users on an instant messaging platform with more than 200 million simultaneous users and over 800 million Monthly Active Users (MAU).

All of these QQ users have a qq.com email address, and all QQ accounts have a numeric email address.

But why numbers for an email address?

Numbers aren’t that hard to memorize. Most people have several phone numbers memorized, maybe a bank account or two, or perhaps a combination lock at their local gym. However, there is something impersonal and dissociative about numbers. A random number, like 845796833, doesn’t really tell you much like say, Support@ or ILuvKittens@ or ImBatman@ or just having a plain old name as an email. So, what’s so different about China that makes numbered email addresses so popular?

Well, there is an interesting article from The New Republic that tries to shed some light on the subject. It brings up an interesting notion that suggests that numbers, when used as homonyms for the Chinese language, can be used to more quickly and easily spell out Chinese words. One example from the article is where the numbers 5 and 1 in Chinese sound like the words “I” “want”, which helps explain why a job-hunting web site would choose 51Job.com for their domain. In Chinese, 5-1-Job would mean “I want Job”. Cute.

The meaning behind numbered emails can go beyond simple homonyms, however. The article calls it a “numbered-based slang,” and here is one example that I think helps explain the idea. Quoting the article:

“The Internet company NetEase uses the web address 163.com—a throwback to the days of dial-up when Chinese Internet users had to enter 163 to get online.”

They go on to state that 163 is not a homonym for anything, but is instead a throwback reference. A similar example would be the 411.com search engine website. 411.com is a throwback to when people in the US would dial 4-1-1 for information (as opposed to now where most people simply ‘google’ to search for information).

More Than Just Numbers

Slang in any language can be very complicated, and staying well-informed on the subject matter to understand its meaning is not easy. Technical slang takes this complexity to a whole new level. Take for example this surprisingly common password, “ji32k7au4a83”. One would think that this seemingly complicated password would be quite rare if not unique; however, it turns out it’s not. As the article in the link points out, the password “ji32k7au4a83” can be translated to mean “my password” in English.

This is how it breaksdown:

ji3 -> 我 -> M

2K7 -> 的 -> Y

au4 -> 密 -> PASS

a83 -> 碼 -> WORD

The article details how a major Chinese transliteration system can be creatively used to map English to Chinese to Unicode and vice-versa. This process can be used to come up with some very complicated looking email addresses and not just passwords.

It would not be a stretch to say that the process bears some resemblance 1337 Speak (Leet Speak). Take the previously mentioned “ImBatman” email example. One leet interpretation of it would be “1mb47m4n”. The result appears similarly nonsensical and complicated, wouldn’t you say? However, the problem with verifying Chinese email addresses goes beyond superficial, fake-looking mailboxes and domains.

Disposable email addresses are easier to create

Let’s circle back to the widely popular QQ application, and the all-numeric qq.com email addresses. When a user registers for a QQ account they are given a QQ ID number, and this number is also their QQ email address. This ID number can be bound to another email address, so instead of giving someone your actual email, you just give them your QQ number. It’s a nice feature. Unfortunately, it is easy for users to create disposable accounts with QQ and bind them to their real email address. These disposable accounts are commonly used by bots, often created for or by Chinese vendors trying to push their products via spam.

This can lead to some false-negatives when validating email addresses. It is not uncommon to receive a business email address with a qq.com domain and for it to end up going bad. The qq.com domain and some of their IP addresses tend to accumulate bad sender reputations due to the large amounts of spam abuse, as mentioned above. Spam and abuse are not just a problem for qq.com, unfortunately, malicious internet activity is very common in China and Chinese service providers struggle with the problem.

Countries with malicious networks or spam saturation: Use Caution

If you were to search for the countries with the worst spam or malicious networks, you would likely find the following result.

Countries with the worst spam/malicious networks

  1. United States
  2. China
  3. Russia

SPAMHAUS lists the worst spam enabling countries and Country IP Blocks (CIPB) lists countries with the most malicious networks, and both lists come back with the same top three countries in the same order. On both lists, the US is the worst offending country of all. Surprised?

CIPB also re-orders their top ten list by the number of malicious networks as a percentage of the total number of networks for the given country. Here is their re-organized list.

Countries with the most infected networks*

  1. Brazil 89%
  2. Turkey 54%
  3. Romania 39%
  4. China 32%
  5. Russia 11%
  6. United Kingdom 11%
  7. Japan 10%
  8. Ukraine 9%
  9. Germany 6%
  10. United States 6%

*Results are based on CIPB’s current top 10 countries with the most malicious networks.

Another CIPB top ten list places China as the current world leader in malicious internet activity. Brazil and Russia take second and third place respectively. The US is not on the list.

SPAMHAUS’ list of the 10 Worst Botnet Countries

  1. India
  2. China
  3. Vietnam
  4. Iran
  5. Thailand
  6. Brazil
  7. Indonesia
  8. Pakistan
  9. Algeria
  10. Russia

Overall, the real issue with trying to verify email addresses from China is not how they look complicated and fake, but that the country is a hot bed for malicious activity. Just because an email address is deliverable, doesn’t mean that it is good or safe. In some cases, it would not be surprising to see one out of three email addresses from China turn out to be a bot and/or disposable.

How Email Validation can help

So how can you differentiate between, say, a legitimate alphanumeric email address that looks suspicious versus a spambot? Our DOTS Email Validation product can help you navigate some of the challenges and complexities of email data quality, particularly for contact or marketing with international addresses.

Our Email Validation service tests emails at multiple distinct levels.

  • First, of course, we check for basic syntax errors, common domain typos and perform a DNS or domain name check to make sure the domain exists and has a valid MX record.
  • We also perform a comprehensive SMTP check by communicating directly with the target mail server to determine three key pieces of information; is the server working, will it accept any address and will it accept a specific address.
  • Finally, we perform multiple integrity checks to see if the email address is associated with problematic addresses and services like; spam-traps, known disposable address providers and blacklisted servers.

Ultimately determining if the email address is a real, functioning email address.

Circling back to the Chinese email addresses we discussed earlier: our Email Validation service can validate these with no problem, but clients often get confused when these emails get a low score. We verify that they are deliverable, but give them a low score because of problems such as being bots or malicious. It is then up to you to decide whether you want to take the risk of using these email addresses or not. So in closing, understand that numerical or nonsensical emails from other countries are often OK is a good first step, but automated validation can help you make an informed decision on whether to use them.

Many emails flying into a trash bin

Identifying Disposable Email Addresses: A Better Approach

Disposable email addresses – also known as burner emails, throwaway emails, temporary emails or fake emails – are commonly touted as a useful tool for keeping one’s personal or business email address private and clean of spam. Not to be confused with alias email addresses (which generally forward to a primary email address, and are therefore more likely to be read), there are different types of disposable email addresses, and they can work in a variety of ways.

In general, a user will submit a disposable email address instead of their real one, which in theory should help keep one’s own email protected from spam without their primary email and/or private data being exposed. (Note that we say “should”: there are some unscrupulous disposable email providers out there, so as with all things concerning the internet, users must be careful.)

Disposable email addresses may sound great for end users, but they can be problematic for legitimate businesses and marketers. One could easily argue that disposables are successfully doing their job when it prevents a marketer from emailing an end user, but this also means that businesses are forced to adapt their marketing strategies. One such strategy: trying to identify these disposable email addresses up front, to have a more accurate view of your email marketing assets.

A simple (but flawed) strategy: email lists

Disposable email addresses are commonly identified by static lists. There are many online communities that pool together their own lists of known disposable domains and email addresses. However, static lists are a poor long-term solution, as they can quickly become stagnant. Some communities do their best to keep their lists up to date, but there are still many potential problems with this strategy:

  • Lists often lack standardization, which can lead to implementation issues. There are many disposable services available worldwide, and some community driven lists and solutions are dedicated to just a single disposable service.
  • These lists frequently contain legitimate records for domains and addresses that are not disposable.
  • In order for a disposable to make it on to a list it first needs to be reported. By the time that happens, and the data makes it way into a solution, the list may already be partially outdated. Moreover, disposables frequently change and not all disposables are reported.
  • Using a list strategy requires constant vigilance. It’s trouble enough staying up to date on just one disposable service, but trying to stay on top of multiple others as well as new ones as they pop up is often a losing battle.

Lists of disposable email addresses are a reactionary solution at best. Worse, they only scratch the surface of the problem. Disposables are constantly changing, with new ones appearing and old ones disappearing all the time. It is impractical to rely on a simple list strategy to try and successfully identify a disposable.

A better approach: organic data aggregation

At Service Objects we like to look beyond simple lists. Instead of looking at one list to perform a simple straightforward disposable lookup, we take advantage of our wealth of data and our years of experience to not only dig deeper, but to also cast a wider net. Our email validation service doesn’t just look at lists, it looks at the whole picture as well as the nitty-gritty.

We observe various behavior patterns to better identify specific activities and ties to these activities, not just for disposables but for a variety of email types – malicious or otherwise. This allows us to assign values to these activities and even compare them against other activities. Using complex algorithms along with machine learning we can intelligently determine if a value is directly or indirectly related to a particular issue, such as being a disposable address.

As sophisticated as this solution is, note that we won’t always be able to successfully identify a disposable address. Sometimes all the variables don’t match up just right, and sometimes there just isn’t enough data. However, the service will still often be able to identify such email address as being malicious or potentially malicious, in which case you would likely want to reject the email address anyway.

The sophisticated solution

Disposable email addresses are a real headache for businesses and marketers. As with most things regarding email addresses, they are a much more complicated problem than one would normally think. A problem that requires more than a simple list as a solution. They call for a sophisticated solution.

Our DOTS Email Address Validation service keeps tabs on millions of domains. It monitors various behavior patterns and leverages multiple sets of data. As domains and data continue to grow, so does the service – becoming smarter and better. The service can adapt to the constantly changing disposables, making it better suited to identify them as they pop up. Not because it’s trying to keep up with them, but because it’s anticipating them.