Service Objects’ Blog

Thoughts on Data Quality and Contact Validation

Email *

Share

Posts Tagged ‘Fraud Prevention’

Baseball and Data Quality: America’s National Pastimes

By the time October rolls around, the top Major League baseball teams in the country are locked in combat, in the playoffs and then the World Series. And as teams take the field and managers sit in the dugout, everyone has one thing on their mind.

Data.

Honestly, I am not just using a cheap sports analogy here. Many people don’t realize that before my current career in data quality, I was a young pitcher with a 90+ MPH fastball. I eventually made it as far as the Triple-A level of the Pittsburgh Pirates organization. So I know a little bit about the game and how data plays into it. We really ARE thinking about data, almost every moment of the game.

One batter may have a history of struggling to hit a curve ball. Another has a good track record against left-handed pitching. Still another one tends to pull balls to the left when they are low in the strike zone. All of this has been captured as data. Have you noticed that position players shift their location for every new batter that comes to the plate? They are responding to data.

Long before there were even computers, baseball statisticians tracked everything about what happens in a game. Today, with real-time access to stats, and the ability to use data analytics tools against what is now a considerable pool of big data, baseball has become one of the world’s most data-driven sports. The game’s top managers are distinguished for what is on their laptops and tablets nowadays, every bit as much as for who is on their rosters.

And then there are the people watching the game who help pay for all of this – remember, baseball is fundamentally in the entertainment business. They are all about the data too.

A recent interview article with the CIO of the 2016 World Champion Chicago Cubs underscored how a successful baseball franchise leverages fan data at several levels: for example, tracking fan preferences for an optimal game experience, analyzing crowd flow to optimize the placement of concessions and restrooms, and preparing for a rush of merchandise orders in the wake of winning the World Series (although, as a lifelong Cubs fan, I realize that they’ve only had to do that once so far since 1908). For any major league team, every moment of the in-game experience – from how many hot dogs to prepare to the “walk up” music the organist plays when someone comes up to bat – is choreographed on the back of customer data.

Baseball has truly become a metaphor for how data has become one of the most valuable business assets for any organization – and for a competitive environment where data quality is now more important than ever. I couldn’t afford to pitch with bad data on opposing players, and you can’t afford to pursue bad marketing leads, ship products to wrong customer addresses, or accept fraudulent orders. Not if your competitors are paying closer attention to data quality than you are.

So, pun intended, here’s my pitch: look into the ROI of automating your own data quality, in areas such as marketing leads, contact data verification, fraud prevention, compliance, and more. Or better yet, leverage our demographic and contact enhancement databases for better and more profitable customer analytics. By engineering the best data quality tools right into your applications and processes, you can take your business results to a new level and knock it out of the park.

Service Objects New BIN Validation Operation Helps Retailers Fight Fraud

Here at Service Objects, we strive to improve our services to best meet our customers’ needs. Sometimes that means adding additional features and upgrades, tweaking an existing service and/or operation, leveraging new datasets, or adding an entirely new service. We take pride on being able to quickly and effectively respond to our customers’ feedback and requests.

Part of this response to client feedback has led us to develop a new operation upgrade for our DOTS BIN Validation service. It is called ValidateBIN_V2. This new feature represents the latest and greatest that our BIN Validation service has to offer.

DOTS BIN Validation service is used to help determine if a certain BIN (the first 6 digits of a credit card number) is valid or not — a crucial step in fighting fraud. BIN validation also helps merchants determine if a credit card number is for a debit card, credit card, gift card, or prepaid card. Likewise, the BIN number will identify the country of origin for the card, providing you with insight as to the validity of the transaction.

This new BIN operation upgrade builds on the previous operation, providing even further information about a BIN.

By design, and to ensure that we’re giving our customers quality information, the V1 BIN operation returns information about a BIN only if bank information can be found about it.

The ValidateBIN_V2 operation provides the same information as the V1 operation, but also functions slightly differently and provides additional information:

  • Instead of failing a BIN or providing an error response, ValidateBIN_V2 displays any information about a BIN that we can find.
  • The V2 operation upgrade will return a “Status” field indicating “OK” for BINs we were able to find or “Not Found” for BINs that we weren’t able to find or that don’t exist.
  • The V2 operation will return the same card type, sub type, bank, and country information that the old operation returned.

We’ve also added a few new fields to the new BIN operation that make it more helpful to the end user:

  • Warnings — This field returns warning codes and accompanying descriptions about those warnings. The current service will only return warnings if the bank information, card type or country information is missing for a BIN.
  • Notes— This field contains additional information. Based on the way we have set these fields up in our API, we can easily add new warnings and notes as we continue to improve our services.  These fields allow us to return useful information about a BIN without affecting the current output structure of the API.
  • Information Components — This field is set up in a way that allows us to future proof the ValidateBIN_V2 operation. If we need to add new fields, the Information Components field allows us to easily do so without altering the existing structure of the API.

If you are interested in testing our BIN Validation API, sign up for a free trial key today!

Is Your Shopping Cart Feeling Abandoned? Data Quality Can Help

Dating experts will tell you that people have more problems committing than ever before. And nowhere is this more evident than in your online shopping cart. According to Barriliance, a vendor of online shopping cart optimization tools, over three-quarters of people abandoned their carts in 2016, with specific figures ranging from 73% on desktops to over 85% on mobile phones.

Cart abandonment sounds like a term straight out of family therapy, but in reality it provides an important window on consumer behavior. Some factors for bailing out on a purchase may be unavoidable – for example, customers may window-shop on their phones to purchase something later, or become reluctant to purchase when they see high shipping charges or additional fees. But other factors are within your control, and these often revolve around data quality issues.

Here are some of the big ones:

Too much data entry. Your customer sees 20 ‘required’ fields to be completed to check out.  Instead, they abandon the cart due to too much ‘form friction’. For greater conversion, we want to reduce the amount of friction wherever possible to promote a fast and accurate checkout process. Autocomplete tools can help lessen the friction, and are generally considered accurate, as they are based on the individuals’ contact information. Whereas, address-suggestors should be used with caution, as they can present the user with multiple address matches close to their own.  This significantly increases the risk of the user accidentally selecting an incorrect but real address. This can also create increased confusion when credit card authorization fails due to mismatched address, further increasing cart abandonment. Regardless of the tool, Address validation should always take place after the customer uses autocomplete and/or address-suggestor, to reduce the risk that a wrong – but valid and deliverable – address gets used.

Computer literacy. Often your richest target markets struggle the most with ordering things online – and too often, throw up their hands if there are too many hardships to placing an order. This means that cart recovery often revolves around being able to reach out to a customer and help them complete the order.

By using phone validation and email validation tools, you can help ensure correct contact data is captured in the event that you need to call or email customers about incomplete orders, and hopefully convert some of these into completed ones. These contacts are generally very effective: for example, Business Insider cites figures from marketing automation firm Listrak showing that 40% of follow-up cart recovery emails are opened if sent within three hours.

Payment information. When people pay by credit card online, they are usually entering 16-20 digits, and typos and bad information can quickly kill valid orders. A Luhn check, a real-time, simple checksum formula designed to distinguish valid numbers from mistyped or otherwise incorrect numbers, can help ensure the credit card number entered at least meets the basic criteria.  You can also check the Bank identification number (BIN) to ensure correct credit card numbers, that have passed that Luhn algorithm, are legitimately issued by financial institutions even before trying to process the actual charge.  This provides the opportunity to engage the customer at the time of entry and allow for corrections.  As a bonus, BIN validation also helps screen out fraudulent payment information before you process the order and/or ship.

Keep It Simple. The design, layout and even language used for your cart make a difference too. Kissmetrics notes that buyers can be turned off by faux pas ranging from bad design, making people create an account, or the process is too complicated.  A simple, clean step-by-step guide can provide confidence for your shopper and increase your conversion rates as well.  When there is an error, do not overlook the power of strong and informative error messaging.  For example, if email validation returns a specific error, let the customer know the precise nature of the error and provide suggestions on how to fix it. A generic ‘error’ message is not enough.

Finally, there is one kind of cart that always should be left behind: people who are trying to place fraudulent orders. You can use bundled tools such as lead and order validation to perform real-time multi-point contact validation on US, Canadian and International leads, comparing data such as name, company, address, phone, email and device against hundreds of authoritative data sources. The results provide both an individual quality score for each data point and a composite quality score (0-100), to ensure that you are working with genuine and accurate leads.

Online order entry truly is a bit like dating. We can’t make everyone fall in love with us, or guarantee that they will make it all the way to the altar. But with the right kinds of tools, including building in data quality safeguards at the API level, we can boost our chances of success substantially. And that is something every online merchant can be in love with.

The Importance of Data Quality for International Ecommerce

In today’s era of online ecommerce, international sales represent a huge potential market for US vendors. According to research firm eMarketer, international sales represent three-quarters of a nearly US $2 trillion retail ecommerce market, nearly half of which comes from China alone. And much of this vast market is only a click away.

On the other hand, cross-border sales remain one of the greatest risks for fraud, with a rate that was more than twice that of domestic fraud through 2012, and despite recent improvements in data quality technology this rate is still 28% higher as of 2015. And one digital commerce site notes that while retailers are making progress at managing fraudulent transaction rates, they are doing so at the expense of turning away good customers – people who, in turn, may never patronize these sites again.

So how do you exploit a rich and growing potential market while mitigating your risk for fraud? The answer might surprise you. While nearly everyone preaches the importance of a fraud protection strategy for ecommerce, and suggestions abound in areas that range from credit card verification to IP geolocation, the head of ecommerce at industry giant LexisNexis points to one area above all: address verification.

In a recent interview with Multichannel Merchant, LexisNexis ecommerce chief Aaron Press points out that the biggest problem with international addresses is a lack of addressing standards between countries. “Postal codes have different formats, where you put the number, how the street is formatted. Normalizing all of that down to a set of parameters that can be published on an API is a huge challenge.”

This means that you need robust capabilities in any third-party solution that you choose to help verify international addresses. Some of the key things to look for include:

  • How many countries does the vendor support address formats for, and does this list include all of the countries where you do business?
  • Can the application handle multiple or nested municipality formats? For example, a customer may list the same location in Brazil correctly as Rio, Rio de Janeiro, Município do Rio de Janeiro – or even the sub-municipality of Guanabara Bay.
  • Will the application handle different spellings or translations for common areas? In the address above, for example, the country may be spelled as Brazil or Brasil. Likewise, the United Kingdom may also be referred to as England, British Isles, Karalyste, Birtaniya, United Kingdom of Great Britain and Northern Ireland, or even 英国 (Chinese for the United Kingdom, literally “England Kingdom”).
  • Can these capabilities can be implemented as an API within your ordering application? Or can it process addresses externally through batch processing?

In general, cross-border fraud prevention requires a multi-pronged effort involving all of the potential stress points in an international transaction, including international address verification, email validation, credit card BIN validation, IP address verification – even name validation, so you can flag orders addressed to Vladimir Putin or Homer Simpson. These are clearly capabilities that you outsource to a vendor, unless you happen to be sitting on hundreds of millions of global addresses and their country-specific formats. The good news is that in an era of inexpensive cloud-based applications, strong fraud protection is easily implemented nowadays as part of your normal order processing strategy.

How secure is your ‘Data at Rest’?

In a world where millions of customer and contact records are commonly stolen, how do you keep your data safe?  First, lock the door to your office.  Now you’re good, right?  Oh wait, you are still connected to the internet. Disconnect from the internet.  Now you’re good, right?  What if someone sneaks into the office and accesses your computer?  Unplug your computer completely.  You know what, while you are at it, pack your computer into some plain boxes to disguise it.   Oh wait, this is crazy, not very practical and only somewhat secure.

The point is, as we try to determine what kind of security we need, we also have to find a balance between functionality and security.  A lot of this depends on the type of data we are trying to protect.  Is it financial, healthcare, government related, or is it personal, like pictures from the last family camping trip.  All of these will have different requirements and many of them are our clients’ requirements. As a company dealing with such diverse clientele, Service Objects needs to be ready to handle data and keep it as secure as possible, in all the different states that digital data can exist.

So what are the states that digital data can exist in?  There are a number of states and understanding them should be considered when determining a data security strategy.  For the most part, the data exists in three states; Data in Motion/transit, Data at Rest/Endpoint and Data in Use and are defined as:

Data in Motion/transit

“…meaning it moves through the network to the outside world via email, instant messaging, peer-to-peer (P2P), FTP, or other communication mechanisms.” – http://csrc.nist.gov/groups/SNS/rbac/documents/data-loss.pdf

Data at Rest/Endpoint

“data at rest, meaning it resides in files systems, distributed desktops and large centralized data stores, databases, or other storage centers” – http://csrc.nist.gov/groups/SNS/rbac/documents/data-loss.pdf

“data at the endpoint, meaning it resides at network endpoints such as laptops, USB devices, external drives, CD/DVDs, archived tapes, MP3 players, iPhones, or other highly mobile devices” – http://csrc.nist.gov/groups/SNS/rbac/documents/data-loss.pdf

Data in Use

“Data in use is an information technology term referring to active data which is stored in a non-persistent digital state typically in computer random access memory (RAM), CPU caches, or CPU registers. Data in use is used as a complement to the terms data in transit and data at rest which together define the three states of digital data.” – https://en.wikipedia.org/wiki/Data_in_use

How Service Objects balances functionality and security with respect to our clients’ data, which is at rest in our automated batch processing, is the focus of this discussion.  Our automated batch process consists of this basic flow:

  • Our client transfers a file to a file structure in our systems using our secure ftp. [This is an example of Data in Motion/Transit]
  • The file waits momentarily before an automated process picks it up. [This is an example of Data at Rest]
  • Once our system detects a new file; [The data is now in the state of Data in Use]
    • It opens and processes the file.
    • The results are written into an output file and saved to our secure ftp location.
  • Input and output files remain in the secure ftp location until client retrieves them. [Data at Rest]
  • Client retrieves the output file. [Data in Motion/Transit]
    • Client can immediately choose to delete all, some or no files as per their needs.
  • Five days after processing, if any files exist, the automated system encrypts (minimum 256 bit encryption) the files and moves them off of the secure ftp to another secure location. Any non-encrypted version is no longer present.  [Data at Rest and Data in Motion/Transit]
    • This delay gives clients time to retrieve the results.
  • 30 days after processing, the encrypted version is completely purged.
    • This provides a last chance, in the event of an error or emergency, to retrieve the data.

We encrypt files five days after processing but what is the strategy for keeping the files secure prior to the five day expiration?  First off, we determined that the five and 30 day rules were the best balance between functionality and security. But we also added flexibility to this.

If clients always picked up their files right when they were completed, we really wouldn’t need to think too much about security as the files sat on the secure ftp.  But this is real life, people get busy, they have long weekends, go on vacation, simply forget, whatever the reason, Service Objects couldn’t immediately encrypt and move the data.  If we did, clients would become frustrated trying to coordinate the retrieval of their data.  So we built in the five and 30 day rule but we also added the ability to change these grace periods and customize them to our clients’ needs.  This doesn’t prevent anyone from purging their data sooner than any predefined thresholds and in fact, we encourage it.

When we are setting up the automated batch process for a client, we look at the type of data coming in, and if appropriate, we suggest to the client that they may want to send the file to us encrypted. For many companies this is standard practice.  Whenever we see any data that could be deemed sensitive, we let our client know.

When it is established that files need to be encrypted at rest, we use industry standard encryption/decryption methods.  When a file comes in and processing begins, the data is now in use, so the file is decrypted.  After processing, any decrypted file is purged and what remains is the encrypted version of the input and output files.

Not all clients are concerned or require this level of security but Service Objects treats all data the same, with the utmost care and the highest levels of security reasonable.  We simply take no chances and always encourage strong data security.

Launching a New Ecommerce Site? Don’t Forget Data Quality Tools

Online commerce is huge nowadays – to the tune of over $400 billion dollars a year in the United States alone in 2017, at a growth rate up to three times that of retail in general. Barriers to entry are lower than ever, ecommerce platforms have become simpler to use and less expensive than ever, and the convenience of e-commerce has grown to encompass businesses of every size. Above all, purchasing goods online has become ubiquitous among today’s consumers.

Whether you are looking to launch a simple shopping cart using platforms like WordPress’ WooCommerce, Shopify or Magento, or an enterprise solution like Microsoft’s Commerce Server or IBM’s WebSphere Commerce, it can still be a minefield for the uninitiated. Here are some of the risks that every online seller takes every day:

Fraud. Filling orders from fraudulent sources costs you both revenue and time – and according to Javelin Research, identity fraud alone totals over $18 billion per year in the US. And the bad guys particularly love to target novice sellers.

Fulfillment. Every online order starts a chain of activities – from billing to shipment – that depend on the quality of your contact data. Credit card processing often requires accurate address data, and one misdirected shipment can wipe out the profit margin of many other sales – not to mention the reputational damage it can do.

Marketing. According to the Harvard Business Review, the cost of acquiring a new customer ranges from 5 to 25 times the cost of selling to an existing customer. This means that your contact database is the key to follow-on sales, brand awareness and long-term profitability. Which also means that bad contact data – and the rate at which this contact data decays– cuts straight to your bottom line.

Tax issues. Did you know that tax rates can vary from one side of a street to the other? Or that some states have passed or are considering an “internet tax” out-of-state sellers? Tax compliance, and avoiding the penalties that come with incorrect sales tax rates, is a fact of life for any online business.

The common denominator between each of these issues? Data quality. And thankfully, these problems can all be mitigated inexpensively nowadays, with tools that fit right in with your current contact management strategy. Some of the solutions available today from Service Objects include:

  • A suite of tools for fraud prevention, including address, email and telephone verification, lead validation that scores prospects on a scale of 0-100, credit card validation, and IP address validation – so you know when an order for a customer in Utah is placed from Uzbekistan.
  • Shipping address validation tools that verify addresses against up-to-date real-time data from the USPS and Canada Post, to make sure your products go to the right place every time.
  • Email verification capabilities that perform over 50 tests, including auto-correcting common domain errors and yielding an overall quality score – improving your marketing effectiveness AND preventing your mail servers from being blacklisted.
  • Real-time tax rate assessment that validates your addresses, and then provides accurate sales and use tax rates at any jurisdictional level.

Each of these capabilities are available in several convenient formats, ranging from APIs for your applications to batch processing of contact lists. Whichever form you choose, automated data tools can quickly make the most common problems of online commerce a thing of the past.

The Path to Data Quality Excellence

“In the era of big data and software as a service, we are witnessing a major industry transformation. In order to stay competitive, businesses have reduced the time it takes to deploy a new application from months to minutes.” – Geoff Grow, Founder and CEO, Service Objects

The big data revolution has ushered in a major change in the way we develop software, with applications webified and big data tools woven in. Until recently data quality tools that ensure data is genuine have not kept pace. As a result, developers have had little choice but to leave out data validation in their applications.

In this video, Geoff will show you why data validation is critical to reducing waste, identifying fraud, and maximizing operation efficiency – and how on-demand tools are the best way to ensure that this data is genuine, accurate, and up-to-date. If you develop applications with IP connectivity, watch this video and discover what 2,400 other organizations have learned about building data quality right into their software.

Looking Beyond Simple Blacklists to Identify Malicious IP Addresses

Using a blacklist to block malicious users and bots that would cause you aggravation and harm is one of the most common and oldest methods around (according to Wikipedia the first DNS based blacklist was introduced in 1997).

There are various types of blacklists available. Blacklists exist for IP addresses, domains, email addresses and user names. The majority of the time these lists will concentrate on identifying known spammers. Other lists will serve a more specific purpose, such as IP lists that help identify known proxies, TORs and VPNs or email lists of known honey pots or lists of disposable domains.

There are many different types of malicious activity that occur on the internet and there are various types of lists out there to help identify and prevent it; however, there are also various problems with lists.

The problem with Lists:

In order to first identify a malicious activity with a list, the malicious activity must first occur and then be reported and propagated. It is not uncommon for the malicious activity to stop by the time it has been reported and propagated. Not all malicious activities are reported. If you encounter the malicious activity before it is reported then you won’t be able to preemptively act on it.

IPs, Domains, Email Addresses and Usernames are dynamic and disposable. If a malicious user/bot gets blocked then they can easily switch to a different IP, domain etc.

Some lists offer warnings that blocking an IP address could affect thousands of users who depend on it in order to obtain crucial information that they would otherwise not have access to. So block responsibly.

Aggregating data to more effectively identify malicious activity:

Instead of looking at one list to perform a simple straightforward lookup, we can take advantage of multiple datasets to uncover patterns and relationships between seemingly disparate values. A simple example would be, relating user names to email addresses, email addresses to domains and domains to IP addresses, which allows us to view the activity of one value and compare it to behavior of other values. Using complex algorithms with machine learning to process large samples of data we can intelligently discern if a value is directly or indirectly related to a malicious activity.

How Service Objects keeps it simple for the user:

The DOTS IP Address Validation service currently has two flags to help its user deal with malicious IPs, ‘MaliciousIP’ and ‘PotentiallyMaliciousIP’. The ‘MaliciousIP’ flag indicates that the IP address recently displayed malicious activity and should be treated as such. The ‘PotentiallyMaliciousIP’’ flag indicates that the IP address recently displayed one or more strong relationships to a malicious activity and that it has a high likelihood of being malicious. Both flags should be treated as warnings with the ‘MalciousIP’ flag being scrutinized more severely.

The warning signs of online fraud are out there, but you need a means of discovering them. Our IP Validation service encompasses many of the identification strategies necessary to make split second decisions on would be attackers before any harm is done.

Fighting Fraud with Big Data

Fraud comes in many forms whether through misrepresentation, concealment or intent to deceive. Traditional methods of identifying and fighting fraud have relied on data analysis to detect anomalies which signal a fraud event has taken place. Detecting anomalies falls into two categories; known and unknown.

Known Fraud Schemes

Known fraud schemes can be easy to identify. They have been committed in the past and thus recognizably fit a pattern. Common known fraud schemes over the web include purchase fraud, internet marketing, and retail fraud. Methods to identify patterns for these types of fraud include tracking user activity, location, and behavior. One example for tracking location might be through IP, determining whether a user is concealing their identity, or is executing a transaction from a high-risk international location. A correlation can be made based on location if it is determined to be High Risk. Another case for location tracking is a physical address. In the past, fraudsters have used unoccupied addresses to accept delivered goods purchased through online and retail stores. Identifying an unoccupied address through DOTS Address Validation DPV notes provides real-time notification of vacant addresses which can be considered a red flag.

Identifying the Unknown

Unknown fraud schemes, on the other hand, are much more difficult to identify. They do not fall into known patterns making detection more challenging. This is starting to change with the paradigm shift from reactive to proactive fraud detection made possible through Big Data technologies. With Big Data, the viewpoint becomes much larger, analyzing each individual event vs sampling random events to attempt to identify an anomaly.

So What is Big Data?

Big Data is generally defined as datasets which are larger or more complex than traditional data processing applications ability to handle them. Big Data can be described by the following characteristics: Volume, Variety, Velocity, Variability, and Veracity.

Volume: The quantity of generated and stored data.

Variety: The type and nature of the data.

Velocity: The speed at which data is generated and processed.

Variability: Inconsistency of the data set.

Veracity: The quality of captured data varies.

Tackling Big Data

With the advent of distributed computing tools such as Hadoop, wrangling these datasets into manageable workloads has become a reality. Spreading the workload across a cluster of nodes provides the throughput and storage space necessary to process such large datasets within an acceptable timeframe. Cloud hosting providers such as Amazon provide an affordable means to provision an already configured cluster; perform data processing tasks, and immediately shut down, reducing infrastructure costs and leveraging the vast hardware resources available through Amazon’s network.

Service Objects Joins the Fight

More recently, Service Objects has been employing Big Data techniques to mine through datasets in the hundreds of terabytes range, collecting information and analyzing results to improve fraud detection in our various services. This ambitious project will provide an industry leading advantage in the sheer amount of data collected, validating identity, location and a host of attributes for businesses. Stay tuned for more updates about this exciting project.

Service Objects is the industry leader in real-time contact validation services.

Service Objects has verified over 2.8 billion contact records for clients from various industries including retail, technology, government, communications, leisure, utilities, and finance. Since 2001, thousands of businesses and developers have used our APIs to validate transactions to reduce fraud, increase conversions, and enhance incoming leads, Web orders, and customer lists. READ MORE