Service Objects’ Blog

Thoughts on Data Quality and Contact Validation

Email *

Share

Posts Tagged ‘Big Data’

Unique US and Canadian Zip Code Files – Now Available for Download

Customer Service Above All.  It is one of our core values here at Service Objects.  Recently, we’ve received several requests for a list of the unique zip codes throughout the US and Canada.  By leveraging our existing services, we’ve made this happen.  We are now offering both the US and Canada list as a free downloadable resource.

So why is Service Objects providing this data? Our goal is to provide the best data cleansing solutions possible for our clients. Part of this means using our existing data to provide our users with the data they need. While other data providers might charge for this type of information, we’ve decided to make it freely available for anyone’s use. These files can be used for several purposes, such as pre-populating a list of cities and states for a form where a user needs to enter address information. The County and State FIPS information is widely used in census and demographic data or could be used to uniquely identify States and counties within a database.  Additionally, the given time zone information can be used to determine appropriate times to place calls to a customer.

Where to Download

This link allows you to access a .zip file containing two CSV records.  One CSV contains the US information, the other is for Canada.  The files indicate the month and year the records were created. Toward the middle of each month, the data in each record will be updated to account for any changes in US and Canadian postal codes.

What Other Information is in the Files?

Both files will have postal codes, states (or provinces for Canada) and time zone information.  The Canadian zip code file will be much larger in size with over 800K records. This is due to Canadian Postal Codes generally being much smaller than US Postal codes. Where a US postal code can sometimes encompass multiple cities or counties, a Canadian postal code can be the size of a couple city blocks or in some cases a single high-rise building.

The US file has information for all United States postal codes including its territories. This file will also include the county that the zip code lies in. There will be County and State FIPS numbers for each of the records to help with processing that information as well.  The US file will be considerably smaller than the Canadian file at only 41K records.

In making these files freely accessible, our hope is to make the integration and business logic easier for our users. If you’d like to discuss your particular contact data validation needs, feel free to contact us!

Don’t Let Bad Data Scare You This Halloween

Most of us here in North America grew up trick-or-treating on Halloween. But did you know the history behind this day?

In early Celtic culture, the feast of All Hallows Eve (or Allhallowe’en) was a time of remembering the souls of the dead – and at a more practical level, preparing for the “death” of the harvest season and the winter to follow. People wore costumes representing the deceased, who by legend were back on earth to have a party or (depending upon cultural interpretation) cause trouble for one last night, and people gave them alms in the form of soul cakes – which evolved to today’s sweet treats – to sustain them.

So what were people preparing for in celebrating Halloween? Good data quality, of course. Back then, when your “data” consisted of the food you grew, people took precautions to protect it from bad things by taking the preventative measure of feeding the dead. Today, Halloween is a fun celebration that actually has some important parallels for managing your data assets. Here are just a few:

An automated process. The traditions of Halloween let people honor the dead and prepare for the harvest in a predictable, dependable way. Likewise, data quality ultimately revolves around automated tools that take the work – and risk – out of creating a smooth flow of business information.

Organizational buy-in. Unlike many other holidays, Halloween was a community celebration fueled by the collective efforts of everyone. Every household took part in providing alms and protecting the harvest. In much the same way, modern data governance efforts make sure that all of the touch points for your data – when it is entered, and when it is used – follow procedures to ensure clean, error free leads, contacts and e-commerce information.

Threat awareness. Halloween was designed to warn people away from the bad guys – for example, the bright glow of a Jack-o-lantern was meant to keep people away from the spirit trapped inside. Today, data quality tools like order and credit card BIN validation keep your business away from the modern-day ghouls that perpetrate fraud.

An ounce of prevention. This is the big one. Halloween represented a small offering to the dead designed to prevent greater harm. When it comes to your data, prevention is dramatically more cost- effective than dealing with the after-effects of bad data: this is an example of the 1-10-100 rule, where you can spend one dollar preventing data problems, ten dollars correcting them, or $100 dealing with the consequences of leaving them unchecked.

These costs range from the unwanted marketing costs of bad or fraudulent leads to the cost in lost products, market share and customer good will when you ship things to the wrong address. And this doesn’t even count some of the potentially big costs for compliance violations, such as the Telephone Consumer Protection Act (TCPA) for outbound telemarketing, the CAN-SPAM act for email marketing, sales and use tax mistakes, and more.

So now you know: once upon a time, people mitigated threats to their data by handing out baked goods to people in costumes. Now they simply call Service Objects, to implement low-cost solutions to “treat” their data with API-based and batch-process solutions. And just like Halloween, if you knock on our door we’ll give you a sample of any of our products for free! For smart data managers, it’s just the trick.

The Talent Gap In Data Analytics

According to a recent blog by Villanova University, the amount of data generated annually has grown tremendously over the last two decades due to increased web connectivity, as well as the ever-growing popularity of internet-enabled mobile devices. Some organizations have found it difficult to take advantage of the data at their disposal due to a shortage of data-analytics experts. Primarily, small-to-medium enterprises (SMBs) who struggle to match the salaries offered by larger businesses are the most affected. This shortage of qualified and experienced professionals is creating a unique opportunity for those looking to break into a data-analysis career.

Below is some more information on this topic.

Data-Analytics Career Outlook

Job openings for computer and research scientists are expected to grow by 11 percent from 2014 to 2024. In comparison, job openings for all occupations are projected to grow by 7 percent over the same period. Besides this, 82 percent of organizations in the US say that they are planning to advertise positions that require data-analytics expertise. This is in addition to 72 percent of organizations that have already hired talent to fill open analytics positions in the last year. However, up to 78 percent of businesses say they have experienced challenges filling open data-analytics positions over the last 12 months.

Data-Analytics Skills

The skills that data scientists require vary depending on the nature of data to be analyzed as well as the scale and scope of analytical work. Nevertheless, analytics experts require a wide range of skills to excel. For starters, data scientists say they spend up to 60 percent of their time cleaning and aggregating data. This is necessary because most of the data that organizations collect is unstructured and comes from diverse sources. Making sense of such data is challenging, because the majority of modern databases and data-analytics tools only support structured data. Besides this, data scientists spend at least 19 percent of their time collecting data sets from different sources.

Common Job Responsibilities

To start with, 69 percent of data scientists perform exploratory data-analytics tasks, which in turn form the basis for more in-depth querying. Moreover, 61 percent perform analytics with the aim of answering specific questions, 58 percent are expected to deliver actionable insights to decision-makers, and 53 percent undertake data cleaning. Additionally, 49 percent are tasked with creating data visualizations, 47 percent leverage data wrangling to identify problems that can be resolved via data-driven processes, and 43 percent perform feature extraction, while 43 percent have the responsibility of developing data-based prototype models.

In-demand Programming-Language Skills

In-depth understanding of SQL is a key requirement cited in 56 percent of job listings for data scientists. Other leading programming-language skills include Hadoop (49 percent of job listings), Python (39 percent), Java (36 percent), and R (32 percent).

The Big-Data Revolution

The big-data revolution witnessed in the last few years has changed the way businesses operate substantially. In fact, 78 percent of corporate organizations believe big data is likely to fundamentally change their operational style over the next three years, while 71 percent of businesses expect the same resource to spawn new revenue opportunities. Only 58 percent of executives believe that their employer has the capability to leverage the power of big data. Nevertheless, 53 percent of companies are planning to roll out data-driven initiatives in the next 12 months.

Recruiting Trends

Companies across all industries are facing a serious shortage of experienced data scientists, which means they risk losing business opportunities to firms that have found the right talent. Common responsibilities among these professionals include developing data visualizations, collecting data, cleaning and aggregating unstructured data, and delivering actionable insights to decision-makers. Leading employers include the financial services, marketing, corporate and technology industries.

View the full infographic created by Villanova University’s Online Master of Science in Analytics degree program.

http://taxandbusinessonline.villanova.edu/resources-business/infographic-business/the-talent-gap-in-data-analytics.html

Reprinted with permission.

What Can We Do? Service Objects Responds to Hurricane Harvey

The Service Objects’ team watched the steady stream of images from Hurricane Harvey and its aftermath and we wanted to know, ‘What can we do to help?’  We realized the best thing we could do is offer our expertise and services free to those who can make the most use of them – the emergency management agencies dedicated to helping those affected by this disaster.

It was quickly realized that as Hurricane Harvey continues to cause record floodwaters and entire neighborhoods are under water, these agencies are finding it nearly impossible to find specific addresses in need of critical assistance. In response to this, we are offering emergency management groups the ability to quickly pinpoint addresses with latitude and longitude coordinates by offering unlimited, no cost access to DOTS Address Geocode ℠ (AG-US). By using Address Geocode, the agencies will not have to rely on potentially incomplete online maps. Instead, using Service Objects’ advanced address mapping services, these agencies will be able to reliably identify specific longitude and latitude coordinates in real-time and service those in need.

“The fallout of the catastrophic floods in Texas is beyond description, and over one million locations in Houston alone have been affected,” said Geoff Grow, CEO and Founder of Service Objects.  “With more than 450,000 people likely to seek federal aid in recovering from this disaster, Service Objects is providing advanced address mapping to help emergency management agencies distribute recovery funds as quickly as possible. We are committed to helping those affected by Hurricane Harvey.”

In addition, as disaster relief efforts are getting underway, Service Objects will provide free access to our address validation products to enable emergency management agencies to quickly distribute recovery funds by address type, geoid, county, census-block and census-track. These data points are required by the federal government to release funding.  This will allow those starting the recovery process from this natural disaster to get next level services as soon as possible.

To get access to Service Objects address solutions or request maps, qualified agencies can contact Service Objects directly by calling 805-963-1700 or by emailing us at info@serviceobjects.com.

Our team wishes the best for all those affected by Hurricane Harvey.

Image by National Weather Service 

How Millennials Will Impact Your Data Quality Strategy

The so-called Millennial generation now represents the single largest population group in the United States. If they don’t already, they will soon represent your largest base of customers, and a majority of the work force. What does that mean for the rest of us?

It doesn’t necessarily mean that you have to start playing Adele on your hold music, or offering free-range organic lattes in the company cafeteria. What it does mean, according to numerous social observers, is that expectations of quality are changing radically.

The Baby Boomer generation, now dethroned as the largest population group, grew up in a world of amazing technological and social change – but also a world where wrong numbers and shoddy products were an annoying but inevitable part of life. Generation X and Y never completely escaped this either:  ask anyone who ever drove a Yugo or sat on an airport tarmac for hours. But there is growing evidence that millennials, who came of age in a world where consumer choices are as close as their smartphones, are much more likely to abandon your brand if you don’t deliver.

This demographic change also means you can no longer depend on your father’s enterprise data strategy, with its focus on things like security and privacy. For one thing, according to USA Today, millennials could care less about privacy. The generation that grew up oversharing on Instagram and Facebook understands that in a world where information is free, they – and others – are the product. Everyone agrees, however, that what they do care about is access to quality data.

This also extends to how you manage a changing workforce. According to this article, which notes that millennials will make up three quarters of the workforce by 2020, dirty data will become a business liability that can’t be trusted for strategic purposes, whether it is being used to address revenues, costs or risk. Which makes them much more likely to demand automated strategies for data quality and data governance, and push to engineer these capabilities into the enterprise.

Here’s our take: more than ever, the next generation of both consumers and employees will expect data to simply work. There will be less tolerance than ever for bad addresses, mis-delivered orders and unwanted telemarketing. And when young professionals are launching a marketing campaign, serving their customers, or rolling out a new technology, working with a database riddled with bad contacts or missing information will feel like having one foot on the accelerator and one foot on the brake.

We are already a couple of steps ahead of the millennials – our focus is on API-based tools that are built right into your applications, linking them in real time to authoritative data sources like the USPS as well as a host of proprietary databases. They help ensure clean data at the point of entry AND at the time of use, for everything from contact data to scoring the quality of a marketing lead. These tools can also fuel their e-commerce capabilities by automating sales and use tax calculations, or ensure regulatory compliance with telephone consumer protection regulations.

In a world where an increasing number of both our customers and employees will have been born in the 21st century, and big data becomes a fact of modern life, change is inevitable in the way we do business. We like this trend, and feel it points the way towards a world where automated data quality finally becomes a reality for most of us.

How secure is your ‘Data at Rest’?

In a world where millions of customer and contact records are commonly stolen, how do you keep your data safe?  First, lock the door to your office.  Now you’re good, right?  Oh wait, you are still connected to the internet. Disconnect from the internet.  Now you’re good, right?  What if someone sneaks into the office and accesses your computer?  Unplug your computer completely.  You know what, while you are at it, pack your computer into some plain boxes to disguise it.   Oh wait, this is crazy, not very practical and only somewhat secure.

The point is, as we try to determine what kind of security we need, we also have to find a balance between functionality and security.  A lot of this depends on the type of data we are trying to protect.  Is it financial, healthcare, government related, or is it personal, like pictures from the last family camping trip.  All of these will have different requirements and many of them are our clients’ requirements. As a company dealing with such diverse clientele, Service Objects needs to be ready to handle data and keep it as secure as possible, in all the different states that digital data can exist.

So what are the states that digital data can exist in?  There are a number of states and understanding them should be considered when determining a data security strategy.  For the most part, the data exists in three states; Data in Motion/transit, Data at Rest/Endpoint and Data in Use and are defined as:

Data in Motion/transit

“…meaning it moves through the network to the outside world via email, instant messaging, peer-to-peer (P2P), FTP, or other communication mechanisms.” – http://csrc.nist.gov/groups/SNS/rbac/documents/data-loss.pdf

Data at Rest/Endpoint

“data at rest, meaning it resides in files systems, distributed desktops and large centralized data stores, databases, or other storage centers” – http://csrc.nist.gov/groups/SNS/rbac/documents/data-loss.pdf

“data at the endpoint, meaning it resides at network endpoints such as laptops, USB devices, external drives, CD/DVDs, archived tapes, MP3 players, iPhones, or other highly mobile devices” – http://csrc.nist.gov/groups/SNS/rbac/documents/data-loss.pdf

Data in Use

“Data in use is an information technology term referring to active data which is stored in a non-persistent digital state typically in computer random access memory (RAM), CPU caches, or CPU registers. Data in use is used as a complement to the terms data in transit and data at rest which together define the three states of digital data.” – https://en.wikipedia.org/wiki/Data_in_use

How Service Objects balances functionality and security with respect to our clients’ data, which is at rest in our automated batch processing, is the focus of this discussion.  Our automated batch process consists of this basic flow:

  • Our client transfers a file to a file structure in our systems using our secure ftp. [This is an example of Data in Motion/Transit]
  • The file waits momentarily before an automated process picks it up. [This is an example of Data at Rest]
  • Once our system detects a new file; [The data is now in the state of Data in Use]
    • It opens and processes the file.
    • The results are written into an output file and saved to our secure ftp location.
  • Input and output files remain in the secure ftp location until client retrieves them. [Data at Rest]
  • Client retrieves the output file. [Data in Motion/Transit]
    • Client can immediately choose to delete all, some or no files as per their needs.
  • Five days after processing, if any files exist, the automated system encrypts (minimum 256 bit encryption) the files and moves them off of the secure ftp to another secure location. Any non-encrypted version is no longer present.  [Data at Rest and Data in Motion/Transit]
    • This delay gives clients time to retrieve the results.
  • 30 days after processing, the encrypted version is completely purged.
    • This provides a last chance, in the event of an error or emergency, to retrieve the data.

We encrypt files five days after processing but what is the strategy for keeping the files secure prior to the five day expiration?  First off, we determined that the five and 30 day rules were the best balance between functionality and security. But we also added flexibility to this.

If clients always picked up their files right when they were completed, we really wouldn’t need to think too much about security as the files sat on the secure ftp.  But this is real life, people get busy, they have long weekends, go on vacation, simply forget, whatever the reason, Service Objects couldn’t immediately encrypt and move the data.  If we did, clients would become frustrated trying to coordinate the retrieval of their data.  So we built in the five and 30 day rule but we also added the ability to change these grace periods and customize them to our clients’ needs.  This doesn’t prevent anyone from purging their data sooner than any predefined thresholds and in fact, we encourage it.

When we are setting up the automated batch process for a client, we look at the type of data coming in, and if appropriate, we suggest to the client that they may want to send the file to us encrypted. For many companies this is standard practice.  Whenever we see any data that could be deemed sensitive, we let our client know.

When it is established that files need to be encrypted at rest, we use industry standard encryption/decryption methods.  When a file comes in and processing begins, the data is now in use, so the file is decrypted.  After processing, any decrypted file is purged and what remains is the encrypted version of the input and output files.

Not all clients are concerned or require this level of security but Service Objects treats all data the same, with the utmost care and the highest levels of security reasonable.  We simply take no chances and always encourage strong data security.

Big Data – Applied to Day to Day Life

With so much data being constantly collected, it’s easy to get lost in how all of it is applied in our real lives. Let’s take a quick look at a few examples starting with one that most of us encounter daily.

Online Forms
One of the most common and fairly simple to understand instances we come across on a daily basis is completing online forms. When we complete an online form, our contact record data points, like; name, email, phone and address, are being individually verified and corrected in real time to ensure each piece of data is genuine, accurate and up to date. Not only does this verification process help mitigate fraud for the companies but it also ensures that the submitted data is correct. The confidence in data accuracy allows for streamlined online purchases and efficient deliveries to us, the customers. Having our accurate information in the company’s data base also helps streamline customer service should there be a discrepancy with the purchase or we have follow up questions about the product. The company can easily pull up our information with any of the data points initially provided (name, email, phone, address and more) to start resolving the issue faster than ever (at least where companies are dedicated to good customer service).

For the most part we are all familiar with business scenarios like the one described above. Let’s shift to India & New Orleans for a couple new examples of how cities are applying data to improve the day-to-day lives of citizens.

Addressing the Unaddressed in India
According to the U.S. Census Bureau, India is the second most populated country in the world with 1,281,935,911 people. With such a large population there is a shortage of affordable housing in many developed cities, leading to about 37 million households residing in unofficial housing areas referred to as slums. Being “unofficial” housing areas means they are not mapped nor addressed leaving residents with very little in terms of identification. However, the Community Foundation of Ireland (a Dublin based non-profit organization) and the Hope Foundation recently began working together to provide each home for Kolkata’s Chetla slum their very first form of address consisting of a nine-digit unique ID. Beside overcoming obvious challenges like giving someone directions to their home and being able to finally receive mail, the implementation of addresses has given residents the ability to open bank accounts and access social benefits. Having addresses has also helped officials identify the needs in a slum, including healthcare and education.

Smoke Detectors in New Orleans
A recent article, The Rise of the Smart City, from The Wall Street Journal highlights how cities closer to home have started using data to bring about city wide enhancements. New Orleans, in particular, is ensuring that high risk properties are provided smoke detectors. Although the fire department has been distributing smoke detectors for years, residents were required to request them. To change this, the city’s Office of Performance and Accountability, used Census Bureau surveys and other data along with advanced machine-learning techniques to create a map for the fire department that better targets areas more susceptible to deaths caused by fire. With the application of big data, more homes are being supplied with smoke detectors increasing safety for entire neighbors and the city as a whole.

FIRE RISK | By combining census with additional data points, New Orleans mapped the combined risk of missing smoke alarms and fire deaths, helping officials target distribution of smoke detectors. PHOTO: CITY OF NEW ORLEANS/OPA

While these are merely a few examples of how data is applied to our day to day lives around the world, I hope they helped make “Big Data” a bit more relatable. Let us know if we can answer any questions about how data solutions can be applied to help your company as well.

Celebrating Earth Day

April 22 marks the annual celebration of Earth Day, a day of environmental awareness that is now approaching its first half century. Founded by US Senator Gaylord Nelson in 1970 as a nationwide teach-in on the environment, Earth Day is now the largest secular observance in the world, celebrated by over a billion people.

Earth Day has a special meaning here in our hometown of Santa Barbara, California. It was a massive 1969 oil spill off our coast that first led Senator Nelson to propose a day of public awareness and political action. Both were sorely needed back then: the first Earth Day came at a time when there was no US Environmental Protection Agency, environmental groups such as Greenpeace and the Natural Resources Defense Council were in their infancy, and pollution was simply a fact of life for many people.

If you visit our hometown today, you will find the spirit of Earth Day to be alive and well. We love our beaches and the outdoors, this area boasts over 50 local environmental organizations, and our city recently approved a master plan for bicycles that recognizes the importance of clean human-powered transportation. And in general, the level of environmental and conservation awareness here is part of the culture of this beautiful place.

Earth Day

It also has a special meaning for us here at Service Objects. Our founder and CEO Geoff Grow, an ardent environmentalist, started this company from an explicit desire to apply mathematics to the problem of wasted resources from incorrect and duplicate mailings. Today, our concern for the environment is codified as one of the company’s four core values, which reads as follows:

“Corporate Conservation – In addition to preventing about 300 tons of paper from landing in landfills each month with our Address Validation APIs, we practice what we preach: we recycle, use highly efficient virtualized servers, and use sustainable office supplies. Every employee is conscious of how they can positively impact our conservation efforts.”

Today, as Earth Day nears the end of its fifth decade, and Service Objects marks over 15 years in business, our own contributions to the environment have continued to grow. Here are just a few of the numbers behind the impact of our data validation products – so far, we have saved:

  • Over 85 thousand tons of paper
  • A million and a half trees
  • 32 million gallons of oil
  • More than half a billion gallons of water
  • Close to 50 million pounds of air pollution
  • A quarter of a million cubic yards of landfill space
  • 346 million KWH of energy

All of this is an outgrowth of more than two and a half billion transactions validated – and counting! (If you are ever curious about how we are doing in the future, just check the main page of our website: there is a real-time clock with the latest totals there.) And we are always looking for ways to continue making lives better though data validation tools.

We hope you, too, will join us in celebrating Earth Day. And the best way possible to do this is to examine the impact of your own business and community on the environment, and take positive steps to make the earth a better place. Even small changes can create a big impact over time. The original Earth Day was the catalyst for a movement that has made a real difference in our world – and by working together, there is much more good to come!

Medical Data is Bigger than You May Think

What do medical centers have in common with businesses like with Uber, Travelocity, or Amazon? They have a treasure trove of data, that’s what! The quality of that data and what’s done with it can help organizations work more efficiently, more profitably, and more competitively. More importantly for medical centers, data quality can lead to even better quality care.

Here’s just a brief sampling of the types of data a typical hospital, clinic, or medical center generates:

Patient contact information
Medical records with health histories
Insurance records
Payment information
Geographic data for determining “Prime Distance” and “Drive Time Standards”
Employee and payroll data
Ambulance response times
Vaccination data
Patient satisfaction data

Within each of these categories, there may be massive amounts of sub-data, too. For example, medical billing relies on tens of thousands of medical codes. For a single patient, even several addresses are collected such as the patient’s home and mailing addresses, the insurance company’s billing address, the employer’s address, and so forth.

This data must be collected, validated for accuracy, and managed, all in compliance with rigorous privacy and security regulations. Plus, it’s not just big data, it’s important data. A simple transposed number in an address can mean the difference between getting paid promptly or not at all. A pharmaceutical mix-up could mean the difference between life and death.

With so much important data, it’s easy to get overwhelmed. Who’s responsible? How is data quality ensured? How is it managed? Several roles can be involved:

Data stewards – Develop data governance policies and procedures.
Data owners – Generate the data and implement the policies and procedures.
Business users –  Analyze and make use of the data.
Data managers –  Information systems managers and developers who implement and manage the tools need to capture, validate, and analyze the data.

Defining a data quality vision, assembling a data team, and investing in appropriate technology is a must. With the right team and data validation tools in place, medical centers and any organization can get serious about data and data quality.

How Can Data Quality Lead to Quality Care?

Having the most accurate, authoritative and up-to-date information for patients can positively impact organizations in many ways. For example, when patients move, they don’t always think to inform their doctors, labs, hospitals, or radiology centers. With a real-time address validation API, not only could you instantly validate a patient’s address for billing and marketing purposes, you could confirm that the patient still lives within the insurance company’s “prime distance” radius before treatment begins.

Accurate address and demographic data can trim mailing costs and improve patient satisfaction with appropriate timing and personalization. Meanwhile, aggregated health data could be analyzed to look at health outcomes or reach out to patients proactively based on trends or health histories. Just as online retailers recommend products based on past purchases or purchases by customers like you, medical providers can use big data to recommend screenings based on health factors or demographic trends.

Developing a data quality initiative is a major, but worthwhile, undertaking for all types of organizations — and you don’t have to figure it all out on your own. Contact Service Objects today to learn more about our data validation tools.

Service Objects is the industry leader in real-time contact validation services.

Service Objects has verified over 2.8 billion contact records for clients from various industries including retail, technology, government, communications, leisure, utilities, and finance. Since 2001, thousands of businesses and developers have used our APIs to validate transactions to reduce fraud, increase conversions, and enhance incoming leads, Web orders, and customer lists. READ MORE