Posts Tagged ‘Big Data’

A Daisy Chain of Hidden Customer Data Factories

I published the provocatively-titled article, Bad Data Costs the United States $3 Trillion per Year in September, 2016 at Harvard Business Review. It is of special importance to those who need prospect/customer/contact data in the course of their work.

First read the article.

Consider this figure: $136 billion per year. That’s the research firm IDC’s estimate of the size of the big data market, worldwide, in 2016. This figure should surprise no one with an interest in big data.

But here’s another number: $3.1 trillion, IBM’s estimate of the yearly cost of poor quality data, in the US alone, in 2016. While most people who deal in data every day know that bad data is costly, this figure stuns.

While the numbers are not really comparable, and there is considerable variation around each, one can only conclude that right now, improving data quality represents the far larger data opportunity. Leaders are well-advised to develop a deeper appreciation for the opportunities improving data quality present and take fuller advantage than they do today.

The reason bad data costs so much is that decision makers, managers, knowledge workers, data scientists, and others must accommodate it in their everyday work. And doing so is both time-consuming and expensive. The data they need has plenty of errors, and in the face of a critical deadline, many individuals simply make corrections themselves to complete the task at hand. They don’t think to reach out to the data creator, explain their requirements, and help eliminate root causes.

Quite quickly, this business of checking the data and making corrections becomes just another fact of work life.  Take a look at the figure below. Department B, in addition to doing its own work, must add steps to accommodate errors created by Department A. It corrects most errors, though some leak through to customers. Thus Department B must also deal with the consequences of those errors that leak through, which may include such issues as angry customers (and bosses!), packages sent to the wrong address, and requests for lower invoices.

The Hidden Data Factory

Visualizing the extra steps required to correct the costly and time consuming data errors.

I call the added steps the “hidden data factory.” Companies, government agencies, and other organizations are rife with hidden data factories. Salespeople waste time dealing with erred prospect data; service delivery people waste time correcting flawed customer orders received from sales. Data scientists spend an inordinate amount of time cleaning data; IT expends enormous effort lining up systems that “don’t talk.” Senior executives hedge their plans because they don’t trust the numbers from finance.

Such hidden data factories are expensive. They form the basis for IBM’s $3.1 trillion per year figure. But quite naturally, managers should be more interested in the costs to their own organizations than to the economy as a whole. So consider:

There is no mystery in reducing the costs of bad data — you have to shine a harsh light on those hidden data factories and reduce them as much as possible. The aforementioned Friday Afternoon Measurement and the rule of ten help shine that harsh light. So too does the realization that hidden data factories represent non-value-added work.

To see this, look once more at the process above. If Department A does its work well, then Department B would not need to handle the added steps of finding, correcting, and dealing with the consequences of errors, obviating the need for the hidden factory. No reasonably well-informed external customer would pay more for these steps. Thus, the hidden data factory creates no value. By taking steps to remove these inefficiencies, you can spend more time on the more valuable work they will pay for.

Note that very near term, you probably have to continue to do this work. It is simply irresponsible to use bad data or pass it onto a customer. At the same time, all good managers know that, they must minimize such work.

It is clear enough that the way to reduce the size of the hidden data factories is to quit making so many errors. In the two-step process above, this means that Department B must reach out to Department A, explain its requirements, cite some example errors, and share measurements. Department A, for its part, must acknowledge that it is the source of added cost to Department B and work diligently to find and eliminate the root causes of error. Those that follow this regimen almost always reduce the costs associated with hidden data factories by two thirds and often by 90% or more.

I don’t want to make this sound simpler than it really is. It requires a new way of thinking. Sorting out your requirements as a customer can take some effort, it is not always clear where the data originate, and there is the occasional root cause that is tough to resolve. Still, the vast majority of data quality issues yield.

Importantly, the benefits of improving data quality go far beyond reduced costs. It is hard to imagine any sort of future in data when so much is so bad. Thus, improving data quality is a gift that keeps giving — it enables you to take out costs permanently and to more easily pursue other data strategies. For all but a few, there is no better opportunity in data.

The article above was originally written for Harvard Business Review and is reprinted with permission.
_______________________________________________________________________

In January 2018, Service Objects spoke with the author, Tom Redman, and he gave us an update on the article above, particularly as it relates to the subject of data quality.

According to Tom, the original article anticipated people asking, “What’s going on?  Don’t people care about data quality?”

The answer is, “Of course they care.  A lot.  So much that they implement ‘hidden data factories’ to accommodate bad data so they can do their work.”  And the article explored such factories in a generic “two-department” scenario.

Of course, hidden data factories take a lot of time and cost a lot of money, both contributing to the $3T/year figure.  They also don’t work very well, allowing lots of errors to creep through, leading to another hidden data factory.  And another and another, forming a sort of “daisy chain” of hidden data factories.  Thus, when one extends the figure above and narrows the focus to customer data, one gets something like this:

I hope readers see the essential truth this picture conveys and are appalled.  Companies must get in front on data quality and make these hidden data factories go away!

©2018, Data Quality Solutions

Is Your Data Quality Strategy Gold Medal Worthy?

A lot of you – like many of us here are Service Objects – are enjoying watching the 2018 Winter Olympics in Pyeongchang, Korea this month. Every Olympics is a spectacle where people perform incredible feats of athleticism on the world stage.

Watching these athletes reminds us of how much hard work, preparation, and teamwork go into their success. Most of these athletes spend years behind the scenes perfecting their craft, with the aid of elite coaches, equipment, and sponsors. And the seemingly effortless performances you see are increasingly becoming data-driven as well.

Don’t worry, we aren’t going to put ourselves on the same pedestal as Olympic medalists. But many of the same traits behind successful athletes do also drive reliable real-time API providers for your business. Here are just a few of the qualities you should look for:

The right partners. You probably don’t have access to up-to-the-minute address and contact databases from sources around the world. Or a database of over 400 million phone numbers that is constantly kept current. We do have all of this, and much more – so you can leverage our infrastructure to assure your contact data quality.

The right experience. The average Olympic skater has invested at least three hours a day in training for over a decade by the time you see them twirling triple axels on TV, according to Forbes. Likewise, Service Objects has validated nearly three billion transactions since we were founded in 2001, with a server uptime reliability of 99.999 percent.

The right strategy. In sports where success is often measured in fractions of a second, gold medals are never earned by accident: athletes always work against strategic objectives. We follow a strategy as well. Our tools are purpose-built for the needs of over 2500 customers, ranging from marketing to customer service, with capabilities such as precise geolocation of tax data, composite lead quality scores based on over 130 criteria, or fraud detection based on IP address matching. And we never stop learning and growing.

The right tools. Olympic athletes need the very best equipment to be competitive, from ski boots to bobsleds. In much the same way our customers’ success is based around providing the best infrastructure, including enterprise-grade API interfaces, cloud connectors and web hooks for popular CRM, eCommerce and marketing automation platforms, and convenient batch list processing.

The right support. No one reaches Olympic success by themselves – every athlete is backed by a team of coaches, trainers, sponsors and many others. We back our customers with an industry-leading support team as well, including a 24×7 Quick Response Team for urgent mission-critical issues.

The common denominator between elite athletes and industry-leading data providers is that both work hard to be the best at what they do and aren’t afraid to make big investments to get there. And while we can’t offer you a gold, silver, or bronze medal, we can give you a free white paper on how to make your data quality hit the perfect trifecta of being genuine, accurate and up-to-date. Meanwhile, enjoy the Olympics!

Unique US and Canadian Zip Code Files – Now Available for Download

Customer Service Above All.  It is one of our core values here at Service Objects.  Recently, we’ve received several requests for a list of the unique zip codes throughout the US and Canada.  By leveraging our existing services, we’ve made this happen.  We are now offering both the US and Canada list as a free downloadable resource.

So why is Service Objects providing this data? Our goal is to provide the best data cleansing solutions possible for our clients. Part of this means using our existing data to provide our users with the data they need. While other data providers might charge for this type of information, we’ve decided to make it freely available for anyone’s use. These files can be used for several purposes, such as pre-populating a list of cities and states for a form where a user needs to enter address information. The County and State FIPS information is widely used in census and demographic data or could be used to uniquely identify States and counties within a database.  Additionally, the given time zone information can be used to determine appropriate times to place calls to a customer.

Where to Download

This link allows you to access a .zip file containing two CSV records.  One CSV contains the US information, the other is for Canada.  The files indicate the month and year the records were created. Toward the middle of each month, the data in each record will be updated to account for any changes in US and Canadian postal codes.

What Other Information is in the Files?

Both files will have postal codes, states (or provinces for Canada) and time zone information.  The Canadian zip code file will be much larger in size with over 800K records. This is due to Canadian Postal Codes generally being much smaller than US Postal codes. Where a US postal code can sometimes encompass multiple cities or counties, a Canadian postal code can be the size of a couple city blocks or in some cases a single high-rise building.

The US file has information for all United States postal codes including its territories. This file will also include the county that the zip code lies in. There will be County and State FIPS numbers for each of the records to help with processing that information as well.  The US file will be considerably smaller than the Canadian file at only 41K records.

In making these files freely accessible, our hope is to make the integration and business logic easier for our users. If you’d like to discuss your particular contact data validation needs, feel free to contact us!

Don’t Let Bad Data Scare You This Halloween

Most of us here in North America grew up trick-or-treating on Halloween. But did you know the history behind this day?

In early Celtic culture, the feast of All Hallows Eve (or Allhallowe’en) was a time of remembering the souls of the dead – and at a more practical level, preparing for the “death” of the harvest season and the winter to follow. People wore costumes representing the deceased, who by legend were back on earth to have a party or (depending upon cultural interpretation) cause trouble for one last night, and people gave them alms in the form of soul cakes – which evolved to today’s sweet treats – to sustain them.

So what were people preparing for in celebrating Halloween? Good data quality, of course. Back then, when your “data” consisted of the food you grew, people took precautions to protect it from bad things by taking the preventative measure of feeding the dead. Today, Halloween is a fun celebration that actually has some important parallels for managing your data assets. Here are just a few:

An automated process. The traditions of Halloween let people honor the dead and prepare for the harvest in a predictable, dependable way. Likewise, data quality ultimately revolves around automated tools that take the work – and risk – out of creating a smooth flow of business information.

Organizational buy-in. Unlike many other holidays, Halloween was a community celebration fueled by the collective efforts of everyone. Every household took part in providing alms and protecting the harvest. In much the same way, modern data governance efforts make sure that all of the touch points for your data – when it is entered, and when it is used – follow procedures to ensure clean, error free leads, contacts and e-commerce information.

Threat awareness. Halloween was designed to warn people away from the bad guys – for example, the bright glow of a Jack-o-lantern was meant to keep people away from the spirit trapped inside. Today, data quality tools like order and credit card BIN validation keep your business away from the modern-day ghouls that perpetrate fraud.

An ounce of prevention. This is the big one. Halloween represented a small offering to the dead designed to prevent greater harm. When it comes to your data, prevention is dramatically more cost- effective than dealing with the after-effects of bad data: this is an example of the 1-10-100 rule, where you can spend one dollar preventing data problems, ten dollars correcting them, or $100 dealing with the consequences of leaving them unchecked.

These costs range from the unwanted marketing costs of bad or fraudulent leads to the cost in lost products, market share and customer good will when you ship things to the wrong address. And this doesn’t even count some of the potentially big costs for compliance violations, such as the Telephone Consumer Protection Act (TCPA) for outbound telemarketing, the CAN-SPAM act for email marketing, sales and use tax mistakes, and more.

So now you know: once upon a time, people mitigated threats to their data by handing out baked goods to people in costumes. Now they simply call Service Objects, to implement low-cost solutions to “treat” their data with API-based and batch-process solutions. And just like Halloween, if you knock on our door we’ll give you a sample of any of our products for free! For smart data managers, it’s just the trick.

The Talent Gap In Data Analytics

According to a recent blog by Villanova University, the amount of data generated annually has grown tremendously over the last two decades due to increased web connectivity, as well as the ever-growing popularity of internet-enabled mobile devices. Some organizations have found it difficult to take advantage of the data at their disposal due to a shortage of data-analytics experts. Primarily, small-to-medium enterprises (SMBs) who struggle to match the salaries offered by larger businesses are the most affected. This shortage of qualified and experienced professionals is creating a unique opportunity for those looking to break into a data-analysis career.

Below is some more information on this topic.

Data-Analytics Career Outlook

Job openings for computer and research scientists are expected to grow by 11 percent from 2014 to 2024. In comparison, job openings for all occupations are projected to grow by 7 percent over the same period. Besides this, 82 percent of organizations in the US say that they are planning to advertise positions that require data-analytics expertise. This is in addition to 72 percent of organizations that have already hired talent to fill open analytics positions in the last year. However, up to 78 percent of businesses say they have experienced challenges filling open data-analytics positions over the last 12 months.

Data-Analytics Skills

The skills that data scientists require vary depending on the nature of data to be analyzed as well as the scale and scope of analytical work. Nevertheless, analytics experts require a wide range of skills to excel. For starters, data scientists say they spend up to 60 percent of their time cleaning and aggregating data. This is necessary because most of the data that organizations collect is unstructured and comes from diverse sources. Making sense of such data is challenging, because the majority of modern databases and data-analytics tools only support structured data. Besides this, data scientists spend at least 19 percent of their time collecting data sets from different sources.

Common Job Responsibilities

To start with, 69 percent of data scientists perform exploratory data-analytics tasks, which in turn form the basis for more in-depth querying. Moreover, 61 percent perform analytics with the aim of answering specific questions, 58 percent are expected to deliver actionable insights to decision-makers, and 53 percent undertake data cleaning. Additionally, 49 percent are tasked with creating data visualizations, 47 percent leverage data wrangling to identify problems that can be resolved via data-driven processes, and 43 percent perform feature extraction, while 43 percent have the responsibility of developing data-based prototype models.

In-demand Programming-Language Skills

In-depth understanding of SQL is a key requirement cited in 56 percent of job listings for data scientists. Other leading programming-language skills include Hadoop (49 percent of job listings), Python (39 percent), Java (36 percent), and R (32 percent).

The Big-Data Revolution

The big-data revolution witnessed in the last few years has changed the way businesses operate substantially. In fact, 78 percent of corporate organizations believe big data is likely to fundamentally change their operational style over the next three years, while 71 percent of businesses expect the same resource to spawn new revenue opportunities. Only 58 percent of executives believe that their employer has the capability to leverage the power of big data. Nevertheless, 53 percent of companies are planning to roll out data-driven initiatives in the next 12 months.

Recruiting Trends

Companies across all industries are facing a serious shortage of experienced data scientists, which means they risk losing business opportunities to firms that have found the right talent. Common responsibilities among these professionals include developing data visualizations, collecting data, cleaning and aggregating unstructured data, and delivering actionable insights to decision-makers. Leading employers include the financial services, marketing, corporate and technology industries.

View the full infographic created by Villanova University’s Online Master of Science in Analytics degree program.

http://taxandbusinessonline.villanova.edu/resources-business/infographic-business/the-talent-gap-in-data-analytics.html

Reprinted with permission.

What Can We Do? Service Objects Responds to Hurricane Harvey

The Service Objects’ team watched the steady stream of images from Hurricane Harvey and its aftermath and we wanted to know, ‘What can we do to help?’  We realized the best thing we could do is offer our expertise and services free to those who can make the most use of them – the emergency management agencies dedicated to helping those affected by this disaster.

It was quickly realized that as Hurricane Harvey continues to cause record floodwaters and entire neighborhoods are under water, these agencies are finding it nearly impossible to find specific addresses in need of critical assistance. In response to this, we are offering emergency management groups the ability to quickly pinpoint addresses with latitude and longitude coordinates by offering unlimited, no cost access to DOTS Address Geocode ℠ (AG-US). By using Address Geocode, the agencies will not have to rely on potentially incomplete online maps. Instead, using Service Objects’ advanced address mapping services, these agencies will be able to reliably identify specific longitude and latitude coordinates in real-time and service those in need.

“The fallout of the catastrophic floods in Texas is beyond description, and over one million locations in Houston alone have been affected,” said Geoff Grow, CEO and Founder of Service Objects.  “With more than 450,000 people likely to seek federal aid in recovering from this disaster, Service Objects is providing advanced address mapping to help emergency management agencies distribute recovery funds as quickly as possible. We are committed to helping those affected by Hurricane Harvey.”

In addition, as disaster relief efforts are getting underway, Service Objects will provide free access to our address validation products to enable emergency management agencies to quickly distribute recovery funds by address type, geoid, county, census-block and census-track. These data points are required by the federal government to release funding.  This will allow those starting the recovery process from this natural disaster to get next level services as soon as possible.

To get access to Service Objects address solutions or request maps, qualified agencies can contact Service Objects directly by calling 805-963-1700 or by emailing us at info@serviceobjects.com.

Our team wishes the best for all those affected by Hurricane Harvey.

Image by National Weather Service 

How Millennials Will Impact Your Data Quality Strategy

The so-called Millennial generation now represents the single largest population group in the United States. If they don’t already, they will soon represent your largest base of customers, and a majority of the work force. What does that mean for the rest of us?

It doesn’t necessarily mean that you have to start playing Adele on your hold music, or offering free-range organic lattes in the company cafeteria. What it does mean, according to numerous social observers, is that expectations of quality are changing radically.

The Baby Boomer generation, now dethroned as the largest population group, grew up in a world of amazing technological and social change – but also a world where wrong numbers and shoddy products were an annoying but inevitable part of life. Generation X and Y never completely escaped this either:  ask anyone who ever drove a Yugo or sat on an airport tarmac for hours. But there is growing evidence that millennials, who came of age in a world where consumer choices are as close as their smartphones, are much more likely to abandon your brand if you don’t deliver.

This demographic change also means you can no longer depend on your father’s enterprise data strategy, with its focus on things like security and privacy. For one thing, according to USA Today, millennials could care less about privacy. The generation that grew up oversharing on Instagram and Facebook understands that in a world where information is free, they – and others – are the product. Everyone agrees, however, that what they do care about is access to quality data.

This also extends to how you manage a changing workforce. According to this article, which notes that millennials will make up three quarters of the workforce by 2020, dirty data will become a business liability that can’t be trusted for strategic purposes, whether it is being used to address revenues, costs or risk. Which makes them much more likely to demand automated strategies for data quality and data governance, and push to engineer these capabilities into the enterprise.

Here’s our take: more than ever, the next generation of both consumers and employees will expect data to simply work. There will be less tolerance than ever for bad addresses, mis-delivered orders and unwanted telemarketing. And when young professionals are launching a marketing campaign, serving their customers, or rolling out a new technology, working with a database riddled with bad contacts or missing information will feel like having one foot on the accelerator and one foot on the brake.

We are already a couple of steps ahead of the millennials – our focus is on API-based tools that are built right into your applications, linking them in real time to authoritative data sources like the USPS as well as a host of proprietary databases. They help ensure clean data at the point of entry AND at the time of use, for everything from contact data to scoring the quality of a marketing lead. These tools can also fuel their e-commerce capabilities by automating sales and use tax calculations, or ensure regulatory compliance with telephone consumer protection regulations.

In a world where an increasing number of both our customers and employees will have been born in the 21st century, and big data becomes a fact of modern life, change is inevitable in the way we do business. We like this trend, and feel it points the way towards a world where automated data quality finally becomes a reality for most of us.

How secure is your ‘Data at Rest’?

In a world where millions of customer and contact records are commonly stolen, how do you keep your data safe?  First, lock the door to your office.  Now you’re good, right?  Oh wait, you are still connected to the internet. Disconnect from the internet.  Now you’re good, right?  What if someone sneaks into the office and accesses your computer?  Unplug your computer completely.  You know what, while you are at it, pack your computer into some plain boxes to disguise it.   Oh wait, this is crazy, not very practical and only somewhat secure.

The point is, as we try to determine what kind of security we need, we also have to find a balance between functionality and security.  A lot of this depends on the type of data we are trying to protect.  Is it financial, healthcare, government related, or is it personal, like pictures from the last family camping trip.  All of these will have different requirements and many of them are our clients’ requirements. As a company dealing with such diverse clientele, Service Objects needs to be ready to handle data and keep it as secure as possible, in all the different states that digital data can exist.

So what are the states that digital data can exist in?  There are a number of states and understanding them should be considered when determining a data security strategy.  For the most part, the data exists in three states; Data in Motion/transit, Data at Rest/Endpoint and Data in Use and are defined as:

Data in Motion/transit

“…meaning it moves through the network to the outside world via email, instant messaging, peer-to-peer (P2P), FTP, or other communication mechanisms.” – http://csrc.nist.gov/groups/SNS/rbac/documents/data-loss.pdf

Data at Rest/Endpoint

“data at rest, meaning it resides in files systems, distributed desktops and large centralized data stores, databases, or other storage centers” – http://csrc.nist.gov/groups/SNS/rbac/documents/data-loss.pdf

“data at the endpoint, meaning it resides at network endpoints such as laptops, USB devices, external drives, CD/DVDs, archived tapes, MP3 players, iPhones, or other highly mobile devices” – http://csrc.nist.gov/groups/SNS/rbac/documents/data-loss.pdf

Data in Use

“Data in use is an information technology term referring to active data which is stored in a non-persistent digital state typically in computer random access memory (RAM), CPU caches, or CPU registers. Data in use is used as a complement to the terms data in transit and data at rest which together define the three states of digital data.” – https://en.wikipedia.org/wiki/Data_in_use

How Service Objects balances functionality and security with respect to our clients’ data, which is at rest in our automated batch processing, is the focus of this discussion.  Our automated batch process consists of this basic flow:

  • Our client transfers a file to a file structure in our systems using our secure ftp. [This is an example of Data in Motion/Transit]
  • The file waits momentarily before an automated process picks it up. [This is an example of Data at Rest]
  • Once our system detects a new file; [The data is now in the state of Data in Use]
    • It opens and processes the file.
    • The results are written into an output file and saved to our secure ftp location.
  • Input and output files remain in the secure ftp location until client retrieves them. [Data at Rest]
  • Client retrieves the output file. [Data in Motion/Transit]
    • Client can immediately choose to delete all, some or no files as per their needs.
  • Five days after processing, if any files exist, the automated system encrypts (minimum 256 bit encryption) the files and moves them off of the secure ftp to another secure location. Any non-encrypted version is no longer present.  [Data at Rest and Data in Motion/Transit]
    • This delay gives clients time to retrieve the results.
  • 30 days after processing, the encrypted version is completely purged.
    • This provides a last chance, in the event of an error or emergency, to retrieve the data.

We encrypt files five days after processing but what is the strategy for keeping the files secure prior to the five day expiration?  First off, we determined that the five and 30 day rules were the best balance between functionality and security. But we also added flexibility to this.

If clients always picked up their files right when they were completed, we really wouldn’t need to think too much about security as the files sat on the secure ftp.  But this is real life, people get busy, they have long weekends, go on vacation, simply forget, whatever the reason, Service Objects couldn’t immediately encrypt and move the data.  If we did, clients would become frustrated trying to coordinate the retrieval of their data.  So we built in the five and 30 day rule but we also added the ability to change these grace periods and customize them to our clients’ needs.  This doesn’t prevent anyone from purging their data sooner than any predefined thresholds and in fact, we encourage it.

When we are setting up the automated batch process for a client, we look at the type of data coming in, and if appropriate, we suggest to the client that they may want to send the file to us encrypted. For many companies this is standard practice.  Whenever we see any data that could be deemed sensitive, we let our client know.

When it is established that files need to be encrypted at rest, we use industry standard encryption/decryption methods.  When a file comes in and processing begins, the data is now in use, so the file is decrypted.  After processing, any decrypted file is purged and what remains is the encrypted version of the input and output files.

Not all clients are concerned or require this level of security but Service Objects treats all data the same, with the utmost care and the highest levels of security reasonable.  We simply take no chances and always encourage strong data security.

Big Data – Applied to Day to Day Life

With so much data being constantly collected, it’s easy to get lost in how all of it is applied in our real lives. Let’s take a quick look at a few examples starting with one that most of us encounter daily.

Online Forms
One of the most common and fairly simple to understand instances we come across on a daily basis is completing online forms. When we complete an online form, our contact record data points, like; name, email, phone and address, are being individually verified and corrected in real time to ensure each piece of data is genuine, accurate and up to date. Not only does this verification process help mitigate fraud for the companies but it also ensures that the submitted data is correct. The confidence in data accuracy allows for streamlined online purchases and efficient deliveries to us, the customers. Having our accurate information in the company’s data base also helps streamline customer service should there be a discrepancy with the purchase or we have follow up questions about the product. The company can easily pull up our information with any of the data points initially provided (name, email, phone, address and more) to start resolving the issue faster than ever (at least where companies are dedicated to good customer service).

For the most part we are all familiar with business scenarios like the one described above. Let’s shift to India & New Orleans for a couple new examples of how cities are applying data to improve the day-to-day lives of citizens.

Addressing the Unaddressed in India
According to the U.S. Census Bureau, India is the second most populated country in the world with 1,281,935,911 people. With such a large population there is a shortage of affordable housing in many developed cities, leading to about 37 million households residing in unofficial housing areas referred to as slums. Being “unofficial” housing areas means they are not mapped nor addressed leaving residents with very little in terms of identification. However, the Community Foundation of Ireland (a Dublin based non-profit organization) and the Hope Foundation recently began working together to provide each home for Kolkata’s Chetla slum their very first form of address consisting of a nine-digit unique ID. Beside overcoming obvious challenges like giving someone directions to their home and being able to finally receive mail, the implementation of addresses has given residents the ability to open bank accounts and access social benefits. Having addresses has also helped officials identify the needs in a slum, including healthcare and education.

Smoke Detectors in New Orleans
A recent article, The Rise of the Smart City, from The Wall Street Journal highlights how cities closer to home have started using data to bring about city wide enhancements. New Orleans, in particular, is ensuring that high risk properties are provided smoke detectors. Although the fire department has been distributing smoke detectors for years, residents were required to request them. To change this, the city’s Office of Performance and Accountability, used Census Bureau surveys and other data along with advanced machine-learning techniques to create a map for the fire department that better targets areas more susceptible to deaths caused by fire. With the application of big data, more homes are being supplied with smoke detectors increasing safety for entire neighbors and the city as a whole.

FIRE RISK | By combining census with additional data points, New Orleans mapped the combined risk of missing smoke alarms and fire deaths, helping officials target distribution of smoke detectors. PHOTO: CITY OF NEW ORLEANS/OPA

While these are merely a few examples of how data is applied to our day to day lives around the world, I hope they helped make “Big Data” a bit more relatable. Let us know if we can answer any questions about how data solutions can be applied to help your company as well.