The Rise of Big Data

January 17, 2012

On the eve of 2011, humanity was creating enough data each day to match all of the content in the Library of Congress -- 11,461 times over.  With more digital devices than there are humans in existence and internet users being added by the tens of millions with each passing month, data is streaming from all corners of the globe. The world is floating in a vast river of digital information that is getting faster, deeper, and wider every day.  Data pours forth from handheld phones, social media, credit card purchases, and Wal-Mart warehouses in quantities and rates never seen before in history.  

The possibilities found in this river of information are equally endless.  It can open up new markets, help unlock the secrets of science, defy corruption, and much more.  These aims are lofty.  Yet they are feasible through the rise of what’s being called “big data” and all of the analysis, storage, and human capital that goes into managing it.

We must ask then what the rising tide of big data means to businesses and individuals alike.  While companies can find opportunities, trends, and new approaches in the mass of data, the human element remains critical to understanding, interpreting, and using this information.

Data, Data Everywhere, Nor Any Drop to Sync

Global data flows are so vast that they overwhelm the imagination.  2010 saw the production of over 1,200 exabytes of information.  To put that number into perspective, 1,200 exabytes is roughly equivalent to the storage space of 300 billion DVDs.  Moreover, this number represents a ten-fold increase in data creation over just five years.  It's projected that in 2015 we will be produce a staggering 7,910 exabytes of data. In 2011, the world had 2 billion internet users or a little over a quarter of the global population.  What isn’t reflected in that number is the total of mobile-phone subscribers, people who are fast becoming connected to this same data flow.  Bearing in mind that some individuals have multiple subscriptions while others are bundled together, the world now has 4.6 billion mobile phone subscribers.

Turning to the organizational level, a recent McKinsey report found that overall data volume is growing by some 40% per year.  While 75% of all data created in the world stems from individual users, companies have liability for around 80% of it at some point.  In 15 of the U.S. economy’s 17 sectors, companies with more than 1,000 employees store on average over 235 terabytes of data.  At the more extreme end are companies like IBM, which boasts a storage capacity that comes to over 120 petabytes, and tech titan Microsoft, which in just one data storage center can hold up to 6.75 trillion photos.

A Rising Tide of Data

As the tide of data rises, a number of key trends are starting to surface.  The first is the sheer ubiquity and complexity of data.  As noted earlier, raw data is being created at a faster rate and in greater quantities than ever before.  One reason for this is the increase in users and devices.  By 2015, Cisco Systems expects that 40% of the world's population will have internet access and that the number of network connected devices will quadruple to some 15 billion connections.  These users and devices will also be able to access and create data at greater speeds thanks to the spread of broadband connectivity, which itself is expected to become four times as fast in less than five years.

This deluge of data is also becoming more complex.  Digital information comes in either structured or unstructured form and, according to SAS, upwards of 85% of data is this latter, unsorted type.  While structured data is categorized and sorted within easy reach, unstructured information is often a scattered morass of text-heavy data with few identifying marks.  The reality of data today is more like a bad episode of TLC’s Hoarders than something out of the streamlined, glowing structures seen in Disney’s Tron.  As such, there is tremendous potential value in leveraging cluttered data.  To quote The Economist, “Given enough raw data, today’s algorithms and powerful computers can reveal new insights that would previously have remained hidden.”

The second key trend is that this data is far more mobile today than ever before.  It’s not just being stored but shared.  One manifestation is in the realm of social networking.  Up to 65% of Americans use social networking sites such as Facebook and LinkedIn.  Now more than ever, these sites are yet another way to share information and structure e-mail data within a defined network of users. Another is the ascent of what’s called the “cloud.”  Ubiquitous internet access is allowing for data to be stored in centralized servers -- rather than be siloed and scattered -- that can be accessed on-demand and readily scaled up.  With the cloud, computing becomes less about the hardware and more of a service to a network of global consumers.  By 2015, upwards of 20% of all data will course through the servers of cloud computing providers.

The private sector was the first realm in which big data arose and so it’s apt that private firms are among the first to be utterly transformed.  As McKinsey put it, data “underpins processes that manage employees; it helps to track purchases and sales; and it offers clues about how customers will behave.”  Not only are industries forming around sorting, analyzing, and applying this data, but it’s forming new data-driven business models.  In short, data is central to how business is done in the 21st century.

So What?

The rise of big data is proving to be profitable, for one thing.  Data analytics, as an example, offers a way to improve productivity by some 0.5% to 1% annually in sectors like health care and retailing.  Moreover, a recent study found that data-driven decision-making increases firm performance by 5-6%.  It is quite normal now for an organization to employ “business intelligence” systems to make sense of the data coming from their customers, employees, stores, and warehouses.  As one article in The Wall Street Journal said, “It means fewer hunches and more facts.”

We’re also seeing more organizations be shaped internally by information flows: that is, becoming more collaborative and less hierarchical.  Networked enterprises thrive off of collaborative software that encourages more of a team-based approach centered on open communication flows.  Think of a Silicon Valley firm that eschews cubicles and communicates through instant message.  A recent survey found a distinct correlation between networked enterprises and increased market share.  Only 3% of the companies they surveyed were "fully networked," but that cohort is growing fast.

Big data is also impacting external engagement.  Companies are using social media platforms to interact with customers, web-based portals to communicate with suppliers, and overarching systems to coordinate the resulting data flows.  The result is incredibly precise segmentation of customers and the products and services they need.  In this we see that big data need not be for the tech titans alone.  Wal-Mart thrives off of real-time inventory data from across the world in order to properly allocate its products.  For instance, "Wal-Mart discovered in 2004, that along with flashlights, batteries, and other emergency supplies, Pop-Tart sales increased before a predicted hurricane."  "Thanks to those insights, trucks filled with toaster pastries and six-packs were soon speeding down Interstate 95 toward Wal-Marts in the path of [Hurricane] Frances. Most of the products that were stocked for the storm sold quickly, the company said."

On a personal level, we are also depositing data in our wake as we move through this river of information. While the privacy concerns are real, this data has also enabled the rise of niche markets and services that become highly personalized in real-time. Companies using data to more accurately interact with their customers or earn new ones makes their advertising, offers, and service more relevant and specific to the needs of customers.  As Erik Brynjolfsson and Andrew McAfee put it in a recent article in The Atlantic, “Customers are acting as unwitting business consultants for these companies.  Our purchases, searches, and online activities are being tracked to improve everything from websites to delivery routes and drug manufacturing.”  Ultimately, this enables you and me to capture the gains brought on by lower prices, lifestyle improvements, and niche products.

Staying Afloat in the River of Data

While we can have instant data ready to answer the “what” and the “how” of our world, the “why” and the “so what?” is still up to us to determine and understand.  Analytical talent is not replaced by streams of data or number-crunching machines.  In fact, it makes the insight and intuition of skilled workers that much more valuable. 

As McKinsey concludes, “The demand for people with the deep analytical skills in big data (including machine learning and advanced statistical analysis) could outstrip current projections of supply by 50% to 60%.” Companies should work diligently to not drown in this river of data.  There's much to be said for thinking proactively about the opportunities afforded by an ever greater supply of data and the need for its effective storage, analysis, and integration.  

Big data allows companies to more dynamically compete in the marketplace and to capitalize on greater informational awareness.  Just as countries should think holistically about their technology agenda, companies must understand how big data fits into their overall strategy and not separate it from decision makers.  The rise of big data offers a way toward a more prosperous future for economies and businesses.

Big data is more than information.  It is an assembly of our collective experience.  We see ourselves in the stream of data, more aware and complex than ever before.  Our institutions -- political, economic, social -- are rising on tides of inter-connectivity.   The future of business will in many ways emerge from this river of data and it will be all the more human because of it.