This research article is Chapter 2 in the report, "The Future of Data-Driven Innovation."
Media coverage of the Big Data revolution tends to focus on new technology developments in data storage and new business opportunities for social analytics and performance management. Alongside these tech-sector updates, a parallel media narrative has focused on the public’s concern over data collection practices.
Market success stories and accentuated privacy concerns are both important and deserve the full attention of business leaders and policymakers. Yet, we should recognize that much of what we read in the press about data is only the protruding tip of the proverbial iceberg. The layer of information generated by Big Data permeates through not only the densely linked networks of business, finance, and government, but into all aspects of quantitative research and scientific inquiry. Big Data is thus much more than an impressive technological development—it is a new framework for understanding and interacting with the world around us[rc1] .
Big Data offers a new era of learning, where we can investigate and analyze a larger body (or the entire body) of information about a subject and gain insights that were inscrutable in smaller samples. Big Data is already reframing critical questions about the processes of research, best practices for engagement with all categories of digital data, and the constitution of knowledge itself.[i] Moreover, while business and consumer stories occupy the headlines, some of the most promising applications of the technology are found in nonprofit work, good governance initiatives, and especially, scientific research.
This chapter colors outside of the lines of familiar Big Data narratives and addresses some of the underreported and less understood aspects of the phenomenon. It explores the value-added aspects of Big Data that make it more than its component parts, and it makes the case that data is best conceptualized—and applied—as a complementary extension of human ingenuity. Computers can crunch numbers, but when it comes to contextualizing and applying that analysis, only a human mind will suffice.
This chapter also emphasizes the importance of data literacy as both an organizational best practice and a core curriculum. It explores how Big Data is being used by nonprofits and universities to alleviate some of the world's most pressing problems (such as disease control, environmental issues, and famine reduction) and considers how Big Data can be applied at the micro-level for individual optimization. The chapter concludes with options for reframing the Big Data debate into a benefits-oriented discussion of data-driven innovation.
A QUANTITATIVE SHIFT
One challenge we find when talking about Big Data is that the term is often described by way of anecdotal example (rather than formal denotation). Big Data might be "Google's satellite mapping imagery," "the streaming financial data supporting Wall Street," or even "all of the people on Facebook," depending on who you ask. Yet, do all these examples really represent the same thing? How big does data have to be before it can be considered Big Data? Is Big Data really a "thing" at all—or is it also a process? Can we effectively promote the benefits of Big Data when we can't even agree on what Big Data is?
Most observers would agree that Big Data is a broad, catch-all term that captures not only the size of particular datasets but also advances in data storage, analytics, and the process of digitally quantifying the world. Big Data may be a nebulous term, but that doesn't mean it is useless. We commonly speak about equally vague technological terms like "social media," "cloud computing," and even "the Internet" without prefacing every remark with peer-reviewed and linguist-approved qualifications. Big Data is more of a dynamic than a thing, but the different facets and technology developments that reflect that dynamic are well known—leading to a multitude of anecdotal examples.
What makes Big Data so useful? It's a complicated—and highly contextual—question, but a simple response really does begin with the defining descriptive attribute: volume. For analytical purposes, more data tends to produce better results. Peter Norvig, an artificial intelligence expert at Google, provides an illustrative analogy in his presentation on "The Unreasonable Effectiveness of Data."[ii]
Norvig notes that a 17,000-year-old cave painting effectively tells its audience as much about a horse—a four legged, hoofed mammal with a thick mane—as any photograph. While drawing the animal with dirt and charcoal is a much slower process than snapping a picture, the information conveyed is fundamentally the same. No matter how advanced the technology that produces it, a single piece of data will always contain a limited amount of both implicit and contextual information. Capturing consecutive images of a horse in the form of a video, however, produces a much fuller assessment of how the animal moves, behaves, and interacts with its environment. Even a modest quantitative shift in the data allows for a far more qualitatively rich assessment.
In the Big Data era, we can not only capture a series of videos of a horse; we could capture the animal's every movement for hours, days, or weeks. Before Big Data processing programs, organizations could not effectively analyze all of the data points they possessed or collected about a particular phenomenon. That was why accurate, representative sampling was so important. Today, it's not only possible, but preferable to pull and analyze all of the data.
Volume, however, isn't the whole story. Although many of the datasets identified in press accounts are staggeringly large (such as the 200-terabyte dataset for the 1000 Genomes Project, cataloging human genetic variation), other datasets lumped in with this trend are not nearly as extensive. Big Data is ultimately less about the size of any one dataset than: (1) a capacity to search, aggregate, and cross-reference an ever-expanding ecosystem of datasets, which include the incomparably large and the proportionally small; and (2) an ability to render previously qualitative research areas into quantitative data.
An excellent example of the latter is Google's Ngram Viewer. In 2004, Google began scanning the full text of the world's entire body of books and magazines as part of its Google Print Library Project. This digitization effort was eventually folded under the Google Books label, which today encompasses more than 20 million scanned books. The Ngram Viewer allows users to search through 7.5 million of these books (about one-seventh of all books ever published) and graph the frequency with which particular words or phrases have been used over time in English, Chinese, Russian, French, German, Italian, Hebrew, and Spanish-language literature.[iii]
Time referred to the project as perhaps the "closest thing we have to a record of what the world has cared about over the past few centuries."[iv]
A more modern example is the proliferation of data-driven journalism outlets. While "data" journalism in the broadest sense has been around for ages (think political polls and census analysis), the influx of user-friendly statistical software, easily accessible spreadsheets, and data-curious reporters has transformed a niche concentration within newsrooms into a whole new type of journalism.[v]
The most prominent example is FiveThirtyEight. Despite some early stumbles, the site has found a loyal audience through compelling political analysis (especially election predictions and polling criticisms), statistics-heavy sports features (site founder Nate Silver got his start in baseball sabermetrics), and even irreverent lifestyle features (such as an advice column supported by statistical assessments of "normal").
What FiveThirtyEight, Vox, Wonkblog, and all of their number-crunching contemporaries have in a common is a commitment to explaining public policy, international affairs, and the day's news through the lens of data analysis (rather than anecdotal reporting). This quantitative approach is reflective of a larger shift in how we read, learn, and process information. Today's readers are no longer satisfied with two-dimensional news—they want strong reporting and analysis presented in an engaging, easy-to-digest (read: mobile-optimized) presentation. The sort of interactive infographics, explanatory videos, and multi-platform features that are thriving online simply weren't possible in the old days of print. News organizations are responding to demand for these stories by actively recruiting reporters with a statistical background and designers skilled in data visualization.
FROM RAW DATA TO USEFUL INFORMATION
It is easy to imagine Big Data as a massive Excel spreadsheet just waiting for somebody to hit "sort," but that's not quite right. In many cases, Big Data is closer to the unsorted mess of memories and factoids floating around in our heads. This wide array of information and details can only be processed through the metadata (see Chapter 1) that facilitates dense linkages and logical pattern recognition. Changing raw data to actionable information, then, requires a full understanding of context.
The functional relationship between data and information is detailed in the Data-Information-Knowledge-Wisdom (DIKW) pyramid, a heuristic device brought to prominence in the late 1980s by organizational theorist Russell Ackoff.[vi] The pyramid explains that information is typically defined in terms of data, knowledge in terms of information, and wisdom in terms of knowledge. Theorists contest some of the finer points of this logical progression—especially the distinction (if there is one) between wisdom and knowledge—but as a general framework, the DIKW pyramid remains useful in demarcating analytically fuzzy concepts.
When critics challenge some of the claims surrounding Big Data, they are usually targeting misunderstandings about the relationship between information and data. For example, Wired editor Chris Anderson famously claimed that with enough data, the numbers would "speak for themselves" and "make the scientific method obsolete."[vii] Anderson's wide-eyed assessment (which was widely mocked, even by Big Data practitioners) was incorrect because he failed to recognize that data is useless without context, theory, and interpretation.
In a great New York Times op-ed, New York University's Ernest Davis and Gary Marcus point out that a Big Data analysis of the crime rate in all 3,143 counties in the United States between 2006 and 2011 might reveal that the declining murder rate is strongly correlated with the diminishing market share of Internet Explorer.[viii] A similarly comprehensive analysis of autism diagnosis cases and organic food consumption might reveal a statistically significant correlation. Big Data can produce endless examples of such correlations but is thus far ineffective at determining which correlations are meaningful.
Simply put, even the most advanced forms of number crunching and correlation recognition are useless without contextual application and analysis[rc2] . In this area at least, even the fastest computers and most powerful analytic applications still trail the human mind, which is uniquely capable of making such connections. Writing in VentureBeat, ReD's Christian Madsbjerg and Mikkel Krenchel note that while computers excel at following narrowly defined rules, only the human brain is capable of reinterpreting, reframing, and redefining data to place it within a big picture.[ix]
Until computers are able to "think" creatively and contextually—or at least are able to mimic such cognitive functioning—the human brain will remain a necessary conduit between data analysis and data application, which is [rc3] reassuring. Not only will Big Data not make humanity obsolete, advanced technology will make our most creative faculties more relevant than ever. Indeed, Big Data is the latest era-advancing piece of technology (not unlike world-changing innovations such as the printing press, steam engine and semiconductor) that can be used to expand ontological horizons and scientific capabilities.
That said, a large caveat is in order: countless business headlines and anecdotal examples suggest that humans are just as capable of drawing the wrong conclusions from data as the correct ones. This is why data literacy is so important—both as an organizational best practice and as an educational praxis. Without the ability to understand and communicate data correctly, we may end up collecting the wrong data, ignoring the right data, failing to apply the data (or applying it incorrectly), extracting the wrong meaning from it, or twisting the results to support our preconceptions.
CULTIVATING A DATA-LITERATE WORKFORCE
Most contemporary usages of the term "literacy" refer not only to the ability to read and write but also to the skills required to think critically about how something is written and what it may represent. More sophisticated definitions also capture the ability to apply these skills for personal development and social transactions.[x] For example, policymakers recognize that an elementary level of "computer literacy" has essentially become a prerequisite for participation in modern society. As such, computer education has been seamlessly integrated into grade school curricula, and government programs and civil society initiatives have emerged to bring older adults up to speed.
The term "data literacy" captures a number of core deductive logic and statistical analysis skills that predate the shift to digital, but in the Big Data era, these abilities are more critical than ever. Data literacy is defined primarily by its active functional component—the ability to convert data into valuable and usable information. Retailers, marketers, and tech leaders have been ahead of the curve on this, transforming themselves into data-driven innovators through sizable investments in new technology and training.[xi] Universities have followed suit. Business analytics is gaining popularity as a curriculum focus within prominent MBA programs, while schools like Columbia University, Northwestern, New York University, and Stanford have launched quantitative studies and data mining programs.[xii] These courses of study prepare students to:
● Use statistical methods to extract patterns and trends from large datasets;
● Develop and use predictive models and analytics;
● Understand and use strategic decision-making applications; and
● Communicate findings in practical business language.
The increasing number of data-literate college graduates and business professionals is a good sign—although McKinsey still expects the United States to have a shortage of up to 190,000 data scientists by 2020.[xiii]
Yet, colleges and business are not the only sources of a data-literate workforce. Learning how to crunch numbers and use the results to tell stories with words and visuals can (and should) start as early as elementary school. An over-reliance on calculators, computers, and text, however, threatens some of our most innate and powerful tools that lead to data literacy. Thus, teaching critical reasoning and visual storytelling skills is critical throughout K-12 and college education, as well as in the professional world.
In a broader societal sense, data literacy should reflect a more passive level of competency and awareness among all people, much the way most people have a working knowledge of personal credit ratings or online banking. The proliferation of digital data impacts all of us, and it shouldn't require a master's degree in computer science for citizens and consumers alike to understand what sort of data is being collected and how and why this data is being analyzed and applied.
The private sector has an important role to play here as well. One of the reasons privacy concerns are raised in the Big Data discussion is that consumer data-collection practices remain opaque and poorly understood, even by practitioners. Forward-thinking businesses that “get” data would be well-advised to translate organizational data literacy into public-facing data resources. By proactively taking on consumer education, companies are able to responsibly pursue policies that benefit and also protect consumers. Businesses that don't understand—or deceitfully mask—their own data usage policies might best pull back and reevaluate. Making policies known, clear, and uncomplicated is a best practice in a data-driven, increasingly data-literate world.
DATA FOR DEVELOPMENT
News coverage of Big Data is most prominent in the business section of the Sunday paper, where readers find numerous stories detailing the newest tech developments from IT leaders, online giants, and big-box retailers. These pieces are always worth a read, but the science, health, and weather sections hold articles that reveal much broader, more altruistic uses of Big Data. For example:
● The Ocean Observatories Initiative recently began constructing a Big Data-scale cloud infrastructure that will store oceanographic data collected over the next 25 years by distributed sensors. The program will provide researchers with an unprecedented ability to study the Earth's oceans and climate.[xiv]
● Flatiron Health, a Big Data startup that consolidates cancer treatment records to offer practitioners a clearer and centralized overview of patient needs, recently raised $130 million in funding from some big name backers. The company plans to create the world's largest pool of structured real-world oncology data.[xv]
● Monsanto recently acquired The Climate Corporation, a San Francisco-based company that maintains a cloud-based farming information system that gathers weather measurements from 2.5 million locations every day. Climate Corporation uses this trove of weather data to help farmers cope with weather fluctuations.[xvi]
All of these examples underscore the idea that Big Data isn't limited to big business. Indeed, data-driven innovation has already been institutionalized within Harvard's Engineering Social Systems (ESS) program, where researchers are looking to census data, mobile phone records, and other newly available digital datasets to provide insights about the causal structure of food shortage in Uganda, the necessity of transportation planning in Rwanda, and the complex behavior of human societies everywhere.
The ESS program is part of a growing consortium of nonprofits, government agencies, universities, and private companies that have been given the label “Big Data for development.” Datakind is another bright star in that constellation. The New York-based nonprofit was created by New York Times R&D labs team member Jake Porway as a way to bring together data scientists and technology developers with civil society groups in a pro-bono capacity. Porway recognized that while many nonprofits and social ventures accumulate large datasets about issues relevant to their missions, they often lack the technology resources and skills to perform analytics.[xvii] Datakind started with local hackathon events but was soon working with the World Bank, Grameen Foundation, and the Red Cross to address problems ranging from fire prevention to good governance. The group now organizes data-dive events across the globe and will soon offer fellowships for longer term engagements.
Another great example is Global Pulse, a UN initiative that develops critical connections between data mining and humanitarianism. The organization uses real-time monitoring and predictive analytics to locate early warning signs of distress in developing countries. Global Pulse scans cell phone activity, social networking sites, and online commerce platforms for signals of near-future unemployment, disease, and price hikes, thus allowing for more rapid responses from humanitarian groups. The personal nature of this data does, of course, bring up privacy concerns, but Global Pulse's analysis does not identify specific individuals or even groups of individuals. Rather, the organization looks at large datasets of anonymized, aggregated data—much of it Open Data, discussed further in Chapter 6—that can provide a sense of "how whole populations or communities are coping with shocks that can result in widespread behavioral changes."[xviii]
Big Data is not only driving how nonprofits operate; it is also dictating how they receive funding. The increasing amounts of public domain and voluntarily provided information about charities, nonprofits, and related-tech ventures can help donors and investors channel dollars to the organizations that are most effective at fulfilling their objectives. Such assessments can be further facilitated by applications that collate and update data from ongoing evaluations, common performance measures, and qualitative feedback. A recent Wall Street Journal article speculates about an ROI-optimized world where "foundations will be able to develop, assess and revise their giving strategies by pulling information from community surveys, organizational reports, and an up-to-date 'ticker' of other philanthropic giving."[xix]
This "Future of Philanthropy" is already happening. The Knight Foundation—which has emerged as the go-to philanthropic organization for funding "transformational ideas"—recently partnered with data analytics firm Quid to produce a detailed analysis of the financial investments that support "civic tech"-related ventures.[xx] "Civic tech" is something of a catch-all category that captures startups, nonprofits, and new technologies that focus on improving the health and vitality of cities. This ecosystem of established operations and new ventures is so large that it was previously difficult (if not impossible) to determine, in a schematic sense, precisely from where funding was coming and the results it was producing.
Quid's approach allowed the Knight Foundation to map out the field through semantic analysis of private and philanthropic investment data. This analysis revealed that the civic tech field has exploded over the past decade, growing at an annual rate of 23% from 2008 to 2012. Quid identified 209 unique civic tech projects within that landscape. Peer-to-peer projects—such as Lyft (an app that facilitates ridesharing) and Acts of Sharing (which addresses all aspects of collaborative consumption)—attracted the vast majority of investment, followed by clusters of ventures related to neighborhood forums, community organizing, and information crowdsourcing. The aim of the analysis was not simply to sketch out the existing civic tech investment ecosystem but to help guide its future development.
DATA FOR GOOD
Big Data encompasses not just the hardware and software advancements needed to work with data on a large scale but also the process of quantifying the world around us into observable and manageable digital data. Mayer-Schönberger and Cukier refer to this transformation as "datafication," and it is occurring constantly throughout the technology sector and at all levels of government and business.[xxi] It is well-established that this process also extends into our personal lives. It is nearly impossible to proceed through a normal day without leaving behind a digital trail of online activities (e.g., Amazon purchases; Netflix viewing). Some marketing firms and tech companies are even deploying anthropologists into natural social settings to further quantify (via sophisticated preference rankings) those few interactions that are not mediated through technology, such as our communal exchanges with one another and our impulsive interactions with branded products and new gadgets.[xxii]
There are generally two responses to our increasingly quantified lives. The first response is to push back. European Courts, for example, continue to recognize users' "Right to be Forgotten"—effectively placing the onus on the online giants (e.g., Facebook) to remove damaging personal information from search results when requested by wronged parties. Even in a world without social media and negative Yelp reviews, however, individuals would still generate an enormous amount of digital data by using credit cards, phone apps, and keycards. Meanwhile, marketers would still send out "individualized" coupons and e-mails based on circulated consumer profiles and publicly available data. In other words, no amount of pushback will stop the data-generating activities individuals perform every day, nor will it degrade the business advantage in analyzing available data and applying insights gleaned from it.
The second response to a quantified existence is that if it is going to happen, we might as well harness it in positive ways for our personal use, such as aiding in things like time management, career choices, weight management, and general decision making. For example, it is easy to begin keeping a detailed log of hours spent working, hours spent traveling, hours spent relaxing, and even miles logged on the treadmill. Numerous gadgets and software programs facilitate this personal quantification. The Up fitness band from Jawbone, for example, is designed to be worn 24 hours a day, 7 days a week. When used with the accompanying application, the device can collect data on calories consumed, activity levels, and rest patterns. Up allows users to analyze daily activity to see when (and for how long) they were most active or most idle. Maintaining quantitative data about our professional and personal routines can help us achieve a qualitatively better work-life balance. This example is indicative of larger Big Data trends that are breaking down the quantitative-quantitative barrier and transforming the way we interact with the world around us.
Unfortunately, the potential in Big Data is endangered by current frameworks that have a tendency to either over-complicate the topic and make it inaccessible to non-scientific audiences or create uneasiness around the topic by emphasizing privacy concerns. As George Orwell argues in his famous 1946 essay “Politics and the English Language,” “An effect can become a cause, reinforcing the original cause and producing the same effect in an intensified form, and so on indefinitely.”[xxiii]
In other words, for businesses and policymakers, the way we talk about Big Data will define its use and either unleash or limit its value. To get away from this, it is much more productive to think about Big Data in terms of what it can and is enabling in every industry: innovation. To this point, it is widely recognized by policymakers and the business community that many of the most critical sectors of the economy are reaping the benefits of data-driven innovation. The healthcare industry uses digitized patient records to create more cohesive patient care between facilities; financial services use Big Data-enabled monitoring software for more accurate (and real-time) market forecasting; and public administrators use Open Data to increase transparency and facilitate more effective feedback loops. Drawing attention to these examples of how data drives innovation (and by consequence, economic growth) is far more beneficial than focusing on the size of the data, the processing power required to analyze it, and particularly, the rarely seen (though often hyped) negative ramifications for consumers.
The way we talk about Big Data can educate and clarify the dynamic through a results-oriented policy lens. Helping policymakers view Big Data from this big-picture perspective is important, and it better contextualizes benefits for individuals, organizations, and economies. Undue regulation may inadvertently hamper the technology's development and diminish near-future benefits. As argued in the Global Information Technology Report in 2014: “Decisions that affect data-driven innovation are usually focused on the problems of privacy and data protection, but fail to consider economic and social benefits that regulation could preclude.”[xxiv]
As with any transformational moment in business, there will be leaders and followers. Integrating Big Data thinking across the public and private sectors will not only benefit the bottom line for the companies who figure it out, but it will also benefit consumers, as they will be more informed and thus better able to navigate the Big Data landscape and enjoy all the benefits it offers. The companies that lead the way will therefore have a competitive advantage for reasons that span from creating greater internal efficiencies around usage to external impacts experienced by having more insights into and abilities to serve their customers.
Leslie Bradshaw is a managing partner at Made by Many, a product innovation company with offices in New York City and London. Named one of the “Most Creative People in Business” by Fast Company in 2013, Bradshaw’s areas of focus are interpreting and visualizing data, knowing what it takes to build and grow companies, and how to create lasting business impact through innovation. A graduate of the University of Chicago and contributor to Forbes, Bradshaw led her first company to the Inc. 500 list twice for revenue growth experienced during her tenure.
[i] Danah Boyd and Kate Crawford, “Critical Questions for Big Data: Provocations for a Cultural, Technological, and Scholarly Phenomenon,” Information, Communication & Society 15 no. 5 (2012): 662-679.
[ii] Alon Halevy, Peter Norvig, and Fernando Pereira, “The Unreasonable Effectiveness of Data,” IEEE Intelligent Systems Magazine 24, no. 2 (2012): 8-12.
[iii] Jon Orwant, “Ngram Viewer 2.0,” Research Blog, 18 Oct. 2012.
[iv] Harry McCracken, "The Rise and Fall of Practically Everything, as Told by the Google Books Ngram Viewer," Time, 16 Jan. 2014.
[v] Roger Yu, “Booming Market for Data-Driven Journalism,” USA Today, 17 March 2014.
[vi] Russell Ackoff, "From Data to Wisdom," Journal of Applied Systems Analysis 16 (1989).
[vii] Chris Anderson, “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete,” Wired,
23 June 2008.
[viii] Gary Marcus and Ernest Davis, “Eight (No, Nine!) Problems With Big Data,” The New York Times, 6 April 2014.
[ix] Mikkel Krenchel and Christian Madsbjerg, “No, Big Data Will Not Mirror the Human Brain—No Matter How Advanced Our Tech Gets,” VentureBeat, 17 Nov. 2013.
[x] Li Wei, ed., Applied Linguistics (Hoboken, NJ: Wiley-Blackwell, 2013).
[xi] In the mid-1990s, for example, British supermarket chain Tesco partnered with computer science firm Dunnhumby to establish Tesco's Clubcard, which allowed Tesco to track customers' purchasing behaviors and to optimize its product lines and targeted marketing. This statistical approach was so effective that Tesco began applying it to other operations as well and was able to expand its market share by more than 10% over the next decade. Tesco acquired a majority stake in Dunnhumby in 2006.
[xii] Elizabeth Dwoskin, "Universities Go in Big for Big Data," Wall Street Journal, 28 Aug. 2013.
[xiii] James Manyika et al., “Big Data: The Next Frontier for Innovation, Competition, and Productivity." McKinsey Global Institute, May 2011.
[xiv] Dana Gardner, “Cloud and Big Data Give Scientists Unprecedented Access to Essential Climate Insights,” ZDNet, 13 Aug. 12.
[xv] George Leopold, “Google Invests $130 Million in Cancer-Fighting Big Data Firm,” datanami, 21 May 2014.
[xvi] Bruce Upbin, “Monsanto Buys Climate Corp For $930 Million,” Forbes, 2 Oct. 2013.
[xvii] Joao Medeiros, "Jake Porway Wants to Turn His Network of Scientists into a League of Information Champions," Wired, 4 June 13.
[xviii] "FAQs," United Nations Global Pulse, <http://www.unglobalpulse.org/about/faqs> (15 Aug. 2014).
[xix] Lucy Bernholz, “How Big Data Will Change the Face of Philanthropy,” Wall Street Journal, 15 Dec. 2013.
[xx] Mayur Patel et al., "The Emergence of Civic Tech: Investments in a Growing Field," Knight Foundation, Dec. 2013.
[xxi] Kenneth Cukier and Viktor Mayer-Schönberger, Big Data: A Revolution That Will Transform How We Live, And Think (New York: Eamon Dolan/Houghton Mifflin Harcourt, 2013).
[xxii] Graeme Wood, "Anthropology Inc.," The Atlantic, March 2013.
[xxiii] George Orwell, "Politics and the English Language," Horizon, April 1946.
[xxiv] Pedro Less Andrade et al., “From Big Data to Big Social and Economic Opportunities: Which Policies Will Lead to Leveraging Data-Driven Innovation’s Potential?” in The Global Information Technology Report 2014: Rewards and Risks of Big Data, INSEAD, Cornell University and the World Economic Forum, 24 April 2014.