3.2M views Aug 2010
Written by the educators who created Visualizing Data, a brief look at the key facts, tough questions and big ideas in their field. Begin this TED Study with a fascinating read that gives context and clarity to the material.
All of us now are being blasted by information design. It's being poured into our eyes through the Web, and we're all visualizers now; we're all demanding a visual aspect to our information...And if you're navigating a dense information jungle, coming across a beautiful graphic or a lovely data visualization, it's a relief, it's like coming across a clearing in the jungle.David McCandless
In today's complex 'information jungle,' David McCandless observes that "Data is the new soil." McCandless, a data journalist and information designer, celebrates data as a ubiquitous resource providing a fertile and creative medium from which new ideas and understanding can grow. McCandless's inspiration, statistician Hans Rosling, builds on this idea in his own TEDTalk with his compelling image of flowers growing out of data/soil. These 'flowers' represent the many insights that can be gleaned from effective visualization of data.
We're just learning how to till this soil and make sense of the mountains of data constantly being generated. As Gary King, Director of Harvard's Institute for Quantitative Social Science says in his New York Times article "The Age of Big Data":
It's a revolution. We're really just getting under way. But the march of quantification, made possible by enormous new sources of data, will sweep through academia, business and government. There is no area that is going to be untouched.
How do we deal with all this data without getting information overload? How do we use data to gain real insight into the world? Finding ways to pull interesting information out of data can be very rewarding, both personally and professionally. The managing editor of Financial Times observed on CNN's Your Money: "The people who are able to in a sophisticated and practical way analyze that data are going to have terrific jobs." Those who learn how to present data in effective ways will be valuable in every field.
Many people, when they think of data, think of tables filled with numbers. But this long-held notion is eroding. Today, we're generating streams of data that are often too complex to be presented in a simple "table." In his TEDTalk, Blaise Aguera y Arcas explores images as data, while Deb Roy uses audio, video, and the text messages in social media as data.
Some may also think that only a few specialized professionals can draw insights from data. When we look at data in the right way, however, the results can be fun, insightful, even whimsical — and accessible to everyone! Who knew, for example, that there are more relationship break-ups on Monday than on any other day of the week, or that the most break-ups (at least those discussed on Facebook) occur in mid-December? David McCandless discovered this by analyzing thousands of Facebook status updates.
There is more data available to us now than we can possibly process. Every minute, Internet users add the following to the big data pool (i):
These numbers are almost certainly higher now, as you read this. And this just describes a small piece of the data being generated and stored by humanity. We're all leaving data trails — not just on the Internet, but in everything we do. This includes reams of financial data (from credit cards, businesses, and Wall Street), demographic data on the world's populations, meteorological data on weather and the environment, retail sales data that records everything we buy, nutritional data on food and restaurants, sports data of all types, and so on.
Governments are using data to search for terrorist plots, retailers are using it to maximize marketing strategies, and health organizations are using it to track outbreaks of the flu. But did you ever think of collecting data on every minute of your child's life? That's precisely what Deb Roy did. He recorded 90,000 hours of video and 140,000 hours of audio during his son's first years. That's a lot of data! He and his colleagues are using the data to understand how children learn language, and they're now extending this work to analyze publicly available conversations on social media, allowing them to take "the real-time pulse of a nation."
Data can provide us with new and deeper insight into our world. It can help break stereotypes and build understanding. But the sheer quantity of data, even in just any one small area of interest, is overwhelming. How can we make sense of some of this data in an insightful way?
Visualization can help transform these mountains of data into meaningful information. In his TEDTalk, David McCandless comments that the sense of sight has by far the fastest and biggest bandwidth of any of the five senses. Indeed, about 80% of the information we take in is by eye. Data that seems impenetrable can come alive if presented well in a picture, graph, or even a movie. Hans Rosling tells us that "Students get very excited — and policy-makers and the corporate sector — when they can see the data."
It makes sense that, if we can effectively display data visually, we can make it accessible and understandable to more people. Should we worry, however, that by condensing data into a graph, we are simplifying too much and losing some of the important features of the data? Let's look at a fascinating study conducted by researchers Emre Soyer and Robin Hogarth. The study was conducted on economists, who are certainly no strangers to statistical analysis. Three groups of economists were asked the same question concerning a dataset:
Visualizing data can sometimes be less misleading than using the raw numbers and statistics!
What about all the rest of us, who may not be professional economists or statisticians? Nathalie Miebach finds that making art out of data allows people an alternative entry into science. She transforms mountains of weather data into tactile physical structures and musical scores, adding both touch and hearing to the sense of sight to build even greater understanding of data.
Another artist, Chris Jordan, is concerned about our ability to comprehend big numbers. As citizens of an ever-more connected global world, we have an increased need to get useable information from big data — big in terms of the volume of numbers as well as their size. Jordan's art is designed to help us process such numbers, especially numbers that relate to issues of addiction and waste. For example, Jordan notes that the United States has the largest percentage of its population in prison of any country on earth: 2.3 million people in prison in the United States in 2005 and the number continues to rise. Jordan uses art, in this case a super-sized image of 2.3 million prison jumpsuits, to help us see that number and to help us begin to process the societal implications of that single data value. Because our brains can't truly process such a large number, his artwork makes it real.
The TEDTalks in this collection depend to varying degrees on sophisticated technology to gather, store, process, and display data. Handling massive amounts of data (e.g., David McCandless tracking 10,000 changes in Facebook status, Blaise Aguera y Arcas synching thousands of online images of the Notre Dame Cathedral, or Deb Roy searching for individual words in 90,000 hours of video tape) requires cutting-edge computing tools that have been developed specifically to address the challenges of big data. The ability to manipulate color, size, location, motion, and sound to discover and display important features of data in a way that makes it readily accessible to ordinary humans is a challenging task that depends heavily on increasingly sophisticated technology.
There are good ways and bad ways of presenting data. Many examples of outstanding presentations of data are shown in the TEDTalks. However, sometimes visualizations of data can be ineffective or downright misleading. For example, an inappropriate scale might make a relatively small difference look much more substantial than it should be, or an overly complicated display might obfuscate the main relationships in the data. Statistician Kaiser Fung's blog Junk Charts offers many examples of poor representations of data (and some good ones) with descriptions to help the reader understand what makes a graph effective or ineffective. For more examples of both good and bad representations of data, see data visualization architect Andy Kirk's blog at visualisingdata.com. Both consistently have very current examples from up-to-date sources and events.
Creativity, even artistic ability, helps us see data in new ways. Magic happens when interesting data meets effective design: when statistician meets designer (sometimes within the same person). We are fortunate to live in a time when interactive and animated graphs are becoming commonplace, and these tools can be incredibly powerful. Other times, simpler graphs might be more effective. The key is to present data in a way that is visually appealing while allowing the data to speak for itself.
While graphs and charts can lead to misunderstandings, there is ultimately "truth in numbers." As Steven Levitt and Stephen Dubner say in Freakonomics, "[T]eachers and criminals and real-estate agents may lie, and politicians, and even C.I.A. analysts. But numbers don't." Indeed, consideration of data can often be the easiest way to glean objective insights. Again from Freakonomics: "There is nothing like the sheer power of numbers to scrub away layers of confusion and contradiction."
Data can help us understand the world as it is, not as we believe it to be. As Hans Rosling demonstrates, it's often not ignorance but our preconceived ideas that get in the way of understanding the world as it is. Publicly-available statistics can reshape our world view: Rosling encourages us to "let the dataset change your mindset."
Chris Jordan's powerful images of waste and addiction make us face, rather than deny, the facts. It's easy to hear and then ignore that we use and discard 1 million plastic cups every 6 hours on airline flights alone. When we're confronted with his powerful image, we engage with that fact on an entirely different level (and may never see airline plastic cups in the same way again).
The ability to see data expands our perceptions of the world in ways that we're just beginning to understand. Computer simulations allow us to see how diseases spread, how forest fires might be contained, how terror networks communicate. We gain understanding of these things in ways that were unimaginable only a few decades ago. When Blaise Aguera y Arcas demonstrates Photosynth, we feel as if we're looking at the future. By linking together user-contributed digital images culled from all over the Internet, he creates navigable "immensely rich virtual models of every interesting part of the earth" created from the collective memory of all of us. Deb Roy does somewhat the same thing with language, pulling in publicly available social media feeds to analyze national and global conversation trends.
Roy sums it up with these powerful words: "What's emerging is an ability to see new social structures and dynamics that have previously not been seen. ...The implications here are profound, whether it's for science, for commerce, for government, or perhaps most of all, for us as individuals."
Let's begin with the TEDTalk from David McCandless, a self-described "data detective" who describes how to highlight hidden patterns in data through its artful representation.
i. Data obtained June 2012 from “How Much Data Is Created Every Minute?” on http://mashable.com/2012/06/22/data-created-every-minute/.