The human insights missing from big data
1,975,456 views | Tricia Wang • TEDxCambridge
Why do so many companies make bad decisions, even with access to unprecedented amounts of data? With stories from Nokia to Netflix to the oracles of ancient Greece, Tricia Wang demystifies big data and identifies its pitfalls, suggesting that we focus instead on "thick data" -- precious, unquantifiable insights from actual people -- to make the right business decisions and thrive in the unknown.
Why do so many companies make bad decisions, even with access to unprecedented amounts of data? With stories from Nokia to Netflix to the oracles of ancient Greece, Tricia Wang demystifies big data and identifies its pitfalls, suggesting that we focus instead on "thick data" -- precious, unquantifiable insights from actual people -- to make the right business decisions and thrive in the unknown.
This talk was presented to a local audience at TEDxCambridge, an independent event. TED's editors chose to feature it for you.
Read more about TEDx.Access emergent human insights about tech and youth culture from Mainland China though Magpie Kingdom's weekly digest.
Learn more about the social implications of big data from Data & Society.
Connect to other people who are integrating "thick data" with big data on the Ethnography Hangout Slack #datatalk channel.
About the speaker
With astronaut eyes and ethnographer curiosity, Tricia Wang helps corporations grow by discovering the unknown about their customers.
Foster Provost and Tom Fawcett | O'Reilly Media, 2013 | Book
Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking
If there was one single data science book I wish everyone in industry would read, it would be Foster and Tom's book. They do a brilliantly effective job at explaining what kinds of questions data science can and can’t answer. You don't need a statistics background to dive into this book. But you do need to be open to learning just exactly what kinds of problems are best suited for a purely quantitative solution. There isn't any discussion of how to integrate "thick data," but that isn't the purpose of this book.
Ken Anderson | Harvard Business Review, 2009 | Article
"Ethnographic Research: A Key to Strategy"
Now that Foster and Tom’s book has given you a great idea of what kinds of questions data science can answer, you're probably asking yourself what kinds of questions does "thick data" ethnography answer? Instead of recommending a book, here's a 415-word article by Intel's Ken Anderson that will give you an overview and case study of how Intel used ethnography to move towards the next phase of explosive growth.
Tricia Wang | Medium, 2016 | Article
"Why Big Data Needs Thick Data"
By now you have a solid sense of what big data and thick data can do on their own. But I believe that they're each more powerful when they're integrated together. In fact, we can’t leverage the most of either method in our current business environment unless we integrate the two. Here, I share my Nokia experience in greater detail. I go into greater depth of what exactly are the differences between big and thick data. And I also explain why I felt the need to create the term "thick data" for the world to use.
Steve Lohr | New York Times, 2012 | Article
"How Big Data Become So Big"
When I use the term "big data," I cringe because it's so vague that it verges on the edge of meaninglessness. But it's here to stay and I have to admit that it does do a decent job at summarizing a new kind of scale around quantitative data that cloud computing enables. Steve's article does a great job at connecting the history of how this term emerged over time.
Mimi Onuoha | Medium, 2016 | Article
"The Point of Collection"
Let's say you're ready to start collecting data about people — what do you need to know to make sure you starting off on the best foot? Mimi provides five practical guidelines on what you need to think about and discuss with your team.
Molly Templeton | Ethnography Matters, 2016 | Article
"Why do brands lose their chill? How bots, algorithms, and humans can work together on social media"
If you're in marketing, this is the essay for you. Brands are now taking a "data-driven" approach to managing their social strategy. But Molly urges brands to look beyond the numbers when working in the digital entertainment and marketing industry. She gives specific examples where algorithms don't know how to parse tweets by humans that are coded with multiple layers of emotional and cultural meaning. She offers the industry a new way to balance the emotional labor in audience management with data analysis.
Cathy O’Neil | Crown, 2016 | Book
Weapons of Math Destruction
If there's one book that will convince anyone that over-reliance on big data is dangerous, it's Cathy's book. For years I've followed Cathy's career as a mathematician in academia to her transition in the finance industry. She's an active blogger and writer with two other books about data science, but this is the book where she shines. She's taken all of her wisdom as a mathematician in academia to her transition in the finance industry and collected it into a concise treasure for the world to eat up.
Matt LeMay | Medium, 2016 | Article
"On Net Promoter and Data Golems"
Net Promoter Score (NPS) is often seen as the single most effective way for companies to get to know your customers. But over-reliance on it and blind use of it puts companies in a very dangerous place. Matt Lemay does a beautiful job explaining why NPS creates quantitative data models that don't reflect actual social models. He ends the article by offering up a new alternative to NPS. (Preview: the solution isn’t to abandon it all together!)
Cristian S. Calude and Giuseppe Longo | Foundations of Science, 2016 | Article
"The Deluge of Spurious Correlations in Big Data"
While this is an academic article, I can't think of a better piece than Christian and Giuseppe's sound-proof argument on why having more data can lead to spurious correlations. The rise of data science has led people to proclaim the "end of science" because databases will lead the way in computer-discovered correlations. The authors use results from ergodic theory, Ramsey theory and algorithmic information theory to show that just having more quantitative data doesn’t mean that we forego the scientific method. As they say, “too much information tends to behave like very little information.
Lauren Kirchner | ProPublica, 2015 | Article
"When Big Data Becomes Bad Data"
What happens when corporations increasingly rely on algorithms to make decisions? Lauren dives into this question by pointing out recent examples of just how risky it is to leave out thick data.
Alex Rosenblat | Harvard Business Review, 2016 | Article
"The Truth About How Uber's App Manages Drivers"
Uber is the quintessential company that has leveraged algorithms for making an automated business. But the story isn't so clean. In fact, Alex's article shows that by shifting employee-driver management from people to algorithms, a new set of problems emerge. Any discussion about the future of work needs to consider the perils of a pure "big data" solution.
Kate Crawford | The New Inquiry, 2014 | Article
"The Anxieties of Big Data"
As the co-founder of Artificial Intelligence NOW, Kate's research has always looked at the effects of big data on people. In this article, she outlines in concrete ways that consumers, citizens and people have anxieties about big data.
Nassim N. Taleb | Wired, 2013 | Article
"Beware of the Big Errors in Big Data"
Former derivatives trader turned professor of risk engineering at New York University's Polytechnic Institute, Nassim has time and time again delivered research on how to make decisions under the conditions of uncertainty. He provides a great op-ed that summarizes the main points about the scale of errors that happen with big data in his latest book, Antifragile: Things That Gain From Disorder.
Francois Chollet | The Keras Blog, 2017 | Article
"The Limitations of Deep Learning"
Francois has written a great primer that makes deep learning understandable for the layperson. He explains with total clarity why human-level AI is not possible right now with deep learning. He stops short of providing solutions on how to get closer to human-level AI, but the next few articles below do exactly just that.
Steven Gustafson | Ethnography Matters, 2016 | Article
"The Human Side of Artificial Intelligence and Machine Learning"
If deep learning isn’t able to achieve human-level AI, how will we get closer to that? Steven Gustafson, the founder of the Knowledge Discovery Lab at the General Electric Global Research Center asks an important question in this article: What is the role of humans in the future of intelligent machines? He makes the case that in the foreseeable future, artificially intelligent machines are the result of creative and passionate humans, and as such, we embed our biases, empathy and desires into the machines making them more "human" that we often think.
Madeleine Clare Elish | Ethnography Matters, 2016 | Article
"The future of designing autonomous systems will involve ethnographers"
Madeleine presents a case for why current cultural perceptions of the role of humans in automated systems need to be updated in order to protect against new forms of bias and worker harms.
Che-Wei Wang | Ethnography Matters, 2016 | Article
"Mindful Algorithms: the new role of the designer in generative design"
Che-Wei contemplates why engineers and architects will need to become more like ethnographers with generative design. He asks if it's possible to convert ethnographic data into quantitative data as algorithmic input. I've long admired Che-Wei’s ability to bring a poetic quality to the deeply mathematical nature of his world and this piece does piece does exactly just that.
Caroline Sinders | Fast Company, 2017 | Article
"The Most Crucial Design Job Of The Future"
Perhaps you’re wondering so what kind of person could bridge both the thick data of ethnography and the big data of quantitative methods? Caroline Sinder’s article introduces a new job that bridges both the quantitative and qualitative: the data ethnographer. In the future, the data ethnographer will help our AI and ML systems’ build data models that better reflect how humans actually interact.
Rochelle King, Elizabeth F Churchill, Caitlin Tan | O'Reilly Media, 2017 | Book
(Designing with Data: Improving the User Experience with A/B Testing
If you work in the field of design, this is your new go-to book on how to start integrating quantitative and qualitative data. Led by three rock-star authors with a long history in tech, they systematically lay out how designers can work with data scientists. Get everyone on your team a copy of this book if you need to answer immediately applicable research questions such as do we do X or Y (not upstream business questions that I covered in my talk, such as what does X mean or how do we understand Y or what is the future of Z). Additionally, the authors provide real examples from their experience. Rochelle is a pioneer in integrating data scientists into her design team at Spotify, while Elizabeth has long history running mixed-methods research labs.
Silberzahn, Raphael et al. | PsyArXiv, 2017 | Article
"Many analysts, one dataset"
I classify this research project as one of the most important contributions to data analysis. 29 different data teams took the same data set with the same research question: are soccer referees more likely to give red cards to dark skin toned players than light skin toned players? The analytic results varied widely between the teams, showing that even expert analysts couldn’t create a single, objective quantitative model of human behavior. The real lesson is that by allowing for analysts to pursue a range of research strategies, it ultimately leads to a more transparent discussion of the diversity of data approaches that can wildly effect outcomes and interpretations. This experiment shows that data science is a creative act, and as with anything creative, there is no such thing as objective truth.
Michelle Nijhuis | The New Yorker, 2017 | Article
"How to Call B.S. on Big Data: A Practical Guide"
What happens when you pair an information scientist and biologist together? You get a sensational sell-out online course called, Calling Buillshit in the Age of Big Data. Michelle captures the highlights from professors Jevin West's and Carl Bergstrom's class, walks us through a few principles that guide their syllabus and case studies. Jevin and Carl emphasize that you don’t need to be a statistician to call bullshit on data, you just need to have common sense.
About TEDx
TEDx was created in the spirit of TED's mission, "ideas worth spreading." It supports independent organizers who want to create a TED-like event in their own community.
This talk was presented to a local audience at TEDxCambridge, an independent event. TED's editors chose to feature it for you.
Read more about TEDx.Access emergent human insights about tech and youth culture from Mainland China though Magpie Kingdom's weekly digest.
Learn more about the social implications of big data from Data & Society.
Connect to other people who are integrating "thick data" with big data on the Ethnography Hangout Slack #datatalk channel.