Big Data Needs Thick Data

Tricia Wang

Editor’s Note: Tricia provides an excellent segue between last month’s “Ethnomining” Special Edition and this month’s on “Talking to Companies about Ethnography.” She offers further thoughts building on our collective discussion (perhaps bordering on obsession?) with the big data trend. With nuance she tackles and reinvents some of the terminology circulating in the various industries that wish to make use of social research. In the wake of big data, ethnographers, she suggests, can offer thick data. In the face of derisive mention of “anecdotes” we ought to stand up to defend the value of stories.


image from Mark Smiciklas at Intersection Consulting

Big Data can have enormous appeal. Who wants to be thought of as a small thinker when there is an opportunity to go BIG?

The positivistic bias in favor of Big Data (a term often used to describe the quantitative data that is produced through analysis of enormous datasets) as an objective way to understand our world presents challenges for ethnographers. What are ethnographers to do when our research is seen as insignificant or invaluable? Can we simply ignore Big Data as too muddled in hype to be useful?

No. Ethnographers must engage with Big Data. Otherwise our work can be all too easily shoved into another department, minimized as a small line item on a budget, and relegated to the small data corner. But how can our kind of research be seen as an equally important to algorithmically processed data? What is the ethnographer’s 10 second elevator pitch to a room of data scientists?

…and GO!

Big Data produces so much information that it needs something more to bridge and/or reveal knowledge gaps. That’s why ethnographic work holds such enormous value in the era of Big Data.

Lacking the conceptual words to quickly position the value of ethnographic work in the context of Big Data, I have begun, over the last year, to employ the term Thick Data (with a nod to Clifford Geertz!) to advocate for integrative approaches to research. Thick Data uncovers the meaning behind Big Data visualization and analysis.

Thick Data: ethnographic approaches that uncover the meaning behind Big Data visualization and analysis.

Thick Data analysis primarily relies on human brain power to process a small “N” while big data analysis requires computational power (of course with humans writing the algorithms) to process a large “N”. Big Data reveals insights with a particular range of data points, while Thick Data reveals the social context of and connections between data points. Big Data delivers numbers; thick data delivers stories. Big data relies on machine learning; thick data relies on human learning.

Insights from network data analysis that yield field observations

Fabien Girardin

Editor’s note: This post for the April ‘Ethnomining‘ edition comes from Fabien Girardin @fabiengirardin who describes his work with networked/sensor data at the Louvre Museum in Paris. Based on this inspiring case study, he discusses the overall process, how mixed-methods are relevant in his work, and what kind lessons he learnt doing this.

Fabien Girardin is Partner at the Near Future Laboratory, a research agency. He is active in the domains of user experience, data science and urban informatics.


Visitor congestion at the Louvre Museum

Visitor congestion at the Louvre Museum, picture by Fabien Girardin

At the Near Future Laboratory we like to experiment and to go in different directions from the typical technology consultancy. We thrive on the involvement of multiple practices, and bet on the unordinary when it comes to question formulation, data collection and solution creation. After completing my PhD in Computer Science, I left the bounded disciplines of academia to embrace learning and connecting to the other “fields”, the other ways of knowing and seeing the world. Along with partners Julian Bleecker, Nicolas Nova and a network of tactical scouts, we formed a technology-based practice that combines insight and analysis, design and research, and rapid prototyping to transform ideas into material form.

Over the past 5 years, I have led investigations that aim to extract knowledge from the byproducts of people’s digital activities (i.e. network data, also often called digital shadows or digital footprints). That intangible material can take the form of logs of cellular network activity, aggregated credit card transactions, real-time traffic information, user-generated content or social network updates. Over time my contributions have evolved into helping transform this type of big data into insights, products and services. Whether applied for a client or as part of our self-started initiatives, this practice requires the basic skills of a “data scientist” (data analysis, information architecture, software engineering and creativity) along with a capacity to engage at the intersections with a wide variety of professionals, from physicists and engineers to lawyers, strategists and designers. The transversal incline of investigations on network data requires understanding the different languages that shape technologies, reporting on the context of their use, and describing people’s practices. The model of inquiry blends qualitative field observations with quantitative evidence often extracted from logs.

The investigation of network data involves along several steps multiple practices and skills from engineering, to statistics, design, strategy planning, product management and ethnography

The investigation with network data involves multiple practices and skills from engineering, to statistics, design, strategy planning, product management and ethnography; picture by Fabien Girardin

Past projects have led us to exploit untapped data sources, uncover opportunities to transform data into insights, and materialize new services or products. Our method first contemplates datasets and techniques to approach our objectives. Then we develop tangible solutions that engage the project stakeholders in exploring different scenarios and solutions. It is through the experiences of  people with knowledge of the project domain that we are able to extract possible near-future changes and opportunities.

