Tag Archives: Wikipedia

Trace ethnography: a retrospective

Stuart GeigerStuart Geiger @staeiou continues our edition of ‘The Person in the (Big) Data‘ with a reflection on his practice of ‘trace ethnography’ that focuses on the trace-making techniques that render users’ activities and intentions legible to each other. Importantly, Stuart argues, we as researchers need to see these traces in the context of our active socialization within the community in question, rather than passively reading traces through lurking. 

When I was an M.A. student back in 2009, I was trying to explain various things about how Wikipedia worked to my then-advisor David Ribes. I had been ethnographically studying the cultures of collaboration in the encyclopedia project, and I had gotten to the point where I could look through the metadata documenting changes to Wikipedia and know quite a bit about the context of whatever activity was taking place. I was able to do this because Wikipedians do this: they leave publicly accessible trace data in particular ways, in order to make their actions and intentions visible to other Wikipedians. However, this was practically illegible to David, who had not done this kind of participant-observation in Wikipedia and had therefore not gained this kind of socio-technical competency. 

For example, if I added “{{db-a7}}” to the top an article, a big red notice would be automatically added to the page, saying that the page has been nominated for “speedy deletion.” Tagging the article in this way would also put it into various information flows where Wikipedia administrators would review it. If any of Wikipedia’s administrators agreed that the article met speedy deletion criteria A7, then they would be empowered to unilaterally delete it without further discussion. If I was not the article’s creator, I could remove the {{db-a7}} trace from the article to take it out of the speedy deletion process, which means the person who nominated it for deletion would have to go through the standard deletion process. However, if I was the article’s creator, it would not be proper for me to remove that tag — and if I did, others would find out and put it back. If someone added the “{{db-a7}}” trace to an article I created, I could add “{{hangon}}” below it in order to inhibit this process a bit — although a hangon is a just a request, it does not prevent an administrator from deleting the article.

File:Wiki Women's Edit-a-thon-1.jpg

Wikipedians at an in-person edit-a-thon (the Women’s History Month edit-a-thon in 2012). However, most of the time, Wikipedians don’t get to do their work sitting right next to each other, which is why they rely extensively on trace data to coordinate render their activities accountable to each other. Photo by Matthew Roth, CC-BY-SA 3.0

I knew all of this both because Wikipedians told me and because this was something I experienced again and again as a participant observer. Wikipedians had documented this documentary practice in many different places on Wikipedia’s meta pages. I had first-hand experience with these trace data, first on the receiving end with one of my own articles. Then later, I became someone who nominated others’ articles for deletion. When I was learning how to participate in the project as a Wikipedian (which I now consider myself to be), I started to use these kinds of trace data practices and conventions to signify my own actions and intentions to others. This made things far easier for me as a Wikipedian, in the same way that learning my university’s arcane budgeting and human resource codes helps me navigate that bureaucracy far easier.Read More… Trace ethnography: a retrospective

About a bot: Materiality, multiplicity, and memory in the study of software agents

Stuart Geiger (@steaiou)

Stuart Geiger

Editors’ note: The next post for our Ethnographies of Objects edition is by one of the people who inspired it when he talked about an ‘ethnography of robots’ for EM last year. Stuart Geiger (@staeiou) is a PhD student at UC Berkeley’s School of Information and long time Wikipedia editor who has been studying Wikipedia bots for many years and who has brought us really great insights: not only into how Wikipedia works but also on new ways of thinking about how to do ethnography of largely-online communities. In this thoughtful post, Stuart talks about how his ideas about bots have changed over the years, and about which of the images below is the “real” bot.     

A few weeks ago, Heather Ford wrote to me and told me about this special edition of Ethnography Matters, focusing on the ethnography of objects.  She asked me if there was something I’d like to write about bots, which I’ve been struggling to ethnographically study for some time.  As I said in an interview I did with EM last year, I want to figure out how to ethnographically study these automated software agents themselves, not just the people who build them or have to deal with them.  Among all the topics that are involved in the ethnography of objects, Heather briefly mentioned that she was asking all the authors to provide a picture of their given object, whatever weird form that may take for bots.

At first, I started to think about the more standard epistemological questions I’d been wrestling with:  What is the relationship between the ethnographer and the ethnographic subject when that subject isn’t a human, but an autonomous software program?  What does it mean to relate an emic account of a such a being, and what does ethnographic fieldwork look like in such an endeavor?  How do classic concepts like agency, materiality, and the fieldsite play out when investigating what is often seen as more of an object than a subject?  What do we even mean when we say ‘object’, and what are we using this term to exclude?  I could take any one of these topics and write far too much about them, I thought.

As always, after jotting down some notes, my mind started to wander as I entered procrastination mode. I shelved the more ‘theoretical’ questions and moved to what I thought was the easier part of Heather’s request: to provide a photo of a bot.  I thought that finding an image would be a fun diversion, and I had so many great cases to choose from.  There were humorous bots, horrifying bots, and hidden bots.  There were bots who performed controversial tasks, and bots whose work was more mundane.  There were bots I loved and bots I hated, bots that were new and bots that were old.  There were bots I knew backwards and forwards, and bots who were still a mystery to me.  I just had to find an image that I felt best encapsulated what it meant to be a bot, and then write about it.  However, I didn’t realize that this simple task would prove to be far more difficult than I anticipated — and working out how to use imagery rather than text to talk about bots has helped me come to articulate many of the more complicated issues at work in my ethnography, particularly those around materiality, multiplicity, and memory.

Read More… About a bot: Materiality, multiplicity, and memory in the study of software agents

August 2013: Ethnographies of Objects

This month’s edition is co-edited by CW Anderson (@chanders), Juliette De Maeyer (@juliettedm) and Heather Ford (@hfordsa). The three of us met in June for the ICA preconference entitled ‘Objects of Journalism’ organised by Chris and Juliette. Over the course of the day, we heard fascinating stories of insights garnered through a focus on the objects, tools and spaces surrounding and interspersed with the business and practice of newsmaking: about faked photographs through the ages, about the ways in which news app designers think about news when designing apps for mobile devices and tablets, and about the evolution of the ways in which news room spaces were designed. We also heard rumblings – rarely fully articulated – that a focus on objects is controversial in the social sciences. In this August edition of Ethnography Matters, we offer a selection of objects from the conference as well as from an open call to contribute and hope that it sparks a conversation started by a single question: what can we gain from an ethnography of objects – especially in the fields of technology, media and journalism research?


Hardware. Image by Cover.69 on Flickr CC BY

Why an *ethnography* of objects?

As well as the important studies of body snatching, identity tourism, and transglobal knowledge networks, let us also attend ethnographically to the plugs, settings, sizes, and other profoundly mundane aspects of cyberspace, in some of the same ways we might parse a telephone book. Susan Leigh Star, 1999

Susan Leigh Star, in ‘The ethnography of infrastructure‘ noted that we need to go beyond studies of identity in cyberspace and networks to (also) look at the often invisible infrastructure that surfaces important issues around group formation, justice and change. Ethnography is a useful way of studying infrastructure, she writes, because of its strengths of ‘surfacing silenced voices, juggling disparate meanings, and understanding the gap between words and deeds’.

In her work studying archives of meetings of the World Health Organization and old newspapers and law books concerning cases of racial recategorization under apartheid in South Africa, Star ‘brought an ethnographic sensibility to data collection and analysis: an idea that people make meanings based on their circumstances, and that these meanings would be inscribed into their judgements about the built information environment’.Read More… August 2013: Ethnographies of Objects

Onymous, pseudonymous, neither or both?

Heather Ford

Heather Ford

Editor’s Note: For our Virtual Identity edition, contributing editor Heather Ford (@hfordsa) explores the complications of attribution and identification in online research. Are members of online communities research subjects, research participants, amateur artists? When is online participation public, private, or something in between?


Pic by moriza on Flickr, CC BY NC SA

Pic by moriza on Flickr, CC BY NC SA

When I published one of my first studies of online communities as part of my master’s research, I came up against one of the most challenging aspects of online research: how to reflect the identity of one’s research participants. I had been observing an open educational content community and quoted one of the participants’ missives from the publicly available mailing list without referring to his name or username. I had thought that this was the right thing to do: to anonymize the data, thus protecting the subjects. But the “subject” was angry that he had been quoted “without attribution”. And he was right. If I was really interested in protecting the privacy of my subjects, why would I quote his sentence when anyone could probably Google it and find out who wrote it.

Since then, my process has evolved a lot, but I still send my research participants a draft of my paper before it gets published so that they can choose whether I a) anonymize their statements b) attribute according to their usernames or c) attribute their full (“real”) names. But the process becomes unwieldy when doing detailed content analysis (or “trace ethnography” as per Geiger and Ribes) on Wikipedia where only some editors accept email and where other editors may have left the project. These are publicly available statements on a website that is explicitly open for copying and remixing, but I’m also taking those statements out of the context in which they are written. This is technically a “remix” but may make some editors uncomfortable.

So, do I quote users and attribute their comments to their username on publicly accessible websites like Wikipedia? Or do I need to get their written permission where they choose whether they want me to attribute their name, username, both or neither?Read More… Onymous, pseudonymous, neither or both?

Isolated vs overlapping narratives: the story of an AFD

Heather Ford

Heather Ford

Editor’s Note: This month’s Stories to Action edition starts off with Heather Ford’s @hfordsa’s story on her experience of watching a story unfold on Wikipedia and in person. While working as an ethnographer at Ushahidi, Heather was in Nairobi, Kenya when she heard news of Kenya’s army invading Somolia. She found out that the article about this story was being nominated for deletion on Wikipedia because it didn’t meet the encyclopedia’s “notability” criteria. This local story became a way for Heather to understand why there was a disconnect between what Wikipedia editors and Kenyans recognised as “notable”. She argues that, although Wikipedia frowns on using social media as sources, the “word on the street” can be an important way for editors to find out what is really happening and how important the story is when it first comes out. She also talks about how her ethnographic work helped her develop insights for a report that Ushahidi would use in their plans to develop new tools for rapid real-time events. 

Heather shared this story at Microsoft’s annual Social Computing Symposium organized by Lily Cheng at NYU’s ITP. Watch the video of her talk, in which she refers to changing her mind on an article she wrote a few years ago, The Missing Wikipedians.


A few of us were on a panel at Microsoft’s annual Social Computing Symposium led by the inimitable Tricia Wang. In an effort to reach across academic (and maybe culture) divides, Tricia urged us to spend five minutes telling a single story and what that experience made us realize about the project we were working on. It was a wonderful way of highlighting the ethnographic principle of reflexivity where the ethnographer reflects on their attitudes/thoughts/reactions in response to the experiences that they have in the field. I told this story about the misunderstandings faced by editors across geographical and cultural divides, and how I’ve come to understand Articles for Deletions (AFDs) on Wikipedia that are related to Kenya. I’ve also added thoughts that I had after the talk/conference based on what I learned here.   

In November, 2011, I arrived in Nairobi for a visit to the HQ of Ushahidi and to conduct interviews about a project I was involved with to understand how Wikipedians managed sources during rapidly evolving news events. We were trying to figure out how to build tools to help people who collaboratively curate stories about such events – especially when they are physically distant from one another. When I arrived in Nairobi, I went straight to the local supermarket and bought copies of every local newspaper. It was a big news day in the country because of reports that the Kenyan army had invaded Southern Somalia to try and root out the militant Al Shabaab terrorist group. The newspapers all showed Kenyan military tanks and other scenes from the offensive, matched by the kind of bold headlines that characterize national war coverage the world over.

A quick search on Wikipedia, and I noticed that a page had been created but that it had been nominated for deletion on the grounds that did not meet Wikipedia’s notability criteria. The nominator noted that the event was not being reported as an “invasion” but rather an “incursion” and that it was “routine” for troops from neighboring countries to cross the border for military operations.Read More… Isolated vs overlapping narratives: the story of an AFD

Why Wikipedia is no ‘proxy for culture’ (Part 1 of 3)

Culture close up Bomedical scientist, Nathan Reading on Flickr (CC BY)

Culture close-up by biomedical scientist, Nathan Reading on Flickr (CC BY)

Last month’s Wired magazine showed an infographic with a headline that read: ‘History’s most influential people, ranked by Wikipedia reach’ with a group of 20 men arranged in hierarchical order — from Jesus at number 1 to Stalin at number 20. Curious, I wondered how ‘influence’ and ‘Wikipedia reach’ was being decided. According to the article, ‘Rankings (were) based on parameters such as the number of language editions in which that person has a page, and the number of people known to speak those languages’. What really surprised me was not the particular arrangement of figures on this page but the conclusions that were being drawn from it.

According to the piece, César Hidalgo, head of the Media Lab’s Macro Connections group, who researched the data, made the following claims about the data gathered from Wikipedia:

a) “It shows you how the world perceives your own national culture.

b) “It’s a socio-cultural mirror.

c) “We use historical characters as proxies for culture.

And finally, perhaps most surprising is this final line in the story:

Using this quantitative approach, Hidalgo is now testing hypotheses such as whether cultural development is structured or random. “Can you have a Steve Jobs in a country that has not generated enough science or technology?” he wonders. “Ultimately we want to know how culture assembles itself.”

It is difficult to comment on the particular method used by this study because there is little more than the diagram and a few paragraphs of analysis, and the journalist may have misquoted him, but I wanted to draw attention to the statements being made because I think it represents the growing phenomenon of big data analysts using Wikipedia data to make assumptions about ‘culture’.Read More… Why Wikipedia is no ‘proxy for culture’ (Part 1 of 3)

Where does ethnography belong? Thoughts on WikiSym 2012

On the first day of WikiSym last week, as we started preparing for the open space track and the crowd was being petitioned for new sessions over lunch, I suddenly thought that it might be a good idea for researchers who used ethnographic methods to get together to talk about the challenges we were facing and the successes we were having. So I took the mic and asked how many people used ethnographic methods in their research. After a few raised their hands, I announced that lunch would be spent talking about ethnography for those who were interested. Almost a dozen people – many of whom are big data analysts – came to listen and talk at a small Greek restaurant in the center of Linz. I was impressed that so many quantitative researchers came to listen and try to understand how they might integrate ethnographic methods into their research. It made me excited about the potential of ethnographic research methods in this community, but by the end of the conference, I was worried about the assumptions on which much of the research on Wikipedia is based, and at what this means for the way that we understand Wikipedia in the world. 

WikiSym (Wiki Symposium) is the annual meeting of researchers, practitioners and wiki engineers to talk about everything to do with wikis and open collaboration. Founded by the father of the wiki, Ward Cunningham and others, the conference started off as a place where wiki engineers would gather to advance the field. Seven years later, WikiSym is dominated by big data quantitative analyses of English Wikipedia.

Some participants were worried about the movement away from engineering topics (like designing better wiki platforms), while others were worried about the fact that Wikipedia (and its platform, MediaWiki) dominates the proceedings, leaving other equally valuable sites like Wikia and platforms like TikiWiki under-studied.

So, in the spirit of the times, I drew up a few rough analyses of papers presented.

(Wikipedia and its platform, MediaWiki are but one of a host of other wiki communities and platforms which is why I’ve distinguished between Wikipedia and others.)

It would be interesting to look at this for other years to see whether the recent Big Data trend is having an impact on Wikipedia research and whether research related to Wikipedia (rather than other open collaboration communities) is on the rise. One thing I did notice was that the demo track was a lot larger this year than the previous two years. Hopefully that is a good sign for the future because it is here that research is put into practice through the design of alternative tools. A good example is Jodi Schneider’s research on Wikipedia deletions that she then used to conceptualize alternative interfaces  that would simplify the process and help to ensure that each article would be dealt with more fairly.

Talking about ethnography?

I am still intrigued by the fact that so many quantitative analysts wanted to know about ethnography during our open space session. We started the session with those who had done ethnographic work talking about their experiences: Stuart Geiger talked about his ethnographic work on Wikipedia bots, Isis Amelie Hjorth talked about her ethnographic enquiry into Wreckamovie, the collaborative movie outfit from Finland and Paško Bilić discussed how he studied breaking news stories on Wikipedia. Others wanted to know how you even begin to do ethnographic research on Wikipedia when editors are a) anonymous and b) located all around the world. One participant said, “I’m faced with 3 million edits (in my dataset) and I have to say something about them. How do I even begin?”Read More… Where does ethnography belong? Thoughts on WikiSym 2012

The tools we use: Supporting Wikipedia analysis

The Ethnomatters team has been wanting to do a review of software tools for a while now but when we got down to writing them, we realized that there are already very comprehensive software reviews in places like the University of Surrey’s website. So we decided to rather compile short posts on the tools that each of us used in our last ethnographic project, highlighting what worked, what didn’t work and what we’re thinking of trying in the future. We’d love to hear from you about your own experiences so please feel free to add yours in the comments below for further reading!

For my latest project (“Understanding sources“), I needed to collect data from a really wide variety of sources. I had interview data, articles and papers from web, and then a multitude of Wikipedia talk pages, edits, history versions, related articles and image and video sources. For interviewing, I use my beautiful and incredibly trustworthy Zoom H2 audio recorder. I do my own transcriptions (as suggested by Jenna in order to get a really close understanding of the data) and for that I use ExpressScribe which seems to work pretty well. I like that you can use “hot keys” to stop and play and that the speed dial is in a good place for slowing down the dictation.Read More… The tools we use: Supporting Wikipedia analysis

Beyond reliability: An ethnographic study of Wikipedia sources

Almost a year ago, I was hired by Ushahidi to work as an ethnographic researcher on a project to understand how Wikipedians managed sources during breaking news events. Ushahidi cares a great deal about this kind of work because of a new project called SwiftRiver that seeks to collect and enable the collaborative curation of streams of data from the real time web about a particular issue or event. If another Haiti earthquake happened, for example, would there be a way for us to filter out the irrelevant, the misinformation and build a stream of relevant, meaningful and accurate content about what was happening for those who needed it? And on Wikipedia’s side, could the same tools be used to help editors curate a stream of relevant sources as a team rather than individuals?

Original designs for voting a source up or down in order to determine “veracity”

When we first started thinking about the problem of filtering the web, we naturally thought of a ranking system which would rank sources according to their reliability or veracity. The algorithm would consider a variety of variables involved in determining accuracy as well as whether sources have been chosen, voted up or down by users in the past, and eventually be able to suggest sources according to the subject at hand. My job would be to determine what those variables are i.e. what were editors looking at when deciding whether to use a source or not?

I started the research by talking to as many people as possible. Originally I was expecting that I would be able to conduct 10-20 interviews as the focus of the research, finding out how those editors went about managing sources individually and collaboratively. The initial interviews enabled me to hone my interview guide. One of my key informants urged me to ask questions about sources not cited as well as those cited, leading me to one of the key findings of the report (that the citation is often not the actual source of information and is often provided in order to appease editors who may complain about sources located outside the accepted Western media sphere). But I soon realized that the editors with whom I spoke came from such a wide variety of experience, work areas and subjects that I needed to restrict my focus to a particular article in order to get a comprehensive picture of how editors were working. I chose the 2011 Egyptian revolution article because I wanted a globally relevant breaking news event that would have editors from different parts of the world working together on an issue with local expertise located in a language other than English.

Read More… Beyond reliability: An ethnographic study of Wikipedia sources

From San Francisco to Cairo and back again: Collaborating across cultures

Annie Lin. Pic by Guillaume Paumier CC BY 3.0

I’ve been trying to talk to Egyptian Wikipedia editors for a project about the experience of Wikipedia editors in the Middle East and am finding it really difficult to connect to relevant people through their Talk pages. And so I went to talk to Annie Lin, Global Education Program Manager at the Wikimedia Foundation about how she engaged with editors in Egypt at the start of a project to get students in local universities to write Wikipedia articles. In this interview, Lin talks about ways for outsiders to gain access by giving up power, encouraging participation and changing communication styles and platforms where the culture demands it. She’s given me some great things to think about as I build a more grounded understanding of editing in the Middle East, and I’m sure there are some gems in here that will help others as they think about doing ethnography starting from online places. 

Annie Lin is excited. The first pilot project that she oversaw in Cairo, Egypt to encourage students in local universities to contribute to Wikipedia has been a success – and although the term has ended, many students are still editing.

May was the last month of classes but a lot of students say they’ll keep editing. It seems that the students are excited about the idea that they’re contributing Arabic topics in the Arab world.

The pilot project, involving 60 students from 7 classes in 2 universities, had students create articles in Arabic Wikipedia either as part of the curriculum or as an extra curricula activity. An initial survey asking students what would motivate them to edit Wikipedia had a sense of contributing information about Egypt or the Arab world as the most common motivation. Lin says that when they show maps of Portuguese Wikipedia compared to Arabic Wikipedia, professors and students are shocked at the low numbers of Arabic articles.Read More… From San Francisco to Cairo and back again: Collaborating across cultures