On Digital Ethnography, What do computers have to do with ethnography? (1 of 4)

Editor’s Note: While digital ethnography is an established field within ethnography, we don’t often hear of ethnographers building digital tools to conduct their fieldwork. Wendy Hsu wants to change that. In the first of her three-part guest post series, she shows how ethnographers can use software, and even build their own software, to explore online communities. By drawing on examples from her own research on independent rock musicians, she shares with us how she moved from being an ethnographer of purely physical domains to an ethnographer who built software programs to gather more relevant qualitative data. 

Wendy is currently a Mellon Digital Scholarship Postdoctoral Fellow in the Center of Digital Learning + Research at Occidental College. She recently completed a Ph.D. in the Critical and Comparative Studies program in the McIntire Department of Music at the University of Virginia. Her dissertation, an ethnography of Asian American independent rock musicians, deploys the methods of ethnomusicology and digital humanities to explore the complex interrelationships between popular music and geography in transnational contexts. She implemented methods of digital ethnography to map musicians’ social networks. She tweets at @wendyfhsu and blogs at beingwendyhsu.info. She also plays with the vintage Asian garage pop revivalist band Dzian!.

Check out past posts from guest bloggers. Here are some ideas for how you can contribute.


When Tricia asked me to contribute a series on Ethnography Matters, I thought that I would take this opportunity to bring together the notes on digital ethnography that I have collected over the last couple of years. I would like to push the boundaries of computational usage in ethnographic processes a bit here. I really want to expand the definition of digital ethnography beyond the use of computers, tablets, and smart phones as devices to interact with online communities, or to capture, transfer, and store field media.

In this three-part series, I am going to discuss how working with computational tools could widen the scope of ethnographic work and deepen our practice. I will stay mostly within the domain of data gathering in this first post. In the second post, I will talk about the process of field data interpreting and visualizing; and the last post, I will focus on how the digital may transform ethnographic narrative and argumentation.

In this post, I’d like to foreground computational methodology in thinking about how we as ethnographers may deploy digital tools as we explore communities within and around digital infrastructures. I am particularly interested in how we use these tools to study communities that are digitally organized. How do we use and think about data ethnographically? How does one use computational tools to navigate in digital communities? What are the advantages of leveraging (small) data approaches in doing ethnographic work? While this post is focused on the study of digitally embedded communities, in my later posts, I will speak more broadly about how the digital may extend how we look at communities where face-to-face interactions are central.

From bars to Myspace

The Hsu-nami @ Don Hills Club, NYC, 9/26/2008

When I was writing my dissertation on the experiences of Asian American musicians playing independent rock music, I discovered that most of the musicians that I connected with spent more time online networking and promoting their music, than actually performing, rehearsing, and recording. This shifted the site of my investigation away from the strictly physical, i.e. in clubs, bars, basement parties, coffee shops where musicians hang out, to include sites of digital social media such as Myspace, Twitter, Facebook, G-chat, etc.

In particular, I noticed that Myspace (mid to late 2000s) was a hot spot for social interactions. The musicians in my study used Myspace to extend their peer and fan networks beyond the borders of the United States. Many of them have forged connections with bands who were geographically based in Asia. I began wondering, what do these online communities look like geographically? Where in the world are the Myspace friends of my musician-informants located?

Myspace profile pages of The Kominas’ friends, circa 2009

Building a software tool to gather data

I set out to explore these bands’ digital social terrain beyond what Internet browsers display through leveraging software tools like web-scraping. Web-scraping refers a set of programmatic methods used to extract targeted information from web pages.[i] To extract location information displayed on Myspace profile pages, I created a web-scraper in the form of an Application Programming Interface (API). APIs are, by definition, a set of software components that act as an interface to communication across applications typically based in the web environment (a friendly explanation of API).

An example of an API is Twitter client (for example, Tweetbot, Tweetdeck). Instead of using Twitter via Twitter.com, developers have leveraged the robust Twitter API to make available apps for users to interact with Twitter via a computer, tablet, mobile or smart phone. In the case of my API, I had to build it from scratch through writing the commands in the Ruby scripting language. I used the Mechanize ruby gem to navigate the source code of a series of targeted Myspace pages.

source code of my scraper API

I will take my work with the South Asian American punk band The Kominas as an example. During the period that I was web-scraping, The Kominas had close to 3,000 friends on Myspace. These were all Myspace users who had requested to become friends with The Kominas, or vice versa. My homebrewed API successfully crawled through the profile pages of 2,867 friends of The Kominas on Myspace and parsed the location-specific text in the source code of these pages. Because I planned on mapping these points, I scripted for the API to use the Geokit ruby gem to turn these friend locations into longitude and latitude coordinates.

a sample of data scraped from Myspace profile pages

This software tool allowed me to go beyond the textual and discursive dimensions of collecting field data, a path previously unexplored by academic online participant-observers. With the geographical information that I gathered from the API, I mapped out these friend locations. I will go deeper into this in my next post on data interpretation and visualization. But for now, I will say that having this set of data has allowed me to exact my empirical place-based findings in an ostensibly placeless digital environment. Not only that, it has enabled me to deepen my analysis. From these software findings, I generated further questions that are geographically focused and theoretically interesting around notions of space and space. Juxtaposing my findings from traditional (and physical) participant observation and software explorations, I discovered patterns of social behaviors and cultural meanings that I would not have had access to otherwise.

Discovering boundaries of software spaces

With this API, I was able to reach beyond the user- and consumer-end experiences of technology. Using a computational tool — a machine-based script that communicates with other machines — I was able to explore quite literally the software infrastructures in which my field interactions occur. This became apparent when my API broke while trying to scrape location information from the friends of The Hsu-nami, a New-Jersey-based progressive erhu-rock band that I followed, on Myspace China.

In troubleshooting, I found that Myspace is in fact not as global as it has promised to be. The Myspace user networks of all (of the available) countries in the world exist on a server located in U.S., with the exception of the users of Myspace China. Hosted on a server in China, Myspace China is positioned institutionally apart from the rest of the Myspace networks in “the world.” These institutional and social barriers are reinforced by the software barriers between Myspace China and Myspace (U.S.) where The Hsu-nami’s profile page is hosted.[ii] From this software “observation”, I have gathered enough evidence to argue that there is not one single cyber space, but rather multiple cyber spaces. The Internet is not one giant blob of space. There are borders and boundaries—software- and hardware-dependent—that bind and separate these cyber spaces. And in the case of the Hsu-nami, forging connections with friends in China potentially suggests that ethnic meanings from musical sound and perform may transcend software barriers.

Certain digital communities are more open to software approaches than others. The Myspace community, for instance, is much more closed than Twitter and Facebook. Last.FM, for instance, is built around an open technology that documents each user’s song selection and form a musical taste profile unique to the user. The records of users’ listening patterns are transferred or “scrobbled” to Last.fm’s database. Last.fm make these data available for builders to create APIs; for that reason, it has become a self-proclaimed “social music playground” on which curious programmers and designers to play with (mostly visual) patterns of music listening. One outstanding visualization project that is built upon the Last.fm API is the thesis project of Christopher Adjei and Nils Holland-Cunz. [Here’s a neat video documentation of the application they built]. Unfortunately, none of these studies were ethnographically informed. Their frameworks of analysis are restricted by the domain of software data, and are not integrated with interviews or interaction-based observations.

Visualizing Last.FM / Data chart by Christopher Adjei & Nils Holland-Cunz

Ethnographers with programming skills – why not?

I’m neither a programmer nor was I ever a computer science student. In high school, I dabbled in web design and learned HTML in a science and technology-focused governors school. I know how to code in HTML, CSS, and basic Javascript, and understand how PHP works in a content management system like WordPress. With this level of literacy, I can only build simple websites but can adequately comprehend the source code behind the creation of web infrastructures. My digital literacy has afforded me a critical perspective to approach interactions and communication in the digital environment. Seeing web interactions as they are intricately tied to their infrastructural — both technical and institutional — context has been an integral part of my ethnographic endeavors.

I should also mention that I got into thinking and doing things computationally through the back doors of digital humanities. In grad school, I worked with the awesome folks at the Scholars’ Lab, a hub that trains graduate students in to acquire software skills and digital humanities perspectives at the University of Virginia. At the Scholars’ Lab, under the guidance of humanities-friendly technologists (shoutout to Joe Gilbert!), I learned basic programming just enough to execute what I had envisioned.

Within the community of academic ethnographers, unfortunately, I have not encountered much of any discussion on computational tools like APIs or web extraction. I have seen scholars in computer science, communications, and social sciences apply similar computational methods such as web-crawling in their works.[iii] But I welcome the opportunity of meeting other software-oriented ethnographers or engage in the conversations with those with interests in computational methodology.

Everyone’s doing it, why aren’t we?

It’s worth mentioning that I do traditional field work, capturing performances on my digital audio recording, taking field notes on Twitter [and Storify], interviewing musicians in coffee shops, setting up shows for them, and sharing a stage with them. But with basic computational know-how, both applied and critical, I have had the opportunity to think wildly about what a mixed-method or “multimodal” ethnography means to me. The technology of web extraction, as I illustrated above, has enabled me to accomplish the following:

  • effectively gather relevant data in digital communities
  • reveal the space and boundaries created by software infrastructures
  • recontextualize findings from traditional field methods – in my case, in geographic terms
  • illuminate how physical/geographic intersects with digital

Integrating physical and software field practices has satisfied my thirst as someone who is curious about our contemporary society as it is organized by various digital infrastructures.

Zooming back out a bit, it is not hard to see the relevance of web scraping and other forms of web extraction in the media and tech industry. In fact, data-mining is a pervasive practice for acquiring marketing research data. There are bots everywhere. Everyone knows that Amazon stores our browsing patterns and that user information in social media regularly gets mined as marketing research analytics.

In light of these pervasive data practices, we as ethnographers should think how we too can leverage these technologies to better understand the infrastructural context, thus closing our knowledge gap between (the cultural and social) content and the (technical and institutional) context of our scrutiny.

In my next post, I will talk about how how digital tools could facilitate data interpretation and examination. I’ll focus on mapping as a method to discover and document patterns of place-based observations. Then I will discuss how we can take advantage of the digital capacity to zoom in and out on content so we could deepen our sensory engagement with physical ethnographic materials.

End Notes

[i] The term ‘web-crawling’ is sometimes used synonymously with web-scraping. Typically crawling refers to the technology of extraction all information on the web, similar to the technology of Google search engines. And web-scraping refers to the extraction of specific online information.

[ii] The software disconnection between China and the United States (and the rest of the world) on Myspace is maybe a product of the financial and political relationship between the countries. In order to follow up this inquiry, one could search news stories about company structure and changes of Myspace. For more detail, read David Barboza’s article “Murdoch Is Taking MySpace to China”, April 27, 2007. http://www.nytimes.com/2007/04/27/business/worldbusiness/27myspace.html (accessed on January 13, 2011).

[iii] More on web-crawling as a social science method, read Halavais, A. (2000). National Borders on the World Wide Web. New Media & Society, 2(7), 7-28. doi: 10.1177; Lin, J., Halavais, A., & Zhang, B. Bin. (2007). The Blog Network in America: Blogs as Indicators of Relationships among US Cities. Connections, 27(2),15-23. Retrieved from http://www.insna.org/Connections-Web/Volume27-2/Lin.pdf

Tags: , , , , , , , , , , , , , , , , , ,

20 Responses to “On Digital Ethnography, What do computers have to do with ethnography? (1 of 4)”

  1. November 6, 2012 at 5:11 am #

    This is such an awesome, inspiring post! Thanks for sharing, Wendy!

    • Wendy Hsu
      November 6, 2012 at 11:39 pm #

      Thanks for reading the post, Heather! I would love to hear specific comments if you have any.

      Here’s a prompt: Would you consider using computational tools for your ethnography?

  2. November 9, 2012 at 2:35 am #

    Wendy, I love your research. A lot, so much so that I wish that I could learn how to program. But I think most ethnographer who would even WAnt to try this don’t have programming skills, just as much as many programmers may want to try ethnography may not have ethnography skills. Both have to be learned. So that’s just a massive practical limitation. But have you thought about talking to programmers and working with them also? like say in the CSCW, CHI, or HCI community – or at research institutes like Microsoft Social Computing http://research.microsoft.com/en-us/projects/socialmedia/ to do any collaborations? Not that you wouldn’t be doing any programming, but it would set models and practices for ethnographers to work with programmers. –

    • Wendy Hsu
      November 9, 2012 at 6:46 pm #

      Hey Tricia, Thanks for bringing up your concern about practical limitations. I like your idea about teaming up with programmers. In fact, my project kind of came out of a collaboration. I worked with Joe Gilbert, a technologist with degrees in English and Computer Science, and a love for music (who used to work at a record store!). As a tech mentor, he got me started on the project by giving me a book on Ruby. [most of the book is available online here: http://pine.fm/LearnToProgram/%5D. I went through it, reading and doing the exercises in the book, in a couple of weeks. It got me to the point where I could do really simple programming and begin to understand rudimentary code scripted by others. But I was definitely not coding the API by myself. Joe walked me through the steps of creating a script that makes sense for scraping Myspace. The scripting process — testing and refining it — took a little more than a month (10 hours/week).

      All of this is to say that coding is less daunting than it seems. With a bit of guidance and encouragement, it too can be a skill acquired by non-technologists like most ethnographers. I have been hanging out with digital humanists who do some simple programming themselves. This group of people advocate for a kind of DIY, makers ethos. I agree with them on the power of code, not only as a makers’ tool, but as tool of comprehension. Having a rudimentary understanding of code (like to the extent of having gone through a thin tutorial book) can give us a tremendous insight on how one can more smartly navigate in software environments.

      But you’re right, we don’t have all the time and luxury to pick up a new skill. It would be cool if ethnographers and programmers have more contact, professional or informal. I bet all kinds of creative products would come out of that. I for one have been thinking a lot about the design of a field app that not only gathers, organizes field data (with meta data), but enables an interaction mode for informants to interact with the device through create their own field documents (stories, pictures, videos, maps, etc). Any programmers out there interested in working with me or others on ethnographic projects?

      I should really seek out the possibility of working with programming experts to deepen my projects. I’m hoping that through interacting with EM readers, I will eventually find programmer-collaborators to work with.

      And I am totally willing to share what I know if any of you ethnographers out there is interested in getting started with programming. Drop me a note here or @wendyfhsu.

      • November 9, 2012 at 7:04 pm #

        I want to clarify how the technical process of web scraping a bit here. The question about learning curve for web scraping was brought up by Andrew Asher. Depending on the web data that you’re interested in mining and where it lives, one may be able to find simpler, less scripty-solutions packaged as browser plugins or extensions.

        For instance, for a while, I was using Outwit Hub, a web extraction plugin for FireFox (now available as a standalone program) to scrape performance data listed on Myspace profile pages. There are other pre-packaged solutions out there. I heard from an ethnomusicologist faculty here at Oxy that the Scraper extension for Google Chrome is effective and user-friendly. The Scraper extension exports the scraped data stored in a Google spreadsheet. I should spend some time to explore this tool for sure.

        I decided to go with a more complex scripted solution because I needed to crawl thousands of pages. And the most efficient way to do this is to utilize the Mechanize Ruby gem.

        I hope this clarification helps a little, Andrew. And, I still remember that I need to write a more technical post on web scraping (taking one through the scripting process step-by-step, with Ruby and Xpath) later. I will keep you posted!

      • Andrew Asher
        November 9, 2012 at 7:36 pm #

        Hi Wendy,
        Thanks for posting these resources and more details on how you picked up programming and web scraping. I’m definitely interested if there are are other programmers and ethnographers who are interested in exploring these methods. I’ve also been really interested in the potential of mobile apps for ethnography– especially the potential for doing ethnography at at distance. e.g. If a participant had an app that an ethnographer could push questions out to– something like, “take a film of the event you are at” and then upload to a server for analysis. This would be a really powerful tool for scaling up the amount of data points one person could collect.

        You’ve probably seen this, but here is a Google doc of mobile data collection tools: https://docs.google.com/a/bucknell.edu/spreadsheet/ccc?key=0Akj5_3vVWZ8tdGk4czI4eHcycGo2Y1NnWmhsUjdBTXc#gid=0


        • October 20, 2014 at 8:35 pm #

          Hi Andrew,

          I’m very interested in applying programmatic methods to data collection and analysis in the context of ethnography. I’m just returning to anthropology to do an MA before my PhD after more or less accidentally becoming a full time software developer (coming from a background of after anthropology and communications rather than comp sci).

          I’m particularly interested in the quantified self movement, so I’m potentially looking at a ton of quantitative data in several non-standardized formats.


  3. November 11, 2012 at 4:24 am #

    This is a great post, Wendy! It’s really inspiring me to think about how I can computation in my own work. Is there a particular reason why you use Ruby, and not, say, Python or Java or something else?

    • Wendy Hsu
      November 13, 2012 at 7:47 pm #

      Thanks for your question, Calvin. I used Ruby for this particular project because it was (still is?) the scripting language of choice at the Scholars Lab. I have heard that Ruby is a more “human” language and that its syntax is less mathematical and more intuitive for new programmers.

      I always think about Wayne Graham’s (@wayne_graham) post whenever this “Why Ruby?” question comes up.


      Wayne compares Ruby to PHP though he doesn’t discuss Python and Java. While Wayne’s post is written for people who are already familiar with programming, especially for the parts on rails, I still think that it’s a good place to learn about Ruby. At the end of the post, he provides a number of helpful resources for learning Ruby.

      I don’t entirely get the technical depth of Wayne’s post, but I do find his discussion about learning curve useful. In particular, he illuminates the importance of learning a language that is used and supported by the people in your immediate social circle, physical or virtual. I, for instance, have been thinking about picking up Python because there is a local women-only Python group (PyLadies) that gathers informally in the LA metropolitan area.

      He recognizes that the choice of one’s first scripting language is important, but also notes that once you start to get the gist of it, it is not too hard to translate what you know from one into the other.

      Hope this helps, Calvin. And definitely let me know if you’re up for meeting to geek out on Ruby since you are in the LA area!


      • November 13, 2012 at 8:11 pm #

        I got some super thoughtful comments from Michael Kramer (@dighist) via email. I thought that I would post them here to deepen our conversations:

        Hi Wendy —

        Thanks for sharing this wonderful post with me. It’s inspiring on many levels: DIY programming exploration, collaboration (Scholars Lab is such a great model for other universities), the way in which your ethnographic research turned to issues of space, how your research revealed in a concrete way the infrastructure of the Internet and its relationship to politics and economics at a global level. That’s just a few things I noticed.

        I think the most important line for me in your post was:

        “With this API, I was able to reach beyond the user- and consumer-end experiences of technology. Using a computational tool — a machine-based script that communicates with other machines — I was able to explore quite literally the software infrastructures in which my field interactions occur.”

        That’s a great zinger of an observation.

        Here are a few thoughts:

        1 – Does this kind of digital ethnographic work have a particular relationship to understanding questions of “space” ethnographically, materially, materially, politically. Could you imagine other lines of inquiry that web scraping might might help to develop (time? aesthetics? race? gender? class? ethnicity? the very question of digital experience, what it is and how it relates to other aspects of life?) or do you think there’s something particularly important about digital research and issues of space?

        2 – How might you intertwine your “traditional” modes of ethnography with your digital research? Sounds like you might address this in an upcoming post!

        3 – I’m interested in your use of the word “leverage.” What do you mean by this word exactly. It comes from the business world, yes? How are you adapting it to your scholarship? (ok, I am playing language police a bit here, guilty as charged, but I really mean to ask about what drew you to that word. It’s an interesting kind of crossover move to take it from business world and apply it to your use of digital in ethnographic research).

        4 – Finally, what would Alan Lomax think?!

        Thanks again for sharing this with me. Can’t wait to read the next installments!


  1. Putting people first » Building software to conduct ethnographic research of online communities - November 5, 2012

    […] the first of her three-part guest post series on Tricia Wang’s Ethnography Matters, she shows how ethnographers can use software, and even […]

  2. Take an ethnographic approach to content design | the pointy edge - November 8, 2012

    […] On Digital Ethnography, Part One: What do computers have to do with ethnography? (ethnographymatters.net) […]

  3. On Digital Ethnography, What do computers have to do with ethnography? (Part 1 of 3) [Guest Contributor] « Pedalogica - November 10, 2012

    […] See on ethnographymatters.net […]

  4. Ethnozine: November 2012 (Anniversary) Edition | Ethnography Matters - November 19, 2012

    […] really lucky to have two guest contributors this month. Wendy Hsu writes about the software that she built to gather more qualitative data for her research on independent […]

  5. On Digital Ethnography: mapping as a mode of data discovery (Part 2 of 4) | Ethnography Matters - December 5, 2012

    […] use software programs? Last month’s guest contributor, Wendy Hsu, says YES! In Part 1 of On Digital Ethnography, What do computers have to do with ethnography?, Wendy introduced her process of using computer programming software to collect quantitative data […]

  6. On Digital Ethnography, What do computers have to do with ethnography? (Part 1 of 3) « Transmedia Camp 101 - January 3, 2013

    […] See on ethnographymatters.net […]

  7. On Digital Ethnography, magnifying the materiality of culture (part 3 of 4) | Ethnography Matters - January 30, 2013

    […] Part 1 of On Digital Ethnography, What do computers have to do with ethnography? […]

  8. Usando ferramentas digitais para coleta de dados em etnografia digital | inter.ativida.de - March 4, 2013

    […] Putting People First eu encontrei esta série de textos (texto 1, texto 2, texto 3. Ainda não foi publicada a parte final; por isso, vale ficar de olho.) tratando […]

  9. April 2013: Ethnomining and the combination of qualitative & quantitative data | Ethnography Matters - April 2, 2013

    […] and potentially inspire design in human-computer interaction research. We recently featured a series from Wendy Hsu, an ethnographer who uses data mining and GIS techniques along with ethnographic […]

  10. Ethnography Beyond Text and Print: How the digital can transform ethnographic expressions | Ethnography Matters - December 9, 2013

    […] that are just as “ethnographic” in nature as a traditional ethnographic monograph. The first post in the On Digital Ethnography series called for ethnographers to use computer software, the second […]

Leave a Reply