Ethics of Doing Research in Online Networks: Fellowship post #3

Posted on July 26th, by George Veletsianos in NPS, scholarship, sharing. 3 comments

This is post #3 of my summer Fellowship. The purpose is to examine practices undertaken by academics and educators in networked publics. These practices fall under the general heading of “digital scholarship” and these individuals have been called “digital scholars” or “open scholars.”

Though research in online spaces is gaining increasing acceptance, it has also gained notoriety recently. A story in the Chronicle of Higher Education last week raised ethical concerns about a Harvard study that included about 1,700 Facebook profiles as its data source. The sticking point was the use of data that should have been private, but was considered public.

Online research involving public data (like the one I am conducting for my Fellowship) is becoming increasingly common. The question becomes, what is public and what is private? Can researchers treat all information posted online as public? If it’s public, is consent to use that information for research purposes still required? Offices of Research Support have adopted policies guiding researchers to the ethical use of public data (though as you will see below, these policies do not fit all molds). My institution’s Review Board for example states that the following does not constitute Human Subjects Research:

Public and/or published data sets, accessible without restriction (e.g., password not needed*), and containing readily identifiable information and where individuals can reasonably expect this information to be available to the public (examples include letters to the editor, blogs) (source)

As such, research involving information posted publicly (e.g. on Twitter, YouTube, etc) can be used for research purposes without informed consent. The problem with the Harvard case was that the individuals mining the data were accessing data that were restricted (i.e. not public), and thus should not have been used without first securing informed consent.

The more difficult questions that I’ve been grappling with in my use of public data for research purposes are the following:

  • Let’s assume I have a public twitter account, a researcher downloads a set of status updates for analysis, and I later delete those posts. Does that mean that the researcher can no longer can use it in his/her research? Does that mean that I have “withdrawn my participation”? Or is the data still considered “public” just by virtue of it being public at one point in time?
  • Consider the case where my profile is private and someone whose profile is public re-tweets one of my status updates. Can a researcher archive the public re-tweet and use it in his/her research, even though the tweet originated from a non-public account?

These are important questions to consider. Both academics and students need to equip themselves with a greater understanding of their rights and responsibilities when conducting research in online spaces such as social networking sites.

As far as my data sets are concerned, I’ve gone at great lengths to anonymize and de-identify them (e.g., by rewriting narratives/tweets/etc and having a second researcher check whether the meaning changed and deleting any identifying information). Re-writing narratives is an acceptable, even encouraged, strategy, in various phenomenological circles (e.g., Kuiken 2001 and van Manen, 1997) and in this instance it also serves ethical purposes.

3 thoughts on “Ethics of Doing Research in Online Networks: Fellowship post #3

  2. Great post.

    A further distinction can be made between message boards or on-line services which require the user to have an account in order to log in, make use of the service and view other users’ postings, with services like Facebook where often you would have to be the user’s ‘friend’ in order to access their personal information.

    Surely the former scenario where account and password access is available to all should count as information which is available in the public domain. The user knows that anyone can choose to create an account and read their information thereby relinquishing their control over who reads their data.

    • Thanks for the comment, Ruth. I have to disagree though. It’s not as much a matter of platform, as it is the way one is accessing that information. “Public” information is public without any sort of restrictions that can be retrieved by anyone at any point in time. For instance, you don’t need to sign-up to a service to find public tweets. Or, you don’t need to sign-up to a service to find public facebook posts (e.g., see On the other hand, if I am participating in a forum that requires a username and a password, that information cannot be retrieved *unless* a username and a password is used. In other words, as a user, I am not “relinquishing [my] control over who reads [my] data.” I am allowing others who create an account on the site to read my data, *and* (this is important), I TRUST that they will not share my data outside of that controlled environment, unless I grant explicit permission. To give an example: Let’s say that I enjoy playing chess and that I sign up to a chess playing site that requires login credentials to access. Let’s also assume that I don’t want my colleagues to know about this because they might perceive it as a nerdy thing to do and I don’t want them “seeing” that part of my identity (hence my choice to participate on a chess-playing site that is not public). Wouldn’t it be unethical if someone decides to “out” me as a chess-playing nerd to colleagues who are not participating on the site? On the other hand, if a colleague signs-up to the site and connects with me, I wouldn’t mind if they knew that I enjoy chess, because, presumably, they also enjoy chess, hence the reason they are on the site. If however, they too decide to share that knowledge/information outside of the chess-playing site, then that would also violate my trust.

