This is post #3 of my summer Fellowship. The purpose is to examine practices undertaken by academics and educators in networked publics. These practices fall under the general heading of “digital scholarship” and these individuals have been called “digital scholars” or “open scholars.”
Though research in online spaces is gaining increasing acceptance, it has also gained notoriety recently. A story in the Chronicle of Higher Education last week raised ethical concerns about a Harvard study that included about 1,700 Facebook profiles as its data source. The sticking point was the use of data that should have been private, but was considered public.
Online research involving public data (like the one I am conducting for my Fellowship) is becoming increasingly common. The question becomes, what is public and what is private? Can researchers treat all information posted online as public? If it’s public, is consent to use that information for research purposes still required? Offices of Research Support have adopted policies guiding researchers to the ethical use of public data (though as you will see below, these policies do not fit all molds). My institution’s Review Board for example states that the following does not constitute Human Subjects Research:
Public and/or published data sets, accessible without restriction (e.g., password not needed*), and containing readily identifiable information and where individuals can reasonably expect this information to be available to the public (examples include letters to the editor, blogs) (source)
As such, research involving information posted publicly (e.g. on Twitter, YouTube, etc) can be used for research purposes without informed consent. The problem with the Harvard case was that the individuals mining the data were accessing data that were restricted (i.e. not public), and thus should not have been used without first securing informed consent.
The more difficult questions that I’ve been grappling with in my use of public data for research purposes are the following:
- Let’s assume I have a public twitter account, a researcher downloads a set of status updates for analysis, and I later delete those posts. Does that mean that the researcher can no longer can use it in his/her research? Does that mean that I have “withdrawn my participation”? Or is the data still considered “public” just by virtue of it being public at one point in time?
- Consider the case where my profile is private and someone whose profile is public re-tweets one of my status updates. Can a researcher archive the public re-tweet and use it in his/her research, even though the tweet originated from a non-public account?
These are important questions to consider. Both academics and students need to equip themselves with a greater understanding of their rights and responsibilities when conducting research in online spaces such as social networking sites.
As far as my data sets are concerned, I’ve gone at great lengths to anonymize and de-identify them (e.g., by rewriting narratives/tweets/etc and having a second researcher check whether the meaning changed and deleting any identifying information). Re-writing narratives is an acceptable, even encouraged, strategy, in various phenomenological circles (e.g., Kuiken 2001 and van Manen, 1997) and in this instance it also serves ethical purposes.