On the Popularity of Open Access Journals
In this post I demonstrate several points that I have been playing with over the years. On the one hand, the post takes a simple concept (the popularity of academic journals) and attempts to rethink it in the context of the digital, interconnected space. On the other hand, it demonstrates the power of the “cloud” and the opportunities provided by posting information in online spaces that are accessible via standardized formats (such as XML). The posting also serves as an example of what kinds of opportunities mashups can provide to universities/education. And finally, I just wanted to learn how to remix data via online services :)
As you may have seen in my previous posting, we collected a list of all the open access online journals that we could find that are focused on publishing educational technology research. While having the list online in an open spreadsheet format allows anyone interested to update it, it also allows us to manipulate and remix the data. As a simple example, consider the issue of journal rankings. I’ve seen it debated on ITForum, on twitter, at the University of Minnesota where I did my PhD, and at the University of Manchester where I currently work. The issue is that “top tier” journals are good for tenure, but there are debates on what constitutes “top tier.” Is it readership? Rejection rates? Quality? Citations? All the above? I could link to a few different resources here, but the only one I will refer interested readers to is the European Science Foundation ERIH listings that I personally use as a guide.
My intention in this post is to rank the online open access journals according to “popularity.” As I see the rolling eyes through the tubes of the internet, let me say that popularity in this case refers to the number of sites that link to a particular page. Higher numbers denote more inbound links (= higher popularity). If you want to see the popularity metrics without reading the details of how this was done, the end result (that is generated every time you click on the link) is available on this page. At the time of writing, the least linked-to journal had 0 inbound links and the most linked-to journal had 31,534 links.
To be fair (or, “a word of caution”): The popularity index is not without it’s faults. Popularity doesn’t mean quality or even readership. The number of inbound links can be easily manipulated. The measure leaves our RSS subscriptions and number of individuals receiving TOC alerts. Also, inbound links carry equal weight regardless of where they come from. Another issue relates to journals changing URLs. For example, the Journal of Computer-Mediated Communication used to be hosted an Indiana University but is now part of the Wiley InterScience group (and is still open access). Also, the URL we used to link to a journal might not be the most appropriate one. To fully understand and see the problems with this method, one has to dive under the hood of the whole process, and that’s what I am doing next.
The implementation in detail
The journal URLs are posted in a google spreadsheet that allows data to exist online in a variety of formats (e.g. csv and html files). Those files can then be read into Yahoo Pipes (essentially, a drag-and-drop mashup tool). Once Yahoo pipes has a list of journal URLs, those URLs are send through the Yahoo Site Explorer API which generates “information about the pages linking to a particular page or pages within a domain.” That information includes the magic numbers used in this exercise (i.e. the number of pages linking to a particular journal via its url). Once the numbers are generated, Yahoo Pipes exports them as an RSS feed. That feed can then be imported back to a Google Spreadsheet. And that’s it. Whenever a journal url is added to the spreadsheet, the pipe generates a popularity number for it without anyone needing to do anything. A new journal appears? No problem, just add the url and its inbound links will be counted automatically. If you want the full details, feel free to grab the actual yahoo pipe that does all the work and clone it (at this point I should thank Mat Morisson and Tony Hirst, whose postings on yahoo pipes and online data manipulation helped me rethink how I was doing this). If you don’t have a yahoo account and are interested in how the implementation looks, the image at the top of this post is the actual pipe created.
A final word of caution :)
This is not a valid method to decide where to send your next paper :). Yet, as I see more and more conversations online about open access (e.g., BJET published an editorial on the topic on Aug 12, 2009) and alternative ways to evaluate ones contribution to his/her chosen field, this simple example may ignite ideas for evaluating journal contributions (in the UK at least the issue of journal impact is currently being debated as we await the transformation of the Research Assessment Exercise). Also, the ranking is less interesting to me than the implications behind our ability to remix available data to think about journal “impact”. Finally, if you are managing an online open access journal and you feel that the URL used is not representative of where users link to, please feel free to correct the url by visiting the original listing. If we used an erroneous link, we apologize and we thank you for helping us correct it.