The different lists of university rankings have attracted increasing attention because of their potential as a weapon in the increasingly fierce global competition between universities. A university that is confronted with a lower position in the rankings has to provide a plausible explanation. And universities that are placed on a higher place in the list naturally celebrate this. Let us take a look at the Netherlands. A few weeks ago, the Leiden Ranking produced by CWTS was good news for the Erasmus University (EUR) in Rotterdam. They were placed as the 6th university in Europe. The university immediately published an advertisement in the national newspapers to congratulate its researchers with this leading position in the Netherlands. The advertisement had the facts right, but it emphasized the criterion that puts the EUR highest (number 6 in the list of 100 largest European universities): the number of citations per publication. This indicator is favorable for universities with large medical faculties and hospitals, because these are large research fields with on average much more references and citations than, for example, in the technical sciences or in philosophy. And it matters which universities are used as the relevant group to rank. Using the same indicator of citations per paper puts the EUR number 9 among the 250 largest European universities, because 3 smaller universities appear in the top before even Oxford and Cambridge. Still a very good score and still number 1 of the Netherlands in this ranking. But how does it look when we use other indicators? CWTS now uses two different indicators to take field differences into account. How does the EUR score in these lists? The traditional CWTS "crown indicator" puts the EUR on number 8 among the 100 largest and number 14 among the 250 largest European universities. The improved CWTS indicator gives the EUR a score of 11 among the 100 largest and 15 among the 250 largest universities in Europe. In all these cases, the EUR is highest among the Dutch universities. If size is taken into account in combination with quality, however, the University of Utrecht has the highest score in the Netherlands (nr. 8) and the EUR ends on position 20, after Utrecht and the University of Amsterdam.
So what is the lesson here? First, ranking is a pretty complicated affair because there are many ways to rank universities. Rankings simplify these comparisons of many different dimensions. The universities are forced to build on this and reduce this complexity even further. This is facilitated by the fact that the different rankings produce different results. It enables universities to choose the most favorable ranking. It also enables universities to debunk a ranking by pointing to other results in the other rankings or even debunk the ranking as such by showing contradictions among ranking results. However, this does not disempower these rankings. As Richard Griffiths (professor of social and economic history in Leiden) stated two weeks ago in the university weekly Mare: "Such a list can be a pile of junk, but it is best not to be in the bottom of the pile." Universities are therefore also discussing to what extent mergers can help to improve their ranking scores. For example, it might be profitable for a technical university to be coupled to a large academic hospital.
Not only individual universities are actively engaged in the debate about rankings, the same holds for associations of universities. The Dutch university association VSNU concluded from the Times Higher Education Supplement (THES) ranking that the Netherlands is the fifth best academic country in the world. As science journalist Martijn van Calmthout wrote in De Volkskrant: this requires some creativity because the Netherlands as a whole does no longer belong to the world top (which does not mean that there are no fields where Dutch researchers belong to the best performers in the world). No Dutch university belongs to the 100 best universities in this ranking (which uses a very different set of indicators from the Leiden Ranking, see the next blog post). In fact, the Dutch universities group together pretty close. Their relative position depends on the indicator used. Leiden scores highest when external funding is the main criterion in the THES ranking. And Shanghai puts Utrecht highest (number 50 in the world list) followed by Leiden (at 70). How significant are the differences among the Dutch universities actually?
The differences between the different rankings creates a drive to keep producing new indicators to capture aspects and dimensions of quality that are not measured satisfactorily in the existing ones. This cannot go on endlessly. It may be time to take the perverse effects of this one-dimensional ranking more seriously. One way is to further develop truly multi-dimensional indicators, another to investigate the underlying properties of indicators more thoroughly, and a third to take the limits of indicators more seriously, especially in science policy. Will it be possible to combine these three strategies?
In the last two weeks, several new university rankings were published. Since universities are facing ever tougher competition, their placement in university rankings becomes increasingly important. So, I’ll spend a couple of blogs on rankings, how the lists are constructed, and what one needs to take into consideration in their interpretation. It struck me that the business of ranking has become more sophisticated over the years. Now that rankings are an instrument for universities in the competition for resources, researchers and students, the competition between them is also increasing. This can work to increase the quality of these rankings, on the other hand it might also promote an overly simple interpretation. Ranking is a complicated business, because it means that a complex phenomenon such as quality, which is by definition composed of many independent dimensions, is reduced to a one-dimensional list. The attraction of rankings is exactly this reduction of reality to an ordered list in which one’s position is unambiguous. This also means that ranking is an inherently problematic business. For example, a university may have high quality teaching as its core mission. This means this university may not score high in a ranking that does not really take teaching into account. In other words, if one wants to evaluate the performance of an institution, one should take into account its mission. It would still be a difficult task to squeeze the complex network of performances of institutions into a simple ordered list. And perhaps we should abstain from ordered lists as such, and develop a completely new form of presentation of performance data. The importance of university missions and the fact that quality is a complex phenomenon that has many different aspects, is central in a European research project lead by CHEPS in which CWTS also participates. This project may produce a new way of monitoring university performance. But for now, we are stuck with one-dimensional rankings. There are five different university rankings that are commonly used, and I will spend a blog on each of them in the course of this week. These are: the Times Higher Education Supplement ranking, the QS ranking (a spin-off of the THES ranking), the Leiden ranking produced by CWTS, the Shanghai ranking, and the somewhat lesser-known Web of World Universities ranking. In the next blog, I’ll discuss how rankings are being used by universities, then I will discuss each ranking in more detail, to conclude with some ideas about the future of rankings.
An observation at the CWTS Graduate Course Measuring Science: in most lectures, the presenters emphasize not only how indicators can be constructed, measured, and used, but also under what circumstances they should not be applied. Thed van Leeuwen, for example, showed on the basis of the coverage data of the Web of Science that citation analysis should not be applied in many fields in the humanities and social sciences, and certainly not for evaluation purposes. If the references in scientific articles in the Web of Science are analyzed, there are strong field differences in the extent to which they cite articles that are themselves covered by the Web of Science. In biochemistry this is very high (92 %), whereas in the humanities this drops to below 17 %. Since citation analysis is almost always based on Web of Science data, most relevant data on communication in the humanities is missed by citation analysis. Of course, this is well-known and it is the usual argument in the humanities and social sciences against the application of citation analysis. However, this also has meant that most scholars see CWTS principally as associated with any use of citation analysis. CWTS does currently not have a strong reputation as the source of critique of citation analysis, although it has systematically, at least since 1995, criticized the Impact Factor and has also been very critical of the very popular and equally problematic h-index. Interesting mismatch between practice and reputation?
Ron de Kloet, professor in medical pharmacology in Leiden and famous for his research on stress, about the journal impact factor in the university weekly Mare (my translation): "In the past, we did not have this complete idiocy around impact numbers". He thinks that those who have to judge scientists on their performance rely too easily on the journal impact factor. "In this way, the journal rather than the researcher is being assessed. And young researchers know that not their individual creativity counts but the visibility of the journal. This can make people obsessed and take away the pleasure in science." Wise words!
Yesterday, the annual Graduate Course Measuring Science started here at CWTS. 24 PhD students and professionals from the information industry (publishers and software houses) are taking week-long a crash course in bibliometrics and scientometrics. Virtually all researchers at CWTS are teaching one or more slots, which gives the students the unique opportunity to get a firm grip on the field from a variety of angles and perspectives. For me, this is a convenient way of immersing myself in the way scientometrics is being done at CWTS and to look at the various methodological debates in the field from the perspective of CWTS. First impression yesterday: the students were bombarded with quite a lot of data and empirical findings, which they seemed to take up calmly. No furious debates yet. But it was only the opening day, so who knows? I am going to discuss the work on modelling the peer review system today, let us see how this goes.
At the STI conference 2010 my colleagues Andrea Scharnhorst, Krzysztof Suchecki from the Virtual Knowledge Studio and I presented our work in progress on modeling the peer review system. The basic idea is simple: is it possible to model the peer review system as if it were a computer game such as Simcity? We followed a strategy where we try to make the model as simple and stupid as possible. So, iniitally we are not trying to mimic reality, but to set up an extremely simplified model of how peer review works in science and academia. Our model consists of two populations: researchers and journals. The researchers have two different roles: they are authors of scientific papers and they are reviewers who judge the quality of scientific papers written by other researchers. Each researcher has her own specific behaviour and the same holds for the journals. The trick of the model is that we incorporated a simulation of quality control, using multi-dimensional vectors. This is extracted from what we know how peer review works. Bascially, reviewers are comparing what they perceive of the work in different dimensions (such as the quality of writing, the images, the statistical reliability, how interesting the quesions are, etc.) with what they perceive as the required quality. We assume that this expected quality relates to the quality of the work that the researcher produces herself. The project is in an early stage, and we are now in the process of writing it up for a proper first publication, mainly on the methodology. At the conference we presented the following poster, that contains more details (I posted it on my Facebook account since this blog software system is apparently not able to process images unless they are very small):
The Erasmus University has opened the new academic year last week by embracing Open Access for all its research publications. From 1 January 2011, it will be obligatory for researchers at the university to deposit their publications, after peer review and corrections, in the institutional repository RePub. The repository staff will take care of web based storage and accessibility in accordance with the specific requirements of the publisher of the research article. According to the Rector Magnificus of Rotterdam, prof. Henk Schmidt, the university aims to make a big leap forward in open access. "Research has made clear that Open Access publications lead to an increase in the number of citations of scientific work". He emphasized that open access is desirable from both a societal and a scientific point of view. The step by the Erasmus University clearly also has the potential to make academic work that has a different form from an article in the traditional research journals more visible and citeable.
Recently, I read Daniel Kehlmann’s ficitonal history about Alexander von Humboldt and Carl Friedrich Gauss, Die Vermessung der Welt. intriguing way to write history of science, because it enables the author to insert internal dialogues which are actually quite plausible, yet by definition unproveable. The two characters are quite different and perhaps symbolize the two basic modalities in quantitative research, recognizable also within the field of scientometrics. Alexander von Humboldt is the outgoing guy, travelling the whole world. He is interested in the particulars of objects, collects huge amounts of birds, stones, insects, plants and describes their characteristics meticulously . Gauss, on the other hand, wants to stay home and thinks about the mathematical properties of the universe. He is interested in the fundamentals of mathematical operations and suspects that they can shed light on the structure of reality. In scientometrics, these two different attitudes come together but never without a fight. Building indicators means thinking through both the mathematical properties of indicators, because this directly affects the question of what the indicator is actually supposed to measure. In technical terms, the validity of the indicator. One also needs other types of insight to understand the validity, such as about what researchers are actually doing in their day to day routines, but a firm grip on the mathematical structure of indicators is indispensable. At the same time, the other attitude is also required. Von Humboldt’s interest in statistical description gives insight into the range of phenomena that one can describe with a particular indicator. A good scientometric group, in other words, needs both people like Gauss and people like Von Humboldt. And indeed, both types are present at CWTS. Let us see how the interactions between them will stimulate new fundamental research in scientometrics and indicator building.
The book has also some interesting observations about the obsession of the key actors for measuring the world and the universe. When Alexander von Humboldt travels through South America, he meets a priest Father Zea, who is sceptical about his expedition. He suspects that space is actually created by the people trying to measure space. He mocks Von Humboldt and reminds him of the time "when the things were not yet used to being measured". in that past, three stones were not yet equal to three leaves and fifteen grams of earth were not yet the same weight as fifteen grams of peas. Interesting idea of the things that need to get used to being measured, especially now that we are tagging our natural and social environments increasingly with RFID tags, social networking sites and smart phone applications such as Layar which adds a virtual reality layer of information to your current location. Later in the book, Gauss adds to this by pondering that his work in surveying (which he did for the money) did not only measure the land, but created a new reality by this act of measuring. Before, there had been only trees, moss, stones, and grass. After his work, a network of lines, angles, and numbers had been added to this. Gauss wondered whether Von Humboldt would be able to understand this.