Still using the Hirsch index? Don’t!

“My research: > 185 papers, h-index 40.” A
random quote from a curriculum vitae in the World Wide Web. Sometimes,
researchers love their Hirsch index, better known as the h-index. But what does
the measure actually mean? Is it a reliable indicator of scientific impact?


Our colleagues Ludo Waltman and Nees Jan van
Eck have studied the mathematical and statistical properties of the h-index.
Their conclusion: the h-index can produce inconsistent results. For this
reason, it is actually not the reliable measure of scientific impact that most
users think it is. As a leading scientometric institute, we have therefore
published the advice to all universities, funders, and academies of science to
abandon the use of the h-index as a measure of the overall scientific impact of
researchers or research groups. There are better alternatives. The paper by
Waltman and Van Eck is now available as a preprint
will soon be published by the Journal of the American Society for Information
Science and Technology


The h-index is a measure of a combination of productivity and citation
impact. It is calculated by ordering the number of publications by a particular
researcher on the basis of the total number of citations they have received.
For example, someone who has an h-index of 40 has published at least 40
articles that have each been cited at least 40 times. Moreover, the remaining
articles have not been cited more than 40 times each. The higher the h-index the


The h-index was proposed by physicist Jorge Hirsch in 2005. It was an
immediate hit. Nowadays, there are about 40 variants of the h-index. About one
quarter of all articles published in the main scientometric journals have cited
Hirsch’ article in which he describes the h-index. Even more important has been
the response by scientific researchers using the h-index. The h-index has many
fans, especially in the fields that exchange many citations, such as the
biomedical sciences. The h-index is almost irrresistable because it seems to
enable a simple comparison of the scientific impact of different researchers. Many
institutions have been seduced by the siren call of the h-index. For example,
the Royal Netherlands Academy of Arts and Sciences (KNAW) in the Netherlands inquires
about the value of the h-index in its recent forms for new members. Individual
researchers can look up their h-index based on Google Scholar documents via
Harzing’s website publish or perish. Both economists and computer scientists
have produced a ranking of their field based on the h-index.


Our colleagues Waltman and Van Eck have now shown that the h-index has some
fatal shortcomings. For example, if two researchers with a different h-index
co-author a paper together, it may lead to a reversal of their position in an
h-index based ranking. The same may happen when we compare research groups.
Suppose we have two groups and each member of group A has a higher h-index than
a paired researcher in group B. We would now expect that the h-index of group A
as group is also higher than that of group B. Well, that does not have to be
the case. Please note that we are now speaking of a calculation of the h-index
based on a complete and reliable record of documents and citations. The
problematic nature of the data if one uses Google Scholar as data source is a
different matter. So, even when we have complete and accurate data, the h-index
may produce inconsistent results. Surely, this is not what one wants using the
index for evaluation purposes!


At CWTS, we have therefore drawn the conclusion that the h-index should not be used as measure of scientific
impact in the context of research evaluation.

Limitations of citation analysis

An observation at the CWTS Graduate Course Measuring Science: in most lectures, the presenters emphasize not only how indicators can be constructed, measured, and used, but also under what circumstances they should not be applied. Thed van Leeuwen, for example, showed on the basis of the coverage data of the Web of Science that citation analysis should not be applied in many fields in the humanities and social sciences, and certainly not for evaluation purposes. If the references in scientific articles in the Web of Science are analyzed, there are strong field differences in the extent to which they cite articles that are themselves covered by the Web of Science. In biochemistry this is very high (92 %), whereas in the humanities this drops to below 17 %. Since citation analysis is almost always based on Web of Science data, most relevant data on communication in the humanities is missed by citation analysis. Of course, this is well-known and it is the usual argument in the humanities and social sciences against the application of citation analysis. However, this also has meant that most scholars see CWTS principally as associated with any use of citation analysis. CWTS does currently not have a strong reputation as the source of critique of citation analysis, although it has systematically, at least since 1995, criticized the Impact Factor and has also been very critical of the very popular and equally problematic h-index. Interesting mismatch between practice and reputation?

“Idiocy of impact factors”

Ron de Kloet, professor in medical pharmacology in Leiden and famous for his research on stress, about the journal impact factor in the university weekly Mare (my translation): "In the past, we did not have this complete idiocy around impact numbers". He thinks that those who have to judge scientists on their performance rely too easily on the journal impact factor. "In this way, the journal rather than the researcher is being assessed. And young researchers know that not their individual creativity counts but the visibility of the journal. This can make people obsessed and take away the pleasure in science." Wise words!