The Impact Factor Lives!

You don't have to look hard to find a scientist or an editor disparaging the impact factor.

Certainly, the impact factor is a measure of limited value -- a journal's ratio of citations in one year to its scholarly output over two years -- which is mostly relevant to editors and publishers, but also to librarians purchasing journals and authors selecting journals for submissions. It does not tell you how well a particular paper performed, or how important a particular researcher is. However, by showing us the ratio of citations to scholarly articles for the prior two-year period, it provides a manageable way to measure intellectual diffusion and uptake by a relevant audience -- other published researchers and academics. This number is then trended over time, and provides an interesting framework for measuring uptake and discussing quality.

It is a measure of a journal's punching power.

Over time, it has been extended to include five-year measures, citation graphs, Eigenfactor data, and so forth. But the core metric remains the source of consternation, possibly owing to its enduring power.

Some critics have said it is a bygone measurement because libraries, often purchasing "big deal" bundles, can't use it as a meaningful buying guide anymore. Others say it is moribund because it's so flawed -- in its mathematics, and because it come from a pre-networked computational era. Others point to academia's misappropriation and misuse of it as a reason journals should abandon it. (Interesting aside -- typically, none of these critics will offer an alternative.)

Some of the objections sound downright sophisticated. At a recent meeting, a few prominent academics took issue with it because "it's an average, not a median," and because "it suggests false precision by going to three decimal places." However, a less prosecutorial assessment might lead you to some insights rather than accusations.

The "three decimal places" complaint.
We have to start this discussion with the fact that what began as an idea quickly morphed into a commercial product, and one that has grown especially quickly in the past 20 years as information flows have increased. More information led to a desire to differentiate flows, one from another. A ranking system helps get this done. And, as in any business, choices are made that reinforce viability. Often, these commercial choices are virtuous. They cannot be dismissed simply because they are commercial choices. When managing a business based on a ranking system, these choices mostly revolve around making and maintaining a ranking system that works. 

In this rather sensible context, taking the impact factor to three decimal places makes perfect sense. Why? Imagine trying to sell a valuation scheme that creates a lot of ties in rankings. It's not viable. It doesn't solve the customer's problem -- telling one thing from another, telling which is better, even if the difference is slight or the initial ranking is later modified or trumped by other factors. And when you have thousands of journals, a measure with a few decimal places helps reduce the number of ties in the rankings.

The need for differentiation in a ranking system leads to more precision among the measures. Stock market changes are stated in percentages that go to two decimal places, making them effectively four-decimal-place numbers. The same goes for most web analytics packages, which have percentages out to two decimal places. Universities and most K-12 schools take GPAs out 2-3 decimal places. A great baseball batting average is 0.294 (three decimal places). Most sports' win percentages are pushed out to three decimal places.

The reason for all this precision is simple -- ties make ranking systems far less interesting, useful, and viable.

So it should be no surprise that this was part of the thinking in going out to three decimal places:

. . . reporting to 3 decimal places reduces the number of journals with the identical impact rank. However, it matters very little whether, for example, the impact of JAMA is quotes as 24.8 rather than 24.831.

This last statement was refined usefully in a later paper:

The last statement is inaccurate [quoting as above], and it will be shown . . . that it has a profound effect particularly at the lower frequencies on ordinal rankings by the impact factor, on which most journal evaluations are based.

In other words, avoiding ties helps smaller journals stand alone, and stand out. Is that such a bad thing?

It's not an "average," it's a ratio.
A more objective assessment of the mathematics might also help you avoid calling the impact factor an average (to be fair, ISI/TR describes it as an "average" in its main explanations, which doesn't help). Instead of an average, however, the impact factor is a ratio* -- the ratio of citations in one year to citable objects from the prior two years. It is not the average number of citations. It is not the average number of articles. It is not the average of two ratios. Those would be a different numbers. This is why when some people argue that it should be a median instead of an average, they have a flawed premise. 

Consider this -- the ratio of people over 30 to people under 30 in a group may be stated as 500:400 or 10:8 or 5:4 or 1.25. The number 1.25 only tells you the ratio between the ages. Similarly, an impact factor of 1.250 only tells you the ratio of citations to articles, no average or median included.

What about how skewed it is?
A corollary complaint can be that citations skew heavily to a few dominant papers, a skew which, it is sometimes argued, invalidates the metric. After all, the ratio is not predictive of what each paper will get. (Of course, to argue this, you first have to forget that this is not what the impact factor was designed to calculate -- it is not predictive for authors or papers specifically, but rather a journal-level metric). But would any system that skews to a few big events therefore be invalid?

Perhaps not. There are similar sources of skew in academia, many of which are celebrated. For instance, if a Nobel Prize winner teaches or conducts research at a university, that is often touted as a sign of the quality of that university. Will each professor or post-doc or student at that university achieve the same level of success and win the Nobel Prize? Certainly not. But that's not the point. What these facts illustrate is that the university has an environment capable of producing a Nobel Prize winner. For ambitious students and researchers, that's a strong signal that speaks to their aspirations. 

Even within a career, not every year is as good as every other, and one really good year can make a career. Hit it big in the lab, win a teaching award, publish a great paper, do some great field work, or write an insightful editorial, and a scientist might leap from an obscure university to a top school, a government appointment, or national celebrity status. Does the fact that the next few decades might be lackluster invalidate the notoriety and honors? Certainly not. The accomplishment suggests the levels this person can reach, and that is the source of reputation -- they can reach those levels, and may do so again.

The bottom line is that inferring a promise of future results from past performance in academia is part of how academia works -- it is a culture of reputation. For journals, impact factor is a reasonable and useful measure of reputation (as we'll see below).

The impact factor is not dead.
Even if you were to accept the arguments denigrating technical execution of the impact factor, journals should not abandon it, because it is not a dead metric. In fact, it's quite healthy.

Looking back at various tenures as publisher and advisor to publishers over my career so far, the impact factor has proven to be a responsive metric, reflecting editorial and publishing improvements. You fix things, and it responds. Editors compete harder for papers, get righteous about discerning cutting-edge from "me too" papers, appear more at conferences, twist arms, and so forth. The publishing house does a better job with media placements and awareness campaigns so that more people in the community learn about the new scientific and research findings. In a few years, the impact factor climbs. There is a cause-and-effect that strongly suggests that, from an editorial and publishing perspective, and therefore from a reader and purchasing perspective (and perhaps from an author perspective), the impact factor does a good job reflecting journal vibrancy and importance.

It's said by some critics that instead of looking at impact factor, works should be evaluated on their merits by experts qualified to do so, and everyone would agree with that. What these critics seem to forget is that the editorial practices that generally lead to improvements in impact factors are exactly what is desired -- expert editors and their expert editorial boards working harder and more aggressively to secure the best papers from the scientists doing the most interesting work. These are then evaluated, and a portion of them published. The papers are reviewed on their own merits by experts in the field. The journal is just doing the hard work of making the first-order selections.

Put forth a better editorial effort, and your impact factor generally increases.

Making the field more aware of the good science being published also drives impact factor legitimately. Citation begins with awareness. You can't cite what you don't know about, so using social and traditional media, meetings, SEO, and other ways to build awareness is an important publishing practice. Marry this with better papers that are more interesting and relevant, and you have a winning combination.

The impact factor seems to respond commensurately with these efforts. In some very competitive situations, where the editorial teams are evenly matched and equally competitive, you may only see a stalemate. But in fields where one journal takes the bit in its proverbial teeth while the others chew the proverbial grass, you can see a true performance difference within a fairly short amount of time.

If editors, libraries, readers, and authors had a measure that gave them a good way of quickly assessing the relative punching power of a journal they are considering -- that might show them which journals are headed up, which are headed down, and which are in a dead heat -- and this measure was fairly responsive to sound editorial and publishing practices, you'd suspect they'd want to use it. If it also made it easier to differentiate between smaller, lesser-known journals, that might also be good. And if it had a long track record that seemed to remain valid and provided good context, that might also be valuable.

Which is why it's very likely that, despite all the crepe being hung and predictions of its demise, given its responsiveness to solid editorial and publishing improvements and the signals to the larger library, author, and reader markets it provides, the impact factor . . . well, the impact factor lives . . . lives and breathes.

* Hat tip to BH for pointing out the ratio aspect.