Wednesday, January 31, 2018

Physics Facts and Figures

Physics is old. Together with astronomy, it’s the oldest scientific discipline. And the age shows. Compared to other scientific areas, physics is a slowly growing field. I learned this from a 2010 paper by Larsen and van Ins. The authors counted the number of publications per scientific areas. In physics, the number of publications grows at an annual rate of 3.8%. This means it currently takes 18 years for the body of physics literature to double. For comparison, the growth rate for publications in electric engineering and technology is 9% (7.5%) and has a doubling time of 8 years (9.6 years).

The total number of scientific papers closely tracks the total number of authors, irrespective of discipline. The relation between the two can be approximately fit by a power law, so that the number of papers is equal to the number of authors to the power of β. But this number, β, turns out to be field-specific, which I learned from a more recent paper: “Allometric Scaling in Scientific Fields” by Dong et al.

In mathematics the exponent β is close to one, which means that the number of papers increases linearly with the number of authors. In physics, the exponent is smaller than one, approximately 0.877. And not only this, it has been decreasing in the last ten years or so. This means we are seeing here diminishing returns: More physicists result in a less than proportional growth of output.

Figure 2 from Dong et al, Scientometrics 112, 1 (2017) 583.
β measures is the exponent by which the number of papers
scales with the number of authors. 
The paper also found some fun facts. For example, a few sub-fields of physics are statistical outliers in that their researchers produce more than the average number papers. Dong et al quantified this by a statistical measure that unfortunately doesn’t have an easy interpretation. Either way, they offer a ranking of the most productive sub-fields in physics which is (in order):

(1) Physics of black holes, (2) Cosmology, (3) Classical general relativity, (4) Quantum information (5) Matter waves (6) Quantum mechanics (7) Quantum field theory in curved space time (8) general theory and models of magnetic ordering (9) Theories and models of many electron systems (10) Quantum gravity.

Isn’t it interesting that this closely matches the fields that tend to attract media attention?

Another interesting piece of information that I found in the Dong et al paper is that in all sub-fields the exponent relating the numbers of citations with the number of authors is larger than one, approximately 1.1. This means that on the average the more people work in a sub-field, the more citation they receive. I think this is relevant information for anyone who wants to make sense of citation indices.

A third paper that I found very insightful to understand the research dynamics in physics is “A Century of Physics” by Sinatra et al. Among other things, they analyzed the frequency by which sub-fields of physics reference to their own or other sub-fields. The most self-referential sub-fields, they conclude, are nuclear physics and the physics of elementary particles and fields.

Papers from these two sub-fields also have by far the lowest expected “ultimate impact” which the authors define as the typical number of citations a paper attracts over its lifetime, where the lifetime is the typical number of years in which the paper attracts citations (see figure below). In nuclear physics (labelled NP in figure) and and particle physics (EPF), the interest of papers is short-term and the overall impact remains low. By this measure, the category with the highest impact is electromagnetism, optics, acoustics, heat transfer, classical mechanics and fluid dynamics (labeled EOAHCF).

Figure 3 e from Sinatra et al, Nature Physics 11, 791–796 (2015).

A final graph from the Sinatra et al paper which I want to draw your attention to is the productivity of physicists. As we saw earlier, the total number of papers normalized to the total number of authors is somewhat below 1 and has been falling in the recent decade. However, if you look at the number of papers per author, you find that it has been sharply rising since the early 1990s, ie, basically ever since there was email.

Figure 1 e from Sinatra et al, Nature Physics 11, 791–796 (2015)

This means that the reason physicists seem so much more productive today than when you were young is that they collaborate more. And maybe it’s not so surprising because there is a strong incentive for that: If you and I both write a paper, we both have one paper. But if we agree to co-author each other’s paper, we’ll both have two. I don’t mean to accuse scientists of deliberate gaming, but it’s obvious that accounting for papers by the number puts single-authors at a disadvantage.

So this is what physics is, in 2018. An ageing field that doesn’t want to accept its dwindling relevance.

23 comments:

auke said...

"This means it currently takes 18 years for the body of physics literature to double."

A missed opportunity to say that physics literature has a half-life of -18 years!

Sylvia Wenmackers said...

I don't see your final conclusion follow from the facts and figures that you discussed. Experimental work gets more challenging as time progresses: we're out of falling apples and low hanging fruit, so better collaborate to get to new physics (at higher energies, better resolution, etc.). Fewer new papers based on better collaboration isn't a "diminishing return"!
Since this effect overlaps with that of easier international communication and possibly pernicious publication incentives, it seems we need more data to tell which is which.

Sabine Hossenfelder said...

Sylvia,

Yes, as I said, physics is an old discipline, the easy things have been done, progress slows down etc etc

The final line isn't a conclusion, merely my interpretation.

The "diminishing returns" is a phrase from the paper; it just refers to the sub-linear relation.

Of course people collaborate more because that exploits so-far unused potential. If you think I am saying that's a bad thing, you assign opinions to me that I don't hold and didn't express. I am merely pointing out that the incentive to add co-authors to increase the number of publications exists. What we are seeing in the data is almost certainly a combination of both.

Uncle Al said...

Empirically sterile theory desperately cherry-picks, aberrantly fantasizes, and furthers its own propagation:

https://www.scientificamerican.com/article/missing-neutrons-may-lead-a-secret-life-as-dark-matter/

Nonperformance commonality points to correction. Beautiful fundamental symmetries (Noether) fail. Equivocate emergent gauge symmetries (arXiv:1710.01791). Emergent symmetry geometric chirality (ugly matrices!) is physics' common mode failure. It is quantifiable in existing apparatus as chemistry. Look, ending damnation with relief.

http://www.mpsd.mpg.de/59126/Research
... The artillery needs quantitatively better ammunition.

Bill said...

I'm unsure what field "physcis" falls into from Figure 2.

Sabine Hossenfelder said...

Bill,

Figure 2? You mean the second figure in the blogpost? All of the dots are sub-fields of physics. EPF stands for "elementary particles and fields" and NP stands for Nuclear Physics. You can download a pdf of the paper here, in which you find what the other abbreviations mean.

Potato Joe said...

If I may, I believe Bill is referring to fig 2 from the Dong team al paper (sorry, can’t get Bevis outa my head "heh heh, he said Dong").

The second set of data is labelled with the typo "Physcis (subsets)."

Dan Carney said...

"(1) Physics of black holes, (2) Cosmology, (3) Classical general relativity, (4) Quantum information (5) Matter waves (6) Quantum mechanics (7) Quantum field theory in curved space time (8) general theory and models of magnetic ordering (9) Theories and models of many electron systems (10) Quantum gravity."

What kind of terrible false 10-chotomy is that? ;) Even if the papers are allowed multiple classifications, this kind of list seems pretty bogus.

That said, the rest of this seems interesting and useful to think about for a bit. I like the long-term impact thing--do you think that could be converted into a useful metric for funding and hiring purposes?

neo said...

one solution i've proposed is that top physics departments deliberate create hire and train non-string QG researchers. i.e researchers in LQG CDT AS EG etc

a concrete implementation would be for every university that has a string theory group, universities like princeton harvard yale stanford, MIT they also establish a LQG/LQC, CDT, AS group, MOND as QG, with funding phd's post docs grad students, even undergrad courses

What do you think about this proposal? Is there a particular candidate QG that you personally favor and think is most promising, and should other universities hire faculty who specialize in it?

Louis Tagliaferro said...

I see a lot of this article being relevant to the article you posted on December 12th and the need to be published due to Academia’s reward structure. Perhaps, we also are reaching human limits to understand and explain the Universe beyond the physics we now know? Either way I think many would agree there are only so many minds capable of practicing in the field at the level required. How many of those minds feel they will be better rewarded by applying their talents elsewhere?

Arun said...

What is the effect of the very large number of authors on an accelerator experiment paper?

Jeff said...

"...its dwindling relevance."

Thank you. Your writing reflects my own feelings, which I usually avoid posting because I'm not a physicist. It seems to me that the spread of untestable ideas like the multiverse is in the end self-correcting. If key members of the discipline choose to champion ideas that are "ineffective"--i.e., ideas that are widely viewed as being disconnected from the physical world--then the discipline will wither from lack of interested. Right now I think it's a real concern.

Sabine Hossenfelder said...

Dan,

What makes you say the list is "bogus"?

Tobias Kosub said...

Please differ.

Exp and Theo physics are so different in all the respects discussed in this blog.

One thing i have witnessed throughout the last 10 years or so:

experimentalists now have good training in working with large data, in automization and in simulations. they didn't have this before when computer natives were still too young. at that time this kind of support was sometimes provided through experts on the Theo side of the physics.

but now my feeling is that the lot of theorists have become too detached from experiments even in graspable field like cond mat.

therefore the realm of theo is pushed out beyond experiments, which is indeed a very difficult situation for all concurrent purely theoretical fields of physics.

Dan Carney said...

Sabine, I just mean: take a random paper--lets say the AMPS paper--and try to apply that classification. It easily has four or five of those classifications. So I'm pretty skeptical of drawing conclusions based on a scheme like that, especially if papers only got one tag. If papers were allowed multiple classifications, then some categories will get overcounted, etc.--if I'm being too unkind to the authors' methodology, I apologize, these seem like pretty basic issues to me.

On the one hand I'm just complaining that ascribing very blurry categories to things, making statistical statements about the counts and then claiming that the study is "quantitative". Perhaps I have been forced to listen to too many psychology talks. But more seriously, I think it's similar to, eg, arxiv classifications. It seems to me that the choice to put something in hep-th, vs say gr-qc or quant-ph, is often at least as much a "political" issue as it is about the actual subject matter in a paper.

Sabine Hossenfelder said...

Dan,

These tags are PACS classifications. In the paper they use a sample from PRL of which all papers have such classifications. A paper can (and usually does) have several tags. The AMPS paper is not in this sample. It would almost certainly have been tagged as black hole physics though.

Topher said...

With respect to experimental particle physics, I think a lot of these statistics can be explained by the particular challenges we're facing rather than the maturity or dysfunctionality of the field. It's highly self-referential mostly because we have the Particle Data Group that nicely summarizes and averages all the individual measurements. If you're not in that sub-field, you are more interested in pdg.lbl.gov than you are in any one particular paper.

But it's unfortunate that this summary is so effective. If each measurement just went to refining some esoteric statistic that no one in their right mind should care about, you should say that we're wasting time and resources on trivia. But the real motivation behind almost all of these papers has been a frustrated attempt to reveal some behavior that's at odds with predictions of the standard model. So individually, each and every paper in this field is uninteresting. But the overall lesson is really very interesting and even unexpected, that we can't yet find at high energies any evidence for new phenomena. It's a sobering result, and one that does not make for good press.

Peter Erwin said...

Some comments:

The oldest science is arguably astronomy, yes. The next oldest is probably medicine, followed by mathematics. (There are discussions of both medicine and mathematics in ancient Egyptian and Babylonian records, but nothing that would qualify as "physics".) Physics per se dates to the Classical Greek period, and is roughly similar in age to non-medical biology (aka "natural history"); for example, Aristotle serves as a key founding figure for both.

So by the simplistic "age = scholarly decrepitude" argument, astronomy should probably have ground to a halt several centuries ago, followed by mathematics and medicine, and physics should be doing better than any of them. But no one is claiming, for example, that medical and biological research is slowing down dramatically.


A final graph from the Sinatra et al paper which I want to draw your attention to is the productivity of physicists. As we saw earlier, the total number of papers normalized to the total number of authors is somewhat below 1 and has been falling in the recent decade. However, if you look at the number of papers per author, you find that it has been sharply rising since the early 1990s, ie, basically ever since there was email.

A minor point: You've got the references to the lines in that figure a bit mixed up: "number of papers per author" is the red line (roughly constant), which is the same as "total number of papers normalized to the total number of authors". What's been rising very recently is "productivity", which is "number of papers co-authored by each physicist".

I'll also note that the Sinatra et al. (2015) article explicitly disagrees with your assessment of physics. E.g., from the first page: "Note, however, that the growth rate of physics is indistinguishable from the growth of science in general. Hence, the field’s exponential growth is not driven by paradigm changes, but by societal needs, and capped by access to resources... Once again, the recent [post-1970] slowdown is not unique to physics, but characterizes the whole scientific literature contained in WoS."

Kaleberg said...

Did the LHC Higg's paper with 5,154 authors affect the statistics? (I doubt it, but there could be are more "shared facility" papers with large numbers of authors.)

Sabine Hossenfelder said...

Kaleberg,

Which statistics? Almost all analysis of paper statistics exclude large collaborations. One would really need entirely different tools to analyse those.

metaplec said...

I believe different mechanism's are driving down the exponent in physics vs math. If you open a typical ArXiv page, you see that most preprints have one or two authors, roughly equally divided between the two. Math tends to be a solitary activity, driving to lower number of authors, but also difficult to get published, tending to lower number of papers.
In experimental physics, you have the ridiculously expensive areas like elementary particle, nuclear, fusion, and now gravitational. With 1k author per paper that will drive the exponent down. Then you have areas like AMO and condensed matter. These typically have five authors. A well established group can put out a bunch of papers from a single experimental setup. But, there are a LOT of AMO and condensed matter groups, so competition for money is fierce. Often these groups will go through dry periods of little funding and work will grind to a halt. When graduates students leave, they take their lab knowledge with them, and the group has to rediscover the wheel when new students come in. This drives the exponent down for small groups.
On the other hand, in physics theory, the barrier to publication is low, so it doesn't have the math problem, and the resources required are low, so it doesn't have the physics experimental problem. So physics theory should drive the exponent up.

Unknown said...

It is obvious that one should divide credit for a paper by the number of authors. So if you collaborate with a co-author on your two papers or write them alone, is exactly the same, 2 times 0.5 or 1 paper credit. However your coauthor will cite both of your half credit papers but he won't cite yours. So twenty authors are great for your CI. Of course everybody is gaming this system as hard as they can until they have tenure. By that time they got used to it.

Seth Thatcher said...

This is interesting and is a new way to look at Physics. Each new piece of knowledge in Physics is hard fought to produce due to its age and the depths already plumbed. Still I am optimistic that physicists are on the verge of a huge breakthrough. A revolution in thinking is gathering steam. This is the most noble area of scientific study due to its profound implications about existence.