Feynman and transients

In 1966-1969 I was active in an experimental particle physics group at Caltech and taught intro physics using the Feynman Lectures on Physics as the textbook, which was a fantastic experience. I often saw Feynman at lunch in the Caltech cafeteria but only once had a substantive discussion with him, which had a long-lasting influence on my physics teaching.

I found myself puzzled about blackbody radiation, a topic in the course. One common way of talking about blackbody radiation is to imagine an oven with a small hole from which radiation escapes, and to imagine the oven walls to be made of material with evenly spaced energy levels that emit similarly quantized photons. But no solid material has such an energy level scheme (the Einstein solid is such a material but is just a highly simplified though useful model), so what’s going on? (The proper approach is to quantize the electromagnetic field, not the emitters.)

I decided to make an appointment to talk with Feynman about this. I was acutely aware that my question was likely to sound hopelessly confused and naive, so I overprepared for the meeting and spoke really fast to try to show that there was an issue. As I expected, initially his face darkened as he wondered why this idiot had been allowed on campus, let alone teaching the curriculum he had created. But by continuing to talk really fast I got far enough to get him intrigued, and he could see that there was an interesting question, and we had an interesting and substantive conversation.

We considered together an astronomically large cloud of atomic hydrogen with some initial energy in the form of atomic excitation. This cloud will emit a line spectrum, not blackbody radiation, yet thermodynamics tells us that eventually the cloud (together with the radiation) will reach thermal equilibrium and the energy distribution of the radiation will be the blackbody continuum. How does the cloud and radiation get from the initial state to this equilibrium state? (It’s not quite an equilibrium state because energy is being radiated away.)

We could see that various processes would alter the initial line spectrum, including doppler effect (from recoil associated with emission) and collisional broadening. So there’s no mystery in the fact that we don’t expect a clean line spectrum to persist. The details of the transient that gets us from initial state to final state may be quite complex, and hard to calculate in detail, but just recognizing that there must be a transient gives a sense of mechanism that is lacking if the final state is presented with no preamble.

Next Feynman supposed that somewhere in the cloud is a speck of dust. (I can no longer remember whether this was special magic dust with evenly spaced energy levels.) Thermodynamics assures us that eventually the hydrogen atoms must come into thermal equilibrium with that speck of dust. Thermodynamics has great power in this respect, but using it alone tends to remove all sense of mechanism.

I believe that this was my first experience of the sense of mechanism that comes from discussing the transient that leads to establishing an equilibrium state or a steady state. In retrospect I think that as a student I was somewhat puzzled by how certain states came into being, but as far as I can remember there was no talk of the transients. Another person who influenced my thinking on this was Mel Steinberg, the creator of the CASTLE electricity curriculum. In the late 1980s I took an AAPT (American Association of Physics Teachers) workshop from him that included desktop experiments with half-farad capacitors (“supercaps”). One of the things he stressed was that the several-second time constant for charging or discharging could be usefully thought of as an observable transient leading to an equilibrium state. This viewpoint in turn influenced the emphasis in Matter & Interactions on the transient that leads to the steady state in DC circuits.

Another influence on both Ruth Chabay and me was doing numerical integrations with a computer, where you gain the strong sense of things happening step by step, not just described in terms of an analytical solution that is a known function of time, which gives little sense of the time evolution of the process.

All of this has had a big effect on the Matter & Interactions curriculum, plus Ruth Chabay’s insight that computational modeling must be a part of introductory physics. Thinking Iteratively is a 30-minute video of a talk we gave at the summer 2014 AAPT meeting which includes examples of these issues.

Bruce Sherwood

Posted in Uncategorized | Leave a comment

A taste of geometric algebra

David Hestenes’ goal for geometric algebra is to subsume under one umbrella many different kinds of mathematics used by scientists and engineers (see the Hestenes web site, and especially his Oersted Medal Lecture). The key to this unification is to provide powerful geometrical representations for all kinds of mathematical topics, many of which are typically learned without a strong connection to a good geometrical representation. Consider the benefit ordinary 3D vectors provide compared to dealing with components individually. Vectors and vector notation not only simplify many mathematical operations but also provide a richer conceptual space for thinking about geometry. Geometric algebra has the same goals and stands in relation to ordinary vectors as vectors stand in relation to operating with components individually.

My goal for this article is to give a small taste of geometric algebra to give a sense of its structure and to illustrate how it can span diverse branches of mathematics that physicists currently study in isolation from each other.

The fundamental entity in geometric algebra is the “multivector” consisting in 3D of four elements: scalar, vector, bivector (a 2D surface with a directed normal), and trivector (a 3D solid). Geometric algebra can also be used in 2D, or in dimensions higher than 3D, but for purposes of a brief introduction we’ll stick with the 3D context. One writes a multivector as a sum: scalar + vector + bivector + trivector. This may look odd, since one is taught that “you can’t add a scalar and a vector,” but note that one often writes a vector in the form x\hat{\imath} + y\hat{\jmath} + z\hat{k} where three very different things are added together. From a computer programming point of view, one might think of a multivector as a heterogeneous list: [scalar, vector, bivector, trivector], with methods for operating on such lists.

Fundamental to geometric algebra is the “geometric product,” where a and b are multivectors. This product is defined in such a way that multiplication is associative, abc = (ab)c = a(bc), but it is not necessarily commutative; ba is not necessarily equal to ab. If a and b are ordinary vectors, the geometric product is a\cdot b + a\wedge b, where a\wedge b is a bivector that is (only in 3D) closely related to the ordinary vector cross product (\wedge is pronounced “wedge”). For vectors a and b the geometric product ba will not be equal to ab if the wedge product is nonzero, since b\wedge a = -a\wedge b.

The dot product a\cdot b (the scalar part of ab) measures how parallel the two vectors are, while a\wedge b (the bivector part of ab) measures how perpendicular they are. Together these two measures provide all the information there is about the relationship between the two vectors and thereby captures important information that neither the dot product nor cross product alone provide. Another way of saying this is that the dot product is the symmetric part of ab and the wedge product is the antisymmetric part of ab.

One way to represent the wedge product of two vectors a\wedge b geometrically is to draw the two vectors tail to tail and make these two vectors the sides of a parallelogram. The area of the parallelogram is the magnitude of the bivector. Compare with the magnitude of the vector cross product |a||b|\sin{\theta} and you’ll see that this is equal to the area of the parallelogram associated with the two vectors.

We’ll investigate some basic aspects of geometric algebra by starting with three ordinary vectors \sigma_1, \sigma_2, and \sigma_3 that are unit vectors in the x, y, and z directions. The geometric product \sigma_1\sigma_1 = 1, because the wedge product of a vector with itself has no area, so the bivector part of \sigma_1\sigma_1 is zero; similarly for the other two unit vectors.

The quantity \sigma_1\sigma_2 = 0 + \sigma_1\wedge\sigma_2 is a unit bivector which can be represented as a 1 by 1 square in the xy plane (the dot product is zero because the two vectors are perpendicular to each other). Similarly \sigma_2\sigma_3 is a unit bivector in the yz plane and \sigma_3\sigma_1 is a unit bivector in the zx plane. The wedge product is antisymmetric, so \sigma_2\sigma_1 = -\sigma_1\sigma_2; similarly for the other unit vectors.

Next, consider the geometric product of these bivectors with the unit vectors, using the fact that the geometric product is associative and that \sigma_2\sigma_1 = -\sigma_1\sigma_2:

\sigma_1\sigma_2\sigma_1 = \sigma_1(\sigma_2\sigma_1) = \sigma_1(-\sigma_1\sigma_2) = -(\sigma_1\sigma_1)\sigma_2 = -\sigma_2
\sigma_1\sigma_2\sigma_2 = \sigma_1(\sigma_2\sigma_2) = \sigma_1

We have similar results for other products of the bivectors and the vectors.

What is \sigma_1\sigma_2\sigma_3? This is a “trivector,” a cube 1 by 1 by 1. Something surprising results if we multiply this unit trivector by itself:

(\sigma_1\sigma_2\sigma_3)(\sigma_1\sigma_2\sigma_3) = \sigma_1\sigma_2\sigma_3\sigma_1(\sigma_2\sigma_3)
= \sigma_1\sigma_2\sigma_3\sigma_1(-\sigma_3\sigma_2)
= -\sigma_1\sigma_2\sigma_3(\sigma_1\sigma_3)\sigma_2
= -\sigma_1\sigma_2\sigma_3(-\sigma_3\sigma_1)\sigma_2
= \sigma_1\sigma_2(\sigma_3\sigma_3)\sigma_1\sigma_2
= \sigma_1\sigma_2\sigma_1\sigma_2
= \sigma_1\sigma_2(\sigma_1\sigma_2)
= \sigma_1\sigma_2(-\sigma_2\sigma_1)
= -\sigma_1\sigma_2\sigma_2\sigma_1
= -\sigma_1(\sigma_2\sigma_2)\sigma_1
= -\sigma_1\sigma_1
= -1

This result justifies identifying the trivector \sigma_1\sigma_2\sigma_3 with the imaginary number i. Now consider this:

(\sigma_1\sigma_2\sigma_3)\sigma_1 = \sigma_1\sigma_2(\sigma_3\sigma_1)

i\sigma_1 = \sigma_1\sigma_2(-\sigma_1\sigma_3)
= -\sigma_1(\sigma_2\sigma_1)\sigma_3)
= -\sigma_1(-\sigma_1\sigma_2)\sigma_3)
= (\sigma_1\sigma_1)\sigma_2\sigma_3
= \sigma_2\sigma_3

The bivector \sigma_2\sigma_3 lies in the yz plane. The standard vector cross product of \sigma_2 and \sigma_3 points in the +x direction, which is \sigma_1. The familiar cross product vector is the normal to the associated bivector (in 3D only), and evidently the bivector is i times the cross product vector. Similarly, you can show that i\sigma_2 = \sigma_3\sigma_1 and i\sigma_3 = \sigma_1\sigma_2. It turns out that bivectors are more useful and better behaved than their “duals,” the cross products. For example, in the old vector world one must sometimes make subtle distinctions between “polar” vectors (the ordinary kind) and “axial” vectors which behave differently under reflection (examples are magnetic field vectors). In geometric algebra there is no such distinction.

When I first saw these relationships among the \sigma_1, \sigma_2, and \sigma_3, I was amazed. As a physics student I was introduced to the 2 by 2 “Pauli spin matrices” used to describe electron spin. The matrices, and their various product and commutation relationships, were taught as something special and particular to quantum mechanical spin systems. I was astonished to find that those 2 by 2 matrices behave exactly like the unit vectors in the geometric algebra context, as discussed above. This is an example of Hestenes’ argument that the mathematical education of physicists fails to bring together diverse branches of mathematics that can be unified in the geometric algebra context.

Another example of a need for unification is that as a physics student one encounters many different schemes for handling rotations. There is a beautiful representation of rotations in geometric algebra. Consider the geometric product abb = a(bb) = a if b is a unit vector. If we write this as (ab)b = a, and consider ab to be a rotation operator, you see that ab can be thought of as a rotor that rotates b into a (there is also scaling if one doesn’t use unit vectors).

For extensive treatments of geometric algebra, see for example the textbooks “Geometric Algebra for Physicists” and “Geometric Algebra for Computer Science.”

Bruce Sherwood

Posted in Uncategorized | 1 Comment

What is Light? What are Radio Waves?

  • A talk given at a Santa Fe Science Cafe, 2013 Jan. 16
  • Abstract of the talk
  • Video of the talk
  • Interview on KSFR radio (15 minutes; choose the 3rd audio option)

The great discovery by Maxwell about 150 years ago of the real nature of light stands as one of the greatest discoveries in all of human history. The goal of this talk was to share with people what light really is, because its nature is not widely understood. I also wanted to demistify “electromagnetic radiation” and “electric fields”, terms that for many people are rather scary due to a lack of understanding of what the terms really mean.

Technical comment for physicists: As a result of preparing and giving this talk, I had a minor insight about the physics of light. A colleague has argued that magnetic fields are merely electric fields seen from a different reference frame. I’ve argued that this isn’t the whole story. I offer several examples that show that magnetic fields are not simply relativistic manifestations of electric fields.

(1) All experiments on electrons to date are consistent with them being true point particles, with zero radius, yet they have a magnetic moment even when at rest. There is no reference frame moving with constant velocity in which the magnetic field of the electron vanishes.

(2) Light consists of electric and magnetic fields that are perpendicular to each other, propagating at the speed of light. There is no physically realizable reference frame in which it is possible to transform away the magnetic field.

(3) Here is my minor recent insight: In the classical wave picture, light is produced by accelerated charges. Because the velocity is constantly changing, there is no constant-velocity reference frame in which the charge is at rest, and in which the magnetic field of the charge vanishes.

Bruce Sherwood

Posted in Uncategorized | Leave a comment

Calculus and formal reasoning in intro physics

A physicist asked me, “One thing I noticed in most recent introductory physics textbooks is the slow disappearance of calculus (integrals and derivatives). Even calculus-based physics now hardly uses any calculus. What is the reason for that?” Here is what I replied:

Concerning calculus, I would say that I’m not sure the situation has actually changed all that much from when I started teaching calculus-based physics in the late 1960s. Looking through a 1960s edition of Halliday and Resnick, I don’t see a big difference from the textbooks of today.

More generally, there is a tendency for older faculty to deplore what they perceive to be a big decline in the mathematical abilities of their students, but my experience is that the students are adequately capable of algebraic manipulation and even calculus manipulation (e.g. they know the evaluation formulas for many cases of derivatives and integrals). What IS however a serious problem, and is perhaps new, is that many students ascribe no meaning to mathematical manipulations. Here is an example that Ruth Chabay and I have seen in our own teaching:

The problem is to find the final kinetic energy. The student uses the Energy Principle to find that K_f = 50 joules. Done, right? No! Next the student uses the mass to determine what the final speed v_f is. Then the student evaluates the expression \frac12 m v_f^2 (and of course finds 50 joules). Now the student feels that the problem is solved, and the answer is 50 joules.

We have reason to believe that what’s going on here is that kinetic energy has no real meaning, rather kinetic energy is the thing you get when you multiply \frac12 times m times the square of v. Until and unless you’ve carried out that particular algebraic manipulation you haven’t evaluated kinetic energy.

Another example: A student missed one of my classes due to illness and actually went to the trouble of coming to my office to ask about what he’d missed, so he was definitely above average. The subject was Chapter 12 on entropy. I showed him an exercise I’d had the class do. Suppose there is some (imaginary) substance for which S = aE^{0.5}. How does the energy depend on the temperature? I asked him to do this problem while I watched. (The solution is that 1/T = dS/dE = 0.5aE^{-0.5}, so E = 0.25a^2T^2.) The student knew the definition 1/T = dS/dE, but he couldn’t even begin the solution. I backed up and backed up until finally I asked him, “If y = ax^{0.5}, what is dy/dx?” He immediately said that dy/dx = 0.5ax^{-0.5}. So I said, okay, now do the problem. He still couldn’t! His problem was that he knew a canned procedure that if you have an x, and there’s an exponent, you put the exponent in front and reduce the exponent by one, and that thing is called “dy/dx” but has no meaning. There is no way to evaluate dS/dE starting from aE^{0.5}, because there is no x, there is no y, and nowhere in calculus is there a thing called dS/dE.

We are convinced that an alarmingly large fraction of engineering and science students ascribe no meaning to mathematical expressions. For these students, algebra and calculus are all syntax and no semantics.

A related issue is the difficulty many students have with formal reasoning, and here there may well be a new problem. It used to be that an engineering or science student would have done a high school geometry course that emphasized formal proofs, but this seems to be no longer the case. Time and again, during class and also in detailed Physics Education Research (PER) interviews with experimental subjects we see students failing to use formal reasoning in the context of long chains of reasoning. An example: Is the force of the vine on Tarzan at the bottom of the swing bigger than, the same as, or smaller than mg? The student determines \vec{p} just before and just after and correctly determines that d\vec{p}/dt points upward. The student concludes correctly that the net force must point upward. The student determines that the vine pulls upward and the Earth pulls downward. The student then says that the force of the vine is equal to mg! Various studies by Ruth Chabay and her PER grad students have led to the conclusion that the students aren’t using formal reasoning, in which each step follows logically from the previous step. Often the students just seize on some irrelevant factor (in this case, probably the compiled knowledge that “forces cancel”).

This problem with formal reasoning may show up most vividly in the Matter & Interactions curriculum, where we want students to carry out analyses by starting from fundamental principles rather than grabbing some secondary or tertiary formula. We can’t help wondering whether the traditional course has come to be formula-based rather than principle-based because faculty recognized a growing inability of students to carry out long chains of reasoning using formal procedures, so the curriculum slowly came to depend more on having students learn lots of formulas and the ability to see which formula to use.

Coming back to calculus, I assert that our textbook has much more calculus in it than the typical calculus-based intro textbook. This may sound odd, since we have had students complain that there’s little or no calculus in our book (we heard this more often from unusually strong students at Carnegie Mellon than at NCSU). The complaint is based on the fact that we introduce and use real calculus in a fundamental way right from the start, but many students do not see that the sum of a large number of small quantities has anything to do with integrals, nor that the ratio of small quantities has anything to do with derivatives. For formula-based students, \Delta \vec{p} = \vec{F}_{\text{net}}\Delta t has nothing to do with calculus, despite our efforts to help them make a link between their calculus course and the physics course.

Bruce Sherwood

Posted in Uncategorized | Tagged | 22 Comments

Quantum entanglement

A non-physicist friend expressed deep puzzlement about measurements at opposite ends of the Universe being somehow linked. Here I describe what I take to be the current state of theoretical and experimental knowledge about the “spooky action at a distance” that bothered Einstein and many other people.

The aspect of quantum mechanics that is pretty widely known and accepted is that small objects (atoms, electrons, nuclei, molecules) have “quantized” properties and that when you go to measure one of these quantized properties you can get various results with various probabilities. For example, an electron can have spin “up” or “down” (counterclockwise or clockwise rotation as seen from above; the Earth as seen from above the North Pole rotates counterclockwise, and we say its spin is up if we take North as “up”). The Earth, being a large classical object, can have any amount of spin (the rate of rotation, currently one rotation per 24 hours). The electron on the other hand always has the same amount of spin, which can be either up or down.

Pass an electron into an apparatus that can measure its spin, and you always always find the same amount of spin, and for electrons not specially prepared you find the spin to be “up” 50% of the time and “down” 50% of the time. (It is possible to build a source of “polarized” electrons which, when passed into the apparatus, always measure “up”, but the typical situation is that you have unpolarized electrons, with 50/50 up/down measures.) It is a fundamental discovery that with a beam of unpolarized electrons it is literally impossible – not just hard, but impossible – to predict whether the spin of any particular electron when measured will be up or down. All you can say is that there is a 50% probability of its spin being up. It’s also possible to prepare a beam of partially polarized electrons, where for example you know that there is a 70% probability of measuring an electron’s spin to be up, but that’s all you know and all you can know.

So much for a review of the probabilistic nature of measuring a quantized property such as spin for a single tiny object. Next for the aspect of quantum mechanics that is less widely appreciated, which has to do with measures on one of a group of tiny objects. A simple case is two electrons that are in a “two-particle state”, where one can speak of a quantized property of the combined two-particle system. For example, in principle it would be possible to prepare a two-electron state with total spin (“angular momentum”) zero, meaning that electron #1 could be up, and electron #2 would be down, or vice versa. As a matter of fact, it was only in the last few decades that experimental physicists learned how to prepare such multiparticle states and make measurements on them, and it is these experiments, together with superb theoretical analyses, that have clarified the issues that worried Einstein. (Actually, most experiments have involved photons rather than electrons, but I’ve chosen the two-electron system as being more concrete in being able to make analogies to the spinning Earth.)

Suppose Carl prepares a zero-spin electron pair and gives one electron to Alice and the other to Bob (Alice and Bob are in fact names used in the scientific literature to help the reader keep straight the two observers.) Alice and Bob keep their electrons in special carrying cases carefully designed not to alter the state of their electron. They get in two warp-speed spaceships and travel to opposite ends of our galaxy or, if one prefers, to opposite ends of the Universe (if the Universe has ends). Many years later, Alice measures the state of her electron and finds that its spin is up (there’s an arrow pointing up on her carrying case indicating what will be called “up”, and a similar arrow pointing up on Bob’s carrying case). If Bob measures his electron, he will definitely find its spin to be down.

One might reasonably interpret these observations something like this: Carl happened to give Alice an “up” electron and (necessarily) gave Bob a “down” electron. There was a 50/50 chance of giving Alice an up electron, and this time Carl happened to give her an up electron. Then of course no matter how long Alice waits before measuring her electron, she’s going to find that it is “up”, and no matter how long he waits Bob is going to find that his electron is “down”. Yes, there are probabilities involved, because neither Carl nor Alice knows the spin of the electron until Alice makes her measurement, but the electron obviously “had” an up spin all the time.

The amazing truth about the Universe is that this reasonable, common-sense view has been shown to be false! The world doesn’t actually work this way!

Thanks to major theoretical and experimental work over the last few decades, we know for certain that until Alice makes her measurement, her electron remains in a special quantum-mechanical state which is referred to as a “superposition of states” – that her electron is simultaneously in a state of spin up AND a state of spin down. This idea is very hard to accept. Einstein never did accept it. In a famous paper in the 1930s, he and a couple of colleagues proposed experiments of this kind and, because quantum mechanics predicts that the state of Alice’s electron will remain in a suspended animation of superposed states, concluded that quantum mechanics must be wrong or at least incomplete. It took several decades of hard work before experimental physicists were able to carry out ingenious experiments of this kind and were able to prove conclusively that, despite the implausibility of the predictions of quantum mechanics, quantum mechanics correctly describes the way the world works.

I find it both ironic and funny that Einstein’s qualms led him to propose experiments for which he quite reasonably expected quantum mechanics to be shown to be wrong or incomplete, only for it to turn out that these experiments show that the “unreasonable” description of nature provided by quantum mechanics is in fact correct. These aspects of quantum entanglement aren’t mere scientific curiosities. They lie at the heart of work being done to implement quantum computing and quantum encryption.

What about relativity, and that nothing can travel faster than light? Not a problem, actually. The key point is that Alice cannot send any useful information to Bob. She cannot control whether her measurement of her electron will be up or down. Once she makes her “up” measurement, she knows that Bob will get a “down” measurement, but so what? And all Bob knows when he makes his down measurement is that Alice will make an up measurement. To send a message, Alice would have to choose to make her electron be up or down, as a signal to Bob, but the act of forcing her electron into an up or down state destroys the two-electron “entangled” state.

I recommend a delightful popular science book on this, from which I learned a lot, “The Dance of the Photons” by Anton Zeilinger. Zeilinger heads a powerful experimental quantum mechanics group in Vienna that has made stunning advances in our understanding of the nature of reality in the context of quantum mechanics. In this book he makes the ideas come alive. The book includes detailed discussions of Bell’s inequalities and much else (Bell was a theoretical physicist whose analyses stimulated experimentalists to design and carry out the key experiments in recent decades).

It seems highly likely that Zeilinger will get the Nobel Prize for the work he and his group have done. A charming feature of the book is that Zeilinger is very generous in giving credit to many others working in this fascinating field. Incidentally, there is some movement in the physics community to bring contemporary quantum mechanics into the physics major’s curriculum, which in the past has been dominated by stuff from the 1920s.

Bruce Sherwood

Posted in Uncategorized | 15 Comments

The Feynman Lectures as textbook

As a young professor at Caltech I was assigned to teach intro physics (1966-1969), and I had the great good fortune to teach intro physics using the “Feynman Lectures on Physics” as the textbook. This experience had a huge impact on me. One of the effects was that I left experimental particle physics to work on university-level physics education, first in the PLATO computer-based education project at UIUC, and later at Carnegie Mellon and NCSU. Ruth Chabay used the Feynman book in an undergraduate course at the University of Chicago, and it was a major influence on us in the writing of the “Matter & Interactions” textbook.

Several times at public gatherings of physicists I have heard the claim that the Feynman course at Caltech was a failure, and I have always seized the opportunity to rebut these claims from my own experience. One of the things I’ve pointed out is that at the time he gave the original lectures there was no textbook, nor were there problems keyed to the lectures, whereas by the time I lectured in the course there was a lot of infrastructure, including the book. I also point out that in a traditional intro course students don’t understand everything, and ask the audience whether it is better to understand part of a traditional textbook or part of Feynman. In my judgment the course was a success in the late 60’s at Caltech, not a failure. When I moved to UIUC in 1969, I judged that it would have worked in an honors course there.

Matthew Sands with Robert Leighton translated Feynman’s unique spoken word into print. In his memoir “Capturing the Wisdom of Feynman”, Physics Today, April 2005, page 49, Sands provided confirmation for my own viewpoint. Feynman’s own assessment in the preface that it was a failure has helped perpetrate the notion that it didn’t work, and I was glad to learn from Sands’ memoir that this was off-the-cuff, not a carefully considered judgment. Moreover, Feynman’s view was not shared by others who were involved in teaching the course. As he says in the preface, “My own point of view — which, however does not seem to be shared by most of the people who worked with the students — is pessimistic”.

Kip Thorne has written some commentary on the history of the Lectures:


Lawrence Krauss’s excellent scientific biography “Quantum Man: Richard Feynman’s Life in Science” also discusses the Feynman course, and what he says is consistent with the views of Sands and me.

Bruce Sherwood

Posted in Uncategorized | 2 Comments

The Higgs boson and prediction in science

An aspect of the discovery of the Higgs boson to celebrate is the possibility of prediction in science — in this case, the prediction that a certain particle should exist so that the world can behave as it does, and even a prediction of its approximate mass, which made it possible to design an accelerator (the Large Hadron Collider) that could accelerate protons to a high enough energy to be able, in collisions with nuclei, to produce the predicted particle if it has the predicted mass. The accelerator was built, the experimentalists looked, and they found something at the right mass. They will study the reactions that particle has, and the ways it decays (falls apart into other particles), to try to pin down whether it in fact has the right properties besides the right mass to be the Higgs boson.

There are some other examples of predicting and finding previously unsuspected particles.

In the 1860s, building on preliminary, partial work by others, Mendeleev was able to bring order to all the known elements in his famous periodic table. Moreover, he correctly interpreted holes in his table as representing elements that had yet to be discovered. For example, he not only predicted the existence of germanium but also predicted its approximate atomic weight and chemical properties, and he was right. In all, he correctly predicted 8 elements that were unknown at the time. The Wikipedia article shows his debt to the ancient Sanskrit grammarian Panini, who had recognized similar kinds of order among the sounds of human speech.

At the time, no one, including Mendeleev, had any way to explain the ordering of the elements made manifest in the periodic table. It was 40 years later that Rutherford and his coworkers discovered that atoms consist of a tiny, extremely dense positively-charged core (the “nucleus”) surrounded by negatively charged electrons (which had recently been discovered by Thompson). A few years later experiments showed that the order of elements in the periodic table simply reflects the number of electrons in the atom (1 for hydrogen, 2 for helium, etc.).

In 1928 Dirac created the famous “Dirac equation”, constituting a version of quantum mechanics that is consistent with special relativity (the earlier Schrodinger equation is not consistent with relativity, though it remains useful in the nonrelativistic limit). An odd feature of the Dirac equation was its prediction of electron-like particles with negative energy, which led Dirac with some reluctance to predict the discovery of an “anti-electron”, an electron-like object with positive charge. The positron was soon found by Carl Anderson at Caltech, with the predicted properties.

In the 1920s there was a puzzle in “beta decay”, in which a nucleus emits an electron (and metamorphoses into a nucleus with one additional positive charge; see my post on neutron decay). The puzzle was that the energies of the parent and daughter nuclei were known (from their masses) to be fixed quantities, but the electron was observed to have a broad range of energies, not simply the difference of the two nuclear energies. This was an apparent violation of the well established principle of energy conservation. There were suggestions that perhaps energy is not conserved in nuclear interactions, but Pauli could not accept that. In 1930 he proposed that the electron is not the only particle emitted in beta decay, that there is also another particle emitted but not observed. This implied that the unseen particle must have no electric charge, as otherwise it would be easily detected, and in fact it must also not interact with nuclei through the “strong interaction”, because again this would make the neutrino easily detectable. Also, the maximum observed energy of the electron was experimentally found to be about equal to the energy difference of the parent and daughter nuclei, which implied that the unseen particle must have very little mass. Pauli had predicted what is now called the neutrino, with specific properties: no electric charge, very small mass, no strong interactions.

The neutrino was observed directly only much later, in 1956, when Reines and Cowan placed detectors behind thick shielding next to a nuclear reactor at the Svannah River Plant in South Carolina. Neutrino reactions are very rare, but the flux of neutrinos was so large that occasionally the experimenters observed “weak” interactions of the neutrinos with matter. The properties of the neutrino matched Pauli’s predictions.

In the early 1960s Gell-Mann and Ne’eman independently were able to classify the large zoo of “elementary” particles into groups of octets and decuplets. There was a decuplet (of 10 particles) arranged in a triangle, like bowling pins, in which the particle at the point was unknown. Gell-Mann was able not only to predict the existence of this particle, which he called the Omega-minus, but he also predicted its charge and mass. A hunt for the Omega-minus was successful, and it had the predicted properties.

As was the case with Mendeleev’s periodic table, at first there was no explanation for the “why” of octet and decuplet groupings of the known particles. Soon however Gell-Mann and Zweig independently proposed that each “baryon” (heavy) particle was made of 3 “quarks” with unusual fractional electric charges, and each “meson” was made of a quark and antiquark. At first somewhat controversial, intense experimental work and closely related theoretical work by Feynman made it clear that the quark model does indeed explain the “periodic table of the particles”.

The creation of antiprotons occurred in a context very similar to the creation of the Higgs boson. The Berkeley Bevatron was a particle accelerator built in 1954, designed to accelerate protons to an energy sufficient to produce antiprotons if, as everyone predicted, the antiproton would have the same mass as a proton (but negative charge). This design criterion was similar to the design consideration for the Large Hadron Collider, that of acclerating protons to an energy large enough to create Higgs bosons. Because by 1954 many particles were known to have antiparticle partners, it was not a surprise when antiprotons were indeed produced by the Bevatron.

I’ve listed some major predictions that were successful. However, it seems to me that “postdiction” is more common. For example, no one predicted that the rings of Saturn can be braided. When spacecraft first returned closeups of the rings, scientists were startled to see braided rings. A lot of work went into understanding these unusual structures (the key turned out to be the role of small “shepherd” moons).

Bruce Sherwood

Posted in Uncategorized | Leave a comment