EMF in a solenoid

A colleague identified a puzzle concerning Faraday’s law. A circular loop at the center of a long solenoid with time-varying magnetic field will run a current due to the emf, which is the path integral of the (curly) non-Coulomb electric field. Put another loop beside the centered one and the puzzle is that the non-Coulomb electric field must be in the same direction in both loops where the two loops nearly touch, yet current will run in the same direction in both (clockwise or counterclockwise). Here’s a solution to the puzzle.

In the figure, the dashed circle is centered in a solenoid in which there is a time-varying spatially-uniform magnetic field that is increasing with time. By symmetry, the non-Coulomb (NC) electric field is tangent to the dashed circle as indicated by the green arrows. By “non-Coulomb” is meant that this is a curly electric field associated with a time-varying magnetic field, not a non-curly “Coulomb” electric field associated with stationary charges.

At larger radii, E_{NC} increases linearly with increasing radius as indicated by the green arrows to the right of the dashed circle, as can be seen from this calculation, where k is a constant:

2\pi rE = \pi r^2 dB/dt

E = kr

The emf around the dashed circle, the path integral of the electric field, is

\text{emf}_1 = (2\pi r_1)(kr_1)

If we were to put a metal ring with resistance per unit length b where the dashed circle is, there would be a current in the ring of amount

I_1 = \text{emf}_1/(2\pi r_1 b) = E_1L_1/(2\pi r_1 b) = (k r_1)(2\pi r_1)/(2\pi r_1 b) = kr_1/b

Next consider the path marked by a blue line, which has been constructed to enclose the same amount of area as is enclosed by the dashed circle:

\pi r_1^2 = \dfrac{1}{4}(\pi r_2^2 - \pi r_1^2)

\dfrac{5}{4}r_1^2 = \dfrac{1}{4}r_2^2

r_2 = \sqrt{5}r_1

Because the two paths enclose the same area, with dB/dt the same everywhere, the emf must be the same around both paths. Moving counterclockwise from the lower left corner of the second path, we can calculate the path integral of the non-Coulomb electric field, and we do indeed find the same emf as that for the circular path:

\text{emf}_2 = 0 + (kr_2)(2\pi r_2)/4 + 0 + (-kr_1)(2\pi r_1)/4

\text{emf}_2 = 2\pi(kr_2^2 - kr_1^2)/4

\text{emf}_2 = 2\pi(5kr_1^2 - kr_1^2)/4 = 2\pi kr_1^2

\text{emf}_2 = (2\pi r_1)(kr_1) = \text{emf}_1

If we place a metal wire along the outer path, with the same resistance b per unit length, the current I_2 will be less than I_1 because the path L_2 is longer than the path L_1 (2\pi r_1):

L_2 = 2(r_2-r_1) + \dfrac{2\pi r_2}{4} + \dfrac{2\pi r_1}{4} = 1.20L_1

With the same emf along the longer length of wire, I_2 = I_1/1.20.

An important difference between the two configurations of wire is the distribution of polarization charges on the surface of the wire. In the circular case, electrons are accelerated inward along their circular path, so the wire must be polarized transverse to the wire, with the outer part of the wire charged negatively and the inner part charged positively, to provide an outward-pointing Coulomb electric field that accelerates the electrons inward. This transverse polarization is extremely small because in a good metal conductor the drift speed (and therefore the centripetal acceleration v^2/r) is extremely small.

The situation is much more complex when a wire follows the outer path. Conservation of charge in the steady state, with a wire of constant cross section and uniform resistivity, means that the current I_2 must be the same all along the path. This in turn means that the net electric field, non-Coulomb field plus Coulomb field due to surface charges, must have the same magnitude all along the path, as is indicated by the orange arrows in the diagram. These arrows are drawn slightly shorter than the arrows drawn on the circular path, because the emf is the same but the wire length is 1.20 as long.

The non-Coulomb electric field E_\text{NC} polarizes the metal, making the upper left section of the wire positively charged and the lower right section negatively charged. These polarization charges are the source of a Coulomb field E_\text{C} which, when added to the non-Coulomb field, makes a net electric field that has constant magnitude around the path, corresponding to the constant current around the path. The Coulomb field can be visualized qualitatively by thinking about what Coulomb field is required to add to the non-Coulomb field to give the indicated net field E_\text{net}.

Note that the path integral of the Coulomb field E_\text{C} will be zero, because the electric field made by charges has zero curl. That part of the net field that is the non-Coulomb field is solely responsible for the non-zero path integral that is the emf.

For clarity in the diagram the arrows representing electric field are drawn just outside the paths, but in the presence of a wire they represent the electric field inside the wire; surface charges will contribute to non-zero Coulomb field outside the wire with components perpendicular to the wire.

Bruce Sherwood

Posted in Uncategorized | Leave a comment

A time line for VPython development

Here is a time line for the development of VPython, an extension to the Python programming language that makes it unusually easy to generate navigable real-time 3D animations (vpython.org).

1997: While at Carnegie Mellon, after writing a volume on introductory electricity and magnetism, Ruth Chabay and I teach introductory “modern mechanics” for the first time, including having the students program computational models, using the cT language I had created, something a bit like a good Basic, with (2D) graphics built-in, but running in a windowing environment on Unix workstations, Macintosh, and Windows (cT overview). cT was based on the TUTOR language of the PLATO computer-based education system.

1998: We have a remarkable student in our mechanics class, David Scherer. While in high school he led a team of his friends to create a 3D game that later won a national prize. He’s intrigued that cT has allowed students to write computational models that work on all platforms, but he glimpses a more powerful approach that would support 3D.

2000: We abandon cT, and in the spring Scherer creates VPython, with Ruth and me deeply involved in design and testing. Many powerful programmers have no interest in or patience for novice programmers, but Scherer saw it as an interesting challenge how to make programmatic 3D animations accessible to novices. His answer is to make real-time navigable 3D animations a side effect of computations, lifting a huge task from the shoulders of the novice. Of course this is also a huge benefit to sophisticated programmers as well. The original version of VPython is now called “Classic” VPython. It requires installing Python, the “visual” module, and an improved program editor based on the IDLE editor that comes with Python. In the fall of 2000 we start having students use VPython to do computational modeling in our course.

2002-2006: Jonathan Brandmeyer, an engineering students at NCSU, makes major contributions to VPython 3. He introduces the use of the C++ Boost libraries to glue the core of VPython, implemented in threaded C++ code, to those components written in Python, and builds autoconfigurable installers for Linux. In the 16 year history of VPython only three people made major contributions to the complex C++ code, Scherer, Brandmeyer, and me.

2008: Scherer, having sold his first software company and thinking about what to do next, and I work on VPython 5. Jonathan Brandmeyer provided support in VPython 4beta for opacity, local lighting, and textures, and made some important architectural changes, but had to stop work on the project before it was completed. Further development led to API changes that were incompatible with the VPython 4beta release, so there was no version 4.

2011: Kadir Haldenbilen, a retired IBM engineer in Turkey, and I collaborate to create the 3D text object and the extrusion object for VPython 5.

2011: Ruth and I learn about WebGL from very knowledgeable computer colleagues in Santa Fe. WebGL is a 3D library built into modern browsers. I poke at it and see that it is quite difficult to use, like its big sister OpenGL used by Classic VPython, but I realize that the VPython API provides a model for making WebGL accessible to novices. I mock up a demo and show it to Scherer, who at the time was CEO of his second major software company. He’s intrigued and in a couple of months puts together the glowscript.org site (it’s a Google App Engine application), solves the problems of operator overloading (to permit adding vectors A and B as A+B) and synchronous code (such as perpetual loops), neither of which is native to JavaScript. He does all this as a project where he can see progress, as relief from his work at FoundationDB where the extreme difficulty of solving once and for all the problems of distributed databases has been getting him frustrated. After setting up GlowScript he leaves, and since then I’ve been developing GlowScript. GlowScript programs are written in JavaScript, as are the GlowScript libraries.

2013: Release of VPython 6, based on wxPython, which was initiated by me in June 2012 to address the serious problem that the Carbon programming framework for the Mac will no longer be supported. Major contributions to the release were made by Steve Spicklemire, a physics professor at the University of Indianapolis.

Late 2014: Thanks to learning from Salvatore di Dio, a programmer in France, about the RapydScript Python-to-JavaScript transpiler, I’m able to make it possible for GlowScript users to write their programs using the VPython API, somewhat modified due to the very different environment (browser, GPU). It is also around this time that John Coady, a programmer in Vancouver, implements the Classic VPython API in pure Python, in the IPython environment, in which your Python program runs on a local server and sends data to your browser, where his JavaScript program acts on the data to display the 3D animation in a Jupyter notebook cell (the Jupyter notebook in the browser is similar to a Mathematica or Sage notebook). He uses the GlowScript libraries to render the 3D images. The advantage of this Jupyter implementation in comparison with GlowScript VPython is that you’re writing real Python, not the necessarily imperfect RapydScript representation of Python, and you have access to the large universe of Python modules, which are not accessible from within a JavaScript-based browser.

Fall 2015: Some institutions using our textbook, including Georgia Tech, report switching from Classic VPython to GlowScript VPython, and note with surprise how much more enthusiastic students are about using VPython now that they don’t have to install anything. In contrast, Classic VPython requires the installation of Python, the installation of the visual module, and, on the Mac, installation of an update to Tcl. This can be daunting and can fail for non-obvious reaons. The use of GlowScript VPython rises rapidly; here is a graph of usage vs. time.

January 2016: Coady, Ruth and I, and several well-known physics education colleagues (all of them users of our textbook and of VPython) publish a document on the further evolution of VPython, in which we announce abandonment of the 16-year-old Classic VPython in favor of the GlowScript and Juptyer versions. Here is that document, detailing our reasons.

January-September 2016: In collaboration with Coady, Ruth and I modify and complete Jupyter VPython to use the GlowScript VPython API instead of the Classic API that Coady had started with, because it is much better suited to the distributed nature of the Jupyter environment. Steve Spicklemeire and Matthew Craig, a physics professor at Minnesota State University Moorhead, contribute mechanisms for creating pip and conda installers. Here are demo programs running in Jupyter notebooks.

July 2016: 3D text object implemented in GlowScript, with major contributions from Kadir Haldenbilen. I complete the GlowScript implementation of the extrusion object.

February 2017: I make the 3D text object and extrusion object available in Jupyter VPython.

June 2017: In response to requests from users, I release a version of the Python module vpython that can run outside the Jupyter notebook environment. The module detects whether the program is running in the notebook, and if not, it sets up http and websocket server mechanisms that display the 3D animations in a browser page. This makes it possible to work in IDLE or Spyder or other environment that can launch Python programs. I thank John Coady for providing helpful advice, Matt Craig for new installers, and Ruth Chabay for useful discussions.

Bruce Sherwood

Posted in Uncategorized | Leave a comment

Pseudowork and real work

I have a story to tell about pseudowork, the integral of a force along the displacement of the center of mass, which is different from the true work done by a force on a system, which must be calculated as the integral of the force along the displacement of the point of application of that force. If the system deforms or rotates, the work done by a force may be different from the pseudowork done by that force. For example, stretch a spring by pulling to the left on the left end and to the right on the right end. The center of mass of the spring does not move, so the pseudowork done by each force is zero, whereas the real work done by each force is positive. Because the total pseudowork is zero (which can also be thought of as the integral of the net force through the displacement of the center of mass), the translational kinetic energy of the spring does not change (more generally, the work-energy theorem for a point particle shows that the change in translational kinetic energy is equal to the total pseudowork). Because the total work done on the spring is positive, the internal energy of the spring increases.

In 1971 in the context of the big PLATO computer-based education project at UIUC I had several physics grad students working with me to develop a PLATO-based mechanics course. They and I each picked an important/difficult mechanics topic and started writing tutorials on the topics. Lynell Cannell was assigned energy and I became concerned that she was the only member of the group not making progress. I was about to have a talk with her about this when she came to me to say that she was hung up on a simple case.

She said, “Suppose you push a block across the floor at constant speed. The net force (your push and the opposing friction force) is zero, so choosing the block as the system no work is done, yet the block’s temperature rises, so the internal energy is increasing. I’m very confused.” I said, “Oh, I can explain this. You just, uh, well, you see, uh…..I have no idea.”

We went and talked to Jim Smith, an older physicist very interested in education, very smart, and a good mentor for my then-young self. Jim had thought it through and explained the facts of life to us, with a micro/meso model of the deformations that occur at the contact points on the underside of the block, such that the work done on the block is different from the pseudowork done on the block.

I got very interested in the matter and fleshed out Jim’s insight in more and more detail, but when I showed my analyses to physics colleagues they weren’t having any. Finally I decided to send my paper to AJP (the American Journal of Physics), and the reviewers rejected it. One reviewer said, “Sherwood applies Newton’s 2nd law to a car, which is illegitimate, because a car isn’t a point particle.” I sent it to The Physics Teacher, and the editor replied that he wouldn’t even send it out to reviewers because the physics was so obviously completely wrong.

I asked AJP for an editorial review, and the reluctant response by an associate editor was, “Well, I guess Sherwood is right….but that’s not how we teach this subject!” Finally, in 1983, AJP did reluctantly print the paper “Pseudowork and real work” which you’ll find on my website. This was the first half of the original paper. The second half, applying the theory to the case of friction, “Work and heat transfer in the presence of sliding friction” (also available on my web site), was published jointly in 1984 with William Bernard, because AJP had received a related paper from Bernard and put the two of us in contact with each other.

At that time there had been some short articles in AJP on the topic, but there hadn’t been a longer article on all the aspects. In fact, given physicist resistance to the truth, Bernard was engaged in a war of attrition, sending short articles to AJP on various aspects of the problem, trying to build up to the full story. Nor had there been any article on friction.

The grand old man of Physics Education Research (PER), Arnold Arons, was a fan of my first paper and summarized it in his books on how to teach intro physics (A Guide to Introductory Physics Teaching, 1990, and Teaching Introductory Physics, 1997).. Even he however was quite skittish about the friction analysis, in large part because he was strenuously opposed to mentioning atoms in the intro physics course, for philosophical reasons. Arons tried to explain the pseudowork issue to his friend Cliff Schwartz, the editor of The Physics Teacher, but he never succeeded; Schwartz remained forever convinced that this was all massively wrong.

After the papers were published, in 1983 I wrote to Halliday and Resnick about the matter, emphasizing that their textbook was certainly not alone in mishandling the energetics of deformable systems. I got a nice letter back from Halliday which said about their book, “Let me say at once that we are well aware of its serious flaws, along precisely the lines that you describe. We have tried several times to patch things up in successive printings but the matter runs too deep for anything but a total rewrite. We have, in fact, such a rewrite at hand, awaiting a possible next edition.” I have the impression that this major rewrite never occurred, as I don’t know of an edition that fully addresses the issues. It is amusing that Ruth Chabay and I were given the 2014 AAPT Halliday and Resnick Award for Excellence in Undergraduate Teaching (here is a video of our talk on the occasion, dealing with thinking iteratively).

Most textbooks make major errors in the energetics of deformable systems, or simply ignore the issues. A few textbooks have a brief section on related matters, but as Halliday discerned, handling the physics correctly requires significant revisions throughout introductory mechanics. Since the early 1980s there have been many good articles about these matters in AJP, with little impact on the teaching of introductory physics. In 2008 John Jewett published a solid five-part tutorial on the subject in The Physics Teacher.

In my original articles the analysis is couched in terms of the two different integrals, for work and for pseudowork. We found that even strong Carnegie Mellon students had difficulty distinguishing between these two very similar-looking integrals. So eventually we changed our textbook to emphasize two different systems (point-particle and extended) instead of two different integrals. The distinction between the two systems is more vivid than the subtle distinction between the two integrals.

The point-particle model of a system has the mass of the system that is modeled as an extended system and that moves along the same path as the center of mass of the extended system. The change in kinetic energy of the point-particle model is given by the integral of the net force acting at the location of the point particle, and this is equal to the change in the translational kinetic energy of the extended system. The change in the total energy of the extended system is equal to the sum of the integrals of each force along the path of its point of attachment to the system.

Here is a video of an apparatus that shows the effects. Two pucks are pulled with the same net force, but one is pulled from the center and doesn’t rotate, whereas the other puck has the string wound around the disk, and it rotates. Somewhat surprisingly, the two pucks move together, but in fact the Momentum Principle guarantees that the centers of mass of the two pucks must move in the same way if the same net force is applied. Here is a computer visualization of the situation.

Bruce Sherwood

Posted in Uncategorized | 2 Comments

Feynman and transients

In 1966-1969 I was active in an experimental particle physics group at Caltech and taught intro physics using the Feynman Lectures on Physics as the textbook, which was a fantastic experience. I often saw Feynman at lunch in the Caltech cafeteria but only once had a substantive discussion with him, which had a long-lasting influence on my physics teaching.

I found myself puzzled about blackbody radiation, a topic in the course. One common way of talking about blackbody radiation is to imagine an oven with a small hole from which radiation escapes, and to imagine the oven walls to be made of material with evenly spaced energy levels that emit similarly quantized photons. But no solid material has such an energy level scheme (the Einstein solid is such a material but is just a highly simplified though useful model), so what’s going on? (The proper approach is to quantize the electromagnetic field, not the emitters.)

I decided to make an appointment to talk with Feynman about this. I was acutely aware that my question was likely to sound hopelessly confused and naive, so I overprepared for the meeting and spoke really fast to try to show that there was an issue. As I expected, initially his face darkened as he wondered why this idiot had been allowed on campus, let alone teaching the curriculum he had created. But by continuing to talk really fast I got far enough to get him intrigued, and he could see that there was an interesting question, and we had an interesting and substantive conversation.

We considered together an astronomically large cloud of atomic hydrogen with some initial energy in the form of atomic excitation. This cloud will emit a line spectrum, not blackbody radiation, yet thermodynamics tells us that eventually the cloud (together with the radiation) will reach thermal equilibrium and the energy distribution of the radiation will be the blackbody continuum. How does the cloud and radiation get from the initial state to this equilibrium state? (It’s not quite an equilibrium state because energy is being radiated away.)

We could see that various processes would alter the initial line spectrum, including doppler effect (from recoil associated with emission) and collisional broadening. So there’s no mystery in the fact that we don’t expect a clean line spectrum to persist. The details of the transient that gets us from initial state to final state may be quite complex, and hard to calculate in detail, but just recognizing that there must be a transient gives a sense of mechanism that is lacking if the final state is presented with no preamble.

Next Feynman supposed that somewhere in the cloud is a speck of dust. (I can no longer remember whether this was special magic dust with evenly spaced energy levels.) Thermodynamics assures us that eventually the hydrogen atoms must come into thermal equilibrium with that speck of dust. Thermodynamics has great power in this respect, but using it alone tends to remove all sense of mechanism.

I believe that this was my first experience of the sense of mechanism that comes from discussing the transient that leads to establishing an equilibrium state or a steady state. In retrospect I think that as a student I was somewhat puzzled by how certain states came into being, but as far as I can remember there was no talk of the transients. Another person who influenced my thinking on this was Mel Steinberg, the creator of the CASTLE electricity curriculum. In the late 1980s I took an AAPT (American Association of Physics Teachers) workshop from him that included desktop experiments with half-farad capacitors (“supercaps”). One of the things he stressed was that the several-second time constant for charging or discharging could be usefully thought of as an observable transient leading to an equilibrium state. This viewpoint in turn influenced the emphasis in Matter & Interactions on the transient that leads to the steady state in DC circuits.

Another influence on both Ruth Chabay and me was doing numerical integrations with a computer, where you gain the strong sense of things happening step by step, not just described in terms of an analytical solution that is a known function of time, which gives little sense of the time evolution of the process.

All of this has had a big effect on the Matter & Interactions curriculum, plus Ruth Chabay’s insight that computational modeling must be a part of introductory physics. Thinking Iteratively is a 30-minute video of a talk we gave at the summer 2014 AAPT meeting which includes examples of these issues.

Bruce Sherwood

Posted in Uncategorized | Leave a comment

A taste of geometric algebra

David Hestenes’ goal for geometric algebra is to subsume under one umbrella many different kinds of mathematics used by scientists and engineers (see the Hestenes web site, and especially his Oersted Medal Lecture). The key to this unification is to provide powerful geometrical representations for all kinds of mathematical topics, many of which are typically learned without a strong connection to a good geometrical representation. Consider the benefit ordinary 3D vectors provide compared to dealing with components individually. Vectors and vector notation not only simplify many mathematical operations but also provide a richer conceptual space for thinking about geometry. Geometric algebra has the same goals and stands in relation to ordinary vectors as vectors stand in relation to operating with components individually.

My goal for this article is to give a small taste of geometric algebra to give a sense of its structure and to illustrate how it can span diverse branches of mathematics that physicists currently study in isolation from each other.

The fundamental entity in geometric algebra is the “multivector” consisting in 3D of four elements: scalar, vector, bivector (a 2D surface with a directed normal), and trivector (a 3D solid). Geometric algebra can also be used in 2D, or in dimensions higher than 3D, but for purposes of a brief introduction we’ll stick with the 3D context. One writes a multivector as a sum: scalar + vector + bivector + trivector. This may look odd, since one is taught that “you can’t add a scalar and a vector,” but note that one often writes a vector in the form x\hat{\imath} + y\hat{\jmath} + z\hat{k} where three very different things are added together. From a computer programming point of view, one might think of a multivector as a heterogeneous list: [scalar, vector, bivector, trivector], with methods for operating on such lists.

Fundamental to geometric algebra is the “geometric product,” where a and b are multivectors. This product is defined in such a way that multiplication is associative, abc = (ab)c = a(bc), but it is not necessarily commutative; ba is not necessarily equal to ab. If a and b are ordinary vectors, the geometric product is a\cdot b + a\wedge b, where a\wedge b is a bivector that is (only in 3D) closely related to the ordinary vector cross product (\wedge is pronounced “wedge”). For vectors a and b the geometric product ba will not be equal to ab if the wedge product is nonzero, since b\wedge a = -a\wedge b.

The dot product a\cdot b (the scalar part of ab) measures how parallel the two vectors are, while a\wedge b (the bivector part of ab) measures how perpendicular they are. Together these two measures provide all the information there is about the relationship between the two vectors and thereby captures important information that neither the dot product nor cross product alone provide. Another way of saying this is that the dot product is the symmetric part of ab and the wedge product is the antisymmetric part of ab.

One way to represent the wedge product of two vectors a\wedge b geometrically is to draw the two vectors tail to tail and make these two vectors the sides of a parallelogram. The area of the parallelogram is the magnitude of the bivector. Compare with the magnitude of the vector cross product |a||b|\sin{\theta} and you’ll see that this is equal to the area of the parallelogram associated with the two vectors.

We’ll investigate some basic aspects of geometric algebra by starting with three ordinary vectors \sigma_1, \sigma_2, and \sigma_3 that are unit vectors in the x, y, and z directions. The geometric product \sigma_1\sigma_1 = 1, because the wedge product of a vector with itself has no area, so the bivector part of \sigma_1\sigma_1 is zero; similarly for the other two unit vectors.

The quantity \sigma_1\sigma_2 = 0 + \sigma_1\wedge\sigma_2 is a unit bivector which can be represented as a 1 by 1 square in the xy plane (the dot product is zero because the two vectors are perpendicular to each other). Similarly \sigma_2\sigma_3 is a unit bivector in the yz plane and \sigma_3\sigma_1 is a unit bivector in the zx plane. The wedge product is antisymmetric, so \sigma_2\sigma_1 = -\sigma_1\sigma_2; similarly for the other unit vectors.

Next, consider the geometric product of these bivectors with the unit vectors, using the fact that the geometric product is associative and that \sigma_2\sigma_1 = -\sigma_1\sigma_2:

\sigma_1\sigma_2\sigma_1 = \sigma_1(\sigma_2\sigma_1) = \sigma_1(-\sigma_1\sigma_2) = -(\sigma_1\sigma_1)\sigma_2 = -\sigma_2
\sigma_1\sigma_2\sigma_2 = \sigma_1(\sigma_2\sigma_2) = \sigma_1

We have similar results for other products of the bivectors and the vectors.

What is \sigma_1\sigma_2\sigma_3? This is a “trivector,” a cube 1 by 1 by 1. Something surprising results if we multiply this unit trivector by itself:

(\sigma_1\sigma_2\sigma_3)(\sigma_1\sigma_2\sigma_3) = \sigma_1\sigma_2\sigma_3\sigma_1(\sigma_2\sigma_3)
= \sigma_1\sigma_2\sigma_3\sigma_1(-\sigma_3\sigma_2)
= -\sigma_1\sigma_2\sigma_3(\sigma_1\sigma_3)\sigma_2
= -\sigma_1\sigma_2\sigma_3(-\sigma_3\sigma_1)\sigma_2
= \sigma_1\sigma_2(\sigma_3\sigma_3)\sigma_1\sigma_2
= \sigma_1\sigma_2\sigma_1\sigma_2
= \sigma_1\sigma_2(\sigma_1\sigma_2)
= \sigma_1\sigma_2(-\sigma_2\sigma_1)
= -\sigma_1\sigma_2\sigma_2\sigma_1
= -\sigma_1(\sigma_2\sigma_2)\sigma_1
= -\sigma_1\sigma_1
= -1

This result justifies identifying the trivector \sigma_1\sigma_2\sigma_3 with the imaginary number i. Now consider this:

(\sigma_1\sigma_2\sigma_3)\sigma_1 = \sigma_1\sigma_2(\sigma_3\sigma_1)

i\sigma_1 = \sigma_1\sigma_2(-\sigma_1\sigma_3)
= -\sigma_1(\sigma_2\sigma_1)\sigma_3)
= -\sigma_1(-\sigma_1\sigma_2)\sigma_3)
= (\sigma_1\sigma_1)\sigma_2\sigma_3
= \sigma_2\sigma_3

The bivector \sigma_2\sigma_3 lies in the yz plane. The standard vector cross product of \sigma_2 and \sigma_3 points in the +x direction, which is \sigma_1. The familiar cross product vector is the normal to the associated bivector (in 3D only), and evidently the bivector is i times the cross product vector. Similarly, you can show that i\sigma_2 = \sigma_3\sigma_1 and i\sigma_3 = \sigma_1\sigma_2. It turns out that bivectors are more useful and better behaved than their “duals,” the cross products. For example, in the old vector world one must sometimes make subtle distinctions between “polar” vectors (the ordinary kind) and “axial” vectors which behave differently under reflection (examples are magnetic field vectors). In geometric algebra there is no such distinction.

When I first saw these relationships among the \sigma_1, \sigma_2, and \sigma_3, I was amazed. As a physics student I was introduced to the 2 by 2 “Pauli spin matrices” used to describe electron spin. The matrices, and their various product and commutation relationships, were taught as something special and particular to quantum mechanical spin systems. I was astonished to find that those 2 by 2 matrices behave exactly like the unit vectors in the geometric algebra context, as discussed above. This is an example of Hestenes’ argument that the mathematical education of physicists fails to bring together diverse branches of mathematics that can be unified in the geometric algebra context.

Another example of a need for unification is that as a physics student one encounters many different schemes for handling rotations. There is a beautiful representation of rotations in geometric algebra. Consider the geometric product abb = a(bb) = a if b is a unit vector. If we write this as (ab)b = a, and consider ab to be a rotation operator, you see that ab can be thought of as a rotor that rotates b into a (there is also scaling if one doesn’t use unit vectors).

For extensive treatments of geometric algebra, see for example the textbooks “Geometric Algebra for Physicists” and “Geometric Algebra for Computer Science.”

Bruce Sherwood

Posted in Uncategorized | 1 Comment

What is Light? What are Radio Waves?

  • A talk given at a Santa Fe Science Cafe, 2013 Jan. 16
  • Abstract of the talk
  • Video of the talk
  • Interview on KSFR radio (15 minutes; choose the 3rd audio option)

The great discovery by Maxwell about 150 years ago of the real nature of light stands as one of the greatest discoveries in all of human history. The goal of this talk was to share with people what light really is, because its nature is not widely understood. I also wanted to demistify “electromagnetic radiation” and “electric fields”, terms that for many people are rather scary due to a lack of understanding of what the terms really mean.

Technical comment for physicists: As a result of preparing and giving this talk, I had a minor insight about the physics of light. A colleague has argued that magnetic fields are merely electric fields seen from a different reference frame. I’ve argued that this isn’t the whole story. I offer several examples that show that magnetic fields are not simply relativistic manifestations of electric fields.

(1) All experiments on electrons to date are consistent with them being true point particles, with zero radius, yet they have a magnetic moment even when at rest. There is no reference frame moving with constant velocity in which the magnetic field of the electron vanishes.

(2) Light consists of electric and magnetic fields that are perpendicular to each other, propagating at the speed of light. There is no physically realizable reference frame in which it is possible to transform away the magnetic field.

(3) Here is my minor recent insight: In the classical wave picture, light is produced by accelerated charges. Because the velocity is constantly changing, there is no constant-velocity reference frame in which the charge is at rest, and in which the magnetic field of the charge vanishes.

Bruce Sherwood

Posted in Uncategorized | 1 Comment

Calculus and formal reasoning in intro physics

A physicist asked me, “One thing I noticed in most recent introductory physics textbooks is the slow disappearance of calculus (integrals and derivatives). Even calculus-based physics now hardly uses any calculus. What is the reason for that?” Here is what I replied:

Concerning calculus, I would say that I’m not sure the situation has actually changed all that much from when I started teaching calculus-based physics in the late 1960s. Looking through a 1960s edition of Halliday and Resnick, I don’t see a big difference from the textbooks of today.

More generally, there is a tendency for older faculty to deplore what they perceive to be a big decline in the mathematical abilities of their students, but my experience is that the students are adequately capable of algebraic manipulation and even calculus manipulation (e.g. they know the evaluation formulas for many cases of derivatives and integrals). What IS however a serious problem, and is perhaps new, is that many students ascribe no meaning to mathematical manipulations. Here is an example that Ruth Chabay and I have seen in our own teaching:

The problem is to find the final kinetic energy. The student uses the Energy Principle to find that K_f = 50 joules. Done, right? No! Next the student uses the mass to determine what the final speed v_f is. Then the student evaluates the expression \frac12 m v_f^2 (and of course finds 50 joules). Now the student feels that the problem is solved, and the answer is 50 joules.

We have reason to believe that what’s going on here is that kinetic energy has no real meaning, rather kinetic energy is the thing you get when you multiply \frac12 times m times the square of v. Until and unless you’ve carried out that particular algebraic manipulation you haven’t evaluated kinetic energy.

Another example: A student missed one of my classes due to illness and actually went to the trouble of coming to my office to ask about what he’d missed, so he was definitely above average. The subject was Chapter 12 on entropy. I showed him an exercise I’d had the class do. Suppose there is some (imaginary) substance for which S = aE^{0.5}. How does the energy depend on the temperature? I asked him to do this problem while I watched. (The solution is that 1/T = dS/dE = 0.5aE^{-0.5}, so E = 0.25a^2T^2.) The student knew the definition 1/T = dS/dE, but he couldn’t even begin the solution. I backed up and backed up until finally I asked him, “If y = ax^{0.5}, what is dy/dx?” He immediately said that dy/dx = 0.5ax^{-0.5}. So I said, okay, now do the problem. He still couldn’t! His problem was that he knew a canned procedure that if you have an x, and there’s an exponent, you put the exponent in front and reduce the exponent by one, and that thing is called “dy/dx” but has no meaning. There is no way to evaluate dS/dE starting from aE^{0.5}, because there is no x, there is no y, and nowhere in calculus is there a thing called dS/dE.

We are convinced that an alarmingly large fraction of engineering and science students ascribe no meaning to mathematical expressions. For these students, algebra and calculus are all syntax and no semantics.

A related issue is the difficulty many students have with formal reasoning, and here there may well be a new problem. It used to be that an engineering or science student would have done a high school geometry course that emphasized formal proofs, but this seems to be no longer the case. Time and again, during class and also in detailed Physics Education Research (PER) interviews with experimental subjects we see students failing to use formal reasoning in the context of long chains of reasoning. An example: Is the force of the vine on Tarzan at the bottom of the swing bigger than, the same as, or smaller than mg? The student determines \vec{p} just before and just after and correctly determines that d\vec{p}/dt points upward. The student concludes correctly that the net force must point upward. The student determines that the vine pulls upward and the Earth pulls downward. The student then says that the force of the vine is equal to mg! Various studies by Ruth Chabay and her PER grad students have led to the conclusion that the students aren’t using formal reasoning, in which each step follows logically from the previous step. Often the students just seize on some irrelevant factor (in this case, probably the compiled knowledge that “forces cancel”).

This problem with formal reasoning may show up most vividly in the Matter & Interactions curriculum, where we want students to carry out analyses by starting from fundamental principles rather than grabbing some secondary or tertiary formula. We can’t help wondering whether the traditional course has come to be formula-based rather than principle-based because faculty recognized a growing inability of students to carry out long chains of reasoning using formal procedures, so the curriculum slowly came to depend more on having students learn lots of formulas and the ability to see which formula to use.

Coming back to calculus, I assert that our textbook has much more calculus in it than the typical calculus-based intro textbook. This may sound odd, since we have had students complain that there’s little or no calculus in our book (we heard this more often from unusually strong students at Carnegie Mellon than at NCSU). The complaint is based on the fact that we introduce and use real calculus in a fundamental way right from the start, but many students do not see that the sum of a large number of small quantities has anything to do with integrals, nor that the ratio of small quantities has anything to do with derivatives. For formula-based students, \Delta \vec{p} = \vec{F}_{\text{net}}\Delta t has nothing to do with calculus, despite our efforts to help them make a link between their calculus course and the physics course.

Bruce Sherwood

Posted in Uncategorized | Tagged | 23 Comments