Hermitian or anti-Hermitian? That is the question

There is a cultural difference between physicists and mathematicians, whereby physicists use Hermitian matrices to generate Lie algebras, whereas mathematicians use anti-Hermitian matrices. Why do they do this, and does it matter? Mathematicians use anti-Hermitian matrices, and define the Lie bracket of two matrices as [A,B]:=AB-BA (or sometimes BA-AB or (AB-BA)/2, but these differences don’t matter). This convention works because AB-BA is always anti-Hermitian, if A and B are either both anti-Hermitian, or both Hermitian. Physicists want [A,B] to be Hermitian, so they have to define [A,B]:=i(AB-BA), where i is a complex square root of -1. This converts the complex anti-Hermitian matrix AB-BA into the complex Hermitian matrix i(AB-BA). This convention works because the number of independent Hermitian matrices is equal to the number of independent anti-Hermitian matrices, that is n^2 for n x n matrices.

But this conversion between Hermitian and anti-Hermitian does not work for real (i.e. orthogonal) Lie algebras, or for quaternionic (i.e. symplectic) Lie algebras. This is because the numbers are different. In the real case, there are n(n+1)/2 independent Hermitian and n(n-1)/2 independent anti-Hermitian matrices, so more Hermitian than anti-Hermitian. In the quaternionic case, there are n(2n-1) Hermitian and n(2n+1) anti-Hermitian, so more anti-Hermitian than Hermitian. Both cause insurmountable problems for physicists, who try to create Lie algebras that are either too large or too small, by using Hermitian instead of anti-Hermitian matrices.

Let’s look first at the real case, where if you want to convert Hermitian into anti-Hermitian matrices you have to increase n to n+1. Physicists actually do this. They look at an electromagnetic field in three dimensions, and think of it as (real) Hermitian 3×3 matrices, so having 3×4/2=6 dimensions altogether, but when they calculate with it they convert to 4×4 (real) anti-Hermitian matrices (again having 4×3/2=6 dimensions altogether), and then multiply the extra dimension (time) by i to make the electric field (but not the magnetic field) Hermitian. It’s a mess, but it works, so physicists don’t care.

You can see it in the theory of gravity as well. General relativity works with symmetric (i.e. Hermitian) rank 2 tensors over spacetime, so with 4×5/2=10 dimensions. Kaluza-Klein gravity recognises that these 10 dimensions ought to form a Lie algebra, but they don’t, so extends spacetime to 5 dimensions, so that they can use the 5×4/2=10 dimensions of anti-symmetric (i.e. anti-Hermitian) rank 2 tensors instead. But they don’t correct the unnecessary and confusing multiplication of time by i, so they don’t get the correct Lie algebra, and so they don’t get a correct theory of gravity.

From a mathematical point of view, therefore, electromagnetism is best modelled with the Lie algebra so(4) of anti-symmetric real 4×4 matrices, without any of this spurious multiplication by i that makes time imaginary in the standard theory. Gravity is similarly best modelled with the Lie algebra so(5) of anti-symmetric real 5×5 matrices, not the symmetric 4×4 real matrices that Einstein used. But physicists are absolutely convinced that matrices that are not Hermitian do not exist, so they are incapable of making this leap. What they don’t see is that if you want a real differential equation to describe real physical quantities, then the Lie algebra that describes the infinitesimals actually does contain anti-Hermitian matrices.

Now let’s look at the quaternionic case. Here the situation is even worse, because you cannot rescue the situation by inventing an extra dimension of space or time. In one dimension, you have one Hermitian matrix, and three anti-Hermitian. If you work with Hermitian matrices, as physicists do, then you only see one generation of electrons. If you work with anti-Hermitian matrices, as mathematicians do, you see all three generations. If you are not prepared to make the leap to anti-Hermitian matrices, then you will never find a theory that deals adequately with the three generations. And when I say never, I mean NEVER.

In two dimensions, there are 6 Hermitian and 10 anti-Hermitian (independent) quaternion matrices. The Lie algebra you get from the anti-Hermitian matrices is in fact the same as the one you get from the real 5×5 anti-Hermitian matrices, so you can translate from one notation to the other if you like. This enables you to translate between quantum mechanical notation (2×2 quaternions) and gravitational notation (5×5 reals) and eventually (hopefully) unify the two approaches. At any rate, you’ve got the same Lie algebra in both cases, and therefore the same differential equations in both cases. But if you are not prepared to make the leap to anti-Hermitian matrices, then you will never find a theory that deals adequately with the unification of gravity and particle physics. And when I say never, I mean NEVER.

Of course, you need to go up to three dimensions to understand how space and the strong force work. Here there are 15 Hermitian and 21 anti-Hermitian matrices. The anti-Hermitian ones are the ones you need to generalise strong su(3) to three generations. Naively, you might expect to need 8 x 3 =24 dimensions, but it seems that only 21 are available. So there are 24-21=3 parameters over which you have no control, corresponding to the three quark-generation-mixing parameters of the standard model.

And finally, in order to make the theory relativistic, you have to go to 4 dimensions, where there are 28 Hermitian and 36 anti-Hermitian matrices. People who are interested in GUTs like the number 28, because it is the dimension of so(8), and they love so(8) because of its exotic properties like triality. But unfortunately these 28 matrices are Hermitian, and therefore they do not generate a Lie algebra, so they do not generate so(8). Any resemblance of so(8) to real physics is purely coincidental. The Lie algebra you actually need to describe the whole of fundamental physics is sp(4), and the Lie algebra you need to make this model relativistic is sl(4,H). The compact part, sp(4), is generated by the anti-Hermitian matrices. The Hermitian matrices extend to gl(4,H), and deal with relativity.

The situation is analogous to Dirac’s relativistic spin group SL(2,C), or rather its Lie algebra sl(2,C). The non-relativistic spin group su(2) is generated by the anti-Hermitian 2×2 complex matrices, which also generate a scalar u(1), and the extension to sl(2,C) is generated by Hermitian 2×2 complex matrices, which also generate a real scalar. But extending from Dirac’s 6 real dimensions, or his 16 complex dimensions in the Dirac algebra, to 16 quaternion dimensions, allows us to include three independent generations, three independent directions of spin, and three independent directions of quantised gravity, all in a 36-dimensional compact Lie group.

So which is it to be, Hermitian or anti-Hermitian? Over to you, Prince of Denmark.

46 Responses to “Hermitian or anti-Hermitian? That is the question”

  1. Robert A. Wilson Says:

    Peter Woit is very concerned about how to convert Minkowski spacetime, as used in relativity, to Euclidean spacetime, as required in quantum mechanics. Neil Turok is very concerned about why you need to introduce a factor of i into the path integral for it to make sense. I believe these are basically the same problem, and I believe they have the same solution. I have tried, largely unsuccessfully, to talk to both of them about it.

    It is simply the fact that physicists think that Hermitian matrices generate Lie algebras. They don’t. Compact Lie algebras are generated by anti-Hermitian matrices, not Hermitian matrices. Correct this fundamental error, that is now 100 years old, and the problems all go away simultaneously.

  2. Robert A. Wilson Says:

    Perhaps I’d better try and clarify what I said about gravity being modelled by so(5), generated by anti-symmetric 5×5 matrices. This is a model, and I am not claiming that it necessarily agrees with experiment. It is a model produced by the binary icosahedral group, as I explained quite some time ago, but it is a model I now feel is unlikely to be compatible with particle physics, so I am not working on it any more.

    An alternative model is obtained using sl(4,R) or gl(4,R), that contains all the matrices, both Hermitian and anti-Hermitian. This is closer to GR than the so(5) model, but includes extra terms in the equations, since the anti-Hermitian terms cannot be avoided. So it has 15 or 16 field equations, rather than Einstein’s 10. The 6 anti-Hermitian terms come from rotations in spacetime, so that GR arises in the limit that the rotations can be ignored.

    This is the model of gravity that I favour at the moment, because it agrees (at a qualitative level) with astronomical observations that show that when the Newtonian (or Einsteinian) gravity gets very weak, the rotations can no longer be ignored.

  3. Lars Says:

    Whenever people (physicists or mathematicians) feel the need to multiply time by i to make their equations work out, I get queasy.

    All indications are that time is real, not imaginary (Wick notwithstanding and notwickrotating)

    And in defense of physicists , Hermitian matrices have real eigenvalues which are required of operators that correspond to experimental results.

    • Lars Says:

      Wick Rotation

      Wick rotate

      And what results

      Is folks in state

      Of summersaults

    • Lars Says:

      For the Love of i”

      Euler loved i

      And so did Gauss

      And by and by

      Did everyone else

    • Lars Says:

      or maybe the left i, the right i and the blind i.

    • Robert A. Wilson Says:

      I disagree with the statement that real eigenvalues are “required” for experimental results. Take “spin” for example. A spin 1 particle such as a photon, has a helicity, which can reasonably be take to be +1 or -1 according to whether it is right-handed or left-handed with respect to the direction of motion. But for a spin 1/2 particle, the spin operator is the square root of the (spin 1) helicity operator, and has eigenvalues +1, -1, +i and -i. So the eigenvalues are real on the left-handed spin, and imaginary on the right-handed spin (or vice versa).

      That is what mathematics tells us, and it is also what nature tells us. Left-handed and right-handed spins are completely different things physically. You can’t model that if you insist (as physicists do) on modelling both spins with the same mathematics.

      The other reason why insisting on real eigenvalues is a bad idea is that the very idea of eigenvalues is a complex notion. It cannot be restricted to real numbers, and it cannot be easily extended to quaternions. Actually it can be extended to quaternions, but then it very much matters whether your eigenvalues are real or imaginary or a mixture. Imaginary quaternionic eigenvalues are required for every operator (such as the mass operator) that has triplets of eigenvalues.

      The experimental existence of three generations of electrons, therefore, requires imaginary eigenvalues.

      • Robert A. Wilson Says:

        This isn’t just philosophical idealism, it is practical calculation. In two of my recent papers I have actually written down a mass operator, with its (quaternionic) eigenvalues and eigenvectors, and demonstrated that it explains (a) the electron/muon mixing angle, (b) the electro/weak mixing angle, (c) the CP-violating phase of quark mixing, and (d) the strange/bottom quark mixing angle.

        So that adds a phenomenological reason for preferring quaternionic operators and non-real eigenvalues over complex Hermitian operators and real eigenvalues, to add to the theoretical and experimental reasons already given.

      • Robert A. Wilson Says:

        Since I’m on the subject of polarisation of photons, I’d better talk about the difference between helicity (circular polarisation) and linear polarisation. Helicity has real eigenvalues +1 and -1, but linear polarisation requires extending the eigenvalues to complex numbers of norm 1, that is the group U(1). Now in standard theory, the complex number cos t + i sin t is decomposed into “amplitudes” cos t and sin t, whose squares cos^2 t and sin^2 t are the “probabilities” of detecting a photon in one of the two perpendicular “directions” of polarisation chosen by the experimenter.

        But this experimental result is a macroscopic measurement of a stream of photons, because “probability” only makes sense if you can “count” the number of photons in each of the chosen “eigenstates”. The “operator” being applied is a macroscopic operator that describes a measurement. It is not an “internal” operator that describes the “actual” state of an individual photon. We know this both for theoretical reasons (Bell’s theorem, etc), and for experimental reasons (the probabilities are correct). The actual state of a photon is believed to be either helicity +1 or helicity -1, and the observations are supposed to derive from (complex) superposition of helicity eigenstates.

        In standard theory, this “superposition” is supposed to be an actual real photon. In my interpretation, and Bohm’s interpretation, this superposition is a mathematical fiction, not a physical particle. It is a description of the “average” photon in the sample. As we all know, normal people are not average people, normal weather is not average weather, and normal photons are not average photons. Average photons are characterised by interference between real photons, that is by phase differences that the real eigenvalues ignore. Phase differences are macroscopic properties, not internal properties. Physicists know that there is no internal “phase” that you can take differences of – only the difference has any physical reality. It’s the same as the use of “potential” in classical physics – there is no physical “potential”, but only the differences in potential have physical reality. That is why U(1) is a “gauge group” – it describes the mathematical transformations you can apply to your theory without changing any of the physics.

      • Robert A. Wilson Says:

        As a follow-up to the above comment, it might be worth commenting again on this notion of “quantum superposition”. In all circumstances, “quantum superposition” is a mathematical fiction designed to make the probabilities come out right. It is a description of the “average” particle in a given situation. It is not a description of any individual particle in that situation, and it is therefore not a description of a normal particle in that situation. It is a description of an average particle.

        If you are in the business of quantum mechanics, you are in the business of calculating probabilities, and for that purpose an “average” particle is extremely useful. Just as long as you don’t make the mistake of thinking that an average particle is a normal particle. The average household in the UK has 2.36 people in it. That is a number that is very useful for statisticians, planners and policy makers. But don’t make the mistake of thinking that a normal household has 2.36 people in it.

      • Robert A. Wilson Says:

        Or take mass as another example. The average rest mass of neutrons that are measured in our experiments is 939.56542 MeV/c^2. Does that mean that this is the rest mass of every neutron in the universe? Physicists appear to assume that it is, but I am more cautious. Does it mean that this is the rest mass of a normal neutron? Physicists appear to assume that it is, but I am more cautious. I only assume what can be proved by experiment, which is that the average mass of a neutron at rest in the laboratory is 939.56542 MeV/c^2, assuming that the relativistic corrections are accurate.

        I have argued in several posts recently that the relativistic corrections are not accurate, because only special relativistic corrections are used, and accelerations due to gravity are ignored. I have also argued that the “mass” is an essentially fictitious concept, that tells us more about the environment than it does about the particle. It is a useful concept, don’t get me wrong, because it enables us to predict accurately what happens to neutrons on average in environments that we can control and observe closely.

        It enables us to predict what happens to neutrons in other environments as well, but there is no evidence to demonstrate that these predictions are accurate. Indeed, astronomy and cosmology have produced a great deal of evidence that these predictions, applied to aggregates of vast numbers of neutrons in vast regions of the universe, are not always accurate.

        It also enables us to predict what happens to “free” neutrons in a “cold” beam and in an “ultracold” bottle, but experiment seems to be trying to tell us that these predictions are wrong: these two sorts of neutrons behave as if their average masses are greater (by about .0007%) in the bottle than in the beam. This can only happen if the “mass” is not a universal constant, but instead is a function of the environment. As I have been saying constantly for the past 9+ years.

        You only have to look at the Schroedinger equation, or the Dirac equation, to see that this must be true. These equations do not tell you what a particle will do, they tell you what a particle can do, and with what probabilities. All terms in the equations vary over time and space, except the mass. All terms in the equations describe variables that are averaged over multiple particles, except the mass. Why is this? Is it a physical fact that the mass of a normal particle is always the same as the mass of an average particle, in defiance of all the normal laws of probability and statistics? Or is this an unwarranted assumption, that nature is now trying to tell us is actually false?

      • Lars Says:

        Perhaps I’m just confused, but I thought because experimental results are real, the operators/matrices corresponding to measurements had to have real eigenvalues.

      • Robert A. Wilson Says:

        I think this confuses two different meanings of the word “real”. A “real” number is a mathematical concept that has nothing to do with the “real” physical world.

      • Lars Says:

        I referred to the real world in some of my comments above, but was using real in the sense of real numbers in the case of both the eigenvalues AND the results to experiment (ie, measurements)

      • Robert A. Wilson Says:

        Yes, but why should the results of physically real experiments be mathematically real numbers? That is an assumption about what mathematics to use to describe the experiment. And it is that assumption that I am questioning. Both nature and mathematics seem to be telling us that that assumption is not valid.

        However, there are other ways of looking at the issue. If the mathematical model uses both Hermitian and anti-Hermitian matrices, as it often does (for example in relativity theory), then typically the Hermitian matrices describe the macroscopic variables, and the anti-Hermitian matrices describe the internal symmetries and the gauge groups. If you don’t use both, then you have no viable theory of measurement.

        Well, maybe that hits the nail on the head. Standard approaches to QM do not contain a viable theory of measurement.

      • Robert A. Wilson Says:

        Or to take another example, consider the masses of the three generations of electrons. Are these better modelled as real numbers, or as imaginary numbers? If you really want three of them, they must be imaginary quaternions, because there is no other algebra that can describe them. If you only want one of them, you can argue about whether it is real or imaginary, and you can argue about whether the factor of i goes on the left hand side of the Dirac equation or the right hand side. But experiment says there are three, and mathematics then implies they are imaginary.

  4. Robert A. Wilson Says:

    Quantum mechanics is entirely based on complex Lie algebras. My models are based on quaternionic Truth algebras.

  5. Robert A. Wilson Says:

    OK, so let’s look at the U(1) gauge group a bit more carefully. Experts tell me it is actually U(1)/Z_6, because the scalars of order 6 arise from the scalars of order 2 in SU(2) and the scalars of order 3 in SU(3). So what this means is that U(1)/Z_6 describes the mathematical “gauge group” that changes the mathematical description without changing the physical reality. So that means that the scalars Z_6 describe actual physical reality.

    Pause and allow that to sink in for a moment. The gauge group U(1) has no physical reality, but its finite subgroup Z_6 does have physical reality. The gauge group SU(2) has no physical reality, but it has a finite subgroup that does have physical reality. Which subgroup is this? Well, I think it has to be one in which when you restrict to U(1), you get Z_6, and it has to be non-abelian. There are about four possibilities, and I’ve tried them all. I’ve written hundreds of pages about them, but I couldn’t work out which one is correct until I looked at SU(3) as well.

    SU(3) has loads and loads of finite subgroups. Which one do you think we need? The clue is in the fact that it must mix non-trivially with the finite subgroup of SU(2). And now there is only one possibility. There is only one finite subgroup of SU(3) that is available to describe actual physical reality, as opposed to a mathematical model that includes loads of unobservable “potentials”.

Leave a comment