So, in the last lecture, I had introduced
the Klein-Gordon equation and then written the current conservation relation which follows
from the equation. And we will continue the discussion of that
result. The conservation can be related to the non-relativistic form by defining the
two quantities – rho which is at density, and j which describes the current. And we
obtained the form which look like what is there above. But in the four dimensional notation,
which shows the Lorentz transformation procedure exactly; it is convenient to write down these
things in terms of 4 vectors.
And that expression will be now defined by a 4 current, these
components are rho and the 3 currents. It will satisfy the relation, which is now observes
both these quantities into one equation using 4 derivatives, and it can be written as i
h cross by 2m into psi star del by del x mu psi minus psi del by del x mu psi star equivalent
to del mu j mu is equal to 0.
So, this notation makes very clear that the 4 components of
the currents are essentially defined by derivatives of the wave function and they obey a conservation
law, which is invariant under Lorentz transformation. On the other hand, the components themselves
are going to change when one applies a Lorentz transformation, just like the space and the
time mix, in coordinate transformation, this is another 4 vector, and rho and j end up
mixing when you go from one Lorentz frame to another.
This however is different than the non-relativistic situation; and I have chosen the normalization,
so that the 3 component of current looks exactly the same as what came out from Schrodinger’s
equation. But, that time component is different. In case of non-relativistic situation, rho
was just absolute value of psi square. Here it is a different combination.
The other feature
is that, in non-relativistic situation, time is absolute. So, the value of rho did not
depend on the frame in which one is working. But, here the value of rho is going to depend
on the frame. And we have to now reconsider what do we mean by rho in terms of physical
interpretation. In non-relativistic situation, it was quite easy to interpret rho as a probability
density or a number density, which one integrated over all of space, was normalized to the number
of particles and that probability or finding a particle in a whole space was fixed. And
that was the interpretation assigned with this quantity Q; and Q was just a normalization
constant. Here the rho is not going to be independent
of the frame. And to see explicitly what happens, let us consider the simplistic situation,
which is easy for the illustration purpose.
That is the result for stationary states.
In this case, the wave function has a very simple time dependence that, psi is e raise
to minus i E t by h cross multiplied by some function of space. And if you now take in
this particular time dependence and stick in to the expression of rho, it immediately
comes out. And the time derivative is essentially acting only on this particular phase. And
one can rewrite this whole expression as rho is equal to E times m c square psi star psi.
And this now makes the connection with the non-relativistic situation explicit. We do
obtain mod psi square, but there is an overall factor, which is E by m c square. If the velocity
is of small, then the energy is indeed close to the rest mass energy, which is m c square.
And in that case, we recover the non-relativistic limit explicitly.
But, in other cases, the
value of rho is not the same as mod psi square. Now, there is a very simple interpretation
of this overall factor, which is connected to the fact that, the space and time coordinates
mix under Lorentz transformations. And so the overall objects, which are involved in
say the definition of Q; they undergo certain transformations. And one famous result is
that, there is a Lorentz contraction of the volume. And the contraction factor
is precisely the ratio of E by m c square.
What is happening in this particular case
is that, the total integral, which defines the number or Q does not change; the volume
factor d cube x shrinks by this particular factor; the density therefore, increases by
the same factor; and density times the volume remains invariant.
So, we have a simple understanding of why rho changes under Lorentz transformation for
this particular type of stationary states also shows the ambiguity of the overall square
root, which gave the two solutions of energy: one with positive sign and one with negative
sign. And so we can see that, depending on whether you choose a positive or a negative
energy solution, the density will also be positive or a negative; and you must find
an interpretation of what we mean by a negative density.
And that takes us back to the business
of reinterpreting the various quantities, which appear in relativistic theory. In case of non-relativistic mechanics, we
could label this rho as a density. But, that density could be either the number density
or the probability or it could be also interpreted as a charge density for the particles. And
both these objects are equal in non-relativistic formulations; and one can use either as a
physical realization of what we mean by this particular mathematical variable. On the other
hand, it is going to be the case that, in relativistic theory, that these two objects
in general do not coincide. And, the reason for that is once you introduce
relativity, we will automatically have to deal with creation and annihilation of particles.
This is a deep result, but experimentally, it is very well seen that, you can create
particles out of vacuum by supplying enough energy equal to the two times the rest mass
energy when there is a pair creation.
And you can also annihilate particles, where 2m
c square energy disappears into something else, which is radiation, photons or similar
objects. So, this is a feature, which was seen later historically after all this mathematical
formulation. But, that is our modern understanding that, once we allow the processes which create
or destroy particles, the number is not going to be conserved.
But, on the other hand, we can still have the charge conserved exactly by demanding
that, when there is a pair creation and annihilation, it is always the particle together, which
is antiparticle, which are created or annihilated. The particle and antiparticle have opposite
values of charges. And so in this process, the charge is not violated. And so that is
the interpretation, which we now assigned to the variables, which we have constructed
from Klein-Gordon equations. So, the rho, which came out of the Klein-Gordon equations – it actually corresponds to charge density. And we have no restriction about the sign of the charge density; there
could be positive charge; there could be negative charge; and either of them is perfectly physical
and we have a conservation of charge as well.
And that all fits into very well with all
the mathematical formulation. So, that is one part of the story.
One can also ask, what is the relation between the energy and the contribution to the charge
density? This becomes obvious from the relation for the stationary states, which we discovered
that, in some sense, whenever the energy is positive, it corresponds to particles with
rho positive; and if energy is negative, that corresponds to antiparticles with rho less
than 0. So, this interpretation now fits very well both with experimental observations and
the mathematical formulation. And the details of this we will encounter in later when we
study the Lorentz transformations and quantum mechanics in more detail in the sense that,
how are the particles and antiparticles are related to each other by several discrete
symmetries and the meaning of the negative energy solutions, and so on and so forth.
But, meanwhile, one can also look at another aspect of this formulation.
And that is the
question that, why cannot we by hand restrict the solution space, so that only positive
energy solutions are allowed? Because sometimes dealing with negatives energy solutions, where
the energy is not bounded, it creates confusions. And so we can see that, there must be some
limitation when you try to avoid solutions of one sign or the other. And this becomes
very obvious when dealing with the solutions not in the position space, but in the Fourier
space; what is the meaning of positive energy solution and the negative energy solutions
or when one of them is required and when one can bypass the other one of them.
us look at the Fourier transform of the wave functions. Say we defined a particular wave
function as a function of time by taking integral over energy. And in this notation, the standard
Fourier integrals will say that, the limits of the integration are from minus infinity
to plus infinity. Now, let us look at just the subspace of this
energy region, where by hand, we say that, we want to change this integration limits
from 0 to infinity. And that will put some restrictions on the property of the time dependence
of this function, because the integration limits are changing. And there are certain
kind of wave functions, which have trouble getting expressed when the integration limits
For example, for localized wave functions in time, we would like a dependence,
where phi of t is at some particular instance, t 0 defined by certain distribution. And so
this mathematically is represented by this delta function.
But, the Fourier transform of a delta function is a constant; and so in order to get localized
functions in time, you must integrate with a flat measure all the way from minus infinity
to plus infinity. If you just integrate from 0 to infinity, you will never get a delta
function in time. And that can be interpreted as a limitation that, if you want to restrict
the solutions only to certain subspace, you will not be able to obtain certain kind of
localized wave functions. If you want localized wave functions, you must cover the whole energy
space; and so must deal with antiparticles. And this can also be looked at in position
space. In the position space, if you want to localize the particle to a region smaller
than Compton wavelength solution with relativistic energy – and so E less than 0 – becomes
In particular, we must include solutions with E less than 0. And this is
a consequence of going to very short scales. If you want to probe very short distances
or very short intervals in time, you must allow for negative energy solutions; or, in
the other languages, you must have a flexibility to create particle-antiparticle pairs or annihilate
them at very short scales. And this is indeed very well-confirmed experimentally in high
energy probes; where, when you try to probe very short distances, immediately, there are
states, which will either create or annihilate particles.
And we have to accept that as a
physical situation. So, E less than 0 is not unphysical; we just have to find a correct
interpretation in terms of the connection between mathematical expressions and physical
experimental situations. So, let me come to the situations, where we
can apply now to various possibilities. And the result is that, this Klein-Gordon equation
describes charged scalar particles. These are not the fundamental particles of atoms
like electrons and protons; but we do have other charged objects, which are created at
high energy accelerators. For example, the charged pions, which appear as pi plus; and
its antiparticle pi minus.
These can be very well-described using Klein-Gordon equations.
They have only the space-time degree of freedom. So, that is why they are labeled as scalars;
they do not have any spin associated with them. But, they do have this particle-antiparticle
degree of freedom, which is associated with the value of the charge. There are other possibilities
as well, which again follow from the same equation that, if for some reason, the wave
function satisfying Klein-Gordon equation is real; then one can immediately see that,
both the charge density as well as the current will vanish identically, because psi star
is equal to psi. And this then describes
neutral scalar particles. And an example of that is a neutral pion.
Now, this neutral particles fall into a different category.
And that is because the wave functions
are real. And in particular, one does not have to worry too much about the ambiguity
of the particle and antiparticle, because there is no opposite values of charge over
here. And one can rephrase that things slightly differently that, these neutral particles are their own antiparticles. So, one can try
to define a transformation going from particle to antiparticle. In the case of charged particles,
it is nothing but complex conjugation if psi describes e raise to minus i E t by h cross
kind of time dependence. Psi star will define that, time dependence of the same type, but
with opposite signs of energy. And so if psi is a particle, psi star can be associated
antiparticle wave function. In the neutral particles case, psi and psi star are equal
or one can say that, the particles are antiparticle of themselves.
And, one peculiarity of such neutral particles is that, they can be created or annihilated
singly in some physical process of either scattering or collision or other kind of interactions.
Because of the charged conservation for charge objects, one must create them in pairs.
one of them is a particle carrying one kind of charge, it must be always be accompanied
by another object, which has opposite values of charge and which we interpret as antiparticle.
And so the charge objects, which we see like electrons and positrons; one – if you want
to create them or annihilate them, you will always get E plus and E minus appearing or
disappearing together the same for proton and antiproton and the same for neutron and
antineutron; neutrons do not have electromagnetic charge, but they can be assigned another charge,
which is a baryon number. And that will again be opposite sign between neutron and antineutron.
On the other hand, neutral particles like pi 0 – there is no problem in creating or
annihilating them singly; the charge in the process is conserved; and no physical principle
is violated. Another example of such a neutral particle, which can be created and annihilated
singly is of course the very common photon, which will appear every time there is a transition
between different atomic energy levels; either it is emitted or absorbed and that occurs
basically with a single photon.
So, this is our interpretation of how to deal with this
Klein-Gordon equation and the solutions of opposite signs of energy associating those
two kind of solutions with particle and antiparticle; and also, looking at the possibility of neutral
particles in the same framework. Now, let me look at the same equation in a
slightly different framework, where I am going to separate this particle and antiparticle
degrees of freedom by hand. And that kind of mix something, is more obvious in the sense
that, the connection between the relativistic framework and its non-relativistic limit is
little bit easier to understand. And I will call this reformulation as a two component
framework. And the two components, which we are going to deal with, are essentially the
particle and antiparticle degrees of freedom. So, I define now the quantities. One of them
is zeta; and it is a combination of psi and psi dot with a particular sign; and another
quantity chi, which is a combination with opposite sign.
The purpose of defining these auxiliary variables is that instead of having a second order equation
in time, we want to construct a first order equation in time, which looks similar to the
framework of Schrodinger equation.
And these two components I have constructed with a particular
coefficient in front of psi dot such that if I take one particular sign of energy, the
solution will go one way or the other; and in this particular case, if I choose a free
particle at rest; which means energy is equal to mc square; and then zeta is nothing but
psi and chi equal to 0. In one case, the two things add; in other thing, the two things
cancel. The side odd basically just produces the energy, which is m c square and the 2
factor cancelled out. And, if I have a free antiparticle at rest,
then E is equal to minus m c square; and then we have zeta is equal to 0 and chi is equal
to psi. So, these variables are conveniently defined that, depending on the sign of the
energy, you will have one variable of the other; and which is the reason for giving
these labels of particle and antiparticle to this combination zeta and chi rather explicitly.
So, now, let us work through the equation, which will explicitly be obeyed by these two
So, let us now calculate what this object
is; and I am going to stick with first derivative as in the case of Schrödinger’s equation.
So, the derivative is rather easily calculated by just taking the definition. And it produces
of psi dot and psi double dot. But, now, we can use the various equations, which we know
to rewrite this in a particular form. For example, i h cross times psi dot. So, this
can be eliminated; i h cross psi dot by m c square is nothing but zeta minus chi. And
to eliminate psi double dot, we can use the original Klein-Gordon equation by these two expressions. And of course, psi
can be then eliminated completely by rewriting in terms of zeta and chi.
So, with all these substitution in this equation, the first derivative form can be completely
rewritten in terms of no time derivatives and the two variable zeta and chi.
result is i h cross del zeta by del t is equal to minus h cross square by 2m del square zeta
plus chi plus m square c zeta. And a similar equation for the second variable, where the
signs are essentially the opposite – i h cross del chi by del t is equal to h cross
square by 2m del square zeta plus chi minus m c square chi. And these two equations now
look very similar to Schrodinger’s equation; there is a first order time derivative; there
is a h cross square del square 2m operator, which represents the kinetic energy.
this m c square essentially represents the rest mass energy, which can be added easily
by hand into Schrödinger equation. What is different is the equation for zeta
now involves an extra contribution from chi and vice versa. And this extra contribution
can be interpreted as the contribution of relativistic corrections. In the Schrodinger
equation, it was not there. And now, Klein-Gordon equation gets that extra correction. And one
can analyze the consequences of what that extra correction is and it also helps understand
the particle-antiparticle interpretation little better.
So, let me rewrite this equation in a two component notation; where, the two components
are just written as a simple vector consisting of zeta and chi – h cross square by 2m into
1, 1, minus 1, minus 1 del square zeta by chi plus m c square into 1, 0, 0, minus 1
into zeta, chi.
And this equation can be now rewritten in various possible ways by choosing
a suitable frame; and that frame will be corresponding a linear transformation on this 2 component
vector. In particular case, what can be done is you can go from a frame in which the particle
is moving to one in which particle is at rest. And, by construction, we have defined the
state, so that the identification of zeta and chi in the frame is very easy in terms
of what charges they carry and which degree of freedom they correspond to particle or
antiparticle. So, to be able to do that, we need to decouple these two equations, so that
one will have one degree of freedom completely independent on the other, instead of the relativistic
corrections here, which are mixing up the two degrees of freedom.
And, in order to do that, we have to look at the various matrices, which appear over
here and diagonalize them.
What turns out to be easy to see is these matrices are simple
combination of Pauli matrices. In particular, this 1, minus 1 on the diagonal is a sigma
3; while the matrix, which is appearing over here is sigma 3 plus i times sigma 2. And
what we need to do is somehow rotate this particular matrix to its diagonal form without
changing the diagonal form of the matrix, which goes with the rest mass term. And if
you do that, the equations will decouple and then one part will call the particle and the
other part will call the antiparticle. This can be done in terms of a general linear algebra
problem, but it is kind of convenient to look at geometrical representation, which is kind
of straight forward.
And so I will draw a little diagram, which
represents these two particular terms, which we want to rotate by a linear transformation.
So, this is a little right-angle triangle; one side is the coefficient coming from here,
which in terms of various quantity involves coefficient of sigma 2; the other side is
the coefficient of
sigma 3. And we want to go to a frame, where these two objects are combined into a single
diagonal matrix. Now, I have chosen to draw it a right-angle triangle, because in this
2 dimensional space, sigma 2 and sigma 3 represent orthogonal coordinates. And one can indeed
rotate them by performing a rotation about the direction, which is orthogonal to both
of them; and which happens to be the direction corresponding to the third Pauli matrix sigma 1. So, if you perform a suitable rotation, what
happens is, these 2 vectors get realigned in a direction, whose magnitude is given by
the hypotenuse of this particular triangle.
It is very easy to square these objects and
find out what that length is. And indeed it is what we expect from the relativistic dispersion
relation and there will be a certain angle of rotation. Typically, this rotation is written
as 2 theta; that has to do with the representation of Pauli matrices in terms of half angles,
but we will not worry about it. The important point is the rotation angle, which is specified
about the third direction.
And that comes out from this geometry as a tangent inverse
of the ratio of these two terms. Because we are dealing with Lorentz transformation, it
is not the trigonometric functions. But, what appears are the hyperbolic functions; and
that is buried inside in this little factor that, there is i upcoming in sigma 2 and no
i such in sigma 3. So, the angle of rotation is a hyperbolic tangent inverse of the ratio
of these two particular sides. And, one can easily now define a transformation,
which achieve this. So, the transformation is the prime component, are certain rotation
matrix acting on this; where, now, this s specifying the rotation matrix
the matrix corresponding to sigma 1 and then
the angle. It is connected with the fact that, we are dealing with Lorentz transformation;
and so hyperbolic function that this matrix as is actually not a Hermitian matrix; it
happens to be anti-Hermitian. And so e raise to i s is a Hermitian matrix. And if you want
to explicitly denote that, s is equal to s dagger minus So, if one performs this particular transformation,
then the result is equation is indeed of the form, which we wanted; which makes the separation
of the two degrees of freedom very explicit.
The total energy has two signs, which is essentially
denoted with plus 1 and minus 1; and the two solutions of opposite energies are what we
called earlier particle and antiparticle. This transformation also illustrate another
point, which we will encounter in more detail when dealing with the Dirac equation; and
that is, how to construct observable quantities out of this particular object. It is very
easy to see that, once you have got to the diagonal form, we can rewrite the current
components encountered earlier in terms of these particular variables.
And, to simplify the notation, let me define a symbol for this to component object, which
I am just calling phi. And then the observables are bilinears in this quantity just like in
the Schrödinger’s equation, there was a form, where you could write all observables
at psi star, then some operator and then psi. In this particular case, we can write down
the observable as a phi dagger. The dagger comes in because of this two component notation;
then some operator and then phi.
And one can now see what is a convenient notation to incorporate
this Lorentz transformation property as well as this quantity or the physical observable.
So, when we change the basis, that phi will go to
s times phi. That is indeed what we did in
going from prime frame to unprimed frame. The observables transform as O will go to e raise to i s O e raise to minus i s. There is a mistake here; this is a general transformation rule for
linear algebra and it was constructed, so that various factors corresponding to this
rotation of the basis cancelled out.
But, now, it is kind of easy to see that,
what are the bilinears, which are invariant under these transformations; and they are
phi dagger eta O phi with eta is equal to 1, 0, 0, minus 1. And this is necessitated
by the fact that, e raise to i s is Hermitian. And so if you construct this particular quantity,
the various factor of e raise to i s and e raise to minus i s have to be cancelling each
other to create an invariant object. On the right-hand part, O times phi e raise to i
s and e raise to minus i s do cancel.
But, on the other side, the phi dagger also produces
e raise to i s. And that is because it happens to be Hermitian. And the O also provides e
raise to i s. And to cancel them each other, we must flip the sign of the exponent of this
transformation matrix here – s involved the Pauli matrix sigma 1 and by putting Pauli
matrix sigma 3, which is denoted here by eta. One can anticommute it through it and cancel
the sign on. And then the object which is appearing over here will not change as a value.
And, this then becomes a prescription – so is various operators invariant observables
are written in the form. Now, I will introduce another notation, which is common phi bar
O phi; where, phi bar is identical to phi dagger eta. In particular, now, one can go
back and look at the charge density, which we had dealt with earlier.
The charge density
corresponds to the rho corresponds to O just being the trivial identity operator. The opposite
signs which were present in the expression are taken care of by this matrix eta, so that
the upper component will contribute a positive term to rho; and the lower component will
contribute a negative term to rho. And in general, all the other components; so various
components of the so-called current –
they can be mapped to the various operators, which I will label just as the Pauli matrices;
where, sigma 0 is identity and sigma are Pauli matrices. So, this gives a prescription of
how to construct various observable operators. And also, explicitly shows that, the plus
1 and minus 1 on the diagonal are very much part of the formulation and their opposite
signs indeed are necessary to construct observables, which are invariant under Lorentz transformations.
And that is a very important lesson, which keeps on appearing in relativistic formulation
of field theory in many places. One must deal with opposite signs of solutions for energy
and a corresponding interpretation of particles and antiparticles. There is one more thing,
which I can point out at this particular stage; which helps this particle-antiparticle identification,
is the fact of introduction of electromagnetic coupling in this particular formulation.
We will deal with the general formulation
little bit later. And… But, the simplistic prescription needed right now is the electric
field, is introduced in this differential equation by a very simple substitution; where,
the momentum is replaced by p mu minus e by c times the gauge potential. And this prescription
is called minimal coupling. We will deal with a more general situation with electromagnetic
interaction little bit later. But, right now, this is enough. And this now tells us what
will happen to the Klein-Gordon equation if you do this particular substitution. And it
is just replacing the gradient operator by particular potential. And, what I am interested right now is just
the electric potential. So, which is… It just modifies the time-derivative part in
the equation. And equation, which you had in the 2 component form with coefficient
is identity. On the other hand, the coefficient
of the rest mass term is the Pauli matrix sigma 3. And this says that, the 2 type of
solutions behave differently in mass and in terms of charge.
So, if you introduce electric
field, the direction in which the energy moves is opposite. So, energy is shifted in opposite
directions for E greater than 0 and E less than 0 solutions when electric potential is
introduced. And, this indeed confirms the label, which
we gave as particle and antiparticle having opposite charges. Positive energy solutions
will be shifted one way in the sense that, the magnitude of the energy in one case will
increase. But, in the other case, when the energy is negative, the magnitude of the energy
in presence of the potential will decrease.
These opposite shifts of the energy indeed
confirm opposite values of charge for particles and antiparticles. So, these are the important
interpretation problems of Klein-Gordon equations and their resolution. And, we will now deal
with the solution of this Klein-Gordon equation in presence of electromagnetic field the next