Hilbert Spaces and their Duals

Defining the inner product on an abstract vector space of physical states.

Mar 13, 2025

For the past few installments we’ve been studying the physics of the simple harmonic oscillator in Quantum Mechanics. I’m emphasizing this system for two reasons. First, it is representative of the mathematics behind Quantum Mechanics. Second, it’s an extremely simple example of the representation theory of Lie algebras. Quantum Mechanics and the representation theory of infinite-dimensional Lie algebras are intimately connected, as we shall explore in due course.

Soon we will take the mathematician’s point of view seriously, expanding upon and abstracting from all the details of the Heisenberg algebra. For the moment, however, we’ve got a few loose ends to tie off.

Today we’re focusing on Hilbert spaces. We’ve had a lot of them flying around in the notes and it’s time to unify our descriptions of them.

Hilbert Spaces and Inner Products

A Hilbert space H is a vector space equipped with an inner product. This is a generalization of the notion of the Euclidean dot product from three-dimensional geometry.

In that three dimensions, we simply summed the product of the components of the two vector together.

\(\langle v , w \rangle = \vec{v}\cdot \vec{w}= \sum_{j=1}^{3}v_{j}w_{j},\quad v,w \in \mathbb{R}^{3}.\)

The inner product allows us to assign a norm or magnitude to each vector,

\(| \vec{v} |^{2} = \langle v , v \rangle = \sum_{j=1}^{3} v_{j}^{2},\)

which in turn gives us a meaningful notion of distance in that vector space,

\(d(v,w) = | \vec{v} - \vec{w} | = \sqrt{\langle v-w , v-w\rangle}.\)

It just so happens that this norm is the standard, Euclidean norm on R^3.

We seek to build the same kind of inner product for Hilbert spaces like H. The added complexities for us include the fact that H is complex and can be infinite-dimensional.

Without getting too formal, let us take these two complications in turn. First, we complex conjugate the components of the first vector in the pairwise product,

\(\langle v , w \rangle = \sum_{j=1}^{3}v_{j}^{\star}w_{j},\quad v,w \in \mathbb{C}^{3}.\)

Exercise: Show that if H is a finite-dimensional, complex vector space, this new inner product defines a norm and a distance on H.

Second, if the vectors are countably infinite - as with functional representations - we have seen previously that the sum generalizes to an integral1,

\(\langle v , w \rangle = \int dz \; v^{\star}(z) w(z).\)

Exercise: What requirements or restrictions do you anticipate for such a vector space of functions? Describe how this construction also defines a norm and a distance on those functions.

If H is infinite-dimensional but discrete, we can instead write,

\(\langle v , w \rangle = \sum_{j=1}^{\infty}v_{j}^{\star}w_{j}.\)

but we have to be careful here to make sure that the sum converges.

States of the Harmonic Oscillator

So far we’ve encountered two explicit representations for the harmonic oscillator: one based on Hermite Polynomials and Dirac’s abstract vectors. These were spanned by linearly independent basis elements indexed by nonnegative integers,

\(\psi_{n} \; \mathrm{and} \;v_{n},\)

respectively.

In the former case,

The Physical Origins of Heisenberg Algebras

Sean Downes

Mar 4

Read full story

we used a functional representation of the inner product,

\(\langle \psi_{m} , \psi_{n} \rangle = \int_{-\infty}^{\infty} dz \;\psi^{\star}_{m}(z) \psi_{n}(z) = \delta_{mn}.\)

The latter case - Dirac’s case - was very abstract. There was no direct computation of the inner product available. We will remedy that presently.

The Transpose and Duality

In elementary physics, we often represent vectors in R^3 as a sum over components with basis vectors,

\(\vec{v} = v_{x} \hat{x} + v_{y} \hat{y} + v_{z} \hat{z}.\)

Another, more practical representation is that of the column vector,

\(v = \left(\begin{array}{c}v_{x} \\ v_{y} \\ v_{z}\end{array}\right).\)

Hopefully the map between these two representations is clear. The column vector representation is favored for doing computations, especially when matrices or computers are involved. For example, if we wanted to rotate in the x,y direction, we would multiply v by a matrix M(θ),

\(M(\theta)\cdot v = \left(\begin{array}{rrr}\cos\theta & -\sin\theta & 0 \\ \sin\theta & \cos\theta & 0 \\ 0 & 0 & 1\end{array}\right)\left(\begin{array}{c}v_{x} \\ v_{y} \\ v_{z}\end{array}\right) = \left(\begin{array}{c}v_{x}\cos\theta - v_{y}\sin\theta \\ v_{x}\sin\theta + v_{y}\cos\theta \\ v_{z}\end{array}\right).\)

Instead of a column vector, which can be thought of as a 3 x 1 dimensional matrix, we could have written down a row vector,

\(v^{\sf T} = \left( \begin{array}{ccc} v_{x} & v_{y} & v_{z}\end{array}\right).\)

which is a 1 x 3 dimensional matrix.

Here we use the T symbol to represent the fact that a row vector is the transpose of a column vector - we have swapped columns and rows.

We can also act on a column vector using a rotation matrix M(θ), which we might naively write as

\(v^{\sf T}\cdot M(\theta) = \left( \begin{array}{ccc} v_{x} & v_{y} & v_{z}\end{array}\right)\left(\begin{array}{rrr}\cos\theta & -\sin\theta & 0 \\ \sin\theta & \cos\theta & 0 \\ 0 & 0 & 1\end{array}\right).\)

But notice that our rules of matrix multiplication require us to multiply from the right. In general you can multiply any two matrices together only if the number of columns on the left matches the number of rows on the right.

Carrying out this computation we notice something interesting,

\(v^{\sf T} \cdot M(\theta) = \left( \begin{array}{ccc} v_{x}\cos\theta + v_{y}\sin\theta & -v_{x}\sin\theta +v_{y} \cos\theta & v_{z}\end{array}\right).\)

That is to say, these two vectors are different.

\(\left(M(\theta)\cdot v\,\right)^{\sf T} \neq v^{\sf T}\cdot M(\theta).\)

Instead, we see that to get the same rotated row vector, we need to operate on the right by the transpose of M(θ),

\(v^{\sf T} \cdot M(\theta)^{\sf T} = \left( \begin{array}{ccc} v_{x}\cos\theta - v_{y}\sin\theta & v_{x}\sin\theta +v_{y} \cos\theta & v_{z}\end{array}\right) = (M(\theta)\cdot v\,)^{\sf T}.\)

This is an illustration of a general point:

Rows and columns are two distinct representations of the same vector. They are related by the transpose operation. In general, a matrix acting on a column vector from the left is the same as the transpose of a matrix acting on a row vector from the right.

Now,

\(v^{\sf T} = (v)^{\sf T} \quad \mathrm{and} \quad v = (v^{\sf T})^{\sf T},\)

and also

\((v^{\sf T})^{\sf T} = v \quad \mathrm{and} \quad ((v^{\sf T})^{\sf T})^{\sf T} = v^{\sf T}.\)

So the transpose operation square to the identity. In this sense, we say that row vectors are dual to column vectors, and vice versa.

Exercise: Show that the set of m x n matrices forms a vector space, and hence argue that it is dual to the vector space of n x m matrices.

Because matrix multiplication requires that number of columns on the left matches the number of rows on the right, we can immediately see that we can always multiply two vectors from dual vector spaces2. This is another reason we call them dual.

In particular, we may multiply a row vector and a column vector,

\(w^{\sf T} \cdot v = \left(\begin{array}{ccc}w_{x} & w_{y} & w_{z}\end{array}\right) \left(\begin{array}{c} v_{x} \\ v_{y} \\ v_{z}\end{array}\right) = \sum_{j=1}^{3} w_{j}v_{j}.\)

This is precisely the inner product. For this case, we also see that this product also gives the Euclidean norm (and therefore distance) on R^3,

\(\langle v , v \rangle = v^{\sf T}\cdot v = v_{x}^{2} + v_{y}^{2} + v_{z}^{2}.\)

This result generalizes to all real matrices. To get an inner product on the vector space of (m x n)-matrices, we take the trace of the associated product,

\(\langle V , W \rangle = \mathsf{Tr}\,(V^{\sf T}\cdot W) = \sum_{k=1}^{n}\left(\sum_{j=1}^{m} V_{kj}W_{jk}\right).\)

Because of the cyclic property of trace3, it doesn’t matter whether you multiply the pair V^T W or W^T V. This is sometimes call the Frobenius inner product. We didn’t have to take the trace for the case of row and column vectors because their product was a single number.

Duality and Hilbert Spaces

We can generalize this notion for any vector space. When we treat the mathematics of this with precision, we will clarify the definitions. For now let us simply deal with our two complications: the case of complex vector spaces and infinite-dimensional vector space of complex functions.

For complex vector spaces of finite-dimension, we also require that dual vectors be complex conjugated. Hence the dual to a column vector is the complex conjugate of a row vector.

\(\langle v , w \rangle = v^{\sf \dagger}\cdot w = (v^{\sf T})^{\star} \cdot w= v_{x}^{\star}w_{x} + v_{y}^{\star}w_{y} + v_{z}^{\star}w_{z}.\)

Just like the transpose operation, this joint operation, the so-called Hermitian conjugate or “dagger” operation, squares to the identity. For complex vector spaces, we must take care to use the Hermitian conjugate if we want to multiply matrices on row vectors from the right,

\(\left(M(\theta) \cdot v\right)^{\dagger} = v^{\dagger} \cdot M^{\dagger}(\theta).\)

Similarly, for the vector spaces of complex functions, we must remember to take the complex conjugate of the left function, as previously seen,

\(\langle \phi , \psi \rangle = \int dz \;\phi^{\star}(z) \psi(z).\)

In all cases we do this to ended up with the inner product on the vector space, which gives us both a norm and hence a notion of distance.

A vector space with such an inner product is called a Hilbert Space.

Exercise: For a finite-dimensional Hilbert space. Argue that its dual space is also a Hilbert space. In particular, show that the two, dual Hilbert spaces necessarily share the same dimension, and are therefore isomorphic as vector spaces.

For infinite-dimensional Hilbert spaces, the notion of a dual space is more nuanced. Almost everything about infinite-dimensional vector spaces is nuanced. We’ll pass on those nuances for now.

The Inner Product on Dirac’s Representation

Last time we saw that physical states in Dirac’s representation of the harmonic oscillator were given by polynomials in the creation operator a† acting on the vacuum vector,

\(H \cong \mathbb{C}[a^{\dagger}],.\)

Dirac's Algebraic Hilbert space.

Sean Downes

Mar 6

Read full story

The monomials in a† formed a “basis” for the physical states4,

\(v_{n} = \frac{1}{\sqrt{n!}}\,a^{\dagger n}\cdot \Omega.\)

Up until now we haven’t yet been clear on what an inner product for Dirac’s abstract representation looks like. We are now equipped to discuss it.

BraKets

In 1939, Dirac published a paper that introduced the BraKet notation for such abstract Hilbert spaces. Following Dirac, we define a ket to be an abstract vector in H,

\(| \psi \rangle \in H.\)

For the case of the harmonic oscillator, we can write a ket for the vacuum vector,

\(\Omega \rightsquigarrow | \varnothing \rangle.\)

We can then rewrite our “basis” vectors as,

\(v_{n} \rightsquigarrow | n\rangle,\)

so that

\(| n \rangle = \frac{1}{\sqrt{n!}}\, a^{\dagger n} | \varnothing\rangle.\)

That is to say, we kept our notation for the operators for the Heisenberg algebra a, a† and 1.

Dirac then went on to define another set of vectors. The bra vectors, which are dual to H,

\(\langle \psi \,| \in H^{\star}.\)

Dirac then simply asserted that the map between bras and kets was essentially the same as transposition, so that

\(| \psi \rangle^{\dagger} = \langle \psi \,| \quad \mathrm{and}\quad \langle \psi \,|^{\dagger} = | \psi \rangle.\)

This is consistent in so far as the dagger operation here squares to the identity, just like the transpose.

Note that the same rules apply for any complex vector space. Any scalar multiple of a ket maps to its complex conjugate as a bra,

\(c| \psi \rangle^{\dagger} = \langle \psi \,| c^{\star}.\)

Now here’s the fun thing. You might recall that the operators are complex linear combinations of position and momentum,

\(a = \sqrt{\frac{m\omega}{2\hbar}}\left(q + \frac{ip}{m\omega} \right),\quad a^{\dagger} = \sqrt{\frac{m\omega}{2\hbar}}\left(q - \frac{ip}{m\omega} \right).\)

Because position and momentum are real linear operators on a complex vector space, their eigenvalues must be real. Hence they must be their own Hermitian conjugate5,

\(q^{\dagger} = q,\quad p^{\dagger} = p.\)

All this is to say that we should take the dagger in a† literally. So in particular, the action of the operator a on a ket maps to a† on the bra

\(\left( a \cdot | \psi \rangle\right)^{\dagger} = \langle \psi \,|\cdot a^{\dagger}.\)

This means that we can find a bra representation of the simple harmonic oscillator by defining its vacuum as the bra annihilated by a†,

\(\langle \varnothing | \cdot a^{\dagger} = 0.\)

We can then build the space of physical states as before, but this time as polynomials in a. Hence we have an alternative “basis” of monomials,

\(\langle n |= \langle \varnothing | \frac{1}{\sqrt{n!}}\, a^{n} .\)

Now you know what we must do. We must make take an inner product by considering the product of bras and kets - brakets6, in other words.

First let us normalize our vacua so that,

\(\langle \varnothing | \varnothing \rangle = 1.\)

Next observe that an operator sandwiched between a bra and a ket can act either on the left or the right. So we immediately have

\(\langle \varnothing | a^{\dagger m}|\varnothing \rangle = \langle \varnothing | a^{n}|\varnothing \rangle= 0.\)

But this is fine, because this implies that

\(\langle m | \varnothing \rangle = \langle \varnothing | n \rangle = 0,\quad m,n \neq 0.\)

The only chance we have of getting a nonzero braket is if there are factors of both a and a† between the vacua. Hence we are left to consider the case

\(\langle m | n \rangle = \frac{1}{\sqrt{m!n!}}\langle \varnothing | a^{m} a^{\dagger n} | \varnothing \rangle.\)

We’ll cut to the chase:

Proposition:

\(\langle \varnothing | a^{m} a^{\dagger n} | \varnothing \rangle = n! \delta_{mn}.\)

Proof.

First suppose that m > n. We can compute,

\(\langle \varnothing | a^{m} a^{\dagger n} | \varnothing \rangle = \langle \varnothing | \left( a^{m-1} [a,a^{\dagger n}] + a^{m-1}a^{\dagger n}a \right) | \varnothing \rangle.\)

The second term vanishes as a annihilates the ket vacuum. Evaluating the commutator using our prior discussion, we have

\(\langle \varnothing | a^{m} a^{\dagger n} | \varnothing \rangle = \langle \varnothing | n\, a^{m-1} a^{\dagger n-1} | \varnothing \rangle.\)

We can iterate this n-1 more times to find,

\(\langle \varnothing | a^{m} a^{\dagger n} | \varnothing \rangle = n!\langle \varnothing | \, a^{m-n} | \varnothing \rangle.\)

But a annihilates the ket vacuum, so this vanishes.

Next suppose that n < m. The same exact computation yields

\(\langle \varnothing | a^{m} a^{\dagger n} | \varnothing \rangle = \frac{n!}{(m-n)!}\langle \varnothing | \, a^{\dagger m-n} | \varnothing \rangle.\)

But here a† annihilates the bra vacuum. So this also vanishes.

Finally, if m = n, we see that

\(\langle \varnothing | a^{n} a^{\dagger n} | \varnothing \rangle = n!\langle \varnothing | \varnothing \rangle.\)

This proves the proposition. ∎

Putting all this together, we see that

\(\langle m | n \rangle = \delta_{mn}.\)

This finally gives us our concrete ability to use the inner product on H, Dirac’s abstract representation.

In the case of the wavefunctions of the simple harmonic oscillator, this integral was performed over all the real numbers, although the functions involved were complex.

To be concrete, an (m x n)-matrix is dual to an (n x m)-matrix, so we can multiply them either way, ( m x n )( n x m ) or ( n x m)(m x n). Of course, different orderings will give different, structural results in general.

Which you can infer from the equation above by summing either over j or k first. For finite-dimensional matrices it doesn’t matter.

For finite-dimensional vector spaces, a basis is just a choice of a maximal subset of linearly independent vectors. For the case of infinite-dimension - as with the harmonic oscillator - this notion is more nuanced. This is one reason we will come prefer the notion of a graded vector space instead of a basis.

There is a slight subtly to this if you use the Schrödinger representation, related to the notion of a duality for an infinite-dimensional vector space. Essentially you need to include the integral as part of your definition. We can discuss that later if there is interest, but this is yet another reason to stay in the realm of algebra whenever possible.

Sometimes it’s hard to tell the difference between dad jokes and math jokes.