Structural insight: April 2019

They [quaternions] are relatively poor 'magicians'; and, certainly, they are no match for complex numbers in this regard.
— Roger Penrose, The Road to Reality, 2005, §11.2.

In this post, I'm going to explore some deep questions about the nature of quaternion differentiation.

Along the way I'm going to suggest some reasons Penrose's assessment quoted above may be somewhat off-target. I'm quite interested in Penrose's view of quaternions because he presents a twenty-first century form of the classical arguments against quaternions, with an (afaics) sincere effort at objectivity, by someone who patently does appreciate the profound power of elegant mathematics (in apparent contrast to the vectorists' side of the 1890s debate). The short-short version: Not only do I agree that quaternions lack the magic of complex numbers, I think it would be bizarre if they had that magic since they aren't complex numbers — but I see clues suggesting they've other magic of their own.

If I claimed to know just what the magic of quaternions is, it would be safe to bet I'd be wrong; the challenge is way too big for answers to come that easily. However, in looking for indirect evidence that some magic is there to find, I'll pick up some clues to where to look next, the ambiguous note on which this post will end... or rather, trail off.

Before I can start in on all that, though, I need to provide some background.

Contents
Setting the stage
Quaternions
Doubting classical nabla
Considering quaternions
Full nabla
Partial derivatives
Generalized quaternions
Rotation
Minkowski
Langlands

Setting the stage

When I was first learning vector calculus as a freshman in college (for perspective, that's about when Return of the Jedi came out), I initially supposed that the use of symbol ∇ in three different differential operators — ∇, ∇×, ∇· — was just a mnemonic device. My father, who'd been interested in quaternions since, as best as I can figure, when he was in college (about when Casablanca came out), promptly set me straight: those operators look similar because they're all fragments of a single quaternion differential operator called nabla:

∇ =
i ∂

∂x

+
j ∂

∂y

+
k ∂

∂z

.

If you know a smattering of vector calculus, you may be asking, isn't that just the definition of the gradient operator? No, because of the seemingly small detail that the factors i, j, k aren't unit vectors, i, j, k — i, j, k are imaginary numbers. Which has extraordinary consequences. I'd better take a moment to explain quaternions.

Quaternions

A vector space over some kind of scalar numbers is the n-tuples of scalars, for some fixed n, together with scalar multiplication (to multiply a vector by a scalar, just multiply all the elements of the vector by that scalar) and vector addition (add corresponding elements of the vectors). An algebra is a vector space equipped with an internal operation called multiplication ("internal" meaning you multiply a vector by a vector, rather than by a scalar) and a multiplicative identity, such that scalar multiplication commutes with internal multiplication, and internal multiplication is bilinear (fancy term, simple once you've seen it: each element of the product is a polynomial in the elements of the factors, where each term in each polynomial has one element from each factor).

Whatever interesting properties a particular algebra has are, in a sense, contained in its internal multiplication. So when we speak of the "algebraic structure" of an algebra, what we're talking about is really just its multiplication table.

Quaternions are a four-dimensional hypercomplex algebra. They're denoted by the symbol ℍ (after Hamilton, their discoverer). Hypercomplex just means that the first of the four basis elements is the multiplicative identity, so that the first dimension of the vector space can be identified with the scalars, in this case the real numbers, ℝ. Traditionally the four basis elements are called 1, i, j, k; which said, hereafter I'll prefer to call the imaginaries i₁, i₂, i₃, and occasionally use i₀ as a synonym for 1. The four real vector-space components of a quaternion, I'll indicate by putting subscripts 0,1,2,3 on the name of the quaternion; thus, a = a₀ + a₁i₁ + a₂i₂ + a₃i₃ = Σ a_ki_k.

Quaternion multiplication is defined by i₁² = i₂² = i₃² = i₁ i₂ i₃ = −1, where multiplication of the imaginary basis elements is associative (i₁(i₂ i₃) = (i₁ i₂) i₃ and so on) and different imaginary basis elements anticommute (i₁ i₂ = − i₂ i₁ and so on). The whole multiplication table can be put together from these few rules, and we have the quaternion product (take a deep breath):

ab = (a₀b₀ − a₁b₁ − a₂b₂ − a₃b₃)

+ i₁ (a₀b₁ + a₁b₀ + a₂b₃ − a₃b₂)

+ i₂ (a₀b₂ − a₁b₃ + a₂b₀ + a₃b₁)

+ i₃ (a₀b₃ + a₁b₂ − a₂b₁ + a₃b₀) .

This is that bilinear multiplication table mentioned earlier, where each element of the product is a simple polynomial in elements of the factors. If you stare at this a bit, you can also see that when a and b are imaginary (that is, a₀ = b₀ = 0), the real part of the product is minus the dot product of the vectors, and the imaginary part of the product is the cross product of the vectors: ab = a×b − a·b.

A few handy notations: real part re(a) = a₀; imaginary part i‍m(a) = a − a₀; conjugate a^* = re(a) − i‍m(a); norm ||a|| = sqrt (Σ a_k²).

Quaternion multiplication is associative. Quaternion multiplication is also non-commutative, which was a big deal when Hamilton first discovered quaternions in 1843, because until then all known kinds of numbers had obeyed all the "usual" laws of arithmetic. But what's really interesting about quaternion multiplication — at least, on the face of it — is that it has unique division. That is, for all quaternions a and b, where a is non-zero, there is exactly one quaternion x such that ax = b, and exactly one quaternion x such that x‍a = b. In particular, with b=1, this says that every non-zero a has a unique left-multiplicative inverse, and a unique right-multiplicative inverse. These are actually the same number, which we write

a⁻¹ =
a^*
||a||²

(the conjugate divided by the square of the norm). So a a⁻¹ = a⁻¹a = 1.

Division algebras are very special; pathological cases aside, there are only four of them: real numbers, complex numbers, quaternions, and octonions. (Yes, there are hypercomplex numbers with seven imaginaries that are even more mind-bending than quaternions. But that's another story.)

To solve equation ax=b, we left-multiply both sides by a⁻¹, thus a⁻¹b = a⁻¹(ax) = (a⁻¹a)x = x; and likewise, the solution to x‍a=b is x = b a⁻¹. We call right-multiplication by a⁻¹ "right-division by a", and write b / a = b a⁻¹; similarly, left-division a \ b = a⁻¹ b. (Backslash, btw, is such a dreadfully overloaded symbol, I can somewhat understand why I haven't seen others use it this way; but I'm quite bowled over by how elegantly natural this use seems to me. It preserves the order of symbols when applying associativity: (a / b) c = a (b \ c).) Naturally, I won't write division vertically unless the denominator is real.

Okay, we're girded for battle. Back to nabla.

Doubting classical nabla

Our definition of nabla, you'll recall, was

∇ =
i₁ ∂

∂x₁

+
i₂ ∂

∂x₂

+
i₃ ∂

∂x₃

.

This operator is immensely useful; people have been making great use of it, or its fragments in vector calculus, for well over a century and a half. But three things about it bother me, the first two of which I've seen remarked by papers written in the past few decades.

Handedness

My first bother follows from the fact that quaternion multiplication isn't commutative. This was, remember, a dramatic new idea in 1843; the key innovation that empowered Hamilton's discovery, because what he wanted — something akin to complex numbers but with arithmetic sensible in three dimensions — requires non-commutative multiplication. But if multiplication isn't commutative, why should the partial derivatives in the definition of ∇ necessarily be left-multiplied by the imaginaries? Why shouldn't they be right-multiplied, instead?

I've seen modern papers that use together both left and right versions of nabla. Peter Michael Jack, who's had a web presence for (I believe) nearly as long as there's been a web, has suggested using a prefix operator for the left-multiplying nabla and a postfix operator for the right-multiplying nabla. Candidly, I find that notation dire hard to read. The point of prefix operators (which Hamilton championed, the better part of a century before Jan Łukasiewicz) is to make expressions much simpler to parse, and mixing prefix with postfix doesn't do that. Another notation I see used in at least one place modernly is a subscript q on the left or right of the nabla symbol to indicate which side to put the imaginary factors on. I'm not greatly enthused by that notation either because it uses a multi-part symbol. I have an alternative solution in mind for the notational puzzle; but I'll want to make clear first the whole of what I'm trying to notate.

Truncation

My second bother is that traditionally defined nabla isn't even a full quaternion operator. It only has the partial derivatives with respect to the three imaginary components. Where's the partial with respect to the real component? In the 1890s debate, the quaternionists said quaternions are profoundly meaningful as coherent entities, and the vectorists said scalars and vectors are meaningful while their sum is meaningless. Now, I'm quite sympathetic to the importance of mathematical elegance, but come on, guys, make up your mind! Either you go full-quaternion, or you don't. A nabla that only acts on vector functions is just lame.

There's a good deal of history related to the question of imaginary nabla versus full nabla. The truncation of nabla to the imaginary components throughout the nineteenth century may have been partly an historical accident. Consideration of the four-dimensional operator seems to have started just before the turn of the twentieth century, and modern quaternionists I've observed use a full-quaternion operator. I'll have more to say about this history in the next section.

Meaning

My third bother is a byproduct, as best I can figure, of my persistent sense of arbitrariness about the nabla operator. (This is the difficulty I've never seen anyone else remark upon. Perhaps I'm missing something everyone else gets, or maybe I've just never looked in the right place; but then again, maybe people are just reluctant to publicly admit something doesn't make sense to them. That might explain a lot about the world.) It was never obvious to me why it should be meaningful — or, if you prefer the word, useful — to multiply the partial derivatives by the imaginaries in the first place. It's clear to me why you'd do that if you were defining gradient, because gradient is meaningful for any number of dimensions, and doesn't depend in any way on the existence of a division operation. But quaternions do have unique division, in fact it's rather a big deal that they have unique division, and the usual definition of ordinary derivative involves dividing by Δx. So why are we multiplying by the imaginaries, instead of dividing by them?

Considering quaternions

Some of my above questions have historical answers, which also bear on the challenge raised by Penrose in the epigraph at the beginning of this post.

By Chapter 11 of The Road to Reality, where Penrose makes that remark (and where he also candidly describes the question of quaternions' use in physics as a "can of worms"), he's already described some marvelous properties of complex numbers, culminating with one (hyper‍functions) only published in 1958. Which raises an important point. Complex numbers have been intensely studied by mainstream researchers throughout the modern era of physics, yet Penrose's crowning bit of complex 'magic' wasn't discovered until 1958?

Compare that to how much, or rather how little, scrutiny quaternions have received. Hamilton discovered them in 1843; but Hamilton was a mathematical genius, not a great communicator. Quaternions, so I gather, remained the archetype of a baffling abstract theory until Einstein's General Theory of Relativity took over that role. The first tome Hamilton wrote on the subject, Lectures on Quaternions, daunted the mathematical luminaries of the day; his later Elements of Quaternions, published incomplete in 1866 following his death in 1865, wasn't easy either. The first accessible introduction to the subject was Peter Guthrie Tait's 1867 Elementary Treatise on Quaternions. Quaternions got a big publicity boost when James Clerk Maxwell used them (for their conceptual clarity, rather than for mundane calculations) in his 1873 Treatise on Electricity and Magnetism. And then in the 1890s quaternions "lost" the great vectors-versus-quaternions debate and their use gradually faded thereafter. There simply weren't all that many people working with quaternions in the nineteenth century, and as world population increased in the twentieth century quaternions were no longer a hot topic.

Moreover, exploration of nabla got off on the wrong foot. Hamilton seems to have first dabbled with it several years before he discovered quaternions, as a sort of "square root" of the Laplacian, at which time naturally he only gave it three components; and when he adapted it to a quaternionic form it still had only three components. He didn't do much with it in the Lectures, and planned a major section on it for the Elements but, afaict, died before he got to it. James Clerk Maxwell was a first-class mind and a passionate quaternion enthusiast, but died at the age of forty-eight in 1879 — the same year as William Kingdon Clifford, who was only thirty-three, another first-class mind who had explored quaternions. The full-quaternion nabla was finally looked at, preliminarily, in 1896 by Shunkichi Kimura, but by that time the quaternionic movement was starting to wind down. Yes, quaternions were still being used for some decades thereafter, but less and less, and the notations get harder and harder to follow as quaternionic notations were hybridized with Gibbs vector notations, further disrupting the continuity of the tradition and undermining systematic progress. Imho, it's entirely possible for major insights to still be waiting patiently.

A subtle point on which Penrose's portrayal of quaternions is somewhat historically off: Penrose says that although to a modern mind the one real and three imaginary components of quaternions naturally suggest the one time and three space dimensions of spacetime, that's just because we've been acclimated to the idea of spacetime by Einstein's theory of relativity; and quaternions don't actually work for relativity because they have the wrong signature (I'll say a bit more about this below; see here). But as far as the notion of spacetime goes, the shoe is on the other foot. Hamilton expected mathematics to coincide with reality (a principle Penrose also, broadly, embraces), and as soon as he discovered quaternions he did connect their structure metaphysically with the four dimensions of space and time. Penrose is quite right, I think, that ideas like this get to be "in the air"; but in this case it looks to me like it first got into the air from quaternions. So I'm more inclined to suspect quaternions suggested spacetime and thereby subtly contributed to relativity, rather than relativity and spacetime suggesting a connection to quaternions. The latter implies an anachronistic influence that must be illusory (for relativity to influence Hamilton would seem to require a TARDIS); the former hints at some deeper magic.

The point about quaternions having the wrong signature has its own curious historical profile. Penrose expresses very much the mainstream party line on the issue, essentially echoing the assessment of Hermann Minkowski a century earlier who, in formulating his geometry of spacetime, explicitly rejected quaternions, saying they were "too narrow and clumsy for the purpose". The basic mathematical point here (or, at least, a form of it) is that the norm of a quaternion is the square root of the sum of the squares of its components, √(t²+x²+y²+z²), whereas in Minkowski spacetime the three spatial elements should be negative, √(t²−x²−y²−z²). But here the plot thickens. Minkowski, who so roundly rejected quaternions, defines a differential operator that is, structurally, the four-dimensional nabla. As for quaternions and relativity, Ludwik Silberstein (a notable popularizer of relativity, in his day) did use quaternions for special relativity — except that, to be precise, he used biquaternions.

Biquaternions (which Hamilton had also worked with) are quaternions whose four coefficients are complex numbers in a fourth, independent imaginary. Or, equivalently, they're complex numbers whose two coefficients are quaternions in an independent set of three imaginaries. Either way, that's a total of eight real coefficients. Biquaternions do not, of course, have unique division. However, there are some oddly suggestive features to Silberstein's treatment. His spacetime vectors have only four non-zero real coefficients (of the four quaternion coefficients, a₀ is real while a_k≥1 are imaginary, so that Σ(a_ki_k)² = a₀²−||a₁||²−||a₂||²−||a₃||²; while other biquaternions he considers have imaginary a₀ and real a_k≥1). Moreover, he prominently uses the "inverse" of a biquaternion, defined structurally just as for quaternions,

a^*

||a||²

, notwithstanding the technical lack of general biquaternion division.

Silberstein's approach contrasts with the quaternionic treatment of special relativity by P.A.M. Dirac, published in 1945 as part of the centennial celebration for the discovery of quaternions. Dirac used real quaternions on the grounds that since the merit of quaternions is in their having division, it would be pointless to use biquaternions which are of no particular mathematical interest. His mapping of spacetime coordinates onto real quaternions was unintuitive. But looking at the oddly familiar-looking patterns in Silberstein's treatment, and Minkowski's operator which is hard not to think of as a full quaternionic nabla, one might well wonder if there is something going on that defies Dirac's claim about the importance of unique division. Perhaps we've been incautious in our assumptions about just where the deep magic is to be found.

There are two pitfalls in this kind of thinking, which the inquirer must thread carefully between. On one hand, one might assume there is some unknown deep magic here, rather than trying to work out what it is; this not only would lean toward numerology, but if there really is something to be found, would miss out on the benefits of finding it. On the other hand, one could derive some superficial mathematical account of the particular mathematical relationships involved, based on math one already knows about, and assume this is all there is to the matter; which would again guarantee that any deeper insight waiting to be found would not be found. (Current mainstream thinking, btw, falls into the latter pitfall, essentially reasoning that geometric algebras are useful in a way that quaternions are not, therefore quaternions are not useful.) Is there any situation where it would really be time to give up the search altogether? Well, yes, one does come to mind — if one were to arrive at some deep insight into why one should really believe there isn't some deep magic here. Which might itself be some rather deep and interesting magic.

Frankly, I don't even know quite where to look for this hypothetical deep magic. I sense its presence, as I've just described; but so far, I'm exploring various questions in the general neighborhood, patiently, with the notion that if these sorts of great insights naturally emerge from a large, broad body of research (as they have done for complex numbers), the chances of finding such a thing should improve as one increases the overall size and breadth of one's body of lesser insights.

Which brings me back to the particular point I'm pursuing in this post, the full quaternion nabla.

Full nabla

From a purely technical perspective, it isn't difficult to define four versions of the full quaternion nabla, differing only by whether each imaginary acts on its corresponding partial derivative by left-multiplying, right-multiplying, left-dividing, or right-dividing. The only remaining — purely technical — question is how to write these four different operators in an uncluttered way that keeps them straight. Since the traditional nabla has three partial derivatives and is denoted by a triangle, I'll denote these full nablas, with four partial derivatives, by a square. To keep track of how the imaginaries are introduced, I'll put a dot inside the square, near one of the corners: upper left for left-multiplying by imaginaries, upper right for right-multiplying, lower left for left-dividing, lower right for right-dividing. (This operator notation affords coherence, as the dot is inside so there's no mistaking it for a separate element, and, as a bonus, should also be easy to write quickly and accurately by hand on the back of an envelope.)

Let a = f(x). Noting that for imaginary i_k, 1/i_k = −i_k, the full-quaternion nablas are

●

a =
∂a

∂x₀

+
i₁ ∂a

∂x₁

+
i₂ ∂a

∂x₂

+
i₃ ∂a

∂x₃

     ●

a =
∂a

∂x₀

+
∂a i₁

∂x₁

+
∂a i₂

∂x₂

+
∂a i₃

∂x₃

●
a =
∂a

∂x₀

−
i₁ ∂a

∂x₁

−
i₂ ∂a

∂x₂

−
i₃ ∂a

∂x₃

= (
●

a ) ‌^*

     ●
a =
∂a

∂x₀

−
∂a i₁

∂x₁

−
∂a i₂

∂x₂

−
∂a i₃

∂x₃

= (
     ●

a ) ‌^*

and when we expand a = a₀ + a₁i₁ + a₂i₂ + a₃i₃,

●

a
 =  (
∂a₀
∂x₀
−
∂a₁
∂x₁
−
∂a₂
∂x₂
−
∂a₃
∂x₃
)

+ i₁ (
∂a₁
∂x₀
+
∂a₀
∂x₁
+
∂a₃
∂x₂
−
∂a₂
∂x₃
)

+ i₂ (
∂a₂
∂x₀
−
∂a₃
∂x₁
+
∂a₀
∂x₂
+
∂a₁
∂x₃
)

+ i₃ (
∂a₃
∂x₀
+
∂a₂
∂x₁
−
∂a₁
∂x₂
+
∂a₀
∂x₃
) .

Here, the left-hand column is the partial with respect to x₀, and the rest is the fragmentary differential operators from vector calculus: the rest of the top row is minus the divergence, the rest of the diagonal is the gradient, and the remaining six terms are the curl. When we reverse the order of multiplication for the right-multiplying

●

, the imaginaries commute with the scalars and with themselves, but anticommute with each other — so everything stays the same except that the sign of the curl is reversed. We have

●

=
∂
∂x₀
− div + grad + curl

     ●

=
∂
∂x₀
− div + grad − curl

●

=
∂
∂x₀
+ div − grad − curl

     ●

=
∂
∂x₀
+ div − grad + curl .

By taking differences between these nablas, one can isolate the partial with respect to x₀, and the curl, and... the gradient minus the divergence. One cannot, however, separate the gradient from the divergence this way, which raises the suspicion that the gradient and divergence are, in some profound sense, a single entity. There may be some insights waiting here into the intuitive meanings of these various fragments of the full nabla.

Wait. Wasn't part of the point of the 1890s debate that the quaternionists maintained the whole quaternion was in a profound sense a single entity? Why are we still talking about the meanings of fragments of this thing, instead of the whole? And while we're at it, why is it in any way meaningful to multiply-or-divide the partial derivatives by the basis elements?

Partial derivatives

From here, the path I've been following breaks up, with faint trails scattering off in many directions. No one trail immediately suggests itself to me as especially worth a protracted stroll, so for now I'll take a quick look down the first turn or so of several, getting a sense of the immediate neighborhood, and let my back‍brain mull over what to explore in some future post.

Possibly, in my quest for the deeper meaning of the nabla operator, I may be asking too much. With the caveat that this may be one of those situations where it's right to ask too much; some kinds of results must be pursued that way; but it's worth keeping in mind that, idealism as may be, there's always been a strong element of utility in the nabla tradition. Starting with, as noted above, the pre-quaternion history of nabla, the choice of operator has been in significant part a matter of what works.

A secondary theme that's been in play at least since Shunkichi Kimura's 1896 treatment is total derivatives versus partial derivatives. Without tangling in the larger question of coherent meaning, Kimura did address this point explicitly and up-front: why write

●

a =
∂a

∂x₀

+
i₁ ∂a

∂x₁

+
i₂ ∂a

∂x₂

+
i₃ ∂a

∂x₃

rather than

●

a =
da

dx₀

+
i₁ da

dx₁

+
i₂ da

dx₂

+
i₃ da

dx₃

?

Kimura, after noting that the two forms are interchangeable when the x_k are independent, chose partial derivatives. And reached this choice by considering the utility of the two candidate operators in expressing some standard equations, and adopting the operator he finds notationally more convenient. It figures this would be the operator using partial derivatives, which are more technically primitive building blocks and thus —one would think— ought logically to provide a more versatile foundation.

An (arguably) more definite form of the total/partial question appears in modern quaternionic treatments of Maxwell's equations ([1], [2]), with the peculiar visible consequence that the definition of full nabla in these treatments has a stray factor of 1/c on the partial with respect to time (x₀). On investigation, this turns out to be a consequence of starting out with the total derivative with respect to time, supposing (as I track this, three and a half decades after I took diffy Q‍s) the whole is time-dependent. Expanding the partials,

d

d‍x₀

=
∂

∂x₀

+

∂x₁ ∂

∂x₀ ∂x₁

+

∂x₂ ∂

∂x₀ ∂x₂

+

∂x₃ ∂

∂x₀ ∂x₃

.

Now, the partials

∂x_k≥1

∂x₀

are the velocities of propagation along the spatial axes, which for Maxwell's equations are taken to be the speed of light, c. This factor of c therefore shows up on three out of four partials, but not on the partial with respect to time; for convenience —that again— one defines an operator with a factor of 1/c on it, which eliminates the extra factors of c on three of the partials, but introduces a 1/c on the partial with respect to time.

And then there is the matter of orienting the partials. Which I'm still foggy on, how the imaginaries get in there and thus whether they multiply or divide, on the left or on the right. I see treatments just splicing the imaginaries in with at most a casual reference to orientation in an algebra, which early classroom experience has conditioned me to treat as someone who understands it all and doesn't take time to explain every little thing (I've been in that position a few times myself); but over time I've started to suspect that the folks acting so in this case might not really understand it any better than I do (I've been in that position, too).

Generalized quaternions

Quaternions lost out on the concrete front to vector calculus. But they also lost out on the abstract front. Mathematicians took Hamilton's idea of using axioms to define more general forms of numbers and reason about their properties, and ran with it. Linear algebra. Clifford algebras. Lie and Jordan algebras. Rings. Groups. Monoids. Semi-groups. People who want special numbers won't go as far as quaternions, and people who want general numbers won't stop at quaternions.

Yet, generalized quaternions — quaternions whose four coefficients aren't real numbers — have occasionally been employed. Why? On the face of it, generalized quaternions don't have the specific properties that make real quaternions unique. Are they used, then, out of some perceived mystical significance of quaternions, or is there actually something structural about quaternions, aside from their unique mathematical properties as a division algebra, that they can confer even in the generalized setting? I do not, of course, have a decisive answer for this question. I do have some places to look for small insights building toward prospects of an answer.

The places to look evidently fall into two groups, those that look within the scope of real quaternions and those that look at generalized forms of quaternions. In looking at real quaternions the point is to understand what they have to offer beyond mere unique division, that might possibly linger after the unique division itself has dropped away. I'll have more to say, further below, about real than generalized quaternions; I'm simply not familiar with much research using generalized quaternions as such, as most researchers either stick with real quaternions or drop quaternion structure.

On the generalized-quaternions front, I've already mentioned Silberstein; but, tbh, all I get from Silberstein is the question. That is, Silberstein's work suggests to me there's something of interest in generalized quaternions, but doesn't go far enough to identify what. There are some well-known generalizations that go off in different directions from Silberstein; besides geometric algebras, which are enjoying some popularity atm, there's the Cayley–Dickson construction, which offers an infinite sequence of hypercomplex algebras with 2ⁿ components, each losing just a bit more well-behavedness: complexes, quaternions, octonions, sedenions, and on indefinitely (though usually not bothering with fancy names beyond the sedenions). So far, I haven't felt any of those sorts of generalizations were retaining the character of quaternions; so that, whatever merits those generalizations might enjoy in themselves, they wouldn't offer insights into the peculiar merits of quaternions.

As it happens, I do know of someone who continues further in what appears to be the same direction as Silberstein. But there's a catch.

The work I'm thinking of was done about sixty years ago by a Swedish civil engineer by the name of Otto Fischer. He wrote two books on the subject, Universal Mechanics and Hamilton's Quaternions (1951) and Five Mathematical Structural Models in Natural Philosophy with Technical Physical Quaternions (1957). It happens I can study the earlier book all I want, because my father bought a copy which I've inherited. Fischer indeed did not stop at real quaternions nor biquaternions. He moved on to what he called quadric quaternions — quaternions whose coefficients are themselves quaternions in an independent set of imaginaries, thus with six elementary imaginaries in two sets of three, and sixteen real coefficients — and thence to double quadric quaternions, which are quadric quaternions whose sixteen coefficients are themselves quadric quaternions in independent imaginaries, thus twelve elementary imaginaries in four sets of three, and 256 real coefficients. If what is needed to bring out the secrets of generalized quaternions is a sufficiently general treatment, Fischer should qualify.

Looking back now, Fischer's work looks a bit fringe; but it didn't look so extra-paradigmatic at the time. The 1890s vectors-quaternions debates were in the outer reaches of living memory, about as far removed as the 1950s are today; and work on quaternions had been done by some prominent physicists within the past few years. In particular, Sir Arthur Eddington, who had tinkered with quaternions, had only recently died. Fischer's work was — deservedly — criticized for its density, but afaict wasn't dismissed out of hand, as such.

In any case, my current interest is on the periphery of things, rather than in the center of prevailing paradigm research; so I can afford to tolerate a certain off-beat character in Fischer's work — up until Fischer gives me a reason to think I've nothing further worthwhile to find in it. And Fischer comes across as competent and quite self-aware of the density and indirection of his work, which he seeks to mitigate — though there's a real question as to whether he succeeds.

What I really want to understand about Fischer's work is, having provided himself with such an immense array of generalized quaternionic structure, what does he use it for? There are some clues readily visible in the preface and final sections of the book; somehow he seems to be associating different quaternion subsets of his general numbers with different specialties, and he's playing some kind of games with "pyramids" of differential operators. To really get a handle on it all, I fear it may be necessary to confront the book in full depth from page 1, which I've tried far enough to realize it's the single densest mathematical treatment I've encountered (though he does take very seriously his own advice to "begin at the beginning", else I'd hold out no hope at all of making sense of it).

So, studying Fischer's work may be one source of... eventual... insights into the puzzle of generalized quaternions. It certainly isn't a short-term prospect; but, there it is.

Before getting back to real quaternions in the next section, I'll digress to remark that Fischer reinforces a belief I've held ever since I really started researching the history of quaternions — in 1986 — that what we really need in mathematics is a certain type of software.

By my reading of the history, the vectorists in the 1890s debate really did have one important practical point in their favor: if you have to deal with the algebra by hand, it seems it'd be vastly easier to not make careless errors when following the rectangular regimentation of matrix algebra than the spinning vortices of quaternion algebra. (Recalling from my earlier post, the equivalence between matrix and quaternion methods is akin to the equivalence between particles and waves — with quaternions playing the part of waves.) That is, if you try to do quaternion algebra, involving breaking things down into components, on the back of an envelope you're awfully likely to make a mistake; so I immediately imagined having a computer help you get it right. (I didn't imagine a graphical user interface, btw, as that technology really didn't exist yet for personal computing. Looking back, I find myself ambivalent about GUIs; sure, they can be sparkly, but they don't always help us think clearly; we're so busy thinking of how to use the graphics, we forget to think first and foremost about the logical structures we'd like to interface with.)

Thinking about this idea, I eventually decided the underlying logical structures one wants would be essentially proofs, so that in a sense the software would be a sort of "proof processor", by loose analogy with the "word processor". Achieving the fluidity of back-of-the-envelope algebra was always key to my concept; my occasional encounters with "symbolic math" software have given me the impression of something far too cumbersome for what I envision. Facilely moving between alternative paths of reasoning should be easy; symbol definition would seem to call for something halfway between conventional "declarative" and "imperative" styles. I also imagined the computer trying, in its free moments, to devise context-sensitive helpful suggestions for what to do next — without trying to take control of the proof task away from the human user. I've never been a fan of fully-automated proof, as such; in the early days of personal computing (as a commenter on another of my posts reminded me) we anticipated computers of the future would enhance our brain power, not attempt to replace it, and the enhancements weren't to be just increasing our ability to look things up, either.

Where does Fischer come into this? Well, Fischer not only deals with massive grids of coordinates, his notation looks extremely idiosyncratic to me, using different conventions than anything else I've seen. Perhaps a typical 1950s Swedish civil engineer would find much of it quite conventional. But, unless you spend all of your time in one narrow mathematical sub‍community, studying mathematics is a pretty heavily linguistic exercise, because every sub‍community has their own language and one is forever having to translate between them. Wouldn't it be nice to be able to just toggle some controls and switch between the way one author (such as Fischer) wrote, and the conventions used by whichever other author you prefer?

Btw, this software I'm describing? Not a minor interest. Not just a past interest. I still want it, all the more because, even though I've always felt it was doable and would be immensely valuable, afaict we're no closer to having it now than we were thirty years ago. Never assume that what you think is needed will be provided by somebody else. Think of it this way: if you can see it's doable and would be valuable, presumably you'd be more likely that most people to make it happen; so if you aren't going to the effort to make it happen, that's a sample of one suggesting that nobody else will go to the effort either. I've also never felt I could properly describe this software in words, so even if I was gifted with a team of programmers to implement it I couldn't tell them what to do; so I figured if it was going to happen I'd have to do it myself. Only, it looks like a huge project, so for one person to implement it would require a programming language with unprecedentedly vast abstractive power. By some strange coincidence, designing a programming language like that is something I've been striving for ever since.

Rotation

Another trail that, sooner or later, clearly needs to be explored is the relationship, at its most utterly abstract, between quaternions and rotation.

Hamilton was looking at rotations, from the start. Quaternions, as noted, stand in relation to matrices as waves to particles; in some profound sense, quaternions seem to be the essence of rotation. The ordinary understanding of quaternion division is that a quaternion is the ratio of two three-vectors, and the non-commutativity of quaternion multiplication then follows directly from recognition that rotations on a sphere produce different results if done in a different order. Even Silberstein, who was using biquaternions rather than real quaternions and was working in Minkowski rather than Euclidean spacetime, was doing rotations, which in itself suggests that what's going on is more than meets the eye.

This is a tricky point. The relationship between quaternions and rotation is readily explained, indeed rather trivialized, in terms of peculiarities of rotation in three-dimensional Euclidean space. This is very much the canonical view, the one embraced by Penrose. Real quaternions become a single case in a general framework, and are then easily dismissed as merely an aberration that loses its seeming specialness when the wider context is properly appreciated.

The weakness in this reasoning is that it depends on the choice of general framework. This would be easier to see if the framework involved were alternative rather than mainstream. Suppose there were two different general frameworks in which the specific case (here, quaternions) could be fit; and in one of these frameworks, the specific case appears incidental, while in the other framework it appears pivotal. It would then be hard to make a compelling case, based on the first framework, that the specific case is incidental, because the second framework would be right there calling that conclusion into question. If the first framework is the only one we know about, though, the same case can be quite persuasive. To even question the conclusion we'd have to imagine the possibility of an alternative framework; and actually finding such an alternative could be a formidable challenge. Especially with the possibility hanging over us that perhaps the alternative mightn't really exist after all.

Investigating this trail seems likely to become an intensive study in avoiding conceptual pitfalls while dowsing for new mathematics.

Minkowski

A narrow, hence more technically fraught, target for mathematical dowsing is Minkowski spacetime. Minkowski's decisive condemnation of a quaternionic approach —"too narrow and clumsy for the purpose"— is a standard quote on the subject, cited by quaternion opponents and proponents alike. If there is an alternative general framework to be found, after all, it'd have to handle Minkowski.

Without actually wading into this thing (not to be undertaken lightly), I can only note from a distance a few features that may be of interest when the time comes. The mechanical trouble in this is evidently to do with the pattern of signs, which seems reminiscent of the multiple variants of nabla (though the pessimist in me insists it can't be quite that easy); which, logically, oughtn't be applicable to the situation unless one were really already dealing with a derivative. Off hand, the only way that comes to mind for derivatives to come into it is if the whole physical infrastructure is something less obvious than what Minkowski was doing — which, yes, is cheating; and cheating (so to speak) is likely the only way to end up with a different answer than Minkowski did, so this might, just conceivably, be a hopeful development.

Langlands

I wondered whether even to mention this. The geometric Langlands correspondence lies at the extreme wide end of mathematical dowsing targets; about as poetic as mathematics comes (which is very poetic indeed), and at the same time about as esoteric as it comes (yea, verily).

Mathematics in its final form is, of course, highly formal (I say "of course", but see my earlier remarks on axioms as a legacy of quaternions). The ideas don't start out formal though; and there's always lots of material that hasn't yet worked its way across to the formal side. Moreover, attempts to describe the poetry of mathematics for non-mathematicians, in my experience, ultimately fail because they're trying to do something that can't really be done: they're trying to divorce the (very real) poetic nature of mathematics from the technical nature of the subject, and at last this can't really be done because the true poetry is that the elegance arises from the technicalities.

Poking around on the internet, I found a discussion on Quora from a few years ago on the question Can the Langlands Program be described in layman's terms? There were some earnest attempts that ultimately devolved into technical arcana; but my favorite answer, offered by a couple of respondents, was in essence: no.

My own hand-wavy assessment: Robert Langlands conjectured broad, deep connections between the seemingly distant mathematical subjects of number theory and algebraic geometry. Especially distant in that, poetically speaking, number theory is a flavor of "discrete" math, while algebraic geometry is toward the continuous side of things. (I riffed on the discrete/continuous balance in physics some time back.) An especially high-publicity result fitting within this vast program was Andrew Wiles's proof of Fermat's Last Theorem, which hinged on proving a conjecture about elliptic curves.

Why would I even bring up such a thing? The Langlands program has gotten tangled up, in this century, with supersymmetry in physics; and the geometric side of Langlands is about complex curves. In effect, Langlands biases mathematical speculations toward further enhancing the reputation of complex numbers. So if one suspects physics may also lean toward the quaternionic, and one is also looking for interesting mathematical properties of quaternions, it seems fair game to ask whether quaternions can play into some variation on Langlands.

Structural insight

Thursday, April 18, 2019

Nabla

ab	=	(a₀b₀ − a₁b₁ − a₂b₂ − a₃b₃)
		+ i₁ (a₀b₁ + a₁b₀ + a₂b₃ − a₃b₂)
		+ i₂ (a₀b₂ − a₁b₃ + a₂b₀ + a₃b₁)
		+ i₃ (a₀b₃ + a₁b₂ − a₂b₁ + a₃b₀)	.