Tuesday, July 31, 2018

Co-hygiene and emergent quantum mechanics

Thus quantum mechanics occupies a very unusual place among physical theories:  it contains classical mechanics as a limiting case, yet at the same time it requires this limiting case for its own formulation.
Lev Landau and Evgeny Lifshitz, Quantum Mechanics: Non-relativistic Theory (3rd edition, 1977, afaik).

Gradually, across a series of posts exploring alternative structures for a basic theory of physics, I've been trying to tease together a strategy wherein quantum mechanics is, rather than a nondeterministic foundation of reality, an approximation valid for sufficiently small systems.  This post considers how one might devise a concrete mathematical demonstration that the strategy can actually work.

I came into all this with a gnawing sense that modern physics had taken a conceptual wrong turn somewhere, that it had made some —unidentified— incautious structural assumption that ought not have been made and was leading it further and further astray.  (I explored the philosophy of this at some depth in an earlier post in the series, several years ago by now.)  The larger agenda here is to shake up our thinking on basic physics, accumulating different ways to structure theories so that our structural choices are made with eyes open, rather than just because we can't imagine an alternative.  The particular notion I'm stalking atm —woven around the concept of co-hygiene, to be explained below— is, in its essence, that quantum mechanics might be an approximation, just as Newtonian mechanics is, and that the quantum approximation may be a consequence of the systems-of-interest being almost infinitesimally small compared to the cosmos as a whole.  Quantum mechanics suggests that all the elementary parts of the cosmos are connected to all the other elementary parts, which is clearly not conducive to practical calculations.  In the model I'm pursuing, each element is connected to just a comparatively few others, and the whole jostles about, with each adjustment to an element shuffling its remote connections so that over many adjustments the element gets exposed to many other elements.  Conjecturally, if a sufficiently small system interacts in this way with a sufficiently vast cosmos, the resulting behavior of the small system could look a lot like nondeterminism.

The question is, could it look like quantum mechanics?

As I've remarked before, my usual approach to these sorts of posts is to lift down off my metaphorical shelf the assorted fragments I've got on the topic of interest; lay out the pieces on the table, adding at the same time any new bits I've lately collected; inspect them all severally and collectively, rearranging them and looking for new patterns as I see them all afresh; and record my trail of thought as I do so.  Sometimes I find that since the last time I visited things, my whole perception of them has shifted (I was, for example, struck in a recent post by how profoundly my perception of Church's λ-calculus has changed just in the past several years).  Hopefully I glean a few new insights from the fresh inspection, some of which find their way into the new groupings destined to go back up on the shelf to await the next time, while some other, more speculative branches of reasoning that don't make it into my main stream of thought are preserved in my record for possible later pursuit.

Moreover, each iteration achieves focus by developing some particular theme within its line of speculation; some details of previous iterations are winnowed away to allow an uncluttered view of the current theme; and once the new iteration reaches its more-or-less-coherent insights, such as they are, a reset is then wanted, to unclutter the next iteration.  Most of the posts in this series —with a couple of exceptions (1, 2)— have focused on the broad structure of the cosmos, touching only lightly on concrete mathematics of modern physics that, after all, I've suspected from the start of favoring incautious structural assumptions.  This incremental shifting between posts is why, within my larger series on physics, the current post has a transitional focus:  reviewing the chosen cosmological structure in order to apply it to the abstract structure of the mathematics, preparing from abstract ground to launch an assault on the concrete.

Though I'll reach a few conclusions here —oriented especially toward guidance for the next installment in the series— much of this is going to dwell on reasons why the problem is difficult, which if one isn't careful could create a certain pessimism toward the whole prospect.  I'm moderately optimistic that the problem can be pried open, over a sufficient number of patient iterations of study.  The formidable appearance of a mountain in-the-large oughtn't prevent us from looking for a way to climb it.

Primitive wave functions
Probability distributions
Quantum/classical interface
The universe says 'hi'
The upper box

The schematic mathematical model I'm considering takes the cosmos to be a vast system of parts with two kinds of connections between them:  local (geometry), and non-local (network).  The system evolves by discrete transformational steps, which I conjecture may be selected based entirely on local criteria but, once selected, may draw information from both local and non-local connections and may have both local and non-local effects.  The local part of all this would likely resemble classical physics.

When a transformation step is applied, its local effect must be handled in a way that doesn't corrupt the non-local network; that's called hygiene.  If the non-local effect of a step doesn't perturb pre-existing local geometry, I call that co-hygiene.  Transformation steps are not required in general to be co-hygienic; but if they are, then local geometry is only affected by local transformation steps, giving the steps a close apparent affinity with the local geometry, and I conjectured this could explain why gravity seems more integrated with spacetime than do the other fundamental forces.  (Indeed, wondering why gravity would differ from the other fundamental forces was what led me into the whole avenue of exploration in the first place.)

Along the way, though, I also wondered if the non-local network could explain why the system deviated from "classical" behavior.  Here I hit on an idea that offered a specific reason why quantum mechanics might be an approximation that works for very small systems.  My inspiration for this sort of mathematical model was a class of variant λ-calculi (in fact, λ-calculus is co-hygienic, while in my dissertation I studied variant calculi that introduce non-co-hygienic operations to handle side-effects); and in those variant calculi, the non-local network topology is highly volatile.  That is, each time a small subsystem interacts non-locally with the rest of the system, it may end up with different network neighbors than it had before.  This means that if you're looking at a subsystem that is smaller than the whole system by a cosmically vast amount — say, if the system as a whole is larger than the subsystem by a factor of 1070 or 1080 — you might perform a very large number of non-local interactions and never interact with the same network-neighbor twice.  It would be, approximately, as if there were an endless supply of other parts of the system for you to interact non-locally with.  Making the non-local interactions look rather random.

Without the network-scrambling, non-locality alone would not cause this sort of seeming-randomness.  The subsystem of interest could "learn" about its network neighbors through repeated interaction with them, and they would become effectively just part of its internal state.  Thus, the network-scrambling, together with the assumption that the system is vastly larger than the subsystem, would seem to allow the introduction of an element of effective nondeterminism into the model.

But, is it actually useful to introduce an element of effective nondeterminism into the model?  Notwithstanding Einstein's remark about whether or not God plays dice, if you start with a classical system and naively introduce a random classical element into it, you don't end up with a quantum wave function.  (There is a vein of research, broadly called stochastic electrodynamics, that seeks to derive quantum effects from classical electrodynamics with random zero-point radiation on the order of Planck's constant, but apparently they're having trouble accounting for some quantum effects, such as quantum interference.)  To turn this seeming-nondeterminism to the purpose would require some more nuanced tactic.

There is, btw, an interesting element of flexibility in the sort of effective-nondeterminism introduced:  The sort of mathematical model I'm conjecturing has deterministic rules, so conceivably there could be some sort of invariant properties across successive rearrangements of the network topology.  Thus, some kinds of non-local influences could be seemingly-random while others might, at least under some particular kinds of transformation (such as, under a particular fundamental force), be constant.  The subsystem of interest could "learn" these invariants through repeated interactions, even though other factors would remain unlearnable.  In effect, these invariants would be part of the state of the subsystem, information that one would include in a description of the subsystem but that, in the underlying mathematical model, would be distributed across the network.

Primitive wave functions

Suppose we're considering some very small physical system, say a single electron in a potential field.

A potential field, as I suggested in a previous post, is a simple summation of combined influences of the rest of the cosmos on the system of interest, in this case our single electron.  Classically —and under Relativity— the potential field would tell us nothing about non-local influences on the electron.  In this sort of simple quantum-mechanical exercise, the potential field used is, apparently, classical.

The mathematical model in conventional quantum mechanics posits, as its underlying reality, a wave function — a complex- (or quaternion-, or whatever-) valued field over the state space of the system, obeying some wave equation such as Schrödinger's,

iℏ Ψ
 =   Ĥ Ψ .

This posited underlying reality has no electron in the classical sense of something that has a precise position and momentum at each given time; the wave function is what's "really" there, and any observation we would understand as measuring the position or momentum of the electron is actually drawing on the information contained in the wave function.

While the wave function evolves deterministically, the mathematical model as a whole presents a nondeterministic theory.  This nondeterminism is not a necessary feature of the theory.  An alternative mathematical model exists, giving exactly the same predictions, in which there is an electron there in the classical sense, with precise position and momentum at each given time.  Of course its position and momentum can't be simultaneously known by an observer (which would violate the Heisenberg uncertainty principle); but in the underlying model the electron does have those unobservable attributes.  David Bohm published this model in 1952.  However Bohm's model doesn't seem to have offered anything except a demonstration that quantum theory does not prohibit the existence of an unobservable deterministic classical electron.  In Bohm's model, the electron had a definite position and momentum, yes, but it was acted on by a "pilot wave" that, in essence, obeyed Schrödinger's equation.  And Schrödinger's equation is non-local, in the sense that not only does it allow information (unobservable information) to propagate faster than light, it allows it to "propagate" infinitely fast; the hidden information in the wave function does not really propagate "through" space, it just shows up wherever the equation says it should.  Some years later, Bell's Theorem would show that this sort of non-locality is a necessary feature of any theory that always gives the same predictions as quantum mechanics (given some other assumptions, one of which I'm violating; I'll get back to that below); but my main point atm is that Bohm's model doesn't offer any new way of looking at the wave function itself.  You still have to just accept the wave function as a primitive; Bohm merely adds an extra stage of reasoning in understanding how the wave function applies to real situations.  If there's any practical, as opposed to philosophical, advantage to using Bohm's model, it must be a subtle one.  Nevertheless, it does reassure us that there is no prohibition against a model in which the electron is a definite, deterministic thing in the classical sense.

The sort of model I'm looking for would have two important differences from Bohm's.

First, the wave function would not be primitive at all, but instead would be a consequence of the way the local-geometric aspect of the cosmos is distorted by the new machinery I'm introducing.  The Schrödinger equation, above, seems to have just this sort of structure, with Ĥ embodying the classical behavior of the system while the rest of the equation is the shape of the distorting lens through which the classical behavior passes to produce its quantum behavior.  The trick is to imagine any sensible way of understanding this distorting lens as a consequence of some deeper representation (keeping in mind that the local-geometric aspect of the cosmos needn't be classical physics as such, though this would be one's first guess).

A model with different primitives is very likely to lead to different questions; to conjure a quote from Richard Feynman, "by putting the theory in a certain kind of framework you get an idea of what to change".  Hence a theory in which the wave function is not primitive could offer valuable fresh perspective even if it isn't in itself experimentally distinguishable from quantum mechanics.  There's also the matter of equivalent mathematical models that are easier or harder to apply to particular problems — conventional quantum mechanics is frankly hard to apply to almost any problem, so it's not hard to imagine an equivalent theory with different primitives could make some problems more tractable.

Second, the model I'm looking for wouldn't, at least not necessarily, always produce the same predictions as quantum mechanics.  I'm supposing it would produce the same predictions for systems practically infinitesimal compared to the size of the cosmos.  Whether or not the model would make experimentally distinguishable predictions from quantum mechanics at a cosmological scale, would seem to depend on how much, or little, we could work out about the non-local-network part of the model; perhaps we'd end up with an incomplete model where the network part of it is just unknown, and we'd be none the wiser (but for increased skepticism about some quantum predictions), or perhaps we'd find enough structural clues to conjecture a more specific model.  Just possibly, we'd end up with some cosmological questions to distinguish possible network structures, which (as usual with questions) could be highly fruitful regardless of whether the speculations that led to the questions were to go down in flames, or, less spectacularly, were to produce all the same predictions as quantum mechanics after all.

Probability distributions

Wave functions have always made me think of probability distributions, as if there ought to be some deterministic thing underneath whose distribution of possible states is generating the wave function.  What's missing is any explanation of how to generate a wave-function-like thing from a classical probability distribution.  (Not to lose track of the terminology, this is classical in the sense of classical probability, which in turn is based on classical logic, rather than classical physics as such.  Though they all come down to us from the late nineteenth century, and complement each other.)

A classical probability distribution, as such, is fairly distinctive.  You have an observable with a range of possible values, and you have a range of possible worlds each of which induces an observable value.  Each possible world has a non-negative-real likelihood.  The (unnormalized) probability distribution for the observable is a curve over the range of observable values, summing for each observable value the likelihoods of all possible worlds that yield that observable value.  The probability of the observable falling in a certain interval is the area under the curve over that interval, divided by the area under the curve over the entire range of observable values.  If you add together two mutually disjoint sets of possibilities, the areas under their curves simply add, since for each observable value the set of possible worlds yielding it is just the ones in the first set and the ones in the second set.

The trouble is, that distinctive pattern of a classical probability distribution is not how wave functions work.  When you add together two wave functions, the two curves get added all right, but the values aren't unsigned reals; they can cancel each other, producing an interference pattern as in classic electron diffraction.  (I demonstrated the essential role of cancellation, and a very few other structural elements, in quantum mechanical behavior in a recent post.)  As an additional plot twist, the wave function values add, but the probability isn't their sum but (traditionally) the square of the magnitude of their sum.

One solution is to reject classical logic, since classical logic gives rise to the addition rule for deterministic probability distributions.  Just say the classical notion of logical disjunction (and conjunction, etc.) is wrong, and quantum logic is the way reality works.  While you're at it, invoke the idea that the world doesn't have to make sense to us (I've remarked before on my dim view of the things beyond mortal comprehension trope).  Whatever its philosophical merits or demerits, this approach doesn't fit the current context for two reasons:  it treats the wave function as primitive whereas we're interested in alternative primitives, so it doesn't appear to get us anywhere new/useful; and, even if it did get us somewhere useful (which it apparently doesn't), it's not the class of mathematical model I'm exploring here.  I'm pursuing a mathematical model spiritually descended from λ-calculus, which is very much in the classical deterministic tradition.

So, we're looking for a way to derive a wave function from a classical probability distribution.  One has to be very canny about approaching something like this.  It's not plausible this would be untrodden territory; the strategy would naturally suggest itself, and lots of very smart, highly trained physicists with strong motive to consider it have had nearly a century in which to do so.  Yet, frankly, if anyone had succeeded it ought to be well-known in alternative-QM circles, and I'd hope to have at least heard of it.  So going into the thing one should apply a sort of lamppost principle, and ask what one is bringing to the table that could possibly allow one to succeed where they did not.  (A typical version of the lamppost principle would say, if you've lost your keys at night somewhere on a dark street with a single lamppost, you should look for them near the lamppost since your chances of finding them if they're somewhere else are negligible.  Here, to mix the metaphors, the something-new you bring to the table is the location of your lamppost.)

I'm still boggled by how close the frontier of human knowledge is.  In high school I chose computer science for a college major partly (though only partly) because it seemed to me like there was so much mathematics you could spend a lifetime on it without reaching the frontier — and yet, by my sophomore year in college I was exploring extracurricularly some odd corner of mathematics (I forget what, now) that had clearly never been explored before.  And now I'm recently disembarked from a partly-mathematical dissertation; a doctoral dissertation being, rather by definition, stuff nobody has ever done before.  The idea that the math I was doing in my dissertation was something nobody had ever done before, is just freaky.  At any rate, I'm bringing to this puzzle in physics a mathematical perspective that's not only unusual for physics, but unique even in the branch of mathematics I brought it from.

The particular mathematical tools I'm mainly trying to apply are:

  • "metatime" (or whatever else one wants to call it), over which the cosmos evolves by discrete transformation steps.  This is the thing I'm doing that breaks the conditions for Bell's Theorem; but all I've shown it works for is reshaping a uniform probability distribution into one that violates Bell's Inequality (here), whereas now we're not just reshaping a particular distribution but trying to mess with the rules by which distributions combine.

    My earlier post on metatime was explicitly concerned with the fact that quantum-mechanical predictions, while non-local with respect to time, could still be local with respect to some orthogonal dimension ("metatime").  Atm I'm not centrally interested in strict locality with respect to metatime; but metatime still interests me as a potentially useful tactic for a mathematical model, offering a smooth way to convert a classical probability distribution into time-non-locality.

  • transformation steps that aggressively scramble non-local network topology.  This seems capable of supplying classical nondeterminism (apparently, on a small scale); but the apparent nondeterminism we're after isn't classical.

  • a broad notion that the math will stop looking like a wave function whenever the network scrambling ceases to sufficiently approximate classical nondeterminism (which ought to happen at large scales).  But this only suggests that the nondeterminism would be a necessary ingredient in extracting a wave function, without giving any hint of what would replace the wave function when the approximation fails.

These are some prominent new things I'm bringing to the table.  At least the second and third are new.  Metatime is a hot topic atm, under a different name (pseudo-time, I think), as a device of the transactional interpretation of QM (TI).  Advocates recommend TI as eliminating the conceptual anomalies and problems of other interpretations — EPR paradox, Schrödinger's cat, etc. — which bodes well for the utility of metatime here.  I don't figure TI bears directly on the current purpose though because, as best I can tell, TI retains the primitive wave function.  (TI does make another cameo appearance, below.)

On the problem of deriving the wave function, I don't know of any previous work to draw on.  There certainly could be something out there I've simply not happened to cross paths with, but I'm not sanguine of finding such; for the most part, the subject suffers from a common problem of extra-paradigm scientific explorations:  researchers comparing the current paradigm to its predecessor are very likely to come to the subject with intense bias.  Researchers within the paradigm take pains to show that the old paradigm is wrong; researchers outside the paradigm are few and idiosyncratic, likely to be stuck on either the old paradigm or some other peculiar idea.

The bias by researchers within the paradigm, btw, is an important survival adaptation of the scientific species.  The great effectiveness of paradigm science — which benefits its evolutionary success — is in enabling researchers to focus sharply on problems within the paradigm by eliminating distracting questions about the merits of the paradigm; and therefore those distracting questions have to be crushed decisively whenever they arise.  It's hard to say whether this bias is stronger in the first generation of scientists under a paradigm, who have to get it moving against resistance from its predecessor, or amongst their successors trained within the zealous framework inherited from the first generation; either way, the bias tends to produce a dearth of past research that would aid my current purpose.

A particularly active, and biased, area of extra-paradigm science is no-go theorems, theorems proving that certain alternatives to the prevailing paradigm cannot be made to work (cf. old post yonder).  Researchers within the paradigm want no-go theorems to crush extra-paradigm alternatives once and for all, and proponents of that sort of crushing agenda are likely, in their enthusiasm, to overlook cases not covered by the formal no-go-result.  Extra-paradigm researchers, in contrast, are likely to ferret out cases not covered by the result and concentrate on those cases, treating the no-go theorems as helpful hints on how to build alternative ideas rather than discouragement from doing so.  The paradigm researchers are likely to respond poorly to this, and accuse the alternative-seekers of being more concerned with rejecting the paradigm than with any particular alternative.  The whole exchange is likely to generate much more heat than light.

Quantum/classical interface

A classical probability distribution is made up of possibilities.  One of them is, and the others are not; we merely don't know which one is.  This is important because it means there's no way these possibilities could ever interact with each other; the one that is has nothing to interact with because in fact there are no other possibilities.  That is, the other possibilities aren't; they exist only in our minds.  This non-interaction is what makes the probability distribution classical.  Therefore, in considering ways to derive our wave function from classical probability distributions, any two things in the wave function that interact with each other do not correspond to different classical possibilities.

It follows that quantum states — those things that can be superposed, interfere with each other, and partly cancel each other out — are not separated by a boundary between different classical possibilities.  This does not, on the face of it, prohibit superposable elements from being prior or orthogonal to such boundaries, so that the mathematical model superposes entities of some sort and then applies them to a classical probability distribution (or applies the distribution to them).  Also keep in mind, though we're striving for a model in which the wave function isn't primitive, we haven't pinned down yet what is primitive.

Now, the wave function isn't a thing.  It isn't observable, and we introduce it into the mathematics only because it's useful.  So if it also isn't primitive, one has to wonder whether it's even needed in the mathematics, or whether perhaps we're simply to replace it by something else.  To get a handle on this, we need to look at how the wave function is actually used in applying quantum mechanics to physical systems; after all, one can't very well fashion a replacement for one part of a machine unless one understands how that part interacts with the rest of the machine.

The entire subject of quantum mechanics appears imho to be filled with over-interpretation; to the extent any progress has been made in understanding quantum mechanics over the past nearly-a-century, it's consisted largely in learning to prune unnecessary metaphysical underbrush so one has a somewhat better view of the theory.

The earliest, conventional "interpretation" of QM, the "Copenhagen interpretation", says properties of the physical system don't exist until observed.  This, to be brutally honest, looks to me like a metaphysical statement without practical meaning.  There is a related, but more practical, concept called contextuality; and an associated — though unfortunately technically messy — no-go theorem called the Kochen–Specker theorem, a.k.a. the Bell–Kochen–Specker theorem.  This all relates to the Heisenberg uncertainty principle, which says that you can't know the exact position and momentum of a particle at the same time; the more you know about its position, the less you can know about its momentum, and vice versa.  One might think this would be because the only way to measure the particle's position or momentum is to interact with it, which alters the particle because, well, because to every action there is an equal and opposite reaction.  However, in the practical application of the wave function to a quantum-mechanical system, there doesn't appear to be any experimental apparatus within the quantum system for the equal-and-opposite-reaction to apply to.  Instead, there's simply a wave function and then it collapses.  Depending on what you choose to observe (say, the position or the momentum), it collapses differently, so that the unobservable internal state of the system actually remembers which you chose to observe.  This property, that the (unobservable) internal state of the system changes as a result of what you choose to measure about it, is contextuality; and the Kochen–Specker theorem says a classical hidden-variable theory, consistent with QM, must be contextual (much as Bell's Theorem says it must be non-local).  Remember Bohm's hidden-variable theory, in which the particle does have an unobservable exact position and momentum?  Yeah.  Besides being rampantly non-local, Bohm's model is also contextual:  the particle's (unobservable, exact) position and momentum are guided by the wave function, and the wave-function is perturbed by the choice of measurement, therefore the particle's (unobservable, exact) position and momentum are also perturbed by the choice of measurement.

Bell, being of a later generation than Bohr and Einstein (and thus, perhaps, less invested in pre-quantum metaphysical ideas), managed not to be distracted by questions of what is or isn't "really there".  His take on the situation was that the difficulty was in how to handle the interface between quantum reality and classical reality — not philosophically, but practically.  To see this, consider the basic elements of an exercise in traditional QM (non-relativistic, driven by Schrödinger's equation):

  • A set of parameters define the classical state of the system; these become inputs to the wave equation. [typo fixed]

  • A Hamiltonian operator Ĥ embodies the classical dynamics of the system.

  • Schrödinger's equation provides quantum distortion of the classical system.

  • A Hermitian operator called an "observable" embodies the experimental apparatus used to observe the system.  The wave function collapses to an eigenstate of the observable.

The observable is the interface between the quantum system and the classical world of the physicist; and Bell ascribes the difficulty to this interface.  Consider a standard double-slit experiment in which an electron gun fires electrons one at a time through the double slit at a CRT screen where each electron causes a scintillation.  As long as you don't observe which slit the electron passes through, you get an interference pattern from the wave function passing through the two slits, and that is quantum behavior; but there's nothing in the wave function to suggest the discreteness of the resulting scintillation.  That discreteness results from the wave function collapse due to the observable, the interface with classical physics — and that discreteness is an essential part of the described physical reality.  Scan that again:  in order to fully account for physical reality, the quantum system has to encompass only a part of reality, because the discrete aspect of reality is only provided by the interface between the quantum system and surrounding classical physics.  It seems that we couldn't describe the entire universe using QM even if we wanted to because, without a classical observable to collapse the wave function, the discrete aspect of physical reality would be missing.  (Notice, this account of the difficulty is essentially structural, with only the arbitrary use of the term observable for the Hermitian operator as a vestige of the history of philosophical angst over the "role of the observer".  It's not that there isn't a problem, but that presenting the problem as if it were philosophical only gets in the way of resolving it.)

The many-worlds interpretation of QM (MWI) says that the wave function does not, in fact, collapse, but instead the entire universe branches into multiples for the different possibilities described by the wave function.  Bell criticized that while this is commonly presented as supposing that the wave function is "all there is", in fact it arbitrarily adds the missing discreteness:

the extended wave does not simply fail to specify one of the possibilities as actual...it fails to list the possibilities.  When the M‍WI postulates the existence of many worlds in each of which the photographic plate is blackened at particular position, it adds, surreptitiously, the missing classification of possibilities.  And it does so in an imprecise way, for the notion of the position of a black spot (it is not a mathematical point) [...] [or] reading of any macro‍scope instrument, is not mathematically sharp.  One is given no idea of how far down towards the atomic scale the splitting of the world into branch worlds penetrates.
— J.S. Bell, "Six possible worlds of quantum mechanics", Speakable and unspeakable in quantum mechanics (anthology), 1993.
I'm inclined to agree:  whatever philosophical comfort the M‍WI might provide to its adherents, it doesn't clarify the practical situation, and adds a great deal of conceptual machinery in the process of not doing so.

The transactional "interpretation" of QM is, afaik, somewhat lower-to-the-ground metaphysically.  To my understanding, TI keeps everything in quantum form, and posits that spacetime events interact through a "quantum handshake":  a wave propagates forward in time from an emission event, while another propagates backward in time from the corresponding absorption event, and they form a standing wave between the two while backward waves cancel out before the emission and forward waves cancel after the absorption.  Proponents of the TI report that it causes the various paradoxes and conceptual anomalies of QM to disappear (cf. striking natural structure), and this makes sense to me because the "observable" Hermitian operator should be thus neatly accounted for as representing half of a quantum handshake, in which the "observer" half of the handshake is not part of the particular system under study.  Wherever we choose to put the boundary of the system under study, the interface to our experimental apparatus would naturally have this half-a-handshake shape.

The practical lesson from the transactional interpretation seems to be that, for purposes of modeling QM, we don't have to worry about the wave function collapsing.  If we can replicate the wave function, we're in.  Likewise, if we can replicate the classical probability distributions that the wave function generates; so long as this includes all the probability distributions that result from weird quantum correlations (spooky action-at-a-distance).  That the latter suffices, should be obvious since generating those probability distributions is the whole point of quantum theory; that the latter is possible is demonstrated by Bohm's hidden-variable theory (sometimes called the "Bohm Interpretation" by those focusing on its philosophy).


There is something odd about the above list of basic elements of a QM exercise, when compared to the rewriting-calculus-inspired model we're trying to apply to it.  When one thinks of a calculus term, it's a very concrete thing, with a specific representation (in fact over-specific, so that maintaining it may require α-renaming to prevent specific name choices from disrupting hygiene); and even classical physics seems to present a rather concrete representation.  But the quantum distortion of the wave equation apparently applies to whatever description of a physical system we choose; to any choice of parameters and Ĥ, regardless of whether it bears any resemblance to classical physics.  It certainly isn't specific to the representation of any single elementary unit, since it doesn't even blink (metaphorically) at shifting application from a one-electron to a two-electron system.

This suggests, to me anyway, two things.  On the negative/cautionary side, it suggests a lack of information from which to choose a concrete representation for the "local" part of a physical system, which one might have thought would be the most straightforward and stable part of a cosmological "term".  Perhaps more to the point, though, on the positive, insight-aiding side it suggests that if the quantum distortion is caused by some sort of non-local network playing out through rewrites in a dimension orthogonal to spacetime, we should consider trying to construct machinery for it that doesn't depend, much, on the particular shape of the local representation.  If our distortion machinery does place some sort of constraints on local representation, they'd better be constraints that say something true about physics.  Not forgetting, we expect our machinery to notice the difference between gravity and the other fundamental forces.

My most immediate goal, though, lest we forget, is to reckon whether it's at all possible any such machinery can produce the right sort of quantum distortion:  a sanity check.  Clues to the sort of thing one ought to look for are extremely valuable; but, having assimilated those clues, I don't atm require a full-blown theory, just a sense of what sort of thing is possible.  Anything that can be left out of the demonstration probably should be.  We're not even working with the best wave equation available; the Schrödinger equation is only an approximation covering the non-relativistic case.  In fact, the transactional-interpretation folks tell us their equations require the relativistic treatment, so it's even conceivable the sanity check could run into difficulties because of the non-relativistic wave equation (though one might reasonably hope the sanity check wouldn't require anything so esoteric).  But all this talk about relativistic and non-relativistic points out that there is, after all, something subtle about local geometry built into the form of the wave equation even though it's not directly visible in the local representation.  In which case, the wave equation may still contain the essence of that co-hygienic difference between gravity and the other fundamental forces (although... for gravity even the usual special-relativistic Dirac equation might not be enough, and we'd be on to the Dirac equation for curved spacetime; let's hope we don't need that just yet).

The universe says 'hi'

Let's just pause here, take a breather and see where we are.  The destination I've had my eye on, from the start of this post, was to demonstrate that a rewriting system, of the sort described, could produce some sort of quantum-like wave function.  I've been lining up support, section by section, for an assault on the technical specifics of how to set up rewriting systems — and we're not ready for that yet.  As noted just above, we need more information from which to choose a concrete representation.  If we try to tangle with that stuff before we have enough clues from... somewhere... to guide us through it, we'll just tie ourselves in knots.  This kind of exploration has to be approached softly, shifting artfully from one path to another from time to time so as not to rush into hazard on any one angle of attack.  So, with spider-sense tingling —or perhaps thumbs pricking— I'll shift now to consider, instead of pieces of the cosmos, pieces of the theory.

In conventional quantum mechanics, as noted a couple of sections above, we've got basically three elements that we bring together:  the parameters of our particular system of study, our classical laws of physics, and our wave equation.  Well, yeah, we also have the Hermitian operator, but, as remarked earlier, we can set that aside since it's to do with interfacing to the system, which was our focus in that section but isn't what we're after now.  The parameters of the particular system are what they are.  The classical laws of physics are, we suppose, derived from the transformation rules of our cosmic rewriting system, with particular emphasis on the character of the primitive elements of the cosmos (whatever they are) and the geometry, and some degree of involvement of the network topology.  The wave equation is also derived from the transformation rules, especially from how they interact with the network topology.

This analysis is already deviating from the traditional quantum scenario, because in the traditional scenario the classical laws of physics are strictly separate from the wave equation.  We've had hints of something deep going on with the choice of wave equation; Transactional Interpretation researchers reporting that they couldn't use the non-relativistic wave equation; and then there was the odd intimation, in my recent post deriving quantum-like effects from a drastically simplified system that lacked a wave equation, that the lack of a wave equation was somehow crippling something to do with systemic coherence buried deep in the character of the mathematics.  Though it does seem plausible that the wave equation would be derived more from the network topology, and perhaps the geometry, whereas the physical laws would be derived more from the character of the elementary physical components, it is perhaps only to be expected that these two components of the theory, laws and wave equation, would be coupled through their deep origins in the interaction of a single cosmological rewriting calculus.

Here is how I see the situation.  We have a sort of black box, with a hand crank and input and output chutes, and the box is labeled physical laws + wave equation.  We can feed into it the parameters of the particular physical system we're studying (such as a single electron in a potential field), carefully turn the crank (because we know it's a somewhat cantankerous device so that a bit of artistry is needed to keep it working smoothly), and out comes a wave function, or something akin, describing, in a predictive sense, the observable world.  What's curious about this box is that we've looked inside, and even though the input and output are in terms of a classical world, inside the box it appears that there is no classical world.  Odd though that is, we've gotten tolerably good at turning the crank and getting the box to work right.  However, somewhere above that box, we are trying to assemble another box, with its own hand crank and input/output chutes.  To this box, we mean to feed in our cosmic geometry, network topology, and transformation rules, and possibly some sort of initial classical probability distribution, and if we can get the ornery thing to work at all, we mean to turn the crank and get out of it — the physical laws plus wave equation.

Having arrived at this vision of an upper box, I was reading the other day a truthfully rather prosaic account of the party line on quantum mechanics (a 2004 book, not at all without merit as a big-picture description of mainstream thought, called Symmetry and the beautiful universe), and encountered a familiar rhetorical question of such treatments:  when considering a quantum mechanical wave function, "‍[...] what is doing the waving?"  And unlike previous times I'd encountered that question (years or decades before), this time the answer seemed obvious.  The value of the wave function is not a property of any particular particle in the system being studied, nor is it even a property of the system-of-interest as a whole; it's not part of the input we feed into the lower box at all, rather it's a property of the state of the system and so part of the output.  The wave equation describes what happens when the system-of-interest is placed into the context of a vastly, vastly larger cosmos (we're supposing it has to be staggeringly vaster than the system-of-interest in order for the trick to work right), and the whole is set to jostling about till it settles into a stable state.  Evidently, the shape that the lower box gives to its output is the footprint of the surrounding cosmos.  So this time when the question was asked, it seemed to me that what is waving is the universe.

The upper box

All we have to work with here are our broad guesses about the sort of rewriting system that feeds into the upper box, and the output of the lower box for some inputs.  Can we deduce anything, from these clues, about the workings of the upper box?

As noted, the wave function that comes out of the lower box assigns a weight to each state of the entire system-of-interest, rather than to each part of the system.  Refining that point, each weight is assigned to a complete state of the system-of-interest rather than to a separable state of a part of the system-of-interest.  This suggests the weight (or, a weight) is associated with each particular possibility in the classical probability distribution that we're supposing is behind the wave equation generated by the upper box.  Keep in mind, these possibilities are not possible states of the system-of-interest at a given time; they're possible states of the whole of spacetime; the shift between those two perspectives is a slippery spot to step carefully across.

A puzzler is that the weights on these different possibilities are not independent of each other; they form a coherent pattern dictated by the wave equation.  Whatever classical scenario spacetime settles into, it apparently has to incorporate effective knowledge of other possible classical scenarios that it didn't settle into.  Moreover, different classical scenarios for the cosmos must —eventually, when things stabilize— settle down to a weight that depends only on the state of our system-of-interest.  Under the sort of structural discipline we're supposing, that correlation between scenarios is generated by any given possible spacetime jostling around between classical scenarios, and thus roaming over various possible scenarios to sample them.  Evidently, the key to all of this must be the transitions between cosmic scenarios:  these transitions determine how the weight changes between scenarios (whatever that weight actually is, in the underlying structure), how the approach to a stable state works (whatever exactly a stable state is), and, of course, how the classical probabilities eventually correlate with the weights.  That's a lot of unknowns, but the positive insight here is that the key lever for all of it is the transitions between cosmic scenarios.

And now, perhaps, we are ready (though we weren't a couple of sections above) to consider the specifics of how to set up rewriting systems.  Not, I think, at this moment; I'm saturated, which does tend to happen by the end of one of these posts; but as the next step, after these materials have gone back on the shelf for a while and had a chance to become new again.  I envision practical experiments with how to assemble a rewriting system that, fed into the upper box, would cause the lower box to produce simple quantum-like systems.  The technique is philosophically akin to my recent construction of a toy cosmos with just the barest skeleton of quantum-like structure, demonstrating that the most basic unclassical properties of quantum physics require almost none of the particular structure of quantum mechanics.  That treatment particularly noted that the lack of a wave equation seemed especially problematic; the next step I envision would seek to understand how something like a wave equation could be induced from a rewriting system.  Speculatively, from there one might study how variations of rewriting system produce different sorts of classical/quantum cosmos, and reason on toward what sort of rewriting system might produce real-world physics; a speculative goal perhaps quite different from where the investigation will lead in practice, but for the moment offering a plausible destination to make sail for.

Monday, June 25, 2018

Why quantum math is unclassical

For me, the important thing about quantum mechanics is the equations, the mathematics.  If you want to understand quantum mechanics, just do the math.  All the words that are spun around it don't mean very much.  It's like playing the violin.  If violinists were judged on how they spoke, it wouldn't make much sense.
Freeman Dyson, in an interview with Onnesha Roychoudhuri, Salon, 2007.

Put aside all metaphysical questions about what sort of universe could be described by quantum mechanics.  Given that quantum mechanics is a recipe for making predictions about the physical world, and that those predictions are rather peculiar by classical standards, what is it about the recipe that causes these peculiarities?

In this post, I'm going to try to vastly simplify the recipe while still producing those peculiarities:  I'm going to build a toy cosmos, a really tiny system with really simple rules that, on their face, have almost none of the specific structure of quantum mechanics; yet, if it works out right, the system will still exhibit certain particular effects whose origins —whose mathematical origins— I want to understand better.  Here's my list of effects I want:

  • Nondeterminism.
  • Quantum interference.
  • Disappearance of quantum interference under observation.
  • Quantum entanglement.

I've tried this before, more than a decade ago, but my perspective has recently changed from my explorations of co-hygiene.  A little after the turn of the millennium I was studying a 1988 MIT AI Lab memo by Gary Drescher, "Demystifying Quantum Mechanics:  A Simple Universe with Quantum Uncertainty", and wanted to use a similar technique to explore some specific peculiarities of quantum math.  I used an even simpler toy cosmos than the 1988 memo had, which I could because my goals were narrower than Drescher's.  I eventually put my results up on the web through my WPI CS Department account (2006), though I didn't feel right at the time about making it a WPI CS Department tech report (a decision I eventually came to regret, after I'd got my doctoral hood and left, and it was too late).  But, nifty though the 2006 paper was in some ways, I now feel it didn't go far enough in simplifying the simple universe.  At the time I wanted to keep the "quantum" math similar enough to actual quantum mechanics to retain its look-and-feel, so that the reader would still think, yes, that is like quantum mechanics.  Now, though, I really want to strip away almost all the structure of quantum mechanics; because I'm now very interested to know which consequences of quantum mechanics are caused by which parts of the mathematical model from which they flow.

The result, with most of the instrument missing, won't be recital-quality violin; not even musical, really.  But I hope to learn from it a bit of how the instrument works.

Classical toy cosmos
Quantum toy cosmos
Classical toy cosmos

A quantum view of a cosmos can only be constructed relative to a classical view.  So we have to start with a classical toy cosmos.

The instantaneous state of this cosmos consists of just two boolean —true/false— variables, a and b; so there are only four possible states for the cosmos to be in, which we call TT, TF, FT, FF (listing a then b).  Time advances discretely from one moment  t  to the next  t+1, and we're allowed to apply some experimental apparatus across that interval that determines how the state at  t+1  depends on the state at  t.  There are just three kinds of experimental apparatus, each of which has two variants depending on whether it's focused on  a  or  b:

  • set v: causes the variable to be true in the next state.
  • clear v: causes the variable to be false in the next state.
  • copy v: causes the value of the variable in the old state to become the value of both variables in the next state.
Nothing changes unless explicitly changed by the apparatus.

For example, from state TF, here are the states produced by the six possible experiments:

TF set a TF
TF set b TT
TF clear a FF
TF clear b TF
TF copy a TT
TF copy b FF

Quantum toy cosmos

A quantum state of the cosmos consists of a vector indexed by classical states; that is,  q = ⟨ws⟩  where s varies over the four classical states of the cosmos (in order TT, TF, FT, FF).

We understand a quantum state to determine a probability distribution of classical states of the toy cosmos; for quantum state  q, we denote the probability of classical state  s  by  ps(q).

As always when reasoning about quantum mechanics — but this bears repeating, to keep the concepts straight — we, as physicists studying the mathematics of the situation, are not observers in the technical sense of quantum theory.  That is, we are not part of the toy cosmos at all.  We can reason about the evolution of the quantum state of the toy cosmos; how an experiment changes the probabilities from time  t  to time  t+1, from  ps(qt)  to  ps(qt+1); and our reasoning does not alter the system.  Observation is one of the possible processes within the toy cosmos, which we will eventually get around to reasoning about, below.

What sorts of values, though, are the weights  ws  within the quantum state?

In current mathematical physics, one would expect these weights to be what's called a gauge field — one of those terms that doesn't mean much to outsiders but, to those in the know, carries along a great deal of extra baggage.  We don't want that baggage here; and it's worth a moment just to consider why we don't want it.

In classical Lagrangian mechanics, one considers the evolution of a system as a path through the system's classical state-space (where points in the space are classical states of the system).  A function called a Lagrangian maps points in the state-space to energies.  The action of the system is the line integral along this path.  The principle of least action says that from a given state, the system will follow the path that minimizes the action.  One solves for this minimal path using a mathematical technology called the calculus of variations.  And Noether's theorem (yeah, yeah, Noether's first theorem) says that each differentiable invariant of the action — each symmetry of the action — gives rise to a conservation law.

In recent quantum physics, the system state — the range of points in the state-space — consists of a classical state together with what I've called here a "weight"; that's the wavy part of the wave function.  While part of that weight can be perceived more-or-less directly as probability (traditionally, probability proportional to the square of the amplitude of a complex number), the rest of it can't be perceived; but its symmetries give rise to conservation laws which in turn come out as classes of particles.  Photons, gluons, and whatnot.  The weights form a gauge field, the invariances that give rise to the conservation laws are gauge symmetries, etc.

Physicists tend to ground their thinking in an imagined "real world"; a century or so of quantum mechanics hasn't really dimmed this attitude, even if the "real world" now imagined is Platonic such as a gauge field.  The attitude has considerable merit imho (leading, e.g., to the profound change I've noted in my view of λ-calculus, which was after all originally an exercise in formalist meta-mathematics, essentially a manipulation of syntax deliberately disregarding any possible referent); but the attitude does seem to make physicists especially vulnerable to mistaking the map for the territory.  That is, in treating the gauge field as if it were "really there", the physicist may forget to distinguish between a mathematical theory that successfully describes observable features of reality, and mathematics that is "known" to underlie reality.  The Lagrangian (as I pointed out in an earlier post) isn't some magic deeper level of reality, it's just whatever works to cause the principle of least action to give the right answer; and Noether's theorem, profound as it is, points out the physical consequences of a mathematical structure that was devised in the first place from the physical world, with the mathematical structure thus serving as essentially a catalyst to reasoning.  Physicists, lacking a traditional classical-style model of reality, observe (say) a force and construct a gauge theory for it which they then think of as a theorized "real thing" (not necessary a bad attitude), reason through Noether's theorem to a class of particles, look for them using massive devices such as the Large Hadron Collider, and when they observe the phenomenon they predicted, then treat the particle as "known" and even take some properties of the gauge field as "known".  The chain of reasoning is so long that even the question of whether the observed particle "exists" is somewhat open to interpretation; and the gauge field is even more problematic.

More to the immediate point, the purpose of this post calls for avoiding the entire baggage train attached to the term "gauge", in pursuit of a minimal mathematical structure giving rise to the specifically named peculiar behaviors of quantum mechanics.

Taking a semi-educated stab at minimality, let's have just three possible weights:  a neutral weight, and two polar opposites.  Call the neutral weight 0 (zero).  One might call the other two 1 and −1, but really the orientation of those has to do with multiplication, and we're not going to have any sort of multiplication of weights by each other, so to avoid implying any particular orientation, let's unimaginatively call them left and right.  Two operations are provided on weights.  Unary negation, −w, transforms left to right, transforms right to left, and leaves 0 unchanged.

In the classical toy cosmos, each experiment determined, given the classical state  s  at time  t, the resulting classical state  s'  at time  t+1.  In the quantum version, each experiment determines, for each possible classical state  s  at time  t, and each possible classical state  s'  at time  t+1, what contribution does weight  wt,s  make to weight  wt+1,s'.  Each weight at time  t+1  is simply the sum of the contributions to that weight from each of the weights at time  t.  This requires, of course, that we sum a set of weights; let the sum of a set of weights be whichever of left or right there are more of amongst the arguments, or zero if there are the same number of left and right arguments.  This summation operation —for which we'll freely use the usual additive notation— is, btw, not at all mathematically well-behaved; commutative, but not associative since, for example,

left + left + (right + right)  =  left
left + (left + right) + right  =  0
(left + left) + right + right  =  right.
The ill-behavedness however is a bit moot, because in the six possible experiments of our toy cosmos, no sum will ever have more than two non-zero addends, and non-associativity only happens when there are at least three non-zero addends.

We understand a zero weight to mean that classical state is not possible at that time; and assign equal probabilities to all non-zero-weighted classical states in the quantum state.  Presumably, for all possible experiments, a zero weight at time  t  contributes zero to each weight at time  t+1. 

It remains to define, for each experiment, the contribution of each weight before the experiment to each weight after the experiment.  We'll write  s  for a classical state before,  s'  after; before weight  ws, after weight  w's', and contribution of the former to the latter  wss'.  We have  w's' = Σs wss'  (that is, each after-weight is the sum of the contributions to it from each of the before-weights).  We'll mainly represent these transformations by tables, rather that depending on all this elaborate notation.

Consider any  set/clear v  experiment.  Before-state s contributes nothing to any after-state that changes the non-v variable. If s already has v with the value called for, only the contribution to s'=s can be non-zero, w'ss = ws.  If s doesn't have the value of v called for, it contributes its weight to the state with v changed, and also contributes the negation of its weight to the unchanged state.  In all,

set a
TT  wTT wTT + wFT
TF wTF wTF + wFF

set b
TT  wTT wTT + wTF
FT wFT wFT + wFF

clear a
FT wFT wTT + wFT
FF wFF wTF + wFF

clear b
TF wTF wTT + wTF
FF wFF wFT + wFF
Follow the same pattern for a  copy v  experiment, adjusting which values are changed.
copy a
TT  wTT wTT + wTF
FF wFF wFT + wFF

copy b
TT  wTT wTT + wFT
FF wFF wTF + wFF
This has, btw, all been constructed to avoid awkward questions when interpreting quantum states probabilistically by guaranteeing that each experiment, operating on a predecessor quantum state with at least one non-zero weight, will always produce a successor quantum state with at least one non-zero weight.

Demonstrating the intended quantum effects is —if it can be done— then just a matter of assembling suitable compositions of experiments.


The fundamental difference between quantum state and classical state is, always, that any observed state of reality is classical.  Quantum state evolves deterministically — we've just specified precisely how it evolves through each experiment — and our difficulty is that we see no way to interpret the probability distributions of quantum mechanics as deterministic evolution of classical states.


The effect to be demonstrated is that a sequence of two experiments produces a probability distribution that doesn't compose the probability distributions of the two individual experiments.

Suppose we  set a  and then  clear a.  To be clear on what's going on, we start from a pure state, that is, a quantum state in which only one classical state is possible.  If that pure state has a=true, the quantum state after  set a  would be unchanged, so the final probability distribution would be just that of the second experiment,  clear a.  So choose instead a pure starting state with a=false.

  set a
clear a
Here, the second experiment produces a quantum state at time  t+2  where the weight on classical state FT is the sum of the weights on states TT and FT at time  t+1; and since the first experiment has left those two as polar opposites, they cancel,  wFT − wFT = 0, so the outcome of the sequence of two experiments is pure state TT.  Even though each of the experiments individually, when applied to a pure state where the value isn't what the experiment seeks to make it, would produce a probability distribution between two possible classical result states.


In the standard two-slit experiment, electron wave interference disappears when we observe which slit the electron goes through.  So, to disrupt the interference effect we've just demonstrated, put a  copy a  in between the other two operations, to observe, within the toy cosmos, the intermediate classical state of the system.

set a
copy a
clear a
Here, the final experiment gives a time  t+3  weight for FT that is the sum of the time  t+2 weights for TT and FT, but now they have the same sign so they don't cancel.

Interestingly, although this does spoil the interference pattern from the previous demonstration, it doesn't produce the crisp "classical" probability distribution that we expect observation to exhibit in a similar scenario in real-world quantum mechanics.  In my 2006 paper, I did get a crisply classical distribution; but there, the transformation of weights by the  copy v  operation was itself deterministic, assigning zero weight to those classical outcomes in which the value was not copied.  I defined the copy transformation differently this time because it had always bothered me that the 2006 paper did not guarantee that an experiment could not result in an all-zero quantum state.  My best guess, atm, as to why this zero-outcome problem doesn't ordinarily arise in full-blown quantum mechanics is that it has to do with the overall coherence provided by the wave equation, a structural component of quantum mechanics entirely omitted here.  At least, I've never heard of this particular anomaly arising in full-blown quantum mechanics; though full-blown quantum mechanics does have anomalies of its own that seem no less alarming if perhaps more sophisticated, such as infinities that may crop up causing renormalization problems in quantum gravity.

Conceivably, this may be a clue that the presence of a wave equation is profoundly fundamental to the overall structure of quantum mechanics.  Identifying the deep structural role of a wave equation, independent of the details of any particular wave equation, would seem to be another exercise for another day — though possibly not all that distant a day, given the sorts of questions I've been asking regarding co-hygiene.

At any rate, the intervening  copy a  experiment does alter the probability distribution of values of a despite the fact that the classical effect of the experiment on a pure classical state never alters the value of a.


The idea of entanglement, in its strongest sense, is that things done to one variable affect the other variable.  Loosely, we want to perform experiments on one variable that don't touch the other variable, yet alter the probability distribution of the other variable.  There is so little mathematical structure left in our toy cosmos, that there aren't a lot of options to consider for demonstrating this effect.  The only operations that don't touch one variable are set/clear of the other variable.  Asymmetric handling of states can be derived from the fact that the set-clear sequence we used to demonstrate interference only causes interference on a pure start state if a=false.  So, suppose we run our  set-clear  on an initial quantum state with a correlation between a and b.

set a
clear a
The two starting weights never get added to each other, so it doesn't matter for this sequence whether they have the same polarity, as long as they're both non-zero.  In the start state, the probability of  b=true  is 1/2, as is the probability of  a=true; in the final state, the probability of  b=true  is 1/3, while the probability of  a=true  is 2/3.


Our toy cosmos deliberately leaves out most complications of quantum mechanics.  We do require, in order that the theory be at all quantum-y, to be able to understand the mathematical model as describing a probability distribution of possible perceived classical states; to understand the quantum state as being partitioned into elements associated with particular classical states; and to understand each of these elements as contributing to various elements of the successor quantum state.  That leaves the question of what sort of information a quantum state associates with each classical state; that is, what is the range over which each weight varies; and then, of course, what are the rules by which a given experiment transforms predecessor quantum state to successor quantum state.  In order to exhibit interference, it seems there must be a way for weights to cancel each other out during the summation process, and in this post I've deliberately taken the simplest sort of weight I could imagine that would allow canceling.

The resulting toy cosmos does exhibit the quantum interference effect, and clearly the demonstration of this effect does rely on weights canceling during summation.

Nondeterminism —relative, that is, to classical states— arises, potentially, when a single predecessor classical-state contributes non-zero weight to more than one successor classical-state.  Interference arises (given the cancellation provided for), again potentially, when a single successor classical-state receives non-zero contributions from more than one predecessor classical-state.

The quantum interference effect depends crucially on the fact that weights are holistic.  That is, a weight is assigned to a classical state of the entire cosmos; it isn't a characteristic of any particular feature within the classical state of the cosmos.  This is why observation within the toy cosmos disrupts interference:  once the particular part of the cosmos we're manipulating (variable a in our demonstration) is "observed" by another part of the cosmos (variable b in our demonstration), the classical state of the cosmos as a whole may differ because of what the observer saw, so that interference does not occur.  (Tbh, this point was more clearly exhibited in the 2006 paper, where observation was absolute — as it is in the full-blown quantum mechanics of our physical world; but it is still there to be found in the toy cosmos of this blog post.)

Entanglement was something I really wanted to understand in 2006; curiously, in 2018 I'm finding it less interesting than observation.  An experiment can cause interference amongst the successors of one classical-state and not amongst the successors of another classical state, so that, in the quantum successor-state, successors of one classical-state are collectively more probable than successors of another classical-state.  If the experiment only manipulates one variable (a) without affecting the other (b), this difference in probabilities of successor states can mean a difference in probabilities of values of the unmanipulated variable (b).

These latter two points are somewhat murkier from the above demonstrations than they were from the 2006 paper; the murkiness is apparently due to my decision in this blog post to define the  copy v  operation as something that might or might not change the state, rather than something that always changes the state in the 2006 paper; and that decision was made here due to considerations of avoiding possible quantum zero-states.  As noted earlier, this seems to be something to do with the absence, from this immensely simplified mathematical structure, of a wave equation that would ward off such anomalies.

It seems, then, that I went into this blog post seeking to clarify minimal structure needed to produce certain quantum effects; and confirmed that those effects could still be produced by the chosen reduced structure; but the structure became so reduced that the demonstrations were less clear than in the 2006 paper, and questions arose about what other primal characteristics of quantum mechanics may have already been lost due to evisceration of internal structure of the transformation of quantum state, i.e., the "wave equation" which has been replaced above by ad hoc tables specifying the successor weights for each experiment.

Saturday, June 2, 2018

Sapience and the limits of formal reasoning

Anakin:      Is it possible to learn this power?
Palpatine:  Not from a Jedi.
Star Wars: Episode III – Revenge of the Sith, George Lucas, 2005.

In this post I mean to tie together several puzzles I've struggled with, on this blog and elsewhere, for years; especially, on one hand, the philosophical implications of Gödel's results on the limitations of formal reasoning (post), and on the other hand, the implications of evidence that sapient minds are doing something our technological artifacts do not (post).

From time to time, amongst my exploratory/speculative posts here, I do come to some relatively firm conclusion; so, in this post, with the philosophical implications of Gödel.  A central notion here will be that formal systems manipulate information from below, while sapiences manipulate it from above.

As a bonus I'll also consider how these ideas on formal logic might apply to my investigations on basic physics (post); though, that will be more in the exploratory/speculative vein.

As this post is mostly tying together ideas I've developed in earlier posts, it won't be nearly as long as the earlier posts that developed them.  Though I continue to document the paths my thoughts follow on the way to any conclusions, those paths won't be long enough to meander too very much this time; for proper meandering, see the earlier posts.


Through roughly the second half of the nineteenth century, mathematicians aggressively extended the range of formal reasoning, ultimately reaching for a single set of axioms that would found all of logic and mathematics.  That last goal was decisively nixed by Gödel's Theorem(s) in 1931.  Gödel proved, in essence, that any sufficiently nontrivial formal axiomatic system, if it doesn't prove anything false, cannot prove itself to be self-consistent.  It's still possible to construct a more powerful axiomatic system that can prove the first one self-consistent, but that more powerful system then cannot prove itself self-consistent.  In fact, you can construct an infinite series of not-wrong axiomatic systems, each of which can prove all of its predecessors self-consistent, but each system cannot prove its own self-consistency.

In other words, there is no well-defined maximum of truth obtainable by axiomatic means.  By those means, you can go too far (allowing proofs of some things that aren't so), or you can stop short (failing to prove some things that are so), but you can't hit the target.

For those of us who work with formal reasoning a lot, this is a perplexing result.  What should one make of it?  Is there some notion of truth that is beyond the power of all these formal systems?  And what would that even mean?

For the question of whether there is a notion of objective mathematical truth beyond the power of all these formal systems, the evident answer is, not formally.  There's more to that than just the trivial observation that something more powerful than any axiomatic system cannot itself be an axiomatic system; we can also reasonably expect that whatever it is, we likely won't be able to prove its power is greater axiomatically.

I don't buy into the notion that the human mind mystically transcends the physical; an open mind I have, but I'm a reductionist at heart.  Here, though, we have an out.  In acknowledging that a hypothetical more-powerful something might not be formally provable more powerful, we open the door to candidates that we can't formally justify.  Such as, a sapient mind that emerges by some combination of its constituent parts and so seemingly ought to be no more powerful than those parts, but... is.  In practice.  (There's a quip floating around, that "In theory, there is no difference between theory and practice. But, in practice, there is.")

A related issue here is the Curry-Howard correspondence, much touted in some circles as a fundamental connection between computation and logic.  Except, I submit it can't be as fundamental as all that.  Why?  Because of the Church-Turing thesis.  Which says, in essence, that there is a robust most-powerful sort of computation.  In keeping with our expectation of an informal cap on formal power, the Church-Turing thesis in this general sense is inherently unprovable; however, specific parts of it are formally provable, formal equivalence between particular formal models of computation.  The major proofs in that vein, establishing the credibility of the general principle, were done within the next several years after Gödel's Theorems proved that there isn't a most-powerful sort of formal logic.  Long story short:  most-powerful sort of computation, yes; most-powerful sort of formal logic, no; therefore, computation and formal logic are not the same thing.

Through my recent post exploring the difference between sapient minds and all our technological artifacts, I concluded, amongst other things, that  (1) sapience cannot be measured by any standardized test, because for any standardized test one can always construct a technological artifact that will outperform sapient minds; and  (2) sapient minds are capable of grasping the "big picture" within which all technology behaves, including what the purpose of a set of formal rules is, whether the purpose is achieved, when to step outside the rules, and how to improvise behavior once outside.

A complementary observation about formal systems is that each individual action taken —each axiomatic application— is driven by the elementary details of the system state.  That is, the individual steps of the formal system are selected on a view looking up from the bottom of the information structure, whereas sapience looks downward from somewhere higher in the information structure.  This can only be a qualitative description of the difference between the sapient and formal approaches, for the simple reason that we do not, in fact, know how to do sapience.  As discussed in the earlier post, our technology does not even attempt to achieve actual sapience because we don't know, from a technical perspective, what we would be trying to achieve — since we can't even measure it, though we have various informal ways to observe its presence.

Keep in mind that this quality of sapience is not uniform.  Though some cases are straightforward, in general clambering up into the higher levels of structure, from which to take a wide-angle view, may be extremely difficult even with sapience, and some people are better at it than others, apparently for reasons of both nature, nurture, and circumstance.  Indeed, the mix of reasons that lead a Newton or an Einstein to climb particularly high in the structure are just the sort of thing I'd expect to be quite beyond the practical grasp of formal analysis.

What we see in Gödel's results is, then, that even when we accept a reductionist premise that the whole structure is built up by axioms from an elementary foundation, for a sufficiently powerful system there are fundamental limits to the sorts of high-level insights that can be assembled by building strictly upward from the bottom of the structure.

Is that a big insight?  Formally it says nothing at all.  But I can honestly say that, having reached it, for the first time in <mumble-mumble> decades of contemplation I see Gödel's results as evidence of something that makes sense to me rather than evidence that something is failing to make sense to me.


In modern physics, too, we have a large-scale phenomenon (classical reality) that evidently cannot be straightforwardly built up by simple accretion of low-level elements of the system (quanta).  Is it possible to understand this as another instance of the same broad phenomenon as the failure, per Gödel, to build a robust notion of truth from elementary axioms?

Probably not, as I'll elaborate below.  However, in the process I'll turn up some ideas that may yet lead somewhere, though quite where remains to be seen; so, a bit of meandering after all.

Gödel's axiomatic scenario has two qualitative features not immediately apparent for modern physics:

  • Axiomatic truth appears to be part of, and therefore to evolve toward, absolute truth; the gap between the two appears to be a quantitative thing that shrinks as one continues to derive results axiomatically, even though it's unclear whether it shrinks toward zero, or toward some other-sized gap.  Whereas, the gap between quantum state and classical state is clearly qualitative and does not really diminish under any circumstances.
  • The axiomatic shortfall only kicks in for sufficiently powerful systems.  It's not immediately clear what property in physics would correspond to axiomatic power of this sort.
The sapience/formalism dichotomy doesn't manifest the same way for different sorts of structure; witness the aforementioned difference between computational power and axiomatic power, where apparently one has a robust maximum while the other does not.  There is no obvious precedent to expect the dichotomy to generate a Gödel-style scale-gap in arbitrary settings.  Nonetheless; might there still be a physics analog to these features of axiomatic systems?

Quantum state-evolution does not smooth out toward classical state-evolution at scale; this is the point of the Schrödinger's-cat thought experiment.  A Gödel-style effect in physics would seem to require some sort of shading from quantum state-evolution toward classical state-evolution.  I don't see what shading of that sort would mean.

There is another possibility, here:  turn the classical/quantum relationship on its head.  Could classical state-evolution shade toward quantum state-evolution?  Apparently, yes; I've already described a way for this to happen, when in my first post on co-hygiene I suggested that the network topology of spacetime, acting at a cosmological scale, could create a seeming of nondeterminism at comparatively small scales.  Interestingly, this would also be a reversal in scale, with the effect flowing from cosmological scale to small scale.  However, the very fact that this appears to flow from large to small does not fit the expected pattern of the Gödel analogy, which plays on the contrast between bottom-up formalism and top-down sapience.

On the other front, what of the sufficient-power threshold, clearly featured on the logic side of the analogy?  If the quantum/classical dichotomy is an instance of the same effect, it would seem there must be something in physics corresponding to this power threshold.  Physics considered in the abstract as a description of physical reality has no obvious place for power in a logical or computational sense.  Interestingly, however, the particular alternative vein of speculation I've been exploring here lately (co-hygiene and quantum gravity) recommends modeling physical reality as a discrete structure that evolves through a dimension orthogonal to spacetime, progressively toward a stable state approximating the probabilistic predictions of quantum mechanics — and it is reasonable to ask how much computational power the primitive operations of this orthogonal evolution of spacetime ought to have.

In such a scenario, the computational power is applied to state-evolution from some initial state of spacetime to a stable outcome, for some sense of stable to be determined.  As a practical matter, this amounts to a transformation from some probability distribution of initial states of spacetime, to a probability distribution of stable states of spacetime that presumably resembles the probability distributions predicted by quantum mechanics.  As it is unclear how one chooses the initial probability distribution, I've toyed with the idea that a quantum mechanics-like distribution might be some sort of fixpoint under this transformation, so that spacetime would tend to come out resembling quantum mechanics more-or-less-regardless of the initial distribution.

The spacetime-rewriting relation would also be the medium through which cosmological-scale determinism would induce small-scale apparent nondeterminism.

Between inducing nondeterminism and transforming probability distributions, there would seem to be, potentially, great scope for dependence on the relative computational power of the rewriting relation.  With such a complex interplay of factors at stake, it seems likely that even if there were a Gödel-like power threshold lurking, it would have to be deduced from a much better understanding of the rewriting relation, rather than contributing to a basic understanding of the rewriting relation.  Nevertheless, I'm inclined to keep a weather eye out for any such power threshold as I move forward.