Structural insight: November 2011

Wednesday, November 30, 2011

Rhyming in US politics

Just a small structural insight into US politics (as advertised).

Republicans in Congress have allowed their agenda to be set by President Obama.

Republicans in Congress are being obstructionist; this shouldn't be a controversial statement, since historically they haven't made a secret of it (though, in something rather like a Catch-22, the insight I'm heading for makes it natural to expect disagreement on this along party lines). But what one ought to be asking is why.

It's simple, really. When an administration comes in, the opposition usually aligns itself squarely against the central priority of the new administration. Although this may be a disagreement that predates the new administration, a sadder scenario —for all parties, and for the electorate— is that the opposition may be in disarray and simply not have any better focus than, well, opposing. On secondary issues there may be all kinds of cooperation, but by default, not on that central priority.

And here we have a president who was really pretty explicit, before elected, that his central priority is cooperation. It's the message that got him national attention in the first place: just because we have disagreements doesn't mean we can't cooperate.

Now, consider how one would go about opposing that priority, and compare it to the current situation.

And, to see the other side of the coin, consider how one would go about pursuing that priority — in the face of opposition to it.

Monday, November 14, 2011

Where do types come from?

[On the phone] There's a man here with some sort of a parasite on his arm, assimilating his flesh at a frightening speed. I may have to get ahead of it and amputate. No... I don't know what it is or where it came from.

— Dr. Hallen, The Blob

I took some tolerably advanced math courses in college and graduate school. My graduate research group of choice was the Theory Umbrella Group (THUG), joint between the math and computer science departments. But one thing I never encountered in any of those courses, nor that-I-recall even in the THUG talks, was a type. Sets aplenty, but not types. Types seem to arise from specifically studying computer science, mathematics proper having no native interest in them. There are the "types" in Russell and Whitehead's Principia Mathematica, but those don't seem to me to have anything really to do with types as experienced in programming.

Yet, over in the computer science department, we're awash in types. They're certainly used for reasoning about programs (both practically and theoreticially) — but at some point our reasoning may become more about the types themselves than about the programs they apply to. Type systems can be strikingly reminiscent of bureaucratic red tape when one is getting tangled up in them. So, if they aren't a natively mathematical concept, why are they involved in our reasoning at all? Are they natural to what we're reasoning about (programs), or an unfortunate historical artifact? From the other side, is reasoning in mathematics simpler because it doesn't use types, or does it not need to use types because what it's reasoning about is simpler?

Representation format

Looking back at the early history of programming, types evidently arose from the need to keep track of what format was being used by a given block of binary data. If a storage cell was assigned a value using a floating-point numerical representation, and you're trying to treat it as a series of ASCII characters, that's probably because you've lost track of what you meant to be doing. So we associate format information with each such cell. Note that we are not, at this point, dealing directly with the abstract entities of mathematics, but with sequences of storage bits, typically fixed-width sequences at that. Nor does the type even tell us about a sort of mathematical entity that is being stored, because within the worldview presented by our programming language, we aren't storing a mathematical entity, we're representing a data value. Data values are more abstract than bit sequences, but a lot less abstract than the creatures we'd meet in the math department. The essential difference, I'd suggest, is that unlike their mathematical cousins, data values carry about with them some of their own representation format, in this case bit-level representation format.

A typical further development in typing is user-defined (which is to say, programmer-defined) types. Each such type is still stored in a sequence of storage bits, and still tells us how the storage is being used to represent a data value, rather than store a mathematical entity. There is a significant difference from the earlier form of typing, in that the language will (almost certainly) support a practically infinite number of possible user-defined types, so that the types themselves have somewhat the character of mathematical abstract entities, rather than data values (let alone bit sequences). If, in fact, mathematics gets much of its character by dealing with its abstract entities unfettered by representational issues (mathematics would deal with representation itself as just another abstract domain), a computer scientist who wants that character will prefer to reason as much as possible about types rather than about data values or storage cells.

Another possible development in typing, orthogonal to user-defined types, is representation-independence, so that the values constrained by types are understood as mathematical entities rather than data values. The classic example is type bignum, whose values are conceptually mathematical integers. Emphasis on runtime efficiency tends to heighten awareness of representational issues, so one expects an inverse relation between that emphasis, and likelihood of representation-independent types. It's not a coincidence that bignums flourish in Lisp. Note also that a key twist in the statement of the expression problem is the phrase "without recompiling existing code".

Complicated type systems as crutches

Once we have types, since we're accustomed to thinking about programs, we tend to want to endow our type systems with other properties we know from our programming models. Parametric types. Dependent types. Ultimately, first-class types.

I've felt the lure of first-class types myself, because they abandon the pretense that complicated types systems aren't treating types computationally. There's an incomplete language design in my files wherein a type is an object with two methods, one for determining membership and one for determining sub/supertyping. That way leads to unbounded complications — the same train of thought has led me more recently to consider tampering with incompleteness of the continuum (cf. Section 8.4.2 of my dissertation; yet another potential blog topic [later did blog on this, here]). As soon as I envisioned that type system, I could see it was opening the door to a vast world of bizarre tricks that I absolutely didn't want. I really wanted my types to behave as mathematical sets, with stable membership and transitive subtyping — and if that's what you want, you probably shouldn't try to get there by first giving the methods Turing power and then backing off from it.

But, from the above, I submit that these complicated type systems are incited, to begin with, when we start down the slippery slope by

tangling with data values —halfway between the abstract and concrete worlds— instead of abstract mathematical entities, and
placing undue emphasis on types, rather than the things they describe. This we did in the first place, remember, because types were more nearly mathematical; the irony of that is fairly intense.

In contrast to the muddle of complicated typing in computer science, folks over in the math department deal mostly with sets, a lightweight concept that fades comfortably into the background and only has to be attended carefully under fairly extreme circumstances. Indeed, contrasting types and sets, a major difference between them is that types have object identity — which is itself a borderline representational concept (able to come down on either side of the line), and jibes with experience that sophisticated types become data structures in their own right. Yes, there are such things as types that don't have object identity; but somehow it seems we've already crossed the Rubicon on that one, and can no longer escape from the idea even in languages that don't endorse it.

Where next?

What we need, it seems, is the lightweight character of mathematical reasoning. There's more to it than mathematical "purity"; Haskell is fairly pure, but tbh I find it appallingly heavy. I find no sense of working with simple primitives — it feels to me more like working on a scaffold over an abyss. In mathematics, there may be several different views of things any one of which could be used as a foundation from which to build the others. That's essentially perfect abstraction, in that from any one of these levels, you not only get to ignore what's under the hood, but you can't even tell whether there is anything under the hood. Going from one level to the next leaves no residue of unhidden details: you could build B from A, C from B, and A from C, and you've really gotten back to A, not some flawed approximation of it that's either more complicated than the original, more brittle than the original, or both.

Making that happen in a language design should involve some subtle shifts in the way data is conceptualized. That isn't a digression in a discussion of types, because the way we conceptualize data has deep, not to say insidious, effects on the nature of typing. As for the types themselves, I suggest we abandon the whole notion of types in favor of a lightweight mathematical notion of sets — and avoid using the word "type" as it naturally drags us back toward the conceptual morass of type theory that we need to escape.

Tuesday, November 8, 2011

Allowing and disallowing

Here are two problems in programming language design that are often treated as if they had to be traded off against each other. I've found it enormously productive to assume that high-level tradeoffs are accidental rather than essential; that is, to assume that if only we find the right vantage to view the problems, we'll see how to have our cake and eat it too. A good first step toward finding a fresh vantage on a problem is to eliminate unnecessary details and assumptions from the statement of the problem. So here are spare, general statements of these two problems.

Allow maximally versatile ways of doing things, with maximal facility.
Disallow undesirable behavior.

I've been accused of promoting unmanageable chaos because my publicly visible work (on Kernel and fexprs) focuses on the first problem with some degree of merry disregard for the second. So here I'll explain some of my thoughts on the second problem and its relationship to the first.

How difficult are these problems? One can only guess how long it will actually take to tame a major problem; there's always the chance somebody could find a simple solution tomorrow, or next week. But based on their history, I'd guess these problems have a half-life of at least half a century.

Why

To clarify my view of these problems, including what I mean by them, it may help to explain why I consider them important.

Allowing is important because exciting, new, and in any and all senses profitable innovations predictably involve doing things that hadn't been predicted. Software technology needs to grow exponentially, which is a long-term game; in the long term, a programming language either helps programmers imagine and implement unanticipated approaches, or the language will be left in the dust by better languages. This is a sibling to the long-term importance of basic research. It's also a cousin to the economic phenomenon of the Long Tail, in which there's substantial total demand for all individually unpopular items in a given category — so that while it would be unprofitable for a traditional store to keep those items in stock, a business can reap profits by offering the whole range of unpopular items if it can avoid incurring overhead per item.

Disallowing is important because, bluntly, we want our programs to work right. A couple of distinctions immediately arise.

Whose version of "right" are we pursuing? There's "right" as understood by the programmer, and "right" as understood by others. A dramatic divergence occurs in the case of a malicious programmer. Of course, protecting against programmer malfeasance is especially challenging to reconcile with the allowing side of the equation.
Some things we are directly motivated to disallow, others indirectly. Direct motivation means that thing would in itself do something we don't want done. Indirect motivation means that thing would make it harder to prove the program doesn't do something we don't want done.

How

If allowing were a matter of computational freedom, the solution would be to program in machine code. It's not. In practice, a tool isn't versatile or facile if it cannot be used at scale. What we can imagine doing, and what we can then work out how to implement, depends on the worldview provided by the programming language, within which we work, so allowing depends on this worldview. Nor is the worldview merely a matter of crunching data — it also determines our ability to imagine and implement abstractions within the language — modulating the local worldview, within some broader metaphysics. Hence my interest in abstractive power (on which I should blog eventually [note: eventually I did]).

How ought we to go about disallowing? Here are some dimensions of variation between strategies — keeping in mind, we are trying to sort out possible strategies, rather than existing strategies (so not to fall into ruts of traditional thinking).

One can approach disallowance either by choosing the contours of the worldview within which the programer works, or by imposing restrictions on the programmer's freedom to operate within the worldview. The key difference is that if the programmer thinks within the worldview (which should come naturally with a well-crafted worldview), restriction-based disallowance is directly visible, while contour-based disallowance is not. To directly see contour-based disallowance, you have to step outside the worldview.

To reuse an example I've suggested elsewhere: If a Turing Machine is disallowed from writing on a blank cell on the tape, that's a restriction (which, in this case, reduces the model's computational power to that of a linear bounded automaton). If a Turing Machine's read/write head can move only horizontally, not vertically, that's a contour of the worldview.
Enforcement can be hard vs soft. Hard enforcement means programs are rejected if they do not conform. Soft enforcement is anything else. One soft contour approach is the principle I've blogged about under the slogan dangerous things should be difficult to do by accident. Soft restriction might, for example, take the form of a warning, or a property that could be tested for (either by the programmer or by the program).
Timing can be eager vs lazy. Traditional static typing is hard and eager; traditional dynamic typing is hard and lazy. Note, eager–lazy is a spectrum rather than a binary choice. Off hand, I don't see how contour-based disallowance could be lazy (i.e., I'd think laziness would always be directly visible within the worldview); but I wouldn't care to dismiss the possibility.

All of which is pretty straightforward. There's another dimension I'm less sure how to describe. I'll call it depth. Shallow disallowance is based on simple, locally testable criteria. A flat type system, with a small fixed set of data types that are mutually exclusive, is very shallow. Deep disallowance is based on more sophisticated criteria that engage context. A polymorphic function type has a bit of depth to it; a proof system that supports sophsiticated propositions about code behavior is pretty deep.

Shallow vs deep tends to play off simplicity against precision. Shallow disallowance strategies are simple, therefore easily understood, which makes them more likely to be used correctly and —relatively— less likely to interfere with programmers' ability to imagine new techniques (versatility/facility of allowance). However, shallow disallowance is a blunt instrument, that cannot take out a narrow or delicately structured case of bad behavior without removing everything around it. So some designers turn to very deep strategies —fully articulated theorem-proving, in fact— but thereby introduce conceptual complexity, and the conceptual inflexibility that tends to come with it.

Recalling my earlier remark about tradeoffs, the tradeoffs we expect to be accdiental are high-level. Low-level tradeoffs are apt to be essential. If you're calculating reaction mass of a rocket, you'd best accept the tradeoff dictated by F=ma. On the other hand, if you step back and ask what high-level task you want to perform, you may find it can be done without a rocket. With disallowance depth, deep implies complex, and shallow implies some lack of versatility; there's no getting around those. But does complex disallowance imply brittleness? Does it preclude conceptual clarity?

One other factor that's at play here is level of descriptive detail. If the programming language doesn't specify something, there's no question of whether to disallow some values of it. If you just say "sort this list", instead of specifying an algorithm for doing so, there's no question —within the language— of whether the algorithm was specified correctly. On the other hand, at some point someone specified how to sort a list, using some language or other; whatever level of detail a language starts at, you'll want to move up to a higher level later, and not keep respecifying lower-level activities. That's abstraction again. Not caring what sort algorithm is used may entail significantly more complexity, under the hood, than requiring a fixed algorithm — and again, we're always going to be passing from one such level to another, and having to decide which details we can hide and how to hide them. How all that interacts with disallowance depth may be critical: can we hide complex disallowance beneath abstraction barriers, as we do other forms of complexity?

Merry disregard

You may notice I've had far more to say about how to disallow, than about how to allow. Allowing is so much more difficult, it's hard to know what to say about it. Once you've chosen a worldview, you have a framework within which to ask how to exclude what you don't want; but finding new worldviews is, rather by definition, an unstructured activity.

Moreover, thrashing about with specific disallowance tactics may tend to lock you in to worldviews suited to those tactics, when what's needed for truly versatile allowing may be something else entirely. So I reckon that allowing is logically prior to disallowing. And my publicly visible work does, indeed, focus on allowing with a certain merry disregard for the complementary problem of disallowing. Disallowing is never too far from my thoughts; but I don't expect to be able to tackle it properly till I know what sort of allowing worldview it should apply to.