[On the phone] There's a man here with some sort of a parasite on his arm, assimilating his flesh at a frightening speed. I may have to get ahead of it and amputate. No... I don't know what it is or where it came from.I took some tolerably advanced math courses in college and graduate school. My graduate research group of choice was the Theory Umbrella Group (THUG), joint between the math and computer science departments. But one thing I never encountered in any of those courses, nor that-I-recall even in the THUG talks, was a type. Sets aplenty, but not types. Types seem to arise from specifically studying computer science, mathematics proper having no native interest in them. There are the "types" in Russell and Whitehead's Principia Mathematica, but those don't seem to me to have anything really to do with types as experienced in programming.
— Dr. Hallen, The Blob
Yet, over in the computer science department, we're awash in types. They're certainly used for reasoning about programs (both practically and theoreticially) — but at some point our reasoning may become more about the types themselves than about the programs they apply to. Type systems can be strikingly reminiscent of bureaucratic red tape when one is getting tangled up in them. So, if they aren't a natively mathematical concept, why are they involved in our reasoning at all? Are they natural to what we're reasoning about (programs), or an unfortunate historical artifact? From the other side, is reasoning in mathematics simpler because it doesn't use types, or does it not need to use types because what it's reasoning about is simpler?
Looking back at the early history of programming, types evidently arose from the need to keep track of what format was being used by a given block of binary data. If a storage cell was assigned a value using a floating-point numerical representation, and you're trying to treat it as a series of ASCII characters, that's probably because you've lost track of what you meant to be doing. So we associate format information with each such cell. Note that we are not, at this point, dealing directly with the abstract entities of mathematics, but with sequences of storage bits, typically fixed-width sequences at that. Nor does the type even tell us about a sort of mathematical entity that is being stored, because within the worldview presented by our programming language, we aren't storing a mathematical entity, we're representing a data value. Data values are more abstract than bit sequences, but a lot less abstract than the creatures we'd meet in the math department. The essential difference, I'd suggest, is that unlike their mathematical cousins, data values carry about with them some of their own representation format, in this case bit-level representation format.
A typical further development in typing is user-defined (which is to say, programmer-defined) types. Each such type is still stored in a sequence of storage bits, and still tells us how the storage is being used to represent a data value, rather than store a mathematical entity. There is a significant difference from the earlier form of typing, in that the language will (almost certainly) support a practically infinite number of possible user-defined types, so that the types themselves have somewhat the character of mathematical abstract entities, rather than data values (let alone bit sequences). If, in fact, mathematics gets much of its character by dealing with its abstract entities unfettered by representational issues (mathematics would deal with representation itself as just another abstract domain), a computer scientist who wants that character will prefer to reason as much as possible about types rather than about data values or storage cells.
Another possible development in typing, orthogonal to user-defined types, is representation-independence, so that the values constrained by types are understood as mathematical entities rather than data values. The classic example is type bignum, whose values are conceptually mathematical integers. Emphasis on runtime efficiency tends to heighten awareness of representational issues, so one expects an inverse relation between that emphasis, and likelihood of representation-independent types. It's not a coincidence that bignums flourish in Lisp. Note also that a key twist in the statement of the expression problem is the phrase "without recompiling existing code".
Complicated type systems as crutches
Once we have types, since we're accustomed to thinking about programs, we tend to want to endow our type systems with other properties we know from our programming models. Parametric types. Dependent types. Ultimately, first-class types.
I've felt the lure of first-class types myself, because they abandon the pretense that complicated types systems aren't treating types computationally. There's an incomplete language design in my files wherein a type is an object with two methods, one for determining membership and one for determining sub/supertyping. That way leads to unbounded complications — the same train of thought has led me more recently to consider tampering with incompleteness of the continuum (cf. Section 8.4.2 of my dissertation; yet another potential blog topic [later did blog on this, here]). As soon as I envisioned that type system, I could see it was opening the door to a vast world of bizarre tricks that I absolutely didn't want. I really wanted my types to behave as mathematical sets, with stable membership and transitive subtyping — and if that's what you want, you probably shouldn't try to get there by first giving the methods Turing power and then backing off from it.
But, from the above, I submit that these complicated type systems are incited, to begin with, when we start down the slippery slope by
- tangling with data values —halfway between the abstract and concrete worlds— instead of abstract mathematical entities, and
- placing undue emphasis on types, rather than the things they describe. This we did in the first place, remember, because types were more nearly mathematical; the irony of that is fairly intense.
What we need, it seems, is the lightweight character of mathematical reasoning. There's more to it than mathematical "purity"; Haskell is fairly pure, but tbh I find it appallingly heavy. I find no sense of working with simple primitives — it feels to me more like working on a scaffold over an abyss. In mathematics, there may be several different views of things any one of which could be used as a foundation from which to build the others. That's essentially perfect abstraction, in that from any one of these levels, you not only get to ignore what's under the hood, but you can't even tell whether there is anything under the hood. Going from one level to the next leaves no residue of unhidden details: you could build B from A, C from B, and A from C, and you've really gotten back to A, not some flawed approximation of it that's either more complicated than the original, more brittle than the original, or both.
Making that happen in a language design should involve some subtle shifts in the way data is conceptualized. That isn't a digression in a discussion of types, because the way we conceptualize data has deep, not to say insidious, effects on the nature of typing. As for the types themselves, I suggest we abandon the whole notion of types in favor of a lightweight mathematical notion of sets — and avoid using the word "type" as it naturally drags us back toward the conceptual morass of type theory that we need to escape.