Monday, November 14, 2011

Where do types come from?

[On the phone] There's a man here with some sort of a parasite on his arm, assimilating his flesh at a frightening speed.  I may have to get ahead of it and amputate.  No... I don't know what it is or where it came from.
— Dr. Hallen, The Blob
I took some tolerably advanced math courses in college and graduate school.  My graduate research group of choice was the Theory Umbrella Group (THUG), joint between the math and computer science departments.  But one thing I never encountered in any of those courses, nor that-I-recall even in the THUG talks, was a type.  Sets aplenty, but not types.  Types seem to arise from specifically studying computer science, mathematics proper having no native interest in them.  There are the "types" in Russell and Whitehead's Principia Mathematica, but those don't seem to me to have anything really to do with types as experienced in programming.

Yet, over in the computer science department, we're awash in types.  They're certainly used for reasoning about programs (both practically and theoreticially) — but at some point our reasoning may become more about the types themselves than about the programs they apply to.  Type systems can be strikingly reminiscent of bureaucratic red tape when one is getting tangled up in them.  So, if they aren't a natively mathematical concept, why are they involved in our reasoning at all?  Are they natural to what we're reasoning about (programs), or an unfortunate historical artifact?  From the other side, is reasoning in mathematics simpler because it doesn't use types, or does it not need to use types because what it's reasoning about is simpler?

Representation format

Looking back at the early history of programming, types evidently arose from the need to keep track of what format was being used by a given block of binary data.  If a storage cell was assigned a value using a floating-point numerical representation, and you're trying to treat it as a series of ASCII characters, that's probably because you've lost track of what you meant to be doing.  So we associate format information with each such cell.  Note that we are not, at this point, dealing directly with the abstract entities of mathematics, but with sequences of storage bits, typically fixed-width sequences at that.  Nor does the type even tell us about a sort of mathematical entity that is being stored, because within the worldview presented by our programming language, we aren't storing a mathematical entity, we're representing a data value.  Data values are more abstract than bit sequences, but a lot less abstract than the creatures we'd meet in the math department.  The essential difference, I'd suggest, is that unlike their mathematical cousins, data values carry about with them some of their own representation format, in this case bit-level representation format.

A typical further development in typing is user-defined (which is to say, programmer-defined) types.  Each such type is still stored in a sequence of storage bits, and still tells us how the storage is being used to represent a data value, rather than store a mathematical entity.  There is a significant difference from the earlier form of typing, in that the language will (almost certainly) support a practically infinite number of possible user-defined types, so that the types themselves have somewhat the character of mathematical abstract entities, rather than data values (let alone bit sequences).  If, in fact, mathematics gets much of its character by dealing with its abstract entities unfettered by representational issues (mathematics would deal with representation itself as just another abstract domain), a computer scientist who wants that character will prefer to reason as much as possible about types rather than about data values or storage cells.

Another possible development in typing, orthogonal to user-defined types, is representation-independence, so that the values constrained by types are understood as mathematical entities rather than data values.  The classic example is type bignum, whose values are conceptually mathematical integers.  Emphasis on runtime efficiency tends to heighten awareness of representational issues, so one expects an inverse relation between that emphasis, and likelihood of representation-independent types.  It's not a coincidence that bignums flourish in Lisp.  Note also that a key twist in the statement of the expression problem is the phrase "without recompiling existing code".

Complicated type systems as crutches

Once we have types, since we're accustomed to thinking about programs, we tend to want to endow our type systems with other properties we know from our programming models.  Parametric types.  Dependent types.  Ultimately, first-class types.

I've felt the lure of first-class types myself, because they abandon the pretense that complicated types systems aren't treating types computationally.  There's an incomplete language design in my files wherein a type is an object with two methods, one for determining membership and one for determining sub/supertyping.  That way leads to unbounded complications — the same train of thought has led me more recently to consider tampering with incompleteness of the continuum (cf. Section 8.4.2 of my dissertation; yet another potential blog topic [later did blog on this, here]).  As soon as I envisioned that type system, I could see it was opening the door to a vast world of bizarre tricks that I absolutely didn't want.  I really wanted my types to behave as mathematical sets, with stable membership and transitive subtyping — and if that's what you want, you probably shouldn't try to get there by first giving the methods Turing power and then backing off from it.

But, from the above, I submit that these complicated type systems are incited, to begin with, when we start down the slippery slope by
  • tangling with data values —halfway between the abstract and concrete worlds— instead of abstract mathematical entities, and
  • placing undue emphasis on types, rather than the things they describe.  This we did in the first place, remember, because types were more nearly mathematical; the irony of that is fairly intense.
In contrast to the muddle of complicated typing in computer science, folks over in the math department deal mostly with sets, a lightweight concept that fades comfortably into the background and only has to be attended carefully under fairly extreme circumstances.  Indeed, contrasting types and sets, a major difference between them is that types have object identity — which is itself a borderline representational concept (able to come down on either side of the line), and jibes with experience that sophisticated types become data structures in their own right.  Yes, there are such things as types that don't have object identity; but somehow it seems we've already crossed the Rubicon on that one, and can no longer escape from the idea even in languages that don't endorse it.

Where next?

What we need, it seems, is the lightweight character of mathematical reasoning.  There's more to it than mathematical "purity"; Haskell is fairly pure, but tbh I find it appallingly heavy.  I find no sense of working with simple primitives — it feels to me more like working on a scaffold over an abyss.  In mathematics, there may be several different views of things any one of which could be used as a foundation from which to build the others.  That's essentially perfect abstraction, in that from any one of these levels, you not only get to ignore what's under the hood, but you can't even tell whether there is anything under the hood.  Going from one level to the next leaves no residue of unhidden details:  you could build B from A, C from B, and A from C, and you've really gotten back to A, not some flawed approximation of it that's either more complicated than the original, more brittle than the original, or both.

Making that happen in a language design should involve some subtle shifts in the way data is conceptualized.  That isn't a digression in a discussion of types, because the way we conceptualize data has deep, not to say insidious, effects on the nature of typing.  As for the types themselves, I suggest we abandon the whole notion of types in favor of a lightweight mathematical notion of sets — and avoid using the word "type" as it naturally drags us back toward the conceptual morass of type theory that we need to escape.

33 comments:

  1. With the sort of first-class types I was describing, the type checker is the program — type-checking is just another form of general computation.  Not deciding whether an expression is well-typed means type-checking nonterminates, which can't be decided in general in finite time so it's not barfing but hanging.  And it's possible that the determination of well-typing will have bugs in it, introduced either accidentally or deliberately.  It isn't even necessary for a deliberate anomaly in the typing to be malicious; the programmer may have thought it a clever solution to some problem or other.

    ReplyDelete
  2. I think you missed a few type for math: imaginary numbers composed of a real and imaginary component are certainly a type, you could consider that the base type of math is the digit 1, all other numbers can be described as either a 1, the absence of 1(0), a composite of 1s (2 = 1 + 1), 1s with additional characteristics (negative, positive). I'm sure I could continue but the point is that math has been around for so long that the "types" are no longer seen as composites but as "first-class" types. Types, in programming languages, come about because the world is based on types, when was the last time you had to directly manipulate a quark to get a glass of water?

    ReplyDelete
  3. I didn't miss those things.  I've worked, to some nontrivial extent or other, with complex numbers, quaternions, vectors, matrices, matroids, graphs, and categories (that I recall, off hand).  I simply maintain that those are sets rather than types, with the qualitative differences that sets are abstract while types (as met in actual programming) are in the twilight between abstract and concrete, and sets are lightweight while types are comparatively heavy (weighed down, in a curiously apt metaphor, by all that concrete).

    Note that you could construct complex numbers as pairs of real numbers — but you don't have to.  You could treat complex numbers as primitive in their own right, and safely assume that there is nothing at all under the hood, unless it's convenient for what you're doing to assume that there is something under the hood after all.  That level of clean abstraction generally isn't to be found in our programming languages.

    ReplyDelete
  4. 010101000111100101110000011001010111001100100000011100000111001001101111011000100110000101100010011011000111100100100000011000110110111101101101011001010010000001100110011100100110111101101101001000000111010001101000011001010010000001100110011000010110001101110100001000000111010001101000011000010111010000100000011000110110111101101101011100000111010101110100011001010111001001110011001000000111010101101110011001000110010101110010011100110111010001100001011011100110010000100000001100010010000001100001011011100110010000100000001100000010000001101001011011100010000001100001001000000111000001100001011101000111010001100101011100100110111000100000011000010110111001100100001000000111011101100101001000000110111001100101011001010110010000100000011000010010000001110111011000010111100100100000011101000110111100100000011010010110010001100101011011100111010001101001011001100111100100100000011101110110100001101001011000110110100000100000011001110111001001101111011101010111000001101001011011100110011100100000011011110110011000100000001100010010011101110011001000000110000101101110011001000010000000110000001001110111001100100000011110010110111101110101001000000110000101110010011001010010000001110010011001010110000101101100011011000111100100100000011011000110111101101111011010110110100101101110011001110010000001100001011101000010000001100110011100100110111101101101001000000111010001101000011001010010000001101000011000010111001001100100011101110110000101110010011001010010011101110011001000000111000001100101011100100111001101110000011001010110001101110100011010010111011001100101

    This is what the hardware sees at the circuit level, it is on or off in patterns of powers of 2. To me the abstract concepts of types has to do with defining what do we want these 2 patterns to represent to us humans.

    ReplyDelete
  5. Haven't you heard about generic interfaces? Well those are types, but they don't define the representation. That is, they are abstract, and that how we express "set" in a a language where "set" is not a first class thing. Then you will have subtypes for ordered sets, vectos, matrices and so on. Where is the flaw to this approach? Is it conceptually flawed, are there problems with the implementation, or is it a matter of language selection (You are using LISP, right?).

    I have thought in the past that we need an algebra for classes (rather than types), so we can express differently: "All the roses are red" and "The roses can only be red", where in the first I talk about a common attribute of all the (current) objects of a class (rose), and in the second I talk about an attribute of the class itself [more on that later], so it is not a "coincidence" that we only have red roses in our reference set.

    Take the Syllogism:
    - All men are mortal
    - All Greeks are men
    ----------------------
    - All Greeks are mortal

    Is it a coincidence* that all greeks are mortal?
    *: a property of the given situation, that is that it is derived from known parameters. To word it in another way: is the predicate "All Greeks are mortal" a tautology or does it depend on parameters?

    I don't know if I'm expressing myself well enough... you can define the operation "is instance of", that for example, given a object and a class it will evaluate to true or false. So, we can carry that to first order logic, and express "For all" (as in: those in a set) and "For all possible" (as in: those of a class, those in the set of all possible objects of a class).

    Then again in the Syllogism, I'm I talking of the set of all known Greeks, or I'm talking about Greeks as a class? The same goes for men.

    I know that when I say "Those of a class" it can be confusing, it means to address all the possible objects the class can have (being they a finite or infinite set), then those objects with which I'm working with are just a subset of that.

    I know, you must be thinking: "but it is just another set". Well, yes the class "has" a set of all the possible objects, but it adds its relationships to others classes, and that is a graph you can reason about (I know that that is also a set... but math is just a literary field if you want to take the point of view of symbol manupalion, if you want to see set, you have sets). Think about it, you can carry on to define inheritance, association and attributes. In these aspects it will be similar to relational algebra but freed from the concept of "primary key" and "foreign key" for a more object oriented perspective, with the addition of inheritance, and why not: generics. Why to free it from those? because those are implementation details, they are about the representation.

    Maybe, I'm talking about an extension to relational algebra... if that is what you want to see.

    I wonder how far is it plausible to take this concept, I would like to develop it as a theory [or a cool toy... or find out if it has been done] but I got dead lines to meet, you know?

    ReplyDelete
  6. I think you are saying, could be wrong here, that a set can contain heterogeneous elements. This would allow the construction of numbers with attributes. Using as an example integer numbers being a magnitude combined with a sign attribute or a ‘set’ of things that represent a number. This is analogous to a ‘struct’ kind of entity where you have combined data without an associated active element. The arithmetic domain language describes the process of adding these numbers as A + B = C. But this is just hiding the messy bits that are happening underneath that + operator. The actual operations are more like:

    If sign of B is positive and sign of A is positive then C is a positive sign combined with the sum of the magnitudes of A and B

    If sign of B is positive and sign of A is negative and magnitude of A is greater than magnitude of B then C is a negative sign combined with the difference of the magnitudes of A and B.

    If sign of B is positive and sign of A is negative and magnitude of B is greater than magnitude of A then C is a positive sign combined with the difference of the magnitudes of A and B.

    There are more rules like this but the point is that the rules are not implicit with the set. You are taught the rules at some point and then do not think of them anymore leading to the assumption that you are dealing with a base type. The same thing happens with programming languages that allow operator overloading. You can create a type, or class, called complex that consists of two real numbers, floats, and define what happens when using the + operator all within the context of the complex type. Once you have this type, adding two complex types is as simple as A + B. In general, people using the complex type are going to forget about the type being a composite thing and use it as if it was a primary type.

    The ability to overload operators and/or provide ‘clean’ interactions between types has been available since at least 1979 with the development of the C++ programming language. There are other languages that also include these features that probably pre-date C++ but I am most familiar with C++. If you are using a programming language that does not provide these features then you may want to reconsider your language choice. We are always trying to develop better domain languages, after all I do not know anybody who wants to do complex math with Roman numerals.

    ReplyDelete
  7. I am fascinated and intrigued by your post, John. So, what you write in this post could lead us to a new programming paradigm? How do you envision this?

    ReplyDelete
  8. You should check Translating mathematics into code: Examples in Java, Python, Haskell and Racket, http://matt.might.net/articles/discrete-math-and-code/

    It tries to solve the exact problem with type you described here.

    ReplyDelete
  9. I agree, John. I guess that the best single data type to start is just s-expression. Perhaps even implementation of integers, as type different from symbols, was step back to Lisp. But symbols should be somehow reduced to s-expressions; dichotomy between processing of symbols (like make-symbol, gensym) and processing of s-expressions isn't elegant.

    ReplyDelete
  10. It seems to me that `math` is inherently lightweight compared to programming. Many concerns of programming do not seem to readily apply in mathematics:
    * Modularity - information hiding, independent development, encapsulation
    * Extension - pluggability, open systems, communication
    * Performance - efficiency, compilation
    * Resource Control - linearity, garbage collection, ownership, quotas, job control
    * Security and Authority - capabilities, sealing
    * Failure Modes and Recovery
    * Licensing & Intellectual Property

    Any of these concerns can influence `types` in various ways that are outside mathematical idealism.

    I agree that a lot of type systems seem like `red tape`.

    ReplyDelete
    Replies
    1. The consequence of ignoring these things is that "while most theorems are true, most proofs have bugs" (Paul Pedersen). Mathematical proof is a social notion: a proof is valid, in practice, if enough of the relevant mathematicians agree that it is. Programs do things, and so it's not enough that everyone agrees your program is bug-free; if it computes the wrong answer compared to the real world, it has a bug.

      Delete
  11. Some of those concerns are indeed surely more cs than math — performance and licensing seem especially so. But with others, it may be our approach that is un-math-like, and that may be only because we haven't yet risen to the challenge of properly reapplying the mathematical principles to cs. In that regard, modularity especially catches my eye. I won't frame this well, perhaps, as I don't have a pre-practiced smooth articulation of the thought, but — If you're working with complex numbers as primitives, with real numbers defined as a subset of them, and somebody does something based on complex numbers being defined as pairs of real numbers, you can say, "but I'm not defining complex numbers that way". That seems to be information hiding... but it's lightweight.

    A thought that's crossed my mind since I wrote this is that some part of the heaviness of types is because we don't mean types as much for us to reason about programs as for computers to reason about programs — and, as I noted on LtU, we think and are clumsy at computing, while computers compute and are clumsy at thinking. So types by catering to computers become more difficult for us.

    ReplyDelete
  12. This comment has been removed by the author.

    ReplyDelete
  13. This is an interesting discussion.

    The world I write software in is rarely neat and tidy. There are not nice clean sets with unambiguous rule sets. Instead we have semantics that evolve and morph as the understanding of the problem space and the solutions evolve. We have relationships between those semantics that are far from hard and fast. Its not possible to build an unambiguous grammar because the problem exists in a space full of ambiguity and edge cases.

    The point is that a programming language suitable for describing clean abstract models will inevitably be more math-like than a language that has evolved to describe messy people problem spaces.

    Types and the structure around them are a very useful construct that form a functional bridges between messy semantics and clean data.

    ReplyDelete
  14. The difference between an int32 and an int64 is not visible in 2=1+1 but elsewhere.

    I think that this observation has something to do with the concept of "light weight sets" -- if we want them to be light weight, I think we are sometimes going to be playing fast and loose with the boundary cases.

    ReplyDelete
  15. Part of the heaviness here is the that that you are working with int32 and int64. It's the existence of the boundary cases that drag them down. In contrast, bignum has no boundary cases (short of things like the amount of virtual memory available).

    ReplyDelete
  16. > Indeed, contrasting types and sets, a major difference between them is that types have object identity

    I'm struggling to understand what you mean by that. How does the Haskell type "Monad m => m a -> (a -> m b) -> m b" or the Java type "List<? extends Map<String, ? extends Number>>" have an object identity?

    ReplyDelete
  17. I'll try to clarify, and then carry the thought one step further than I had above.

    I did note that not all types have this. What I meant by the term object identity here —a term whose habit I believe I picked up years ago from the Brown University database folks— is types with separate identical definitions, which are distinct types because the definitions are separate, rather than the same types because the definitions are identical. So that when you say "this is of type X" and "that is of type Y", though X and Y are identically defined, you aren't saying the same things about this and that.

    It does occur to me, though, that some such identity is probably necessary for information hiding, which math reasoning does better than we do. Yet I still think it's weighing down our types. Looking to reconcile those two thoughts, perhaps in lightweight reasoning the identity device should be orthogonal to sets.

    ReplyDelete
  18. I got ya, thanks for clarifying. Nominative vs structural types.

    ReplyDelete
  19. A different way of viewing Types is that they are a mechanism for error control, and as such their use better matches the human psychology issues.

    The same comparisons can be made with engineering and scientific measurements with their need for well understood units and dimensions (Mass, Length, Time :: metres, kilogram, second; etc.). They also help detect mistakes when used - though it is common for software to repeat such unit and dimension errors, unless embedded (e.g. MathCAD).

    ReplyDelete
  20. Russel and Whitehead's types _are_ connected to types in programming languages! Their type theory was simplified into "simple type theory", which was then formalized by Alonzo Church. If you omit the logical constants from Church's theory, you get the simply-typed lambda-calculus, the prototypical typed programming language.

    This also suggests that types have some motivation independent of binary data formats, since it happened before there were electronic computers. :)

    ReplyDelete
  21. I did choose my wording rather carefully there (I'm not perfect, but I do try to word things cautiously); with emphasis added:  R&W's types "don't seem to me to have anything really to do with types as experienced in programming."  My point is that the actual experience in programming is something quite qualitatively different.

    ReplyDelete
  22. You are probably already familiar with this, but I was very happy when I finally understood that a type represents what we have already proved about a value.
    I don't know if this is how types are used but I am very excited by what they could be.

    ReplyDelete
    Replies
    1. Yes! It's exciting, I agree. Yet I suspect that somehow, the way we've been doing types may not offer any path of incremental migration to that exciting possibility ("you can't get there from here"). Hence my desire to understand better where our approach to types has gone wrong.

      Delete
    2. I don't know how we've been doing types so I'll agree with you. Treating a class (in object oriented programming) like a type is wrong because you could want a type that only includes balanced trees and you could have a class that represents trees kitting patterns. When you add a pattern for a sweater to the tree, its class doesn't change but some of its types might.

      I like that you allow polymorphic functions to dispatch on arbitrary predicates. That covers most things that I would want.

      Delete
  23. I want to focus on something. CS both has an additive bias in its abstraction mechanisms and cares about it. Math doesn't.

    In CS, we have to worry about three things that mathematics doesn't have. First, we have to worry about efficiency in time and space. Every layer of abstraction costs time and space. If we have primitive "numbers" that include complex, then derive "real" via abstraction and "complex" numbers via abstraction on reals, we are using at least twice the representation space we need, and wasting at least 75% of our effort. Even with identical semantics, no computer scientist could possibly be satisfied that we have really reached the "same thing" again.

    That brings us to the Second thing we need to worry about, which is managing abstraction complexity. Mathematicians can show a very complex relationship between two things, and then treat the second thing as a primitive unto itself. CS can do a similar thing with a very complicated implementation of the second thing, but that complicated implementation keeps on costing us overhead long after we discover it.

    We try to reduce complexity by providing a simple interface and hiding information about the implementation from other parts of the program, in order to make an abstraction that can be used additively. As is a way of managing complexity, it (at least sort of) works.

    Third, we have to worry about our additive bias -- We can easily see when we can add code to get to something we want, but it is very hard to take unwanted code away in order to get to something we want. In fact the whole OO paradigm is based on trying to make programs by simple addition of code rather than by allowing the deeper integration/understanding that might result in taking code away, because that deeper integration is complex and we have to manage complexity or go stark raving bonkers.

    So this is the pain and vicious cycle of types as seen in CS. We care, and care deeply, about something which, by emulating the abstractive pattern of mathematics as closely as possible, we make ourselves blind to wasting.

    Adding code is "just what you do" when you're faced with a problem. But by the time they've built nine abstractions each on top of the last, a mathematician is making perfectly good progress but the computer programmer gets fired because his or her code is running like ants stuck in honey.

    To achieve parity with Mathematics, We need a tool that fully integrates instead of fully separating, eliminates redundant and unnecessary infrastructure, and gives us, for this example, complex numbers that truly *are* the same no matter what pile of abstractions we've gone through to get them. And real-number object code that actually does *less* than the complex-numbers it's derived from rather than doing *more* and ignoring the unwanted parts.

    But, that would be the mythical "sufficiently smart compiler" that Turing proved can't actually exist (see also: Halting). It's a problem.

    ReplyDelete
    Replies
    1. Does math not worry about efficiency? Maybe they just don't call it the same thing. Simple proofs are *elegant*.

      Delete
  24. Hi! New reader here, a complete Comp.Sci amateur (IT guy, hobbyist programmer), but I've been thinking informally about topics very much like yours for about the last five years. Specifically, in the context of extensible language design for domain specific artificial intelligence languages; much more specifically, in terms of the baroque little steampunk language Inform 7 (www.inform7.com).

    Inform 7 is very intriguing and breaks a lot of programming assumptions as a language because it's English based and much of the "program code" is logical predicates somewhat similar to those of Prolog. (Not entirely; it also has rules, SQL-ish data tables, objects with multidispatch methods, and procedures. It's a very weird kind of beast, and trying to tame its complexity with something simpler was what got me into the logical foundations of programming).

    But the thing with I7 code is that it is extremely sensitive to the program text, much more than you would normally see in code. For example, from this single line:

    The Fonz is a man. He is in the kitchen.

    the I7 compiler can infer the following sort of things:

    Object_1234 exists.
    Object_1234 is of class "man", and inherits the default properties of that class (such as membership of class "human", property "male")
    Object_1234 has property "display_name" with value "The Fonz"
    Object_1234 has property "parse_name" with value "fonz"
    A dictionary word "fonz" is added to the dictionary.
    Object_1234 has property "article" with value "the".
    Object_1234 is singular.

    Object_1235 exists.
    Object_1235 is of class "container" or "room".
    Object_1235 has parse_name "kitchen".
    Object_1235 has a default display_name of "kitchen" (but this is likely to be overridden by something more like "The Kitchen" if there's a better declaration later).

    An instance of the "located_in" relation exists between Object_1234 and Object_1235

    As you can see, there's some extremely fine parsing of the source text going on to extract all this information, and this is something it would be really useful to have available - and do - at runtime. I7's design, being compiler-based, doesn't currently allow this, but that line of thought - 'wait, why can't all programming languages be this dense and expressive' - leads naturally toward wishing to parse arbitrary source text of arbitrary domain specific languages at runtime. And that leads to fexprs, vau, and friends.

    And then with types: the more one looks at languages based on logical declarations - like I7 or Prolog, for instance - the more it feels that 'types' are just another kind of logical declaration, so why is there all this machinery that tries to separate them out as something special? For instance, why should the Prolog assertion:

    integer(x).

    be, a priori, something utterly different from C's:

    int x;

    when they both seem to be saying the same thing: 'the predicate 'integer' is asserted of the symbol 'x'? And this is just a chunk of data, isn't it, that ought to be able to be pointed at by any number of other predicates?

    Taking this a bit further, and looking afield at the big wide Internet and the Semantic Web, it seems like almost every bit of data on the Net, from my desktop to my blog to the save game state of my Xbox, could be expressed as predicates in something like RDF (preferably a better language, but something like), and that - if we really wanted and needed type assertions - couldn't they just be a small set predicates about other predicates? So if we had logical assertions to start with, why do we need separate type systems at all?

    And that breadcrumb trail leads right into the maw of Russell and the paradoxes. And I'm glad that you're exploring in this direction, because it seems extremely productive to me, and long past due.

    ReplyDelete
    Replies
    1. We do seem to be wandering down similar paths.

      My path took me through the intersection of 'extensible language' and 'grammar' in the early 1990s; you might be interested in my master's thesis, on Recursive Adaptive Grammars. I even speculated at the time about using a fuzzy RAG, where rule sets are fuzzy sets, to tackle natural language parsing; however, I later concluded that grammatical ambiguity is undesirable in a programming language, as reasoned in a later techreport.

      The connection with logic, and the desirability of breaking down the barrier between compile- and runtime, do seem to be common themes between us. I might remark that all the time I was writing these things in LaTeX, I've mulled over a possible redesign of TeX to be based on fexprs instead of macros, since TeX works mostly with unevaluated data, which ought to be fexprs' strong suit, especially when building massive abstractions on top, as LaTeX tries to do.

      Delete
  25. " type is an object with two methods, one for determining membership and one for determining sub/supertyping." -- is the second method neccesary? You could just let subtyping happen -- whenever type A accepts all members that type B accepts, B is subtype of A.

    ReplyDelete
    Replies
    1. Because the membership method is defined by a computational algorithm, determining whether membership in type T1 implies membership in type T2 cannot be done by examining the definitions of the membership methods; it'd run afoul of the halting problem. So, yes, if you want the system to have access to the subtype relation — and subtyping figures into a lot of common uses of typing — one needs a method to determine it.

      Delete
  26. I know it's been awhile since this was posted, but I'm considering implementing my own dialect of Lisp with gradual types and wanted to ask, how would you see sets (vs types) being realized on a clean slate programming language?

    ReplyDelete
    Replies
    1. I wrote this post because I saw the problem; I wasn't sure what a solution might look like, and truthfully I still don't know. Fwiw, imho a clean-slate language design is an *excellent* way to explore whatever PL concept you may have in mind, and I find Lisp an excellent medium in which to set such a concept and let it be itself unapologetically.

      Delete