Wednesday, April 6, 2011

Fexpr

Fexpr is a noun.  It's pronounced FEKSper.  A fexpr is a procedure that acts on the syntax of its operands, rather than on the values determined by that syntax.

This is only in the world of Lisp programming languages — not that only Lisps happen to have fexprs, but that having fexprs (and having the related characteristics that make them a really interesting feature) seemingly causes a programming language to be, at some deep level, a dialect of Lisp.  That's a clue to something:  although this acting-on-syntax business sounds superficial, it's a gateway to the deepest nature of the Lisp programming-language model.  Fexprs are at the heart of the rhyming scheme of Lisp.

Data as programs

When Lisp reads a syntax (i.e., source code) expression, it immediately represents the expression as a data structure; or at least, in theory it does.  For fexprs to even make sense, this would have to be true:  a procedure in a program acts on data, so if you're passing the operand syntax expressions to a procedure, those expressions have to be data.  The Lisp evaluator then interprets the syntax expression in data form, and that's the whole of Lisp:  read an expression and evaluate it, read another and evaluate it, and so on.

But Lisp was designed, from the start, specifically for manipulating an especially simple and general kind of data structures, essentially trees (though they can also be viewed as nested lists, hence the name of the language, short for LISt Processing).  And syntax expressions are represented as these same trees that are already Lisp's native data structure.  And, the Lisp evaluator algorithm isn't limited to data that started life as a representation of syntax:  any data value can, in principle, be evaluated.  Which means that a fexpr doesn't have to act on syntax.

The theory of fexprs

One reason it matters that fexprs can act on non-syntax, is because of a notorious theoretical result about fexprs.  Proving programs correct usually makes heavy use of determining whether any two source expressions are interchangeable.  When two source expressions may be operands to a fexpr, though, they won't be interchangeable in general unless they're syntactically identical.  So with fexprs in the language, no two distinct operands are ever universally interchangeable.  This was famously observed by Mitch Wand in a journal article back in 1998, The Theory of Fexprs is Trivial.

But while a fexpr can analyze any syntactic operand down to the operand's component atoms, computed operands are a different matter.  It's almost incidental that some computed data structures are encapsulated, so can't be fully analyzed by fexprs.  The more important point is, even if the structure resulting from computation can be fully analyzed, the process by which it was computed is not subject to analysis.  If a fexpr is given an operand 42, the fexpr can't tell how that operand was arrived at; it might have been specified in source code, or computed by multiplying 6 times 7, or computed in any of infinitely many other possible ways.

So, suppose one sets up a computational calculus, something like lambda-calculus, for describing computation in a Lisp with fexprs.  Source expressions are terms in the calculus, and no two of them are contextually equivalent (i.e., interchangeable as subterms of all larger terms).  But —unless the calculus is constructed pathologically— there are still very many terms in the calculus, representing intermediate states of subcomputations, that are contextually equivalent.

I've developed a calculus like that, by the way.  It's called vau-calculus.

Deep fexprs

We're about to need much better terminology.  The word fexpr is a legacy from the earliest days of Lisp, and procedure is used in the Lisp world with several different meanings.  Here's more systematic terminology, that I expanded from Scheme for use with the Kernel programming language.
A list to be evaluated is a combination; its first element is the operator, and the rest of its elements are operands.  The action designated by the operator is a combiner.  A combiner that acts directly on its operands is an operative.  (Legacy terms: an operative that is a data value is a fexpr, an operative that is not a data value is a special form.)  A combiner that isn't operative is applicative; in that case, the operands are all evaluated, the results of these evaluations are called arguments, and the action is performed on the arguments instead of on the operands.
It might seem that applicative combinations would be more common, and far more varied, than operative combinations.  Explicitly visible operatives in a Lisp program are largely limited to a small set, used to define symbols (in Kernel, mainly $define!  and $let), construct applicatives ($lambda), and do logical branching ($if  and $cond) — about half a dozen operatives, used over and over again.  The riotous variety of programmer-defined combiners are almost all applicative.

But looking closely at the above definition of applicative, it implies that every applicative has an operative hiding inside it.  Once an argument list has been computed, it's just another list of data values — and those values are then acted on directly with no further processing, which is what one does when calling an operative!  Applicative +, which evaluates its operands to arguments and then adds the arguments, has an underlying operative that just adds its operands; and so on.

Vau-calculus

In a computational calculus for fexprs, it's a big advantage to represent each applicative explicitly as a wrapper (to indicate the operands are to be evaluated) around another underlying combiner.  That way, the calculus can formally reason about argument evaluation separately from reasoning about the underlying actions.  Vau-calculus works that way.  The whole calculus turns out to have three parts.  There's one part that only represents tree/list structures, and no computation takes place purely within that part.  There's one part that only deals with computations via fexprs.  And then, linking those two, there's the machinery of evaluation, which is where the wrapper-to-induce-operand-evaluation comes in.

Fascinatingly, of these three parts of vau-calculus, the one that deals only with computations involving fexprs is (give or take) lambda-calculus.  One could reasonably claim —without contradicting Mitch Wand's perfectly valid result, but certainly contrasting with it— that the theory of fexprs is lambda-calculus.

(Vau-calculus seems a likely topic for a future blog entry here.  Meanwhile, if you're really feeling ambitious, the place to look is my dissertation.)
[Note:  I've since blogged on vau-calculus here.]
Kernel

What works for a computational calculus also works for a Lisp language:  represent each applicative as a wrapper around an underlying combiner.  The Kernel programming language does this.  An applicative unwrap  takes an applicative argument and returns the underlying combiner of that argument; and an applicative wrap  takes any combiner at all as an argument, and returns an applicative whose underlying combiner is that argument.

This makes Kernel a powerful tool for programmers to fluently manipulate the operand-evaluation process, just as the analogous device in vau-calculus allows reasoning about operand-evaluation separately from reasoning about the underlying lambda-calculus computations.

Kernel (evaluator)

Here's the central logic of the Kernel evaluator (coded in Kernel, then put in words).
($define! eval
   ($lambda (expr env)
      ($cond ((symbol? expr)  (lookup expr env))
             ((pair? expr)
                (combine (eval (car expr) env)
                         (cdr expr)
                         env))
             (#t  expr))))

($define! combine
   ($lambda (combiner operands env)
      ($if (operative? combiner)
           (operate combiner operands env)
           (combine (unwrap combiner)
                    (map-eval operands env)
                    env))))
To evaluate an expression in an environment:  If it's a symbol, look it up in the environment.  If it's a pair (which is the more general case of a list), evaluate the operator in the environment, and combine  the resulting combiner with the operands in the environment. If it's neither a symbol nor a pair, it evaluates to itself.

To combine a combiner with an operands-object:  If the combiner is operative, cause it to act on the operands-object (and give it the environment, too, since some operatives need that).  If the combiner is applicative, evaluate all the operands in the environment, and recursively call combine  with the underlying combiner of the applicative, and the list of arguments.

Kernel (fluently doing nothing)

When evaluating syntax read directly from a source file, the default case of evaluation —the one explained in boldface— is why a literal constant, such as an integer, evaluates to itself.  What makes it worth boldfacing, though, is that when evaluating computed expressions, that case helps keep environments from bleeding into each other (in Lisp terminology, it helps avoid accidental bad hygiene).  Here's a basic example.

Lisp apply  overrides the usual rule for calling an applicative, by allowing a single arbitrary computation-result to be used in place of the usual list of arguments.  The first argument to apply  is the applicative, and its second argument is the value to be used instead of a list of arguments.  In Kernel, and then in words:
($define! apply
   ($lambda (appv args)
      (eval (cons (unwrap appv) args)
            (make-environment))))
To apply an applicative to an args-object, construct a combination whose operator is the underlying combiner of the applicative, and whose operands-object is the args-object; and then evaluate the constructed combination in a freshly created empty environment.  When the constructed combination is evaluated, its operator evaluates to itself because it's a combiner.  This defaulting operator evaluation doesn't need anything from the environment where the arguments to apply  were evaluated, so the constructed combination can be evaluated in an empty environment — and the environment of the call to apply  doesn't bleed into the call to the constructed combination.

In a standard Kernel environment, (apply list 2) evaluates to 2.

A more impressive illustration is the way $lambda  can be defined hygienically in Kernel using more primitive elements of the language.  I should make that a separate post, though.  The earlier parts of this post deliberately didn't assume Lisp-specific knowledge at all, and in the later parts I've tried to ease into Lisp somewhat gently — but $lambda  gets into a whole nother level of Lisp sophistication (which is what makes it a worthwhile example), so it just feels logically separate.
[Note: I did later post on Kernel hygiene and $lambda, here.]

14 comments:

  1. Thanks for your lucid explanations. This topic doesn't seem to be very well-covered elsewhere on the internet. I look forward to reading more as your blog develops.

    I've explored expression evaluation from an actor-based perspective. My applicative evaluator is described at http://www.dalnefre.com/wp/2010/08/evaluating-expressions-part-1-core-lambda-calculus/

    I've also built an (unpublished) operative evaluator with actors, and observed that applicatives were a special case of operatives, just as you've written.

    ReplyDelete
  2. Interesting about your operative evaluator. It seems people who mix actors with fexprs tend to reach somewhat similar conclusions; I started with Scheme (which is, of course, historically an exploration of the actors model via Lisp), and I recall an old remark on LtU comparing Kernel to actors-based language io.

    The reason fexprs aren't well-covered is that they were on the losing side of a showdown between scientific paradigms in about 1980, and the winning paradigm in such a showdown goes out of its way thereafter to ridicule the loser (the loser becomes "conuclear"). My dissertation sought to make fexprs academically credible again.

    ReplyDelete
  3. I was inspired to revisit the actor-based approach to operative evaluation, based on your Kernel work. I'd be interested to hear what you think.

    http://www.dalnefre.com/wp/2011/11/fexpr-the-ultimate-lambda/

    ReplyDelete
  4. I recently learned that there is the so-called 3-Lisp, which seems quite related to Kernel, except that in Kernel, $vau does not capture the continuation. It seems then Kernel loses some expressiveness, or does it? How do you compare and contrast Kernel with 3-Lisp. Thanks.

    ReplyDelete
  5. 3-Lisp is in the family of reflective Lisps; in fact, it's the ancestor of that entire family. Note that Wand wrote "The Theory of Fexprs is Trivial" coming from the reflective-lisp tradition; the trivialization of theory he's looking at is really a consequence of reflection, rather than of fexrs, hence its dependence on the degree of reflection rather than the "degree of fexprs" — whatever that could mean: you either have fexprs or you don't, but reflection can be stronger or weaker, which is (in one view of the matter) why the trivialization is a matter of degree rather than all-or-nothing.

    In terms of technical expressiveness, bundling continuation-capture into the basic procedural mechanism would be no more powerful than providing continuation-capture by means of a separate device, which is what Kernel does. The two facilities are logically orthogonal to each other, which favors keeping them separate. Related to the orthogonality, I think using a separate feature has advantages for accident-avoidance, which is a major theme in the Kernel design.

    Philosophically, Kernel doesn't aspire to be reflective. Kernel combiners actively seek to be non-reflective, a point closely related to explicit evaluation versus implicit evaluation; perhaps there's yet another topic for a blog post there (as if I needed more of them; I've got a great pile of posts in various stages of incompletion). One could plausibly claim the continuation module is reflective.

    ReplyDelete
    Replies
    1. Thanks for the answer. Now according to your explanation, the real matter is the degree of reflection, by which if I understand correctly, you mean strong or weak. So could you be a little clearer about which kind of reflection is strong and which is weak? How the degree of reflection changes from one end to the other? Also, from your description, it seems that one end of the degree of reflection (the strong?) will definitely cause trivialization but the problem will become less serious along toward the other end.

      By the way, I have tried several times to write to you by email, asking more general questions regarding your dissertation but have not got any answer. I wonder whether the email address you put on your home page is still in use or not.

      Delete
  6. Degree of "reflection" is a somewhat careless phrasing on my part; when I'm being careful, I maintain that Kernel is entirely non-reflective.  Yet there's a degree of something involved, though "reflection" isn't quite the right word for it.  This is one of the various topics I mean to blog about, but out of those topics it's not the first or even second highest priority atm.  The trivialization occurs because two terms T1, T2 cannot be considered contextually equivalent ("equal") to each other if there is any context C[ ] that could surround them that would distinguish them from each other; that is, C such that C[T1] is observably different from C[T2].  With fexprs, C could be a context in which T1 and T2 take an operand position in a call to a fexpr.  So if a fexpr can analyze its operand so as to tell which of T1, T2 the operand is, they can't be "equal".  The more terms can be distinguished, the more nearly trivial the theory becomes.  Wand's paper considered a calculus in which every possible term could be observably distinguished from every other possible term, so of course the equational theory of that calculus was completely trivial.  In my vau calculi, every source code expression can be distinguished from every other source code expression, but as soon as you start doing computations, the intermediate states of a computation cannot be distinguished from each other  Reflection comes into it because reflection is one device that distinguishes things that otherwise wouldn't be distinguishable.  (What is there, besides reflection, that does this?  Well, for one thing, fexprs. :-)

    The email address I believe you're asking about is deprecated for the long run, but that's not why it wouldn't have been getting a response.  I hope to have effective access to it again soon, after (for various reasons) not accessing it for several months.

    ReplyDelete
  7. Now it is clearer. Thanks for the explanation. By the way, are you aware of a Google Group (https://groups.google.com/d/forum/klisp) around klisp (http://klisp.org/, which aims to be a practical faithful implementation of Kernel), Kernel and $vau. You may consider joining for discussion. It would be nice to have its inventor involved in the growing community, :)

    ReplyDelete
  8. This comment has been removed by a blog administrator.

    ReplyDelete
  9. An idea!! -- What if you could vary the amount of syntax you want to see? That is, what if you could express "interest" at the call site of an fexpr; in order to determine which components of an fexpr are evaluated and which presented as syntax. For instance, given an expression "1 + 2 * (3 + 1)", and a call site expressing interest only in multiplication, it would see that expression as "(\x -> 1 + x) (2 * 4)". That leads to the property that a caller can only distinguish between operators that are visible in its translation unit; if they are not, it cannot express interest.

    ReplyDelete
  10. OK, I give up. What is the character/font you're using for "$vau-calculus" - as seen in your thesis - looks like a mirrored lower case f which I can't find in Unicode? Just want to know how to spell it correctly.

    ReplyDelete
    Replies
    1. It's a transformation of lower-case f in LaTeX using PSTricks; the transformation and its markup are at the end of Appendix C of the dissertation.

      Delete
  11. Could you clarify how lazy vs. eager evaluation would work? For example, `(if (test) (fun a) (fun b))` -- clearly, the call to `test` must be evaluated immediately, to find out if it's true or not. If it is false, then we do NOT want to evaluate either `a` or `(fun a)`, as this would result in, at best, a waste of CPU resources, and, at worst, an infinite loop. So I like the ideas you are describing here, but when I thought about using these ideas in the system I work in, I am faced with the above dilemma.

    ReplyDelete
    Replies
    1. ... and to be clear, the above is an example. The actual structure might be `(backwards-if (fun a) (phun b) (test))` or `(infix-if (fun a) (test) (fun b))` and so there is no particular way to know which of the operands need to be evaluated immediately, and which are to be deferred. I might not even know which ones are deferred until some of the other arguments have been evaluated. I don't see any particularly simple solution: it seems like I have to tell the programmer to continue doing what they've always done: poke around in the operands (which are trees), maybe evaluate something deep inside one of the trees, look at the resulting new tree, and then maybe do something, or not, in one of the other trees, and then maybe return or maybe tail-recurse.

      Delete