each extensible language is surrounded by an envelope of possible extensions reachable by modest amounts of labor by unsophisticated users.Thomas A. Standish, "Extensibility in Programming Language Design",SIGPLAN Notices10 no. 7 (July 1975) [Special Issue on Programming Language Design], p. 20.

I said in an earlier post I should blog about abstractive power "eventually". This is it.

This material is very much a work in progress. Throughout this post I'll emphasize insight and intuition — but while the post starts out non-technical, it will get more mathematical by increments as it goes along. The relation between the math and the insights works both ways: insights into abstraction guide the mathematical development, and the mathematical development is pursued partly in hopes of eventual further insights into abstraction. I also hope, by presenting the mathematical development in an intuitive form (as opposed to the much drier form in my 2008 techreport) to get insights *into* the mathematical development. The post ends, as does the current state of the work, with an elementary test case to show feasibility.

The ideaContents

The idea

The goal

There is no semantics

The second derivative of semantics

Recasting expressiveness

Abstractiveness

Test case

The *extensible languages* movement peaked around 1970, and was on its way out when Standish wrote the above. Extensibility enthusiasts had hoped, frankly, that by means of language-extension mechanisms it would become possible for everyone to use a single base language and transform it into anything anyone needed for any particular purpose. Standish was noting that the extension mechanisms primarily used by the movement — macro preprocessors and perhaps the ability to add new syntax rules — had a limited range, after which it became quite difficult to extend the language further.

Macro preprocessing, in particular, cannot easily be used to build a series of extensions, one on top of another, because as extension follows extension, the programmer is rapidly overcome by accumulating complexity. In order to use a macro, you have to be able to see whatever underlying facility the macro uses. Thus, to add a second layer of macros to a base language, you have to understand and account for the base language *and all of its first layer of macros*; to add a third layer of macros, you have to understand the base language, the first layer of macros, and the second layer of macros; and so on. The visibility of the underlying layers also limits how *different* the extended language can be from the base language.

The extensibility movement was supplanted by the *abstraction* movement, which had a more semantic focus, and came to be dominated — at least for a while — by the Object-Oriented Paradigm. Something of the spirit of the new movement is visible in this remark on lexical scoping from Steele and Sussman's 1978 *The Art of the Interpreter; or, The Modularity Complex* (p. 24):

What is interesting about this is that we can write procedures whichconstruct other procedures. This is not to be confused with the ability to construct S-expressionrepresentationsof procedures; that ability is shared by all of the interpreters we have examined. The ability to construct procedures was not available in the dynamically scoped interpreter. In solving the violation of referential transparency we seem to have stumbled across a source of additional abstractive power.

Abstraction helps with the problem of accumulating complexity, because you can — at least, ideally — use an extension without having to worry about all the details of what underlies it. There is still some accumulated complexity, though. I noted this in my earlier blog post on types:

In mathematics, there may be several different views of things any one of which could be used as a foundation from which to build the others. That's essentially perfect abstraction, in that from any one of these levels, you not only get to ignore what's under the hood, but you can't even tell whether there is anything under the hood. Going from one level to the next leaves no residue of unhidden details: you could build B from A, C from B, and A from C, and you've really gotten back to A, not some flawed approximation of it that's either more complicated than the original, more brittle than the original, or both.The central point of that blog post is that typing, which is evidently meant to help us manage complexity, can easily become itself a source of complexity. The same danger applies to other tools we use to manage complexity; the tools become effectively part of the language, and are thus added complexity and subject to accumulation of further complexity.

Another four decades of experience (since Standish's post-mortem on the extensible languages movement) suggests that all programming languages have their own envelopes of reachable extensions. However, some languages have much bigger, or smaller, envelopes than others. What factors determine the size, and shape, of the envelope? What, in particular, can a programming language designer do to maximize this envelope, and what are the implications of doing so?

As a useful metaphor, I call the breadth of a language's envelope its *radius of abstraction*. Why "abstraction"? Well, consider how the languages in this envelope are reached. Starting from the base language, you incrementally modify the language *by using facilities provided within the language*. That is, the new (extended) language is drawn out from the old (base) language, in which the new language had been latently present. (Latin *abs-*, "out", and *trahere*, "pull/draw") The terminology of *layers of abstraction*, in programming, goes back to the 1970s. One also finds this use of *abstraction* in philosophy, for drawing out something latently present; here's a passage from Locke's 1689 *An Essay Concerning Human Understanding* (you may recognize this, as it's quoted at the front of the Wizard Book):

The acts of the mind, wherein it exerts its power over its simple ideas, are chiefly these three : (1) Combining several simple ideas into one compound one ; and thus all complex ideas are made. (2) The second is bringing two ideas, whether simple or complex, together, and setting them by one another, so as to take a view of them at once, without uniting them into one ; by which way it gets all its ideas of relations. (3) The third is separating them from all other ideas that accompany them in their real existence : this is called abstraction : and thus all its general ideas are made.Our programming-language use of the term

*abstraction*does take a bit of getting used to, because we usually expect something

*abstracted*to be

*smaller*than what it was abstracted from. The abstract of a paper is a short summary of it. An abstract thought has left behind the details of concrete instances — though it seems the abstract thought may be somehow "bigger" than the more concrete thoughts from which it was drawn. In our case, the extended language is probably equi-powerful with the base language, and therefore, even if some specific implementation details are hidden during extension, the two languages still feel as if they're the same size. This is not really strange; recall Cantor's definition of an infinite set — a set whose elements can be put in one-to-one correspondence with those of a proper subset of itself. Since we rarely work with finite languages, it shouldn't surprise us if we abstract from one language another language just as "big".

Ironically, though, the reason we're discussing this at all is that, despite our best efforts, the extended language *is* smaller than the base in the sense that its "envelope" of reachable extensions is smaller. We'd really rather it *weren't* smaller.

What general principles govern radius of abstraction? My first candidate is *smoothness*, a term I borrowed from M. D. McIlroy, one of the founders of the extensible languages movement. I mean by it the property of a language that its abstractive facilities apply to the language in a free and uniform way. This concept is also close kin to Strachey's *first-class objects*, and van Wijngaarden's *orthogonality*. I proposed the following principle in my dissertation:

When a base language contains a defect of smoothness, I suggest, successive extensions magnify the defect, creating unbounded complexity that drags down the programmer. The goal(Smoothness Conjecture)Every roughness (violation of smoothness) in a language design ultimately bounds its radius of abstraction.

The Smoothness Conjecture is a neat expression of a design priority shared by a number of programming language designers; but, looking at programming language designs over the decades, clearly many designers either don't share the priority, or don't agree on how to pursue it.

What, though, if we could develop a mathematical framework for studying the abstractive power of programming languages — a *theory of abstraction*. One might then have an objective basis for discussing design principles such as the Smoothness Conjecture, that to date have always been largely a matter of taste. I wouldn't expect the Smoothness Conjecture itself to be subject to formalization, let alone proof, at least not until the study of the subject reached quite a mature phase; but the Conjecture may inspire any number of more specific claims that could then be weighed objectively.

This, in my humble opinion, would be very cool and, as an added bonus, immensely useful.

It is not, however, a short-term goal. For a ballpark estimate, say people have been pursuing abstractive power (under whatever name) since the founding of the extensible languages movement, circa 1960. When I got into the game, it had already been going on for about three decades. The more extreme OOP advocates were making claims for it that could have been lifted nearly verbatim from the more extreme extensibility advocates of two decades earlier, and by my assessment then (understandably unpopular with the OOP advocates) there was still more we didn't know than that we did. I didn't expect to tie it all up quickly; but I'm still *excited* about the prospects, because every few years my thinking on it has moved (forward, I hope) slightly — and by my estimate of the difficulty, any progress at all is a very encouraging sign.

Shortly after I started thinking on abstractive power, Matthias Felleisen's classic paper "On the Expressive Power of Programming Languages" was published (in the proceedings of ESOP '90), and I was encouraged by this evidence that I wasn't the only person in the world crazy enough to try to mathematize traditionally informal aspects of language design. Felleisen's treatment has been quite a successful meme in the years since, and has some features of interest for abstraction theory — both features that apply to abstractive power, and features that offer insight because of why they *don't* apply to abstractive power.

Felleisen's expressiveness works roughly thus: A programming language is a set of programs together with a partial mapping from programs to semantic values. Language *A* can express language *B* if there is a simple way to rewrite *B*-programs as *A*-programs that preserves the overall pattern of semantics of programs — not only the semantic values of valid programs, but which programs are valid, i.e., halt and which are not valid/don't halt. (In this case, the overall pattern of behavior is sufficiently captured by the pattern of halting/not-halting, so one might as well say there is just a single semantic value, or technically replace the "partial mapping from programs to semantic values" with a "subset of programs designated as halting".)

That is, *A* can express *B* when there exists a decidable function φ mapping each *B*-program *p* to an *A*-program φ(*p*) such that φ(*p*) halts iff *p* halts. *A* can *weakly* express *B* when φ(*p*) halts if *p* halts (but not necessarily *only* if *p* halts).

How *readily* *A* can express *B* depends on how disruptively φ is allowed to rearrange the internal structure of *p*. Felleisen's paper particularly focuses on the class of "macro", a.k.a. "polynomial", transformations φ, which correspond to Landin's notion of *syntactic sugar*. Each syntactic operator σ in language *B* is replaced by a polynomial (a "macro", or "template") σ_{φ} in language *A*; thus,

φ(σ(eWhen φ is of this form, one says_{1}, ... e_{n})) = σ_{φ}(φ(e_{1}), ... φ(e_{n}))

*A*can macro-express

*B*.

I described the criterion for *A can express B* as preserving the "overall pattern of semantics of programs". I meant to suggest that this is more than each individual mapping *p* ↦ φ(*p*) preserving semantics; it involves preserving, across mapping φ, how the shape of program texts affects their semantics. This preservation-of-shape is more apparent when considering macro-expressiveness, which demands similarities of program shape between *p* and φ(*p*), because this implies that the similarities and differences between φ(*p _{1}*) and φ(

*p*) would be akin to the similarities and differences between

_{2}*p*and

_{1}*p*; but it's not clear that polynomial/macro rewriting would be the only useful measure of similarity of program shape. (Cf. Steele and Sussman's 1976

_{2}*Lambda: The Ultimate Imperative*.) To explore more general aspects of expressiveness, one might parameterize the theory by what class of transformations are allowed.

In preparing for an analogous treatment of abstractiveness, the first thing to recognize is that while expressiveness views each program as inducing a semantic value, abstractiveness views each program as inducing a *programming language*. When comparing two *B*-programs *p _{1}* and

*p*, we don't just ask whether they induce the same programming language, because they almost certainly do not. Rather, we want to compare the induced programming languages to each other, probably using some measurement at least as sophisticated as expressiveness.

_{2}
Consider what this means for the definition of *programming language*. Picture a base language as the center of a web of languages connected by directed arrows —abstractions— each arrow labeled by a program text. The whole thing is a sort of state machine, where the states are languages, the state transitions are abstractions, and the input "alphabet" is the set of program texts. We could also integrate semantics into this model, by adding transitions labeled with reserved "observable" terms — and then there isn't really any need for the states of the machine at all. Everything we could ever want to know about a given programming language is contained in the set of all possible sequences of labels on paths starting from that language; so we might as well define a language to *be* that set of sequences. That is,

(D1) AThis approach also appeals to the recognition that although computation theory tends to look only at computations thatprogramming languageover set of termsTis a set of sequences of termsP⊆T^{*}such that for all sequences of termsxandy, ifxy∈Pthenx∈P.

*halt*, a great many of our software processes are open-ended.

This purely syntactic, unbounded view of programming languages is foundational. The expectation of halting — what one might call the *terminal-semantic assumption* — is ubiquitous: the assumption, hardwired into one's core definitions, that a computation is meant to get an answer and stop. Denotational semantics is a terminal-semantic model. Theory of computation, and complexity theory, are founded on the terminal-semantic assumption.

To my mind, an essential difficulty with the terminal-semantic approach is that, patently, it prefers to disregard properties that relate to unbounded sequences of future developments. Abstractive power is directly concerned with such sequences, but one suspects all computation really should take them into account, as most macroscopic software processes are interactive (in one or another sense) and open-ended rather than self-contained and merely producing a final result. (Cf. Dina Goldin et al.)

The unbounded-syntax approach does not, of course, really "eliminate" semantics; but it does cause semantics to become a dependent concept, grounded in syntax. For abstraction theory as I'm currently developing, semantics is a set of sequences of terms; in RAGs (from my master's thesis), semantics is a nonterminal symbol of a grammar. (In a modern treatment of RAGs I'd be inclined to replace the term "metasyntax" with "co-semantics"; but I digress... sort-of.)

The second derivative of semanticsNote:I've described RAGs in a later blog post, here.

The expressiveness of a programming language —what we ask φ to conserve when mapping *B* to *A*— is about the contours of the overall pattern of semantics of programs. That is, it's about how variations in the text of a program change the semantics induced by the program; in short, expressiveness is the *first derivative of semantics*.

The abstractiveness of a programming language —which we will want conserved when we assert that one language is "as abstractively powerful" as another— is about how variations in the text of a program change the expressiveness of the language induced by the program. Thus, as expressiveness looks at variations in semantics, and is in this sense the first derivative of semantics, abstractiveness looks at variations in expressiveness, and is thus the second derivative of semantics.

As I've set out to mathematize this, I've found the treatment becomes rather off-putting-ly elaborate (a trend I mean to minimize in this post). I observe a lesser degree of the same effect even in Felleisen's paper, which was apparently trying to stick to just a few simple ideas and yet somehow got progressively harder to keep track of. Some of this sort of thing is a natural result of breaking new ground: appropriate simplifications may be recognized later. I omitted one substantial complication from my description of Felleisen's treatment, that I suspect was simply a consequence of techniques he'd inherited from other purposes. However, I've also introduced one new complication into expressiveness, namely parameterization by the class of transformations allowed — and I'm about to introduce a second complication, as I adapt Felleisen's treatment to my unbounded-syntax strategy.

Recasting expressivenessIn adapting expressiveness to the unbounded-syntax definition of programming language (D1), the first consideration is that a mapping φ between two languages of this sort has to respect the all-important prefix structure of the languages:

(D2) For programming languagesThat is, for everyPandQ, amorphismfromPtoQis a function φ :P→Qthat preserves prefixes.

*xy*∈

*P*, there exists

*z*such that φ(

*xy*) = φ(

*x*)

*z*. We write

*P*/

*x*for the language reached from language

*P*by text sequence

*x*∈

*P*; and similarly, when φ :

*P*→

*Q*, we write φ/

*x*for the corresponding morphism from

*P*/

*x*to

*Q*/φ(

*x*). Thus,

*P*/

*x*= {

*y*|

*xy*∈

*P*} and φ(

*xy*) = φ(

*x*) (φ/

*x*)(

*y*).

As remarked earlier, we parameterize expressiveness relationships by the class of allowable morphisms. We won't allow arbitrary classes of morphisms, though.

(D3) AFor given termscategory(over programming languages with termsT) is a setCof morphisms between languages overT, closed under composition and including the identity morphism of each language overT.

*T*, some useful categories:

Where Felleisen could avoid complicated semantic values by relying on halting behavior, we need a set of observable terms,Any= category of all morphisms.

Map= category of morphisms that perform a term transformation uniformly, φ(t) = φ(_{1}...t_{n}t)...φ(_{1}t)._{n}

Macro= category of map morphisms whose term transformation is macro/polynomial.

Inc= category of inclusion morphisms, φ :P→Qwith φ(x)=x

Id= category of identity morphisms, φ :P→Pwith φ(x)=x

*O*⊆

*T*.

(D4) Morphism φ :P→Qrespects observablesOifMorphism φ :

- φ maps observables, and nothing else, into those same observables — that is, (φ/
x)y=oiffy=o— and- φ transforms each derived language into a language with exactly the same observables — that is,
o∈(P/x) iffo∈(Q/φ(x)).P→Qweaklyrespects observablesOif it satisfies all the conditions for respecting observablesOexcept thato∈(P/x) implieso∈(Q/φ(x)) (rather than implication going both ways).

Our basic expressiveness relation is thenObs_{O}= category of all morphisms that respect observablesO

WObs_{O}= category of all morphisms that weakly respect observablesO

(D5) For categoryWe sayCand languagesAandB,AcanC-expressBfor observablesOif there exists φ :B→Awith φ ∈C∩Obs_{O}.

*A*is as

*C*,

*O*-expressive as

*B*. Weak

*C*,

*O*-expressiveness uses

**WObs**

_{O}in place of

**Obs**

_{O}. Evidently, the

*as C,O-expressive as*relation is transitive, as is its weak variant. Expressiveness implies weak expressiveness (because

**Obs**

_{O}⊆

**WObs**

_{O}).

There is a simple theorem that the expressiveness relation is preserved if the category is made bigger, and if the set of observables is made smaller. That is, if *A* is as *C _{1}*,

*O*-expressive as

_{1}*B*, category

*C*⊆

_{1}*C*, and observables

_{2}*O*⊆

_{2}*O*, then

_{1}*A*is as

*C*,

_{2}*O*-expressive as

_{2}*B*.

AbstractivenessNote:Although the notation here is simplified from the techreport, it still poorly handles the codomain of a derived morphism, producing such un-self-explanatory expressions asQ/φ(x). Belatedly I see that by adopting for φ:P→Qthe additional notation φ(P)=Q, one would have the more mnemonic φ/x:P/x→ φ(P)/φ(x).

As expressiveness depends on φ : *B* → *A* preserving the semantic landscape of *B*, abstractiveness should depend on φ preserving the expressive landscape of *B*. A semantic landscape for *B* arose from specifying a set of observables. An expressive landscape for *B* arises from specifying a category of expressiveness morphisms by which to compare different languages *B*/*x*.

If it wasn't entirely certain what category to use when comparing expressiveness (hence the introduction of parameter *C* into the expressiveness relation, where Felleisen's treatment only looked at two choices of *C*), the choice of expressiveness category becomes much more obscure when considering abstractiveness. Note in Standish's remark the phrase *by modest amounts of labor*; dmbarbour has remarked that multiple degrees of difficulty are of interest, and this suggests that no one choice of category will give us a full picture. One might even imagine that a φ : *B* → *A* of one category would map expressive relations on *B* of a second category onto expressive relations on *A* of yet a third category. The point is that, without prior experience with this mathematics, we have no clear notion which of the myriad imaginable relations we want to look at. We hope that experience working with it may give us insight into what special case we really want, but meanwhile the endeavor seems to require a lot of parameters. It would look rather silly to have an abstractiveness relation with, say, four parameters, so we redistribute the parameters by bundling the choice of expressive category with the language.

(D6) AWe call theseprogramming language with expressive structureover termsTis a pairA=〈S,C〉 of a programming languageSoverTwith a categoryCover programming languages with termsT.

*expressive languages*for short. We may write

*seq*(

*A*) and

*cat*(

*A*) for the components of expressive language

*A*.

For φ to preserve the expressive landscape from *B* to *A* is more involved than preserving the semantic landscape. The expressive landscape of *B*=〈*S*,*C*〉 is a web of relations between languages derived from *S*. Suppose *g*∈*C* with *g* : *S*/*x* → *S*/*y*. Through φ : 〈*S*,*C*〉 → 〈*R*,*D*〉, derived languages *S*/*x* and *S*/*y* correspond to *R*/φ(*x*) and *R*/φ(*y*). Moreover, we already have *three* morphisms between these four derived languages:

If only we had a fourth morphismg:S/x→S/y

φ/x:S/x→R/φ(x)

φ/y:S/y→R/φ(y)

*h*∈

*D*with

*h*:

*R*/φ(

*x*) →

*R*/φ(

*y*), we could ask these four morphisms to commute,

*h*∘ φ/

*x*= φ/

*y*∘

*g*.

(D7) For expressive languagesWe sayPandQ, amorphismfromPtoQis a morphism φ :seq(P) →seq(Q) such thatFor category

- for all
g∈cat(P) withg:seq(P)/x→seq(P)/y, there existsh∈cat(Q) withh∘ φ/x= φ/y∘g, and- for all
x,y∈seq(P), if there is nog∈cat(P) withg:seq(P)/x→seq(P)/y, then there is noh∈cat(Q) withh:seq(Q)/φ(x) →seq(Q)/φ(y).Cand expressive languagesA,B,AcanC-expressBfor observablesOif there exists φ :B→Awith φ ∈C∩Obs_{O}.Note:I've corrected a typo in this definition: the second condition readfor all x,y ∈ seq(Q)rather thanseq(P). (24 April 2014)

*A*is as

*C*,

*O*-abstractive as

*B*; and weak

*C*,

*O*-abstractiveness uses

**WObs**

_{O}in place of

**Obs**

_{O}. The

*as C,O-abstractive as*relation (and its weak variant) is transitive, and is preserved by making

*C*bigger or

*O*smaller. Abstractiveness implies weak abstractiveness (because, again,

**Obs**

_{O}⊆

**WObs**

_{O}). The techreport proves several other rather generic theorems; for example, if

*A*is as

*C*,

*O*-abstractive as

*B*, and

*cat*(

*A*) respects observables

*O*, then

*cat*(

*B*) respects observables

*O*.

When the expressive languages get their expressive structure from the identity category, *C*,*O*-abstractiveness devolves to *C*,*O*-expressiveness. That is, for all languages *A* and *B*, category *C*, and observables *O*, 〈*A*,**Id**〉 is as *C*,*O*-abstractive as 〈*B*,**Id**〉 iff *A* is as *C*,*O*-expressive as *B*.

An obvious question at this point is, now that we've got this thing put together, does it work? In the techreport, my sanity check was a test suite of toy languages, chosen to minimally capture the difference between a language in which local declarations are globally visible, and one in which local declarations can be private. The wish list at the end of the techreport, of results to try for, was much more impressive — I targeted encapsulation of procedures; hygienic macros; fexprs; and strong typing — but the sanity check, as the only example thus far actually worked out in full detail, has some facets of interest.

In language *L _{0}*, records are declared with constant fields, and the fields can then be queried. When a query occurs in a text sequence, the next text in the sequence reports the value contained in the queried field; these reports are the observable terms. Language

*L*is identical except that individual fields may be declared "private"; private fields cannot be queried, though they can be used when specifying the value of another field in the same record.

_{priv}
It's easy to transform an *L _{priv}* text sequence into a valid

*L*sequence: just remove all the "private" keywords from the field declarations. This is a macro/polynomial transformation, and all the queries that worked in

_{0}*L*still work and give the same results in

_{priv}*L*. Therefore,

_{0}*L*is weakly as

_{0}**Macro**-expressive as

*L*.

_{priv}
Whether or not *L _{0}* is as

**Macro**-expressive as

*L*(without the "weakly" qualifier) depends on just how one sets up the languages. Each query is a separate text in the sequence, and the only text that can follow a query is a report of the result from the query. Expressiveness hinges on whether or not it is permissible to query a field that is not visible; and the reason this matters is that although strong expressiveness requires each derived language

_{priv}*B*/

*x*map into a derived language

*A*/φ(

*x*) that has no extra observable terms, it does not require that there be no extra

*non*-observable terms. To see how this works, suppose

*u*∈

*L*is a sequence of declarations, after which query

_{priv}*q*is valid but

_{1}*q*attempts to query a private field. Following

_{2}*u*, the result of

*q*is

_{1}*r*, and the result of

_{1}*q*would be

_{2}*r*if the field were made public. Let

_{2}*v*be the result of removing all "private" keywords from

*u*.

In the techreport, a query is not permitted unless its target is visible. Therefore, *L _{priv}* sequences include

*uq*and

_{1}*uq*, but not

_{1}r_{1}*uq*.

_{2}*L*sequences include

_{0}*vq*,

_{1}*vq*,

_{1}r_{1}*vq*, and

_{2}*vq*. But expressiveness does not object to

_{2}r_{2}*vq*∈

_{2}r_{2}*L*unless there is some

_{0}*x*∈

*L*such that φ(

_{priv}*x*)=

*vq*but

_{2}*xr*∉

_{2}*L*. Since there is no such

_{priv}*x*, the observable

*r*in

_{2}*vq*causes no difficulty. Although it is true that

_{2}r_{2}*vq*∈

_{2}*L*even though

_{0}*uq*∉

_{2}*L*, this does not interfere with expressiveness because

_{priv}*q*is not an observable. The upshot is that in the techreport,

_{2}*L*is as

_{0}**Macro**-expressive as

*L*.

_{priv}
Alternatively, one could allow queries regardless of whether they're valid; this is essentially a "dynamic typing" approach, where errors are detected lazily. In that case, *uq _{2}* ∈

*L*; the allowance of the observable

_{priv}*r*in

_{2}*L*is then a violation of strong expressiveness, and

_{0}*L*is

_{0}**not**as

**Macro**-expressive as

*L*(although it's still weakly as

_{priv}**Macro**-expressive).

What about **Macro**-abstractiveness? Given the relative triviality of the languages, I chose to use weak category **Inc** for the expressive structure of the languages: 〈*L _{0}*,

**Inc**〉 and 〈

*L*,

_{priv}**Inc**〉.

*L*⊆

_{0}*L*, so using the inclusion morphism φ(x)=x from

_{priv}*L*to

_{0}*L*, evidently 〈

_{priv}*L*,

_{priv}**Inc**〉 is as

**Inc**-abstractive as 〈

*L*,

_{0}**Inc**〉; and consequently, as a weaker result, 〈

*L*,

_{priv}**Inc**〉 is as

**Macro**-abstractive as 〈

*L*,

_{0}**Inc**〉.

As for the other direction, we've already observed that *L _{0}* is as

**Macro**-expressive as

*L*, which is to say that 〈

_{priv}*L*,

_{0}**Id**〉 is as

**Macro**-abstractive as 〈

*L*,

_{priv}**Id**〉. So if 〈

*L*,

_{0}**Inc**〉 isn't as

**Macro**-abstractive as 〈

*L*,

_{priv}**Inc**〉, the difference has to be in the additional expressive structure. This expressive structure gives us a

*g*:

*L*/

*x*→

*L*/

*y*when, and only when,

*L*/

*x*⊆

*L*/

*y*. Therefore, in order to establish (contrary to our intention) that 〈

*L*,

_{0}**Inc**〉 is as

**Macro**-abstractive as 〈

*L*,

_{priv}**Inc**〉, we would have to demonstrate a

**Macro**∩

**Obs**

_{O}morphism φ :

*L*→

_{priv}*L*that preserves all of the subsetting relationships between derived languages in

_{0}*L*.

_{priv}
For example, the transformation remove-all-"private"-keywords, which gave us **Macro**-expressiveness, will not work for **Macro**-abstractiveness, because it washes out subsetting relationships. Suppose *u* is a sequence of declarations in *L _{priv}* that contains some private fields, and

*v*is the result of removing the "private" keywords from

*u*. Since

*v*can be followed by some things that

*u*cannot,

*L*/

_{priv}*u*is a proper subset of

*L*/

_{priv}*v*. However, the remove-"private" transformation maps both of these languages into

*L*/

_{0}*v*, which evidently isn't a proper subset of itself; so remove-"private" doesn't preserve the expressive structure.

In fact, no **Macro** morphism will preserve the expressive structure. Consider a particular private field declaration, call it *d*. When *d* occurs within a record declaration in an *L _{priv}* sequence, it can affect later expansions of the sequence, through its use in defining other, non-private fields of the same record, changing the results of queries on those non-private fields. The only thing that can affect later queries in

*L*is a field declaration; therefore, a

_{0}**Macro**morphism φ must map

*d*into one or more field declarations. But if

*d*is embedded in a record declaration

*that doesn't use it*, it won't have any effect at all on later queries — and the only way that can be true in

*L*is if φ(

_{0}*d*) doesn't declare any fields at all. So there is no choice of φ that preserves the expressive structure (which is to say, the subsetting structure) in all cases.

So 〈*L _{priv}*,

**Inc**〉 is strictly more

**Macro**-abstractive than 〈

*L*,

_{0}**Inc**〉.