Tuesday, March 17, 2020

Irregularity in language

you wouldn't believe the kind of hate mail I get about my work on irregular verbs
Stephen Pinker, in an interview with The Guardian, 2007.

Assembling my prototype conlang Lamlosuo transformed my understanding of irregularity in language.  That was unexpected.  The prototype was supposed to be a learning vehicle, yes — for learning about the language model I'd devised.  Irregularity wasn't mentioned on the syllabus.

I set out to create an experimental prototype conlang with radically different semantic underpinnings than human natural languages (natlangs).  (This blog is littered with evidence of my penchant for studying the structure of things by devising alternative structures to contrast with them.)  The prototype was meant as a testbed for trying out features for conlangs based on the envisioned semantics; it had no strong stake in regularity, one way or another, aside from an inclination not to deliberately build in irregularities that would make the testbed less convenient to work with.  The effect of the experiment, though, was rather like scattering iron filings on a piece of paper above a magnet, and thereby revealing unsuspected, ordinarily-invisible structure.  From contemplating the shape of the prototype that emerged, I've both revised my thinking on irregularity in general, and drawn take-away lessons on the character of the language structure the prototype is actually meant to explore.

My first post on Lamlosuo, several years ago now, laid out the premise of the project and a limited set of its structural consequences, while deferring further complications —such as an in-depth discussion of irregularity— to some later post.  This post is its immediate sequel, describing major irregular elements of Lamlosuo as they emerged, as well as what I learned from them about irregularity in general and about the language model in particular.

[Overall insights about the language project are largely —though by no means entirely— concentrated in the final section below.  Insights into irregularity are distributed through the discussion, as they arise from details of the language.]

Vector language
Routine idiosyncrasies
Patterns of variation
Extraordinary idiosyncrasies
Whither Lamlosuo?

From our early-1970s hardcopy Britannica (bought by my parents to support their children's education), I gathered that commonly used words tend to accumulate irregularities, while uncommonly used words tend to accumulate regularity by shedding their irregularities.  From 1990s internet resources on conlanging (published there by scattered conlangers as they reached out through the new medium to form a community), I gathered that irregularity may be introduced into a conlang to make it feel more naturalistic.  All of which I still believe, but these credible ideas can easily morph into a couple of major misapprehensions about irregularity, both of which I was nursing by the time I committed to my first conlanging project, at the turn of the century:  that the only reason natlangs have irregularity is that natlangs evolve randomly in the field, so that a planned conlang would only have irregularity if the designer deliberately put it there; and that irregularity serves no useful function in a language, so that desire for naturalism would be the only reason a conlang designer would put it there.

20 years later, I'd put my current understanding this way:  Irregularity is a natural consequence of the impedance mismatch between the formal structure of language and the sapient semantics communicated through it (a mismatch I last blogged about yonder).  Sapient thought structures are too volatile to fit neatly into a single rigid format; large parts of a language, relatively far from its semantic core, may be tolerably regular, but the closer things get to its semantic core, the more often they call for variant structure.  It may even be advantageous for elements near the core to be just slightly out of tune with each other, so they create (to use another physics metaphor) a complex interference pattern that can be exploited to slip sapient-semantic notions through the formal structure.  Conversely, one may be able to deduce where the semantic core of the language is, from where this effect stirs up irregularity.  By similar feedback, also, structural nonuniformities can orient sapient users of the language as they work intensively with the semantic core; I analogize this with the bumps on the F and J keys of a QWERTY keyboard, which allow a touch-typist to feel when their fingers are in standard position.

These effects are likely to apply as well to programming languages, which are ultimately vehicles for sapient thought.  Note that the most peculiar symbol names of Lisp are concentrated at its semantic core:  lambda, car, cdr.

Vector language

My central goal for this first conlanging project was to entirely eliminate nouns and verbs, in a grammatical sense, by replacing the verb-with-arguments structural primitive of human natlangs with some fundamentally different structural primitive.  The verb-with-arguments structural pattern induces asymmetry between the grammatical functions of the central "verb" and the surrounding "nouns", which afaics is where the grammatical distinction between verbs and nouns comes from.  (My notes also call these "being-doing" languages, as verbs commonly specify "doing" something while nouns specify simply "being" something.)  In the structure I came up with to replace this, each content element would be, uniformly, an act of motion ("going"), understood to entail a thing that goes (the cursor), where it's going from and to, and perhaps some other elements such as the path by which it goes.  For the project as a whole I hoped to have several related languages and some grammatical variance between them, but figured I'd need first to understand better how a language of this sort can work, to understand the kinds of variation possible.  So I set out to build a prototype language, to serve as a testbed for studying whether-and-how the language model could work.

In the prototype language, there is just one open class of vocabulary words, called vectors, each of which has five participant slots, called roles.  The five roles are:  cursor, start, end, path, and pivot.  The name pivot suggests that the action is somehow oriented about the pivot element, but really the pivot role is a sort of catch-all, a place to put an additional object associated with the action in some way.  The pivot role in itself says something about irregularity.  In lexicon building, each vector has definitions for each of its occupied roles.  Defining all these roles for a given vector, I've found, establishes the meaning of the vector with great clarity.  The cursor is the only absolutely mandatory role:  there can't be a going without something that goes.  The start and end are usually clear.  The path is usually fairly straightforward as well, though sometimes occupied by an abstract process rather than a physical route of travel.  But each vector is, in the end, semantically unique; and its uniqueness rebels against being pinned down precisely into a predetermined form —I analogize this to the Heisenberg uncertainty principle, where constraining one part of a particle's description requires greater leeway for another part— so that while the cursor, start, and end are usually quite uniform, and the path has limited flexibility, the pivot provides more significant slack to accommodate the idiosyncrasy of each vector.

For example:  The first meaning I worked out for the language was a vector meaning speak.  This was before the language even had a phonology; it was meant to verify, before investing further in the structural primitive, that it was capable of handling abstracts; and speak, as a meaning in a conlang, was appealingly meta.  In a speech act, it seemed the thing that goes from somewhere to somewhere is the message; so I reckoned the cursor should be the message, the thing said.  The start would then be the speaker; and the end would be whomever receives it, the audience.  It was unclear whether the path would be more usefully assigned to the route by which the message travels, or the transmission medium through which it travels (such as the air carrying sound, or aether carrying radio waves); waiting for a preference to emerge, I toyed with one or the other in my notes but ultimately the path role of that vector has remained unoccupied.  For the pivot, I struck on the idea of making it the language in which the message is expressed (such as English — or Lamlosuo). 

The "escape-valve" pattern —regularity with an outlet to accommodate variant structure that doesn't neatly fit the regularity— recurred a number of times in the language design as it gradually emerged.  The various escape mechanisms accommodate different grades of variant structure, and while the relations between these devices are more complex than mere nesting, the whole reminds me somewhat of a set of matryoshka dolls.  With that image in mind, I'm going to try to order my description of these devices from the outside in, from the broadest and mildest irregularities to the narrowest and most extreme.

It's a fair question, btw, where all this emergent structure in the prototype emerges from.  It all comes through my mind; the question is, what was I tapping into?  (I'll set the origin of the vector primitive itself outside the scope of the question, as the initial inspiration seems rather discontinuous whereas the process after that may be somewhat fathomable.)  My intent has been to access the platonic structure of the language model; that's platonic with a lower-case p, meaning, structure independent of particular minds in the same sense that mathematical structure is independent of particular minds.  Given the chosen language primitive, I've tried through the prototype to explore the contours of the platonic structural space around that chosen primitive, letting natural eddies in that space shape the design while, hopefully, reducing perturbations from biases-of-thought enough to let the natural eddies dominate.  (I also have some things I could say on the relationship between platonic structure and sapient thought, which I might blog about at some point if I can figure out how to address it without getting myself, and possibly even those who read it, hopelessly mired in a quag of perspective bias.)


The outermost nesting shell; the outer matryoshka doll, as it were; is, in theory, the entirely regular structure of the language.  I shall attempt to enumerate just those parts in this section, as briskly as may be.  This arrangement turns out to be somewhat challenging, both because language features aren't altogether neatly arranged by how regular they are, and because the noted concentration of irregularity toward the semantic core assures there will be some irregularity in nearly all remotely interesting examples in Lamlosuo (on top of the limitations of Lamlosuo's thin vocabulary).  Much of this material, with a bit more detail on some things and less on others, is included in the more leisurely treatment in the earlier post.

Ordinarily, a syllable has five possible onsetsf s l j w (as in fore, sore, lore, yore, wore);  five possible nucleii u e o a (close front, close back, mid front, mid back, open; in my idiolect, roughly as in teem, tomb, tame, tome, tom);  and two possible codasn m (as in nor, more).  In writing a word, if a front vowel (i or e) is followed by j and another vowel, or if a back vowel (u or o) is followed by w and another vowel, the consonant between those vowels is omitted; for example, lam‍losu‍wo would be shortened to lam‍losu‍o.  Two other sounds occasionally arise:  an allophone of f, written as t (the initial sound of thorn);  and one plosive written as an apostrophe, ' (the initial sound of tore).

A basic vector word consists of an invariant stem and a mandatory class suffix.  The stem is two or more consonant-vowel syllables (accent on the first syllable of the stem), and the class suffix is one consonant-vowel syllable.  There are eleven classes:  the neutral class, and ten genders;  a neutral vector is sort-of a lexical verb, an engendered vector is sort-of a lexical noun (though this distinction lacks grammatical punch, as they're all still vectors).  The neutral suffix after a back vowel (u or o) is ‑wa, otherwise it's ‑ja (so, the suffix consonant is omitted unless the stem ends with a).  Genders identify role (one of the five) and volitionality (volitional or non-volitional).  Non-volitional genders use front vowels, volitional genders use back vowels; the onset determines the role:  ‑li/‑lu cursor, ‑ti/‑tu start, ‑se/‑so end, ‑je/‑jo path, ‑we/‑wo pivot.  Somewhat relevant to irregularity, btw:  start and end genders deliberately use different vowels to strengthen their phonological contrast since they have relatively weak semantic contrast; while, on the other hand, an earlier experiment in the language determined that assigning the vowels in consistent sets (either i/u or e/o, never i/o or e/u) is a desirable regularity to avoid confusion.

For example:  The vector meaning speak has stem losu-.  The neutral form is losua; engendered forms are losuli (message, non-volitional), losutu (speaker, volitional), losuso (audience, volitional), losuo/losue (living language/non-living language).  My first thought for the non-volitional pivot, losue, was dead language; but then it occurred to me that that gender would also suit a conlang.

Vector words can also take any of a limited set of prefixes, each of the form consonant-vowel-consonant; as the two coda consonants are very similar (m and n), I try to avoid using two prefixes that differ only by coda.  In ideal principle, each prefix would modify its vector in a uniform way.  A vector prefix can also be detached from the vector it modifies, to become a preposition. 

A simple clause is a chain of vectors, where each pair of consecutive vectors in the chain are connected by means of role alignment.  Generically, one puts between the two vectors first a dominant role particle, which specifies a role of the first vector (the dominant vector in the alignment), then a subordinate role particle specifying a role of the second vector (the subordinate vector in the alignment), indicating that the same object occupies those two roles.  Ordinarily, the dominant role particles are just the volitional gender suffixes, the subordinate role particles are just the non-volitional gender suffixes, all now as standalone words, except using f rather than t for the start particles.  For instance, losua fu li susu‍a would equate the start of losua with the cursor of susu‍a.  If a vector is engendered, one may omit its role particle from an alignment, in which case by default it aligns on its engendered role (though an engendered vector can be explicitly aligned on any of its roles).  There are also a set of combined role particles, using the usual role consonants with vowel a; a combined role particle aligns both vectors on that role.

Each of the fifteen basic role particles (five dominant, five subordinate, five combined) has a restrictive variant; the distinction being that a non-restrictive alignment asserts a relationship between vectors whose meanings are determined by other means, while a restrictive alignment must be taken into account in determining the meanings of the vectors.  Each restrictive role particle prefixes the corresponding non-restrictive particle with its own vowel; thus, ja → a‍ja, etc.

A clause can be packaged up as an object by preceding it with a subordinate content particle.  A subordinate content particle is simply a single vowel, as a standalone word.  The five subordinate content particles determine the mood of the objectified clause (and can also be used at the front of a sentence to assign a mood to the whole thing):  a, indicative; i, invitational; u, imperative; e, noncommittal; o, tentative.  Having bundled up a clause as an object, one can then treat it as the subordinate half of a role alignment with a dominant vector.  There are also dominant content particles, which package up the dominant vector (just the one vector) as an object to align with some role of the subordinate vector, thus beginning a subordinate relative clause.  Dominant content particles prefix ow- to the corresponding subordinate content particles (the w attaches to the second syllable, and then is dropped since preceded by a back vowel) — with a lone exception for the dominant tentative content particle, which by strictly regular construction should be oo but uses leading vowel u (thus, uo) to avoid confusion with the dominant restrictive pivot particle (oo).  (In crafting that detail, I was reminded of English "its" versus "it's".)

The image of a subordinate content particle packaging up a subordinate clause and objectifying it for alignment with a dominant role seems to have built into it a phrase-structure view of the situation.  Possibly there is a way to view the same thing in a dependency-grammar framework (rather like wave-particle duality in physics); the whole constituency/dependency thing is not yet properly clear to me, and when I designed that part of Lamlosuo I was unaware of the whole controversy:  phrase-structure was the only approach to grammar I'd even seen, somewhat in grade-school and intensively in compiler parsing algorithms.  So, this particular part of the language design might or might not contain an embedded structural bias.

A provector has a stem of the form vowel-consonant and a class suffix.  The provector stems are in- (interrogative), um- (recollective), en- (indefinite), on- (relative), an- (demonstrative).  The recollective provector has an antecedent earlier in the clause, and does not align with its syntactic predecessor; where ordinarily alignment can only align a vector with two others (the one before it and the one after it), as antecedent of a recollective provector it can participate in any number of additional alignments.  (The demonstrative provector, btw, serves the function of a third-person pronoun, using cursor an‍lu/an‍li in general, volitional start an‍tu for a person of the same sex/gender as the speaker, volitional end an‍so for a person of different sex/gender from the speaker; but I digress.)

A vector can incorporate a simple clause.  Position the vector at the front of the simple clause, and join the entire clause together with plosives (') between its words; the whole then aligns as its first vector, with the rest of the incorporated clause aligned to it independent of any other surrounding context.  Recollective provectors may be disambiguated by incorporating a copy of the antecedent vector.

Routine idiosyncrasies

Beyond a vector's definitions of its neutral form and up-to-ten genders, each vector has a number of conventions associated with it that accommodate low-to-medium-grade vector-idiosyncrasies of the sort that occur broadly throughout the vocabulary.  Role alignment is not as simple as "the object that occupies this role of this vector is the same object that occupies that role of that vector":  that isn't always the sort of relation-between-vectors that's wanted, and when it is, there may be refinements needed to clarify what is meant.  The meaning of an alignment is resolved primarily by alignment conventions of the dominant vector.  My notes on the language design suggest that exceptions to the regular sense of alignment are most often associated with vectors corresponding, in a verb-with-arguments language, to conjunctions and helping verbs.

Combined role particles play a significant part in this because, it turns out, the "standard" meaning of the combined role particles —to align the same role of both vectors, thus la = lu li, sa = so se, etc.— is rarely wanted.  The combined role particles are therefore an especially likely choice for reassignment by convention based on more realistic uses of a particular vector.  A given vector often has some practical use, due to the particular meaning of that vector, for alignments that involve multiple roles of each vector (as a simple example, one might equate the cursor of both vectors, and at the same time equate the end of the first vector with the start of the second); or, sometimes, for some other more peculiar alignment strategy appropriate to the vector's particular meaning; and combined role particles are routinely drafted for the purpose.

Several rather ordinary vectors have some role that, by the nature of their meaning, is often a complex information structure described by a subordinate clause, and therefore they use the combined role particle on that role to imply a subordinate noncommittal content particle (e):  losua la — (say that —), lawa‍ja la — (teach that —), susu‍a wa — (dream that —); sofo‍a (deduce) and soo‍a (imply) do this on multiple roles.  A more mundane example of variant alignment conventions (not involving implied content particles) applies to stem seli-, go repeatedly, whose cursor is the set of instances of repeated going.  When dominant in an alignment with combined cursor particle la, the subordinate vector is what is done repeatedly (a restrictive alignment); subordinate start, path, and end are assigned to those dominant roles, while subordinate cursor is assigned to dominant pivot.  Preceding seli- by a number indicates the number of repetitions; for example, siwe‍a seli‍a la jasu‍a = sneeze three times.  (In fact, this can be shortened to siwe seli jasu‍a; see my earlier remark on interesting examples.)

A moderately irregular configuration is two neutral vectors used consecutively in a clause with no explicit particle between them.  The strictly regular language assigns no meaning to this arrangement, as there are no gender suffixes on the vectors to determine default roles when omitting the role particles; the configuration has to depend on conventions of the dominant (or, less likely, the subordinate) vector.  The language notes stipulate that this type of alignment is restrictive.

Patterns of variation

The alignment idiosyncrasies of particular vectors fall into overall patterns.  At the start of Lamlosuo I didn't see this coming, which in retrospect seems part of my general failure to appreciate that irregularity is more than skin deep.  With increasing number of vectors explored in the lexicon, though, I began to sense the shapes of these patterns beneath the surface, and then tried to work out what some of them were.

Because these are patterns that arise in other patterns that arise in the language, they compound the ambiguity between (again) the language's platonic structure versus my latent biases of thinking:  each lexicon entry is subject to this ambiguity, both in the choice of the entry and in its articulation, while the perception of patterns amongst the data points is ambiguous again.  This blog post has a lopsided interest in the platonic structure —my biases would be entirely irrelevant if not for the drive to subtract them from the picture— but I'd recommend here to not stint on even-handed skepticism.  Vulnerable as the process is to infiltration by biases of thinking (the phrase "leaky as a sieve" comes to mind), it should be no less vulnerable to infiltration from the platonic structure of the language.  Influences from the platonic realm can seep in both directly by perturbing interplay of language elements, and indirectly by perturbing biases of thought at any point in the backstory of the thought.  Biased influence can therefore be platonic influence; or, putting the same thing another way, the only biases we'd want to subtract from the picture are those that aren't ultimately caused by the platonic structure.  However murky the process gets, I'd still hope for the emergent patterns to carry evidence of the platonic structure.

Very early on, I'd speculated consecutive neutral vectors might align by chaining sequentially, cursor-to-cursor and end-to-start.  This in its pure form looked less plausible as the lexicon grew, as it became clear that many vectors were of the wrong form.  (For instance, aligning losua susu‍a in this way —susu‍a means sleep— would equate the message with the person who sleeps, and the audience with the act of falling asleep.)  Another early notion was that some vectors would be used to modify other vectors, by aligning in parallel with them — equating cursor-to-cursor, path-to-path, start-to-start, end-to-end.  I've called these modifiers advectors.  Parallel alignment could be assigned, by dominant-vector-driven convention, to consecutive neutral vectors, and perhaps to the combined path particle (ja, which in this case would take on restrictive effect by convention).  The sequential/parallel preference also arises in the semantics of more general alignments, such as the sentence (mentioned earlier) losua fu li susu‍a, which describes a speech act and a sleep act, both by the same person (dominant start, the speaker, is aligned with subordinate cursor, the sleeper); to understand the import of the alignment, one has to know whether the speaking and sleeping events take place in parallel (so that the person is speaking while sleeping) or in series (so that the person speaks and then sleeps).

When the merger of two vectors allows their combination to be treated as a single vector, the vector stems may be concatenated directly, forming a compound stem which can then be engendered after the merge.  For example, stem lolulelo- means father, sequentially combining lolu- (impregnate) and lelo- (give birth to).  According to current notes, btw, lolulelo- has shortened form lole-.

When a series of consecutive neutral vectors form a compact clause, short of merging into a single compact vector, I've considered a convention that the neutral class suffix may be omitted from all but the last of the sequence — "in all but the most formal modern usage", as the language notes say.  (Evidently I hesitated over this, as the latex document has a boolean flag in it to turn this language feature on or off; but it's currently on.)

Accumulating vocabulary gradually revealed that pivots generally fell into several groupings.  A reference point defining the action (whence the term pivot).  Intermediate point on the path.  Motivation for the action.  Agent causing the action.  Instrument.  Vehicle.  Listing these now, it seems apparent these are the sorts of things that —in a typical European natlang— might well manifest as a clutter of more-or-less-obscure noun cases.  I'd honestly never thought of those sorts of clutters-of-noun-cases as a form of intermediate-grade irregularity (despite having boggled at, say, Finnish locatives); and now I'm wondering why I hadn't.

Eventually, I worked out a tentative system of three logical roles —patient, agent, instrument— superimposed on the five concrete roles.  These logical roles would map to concrete roles identifying the associated noun primarily affected by the action (patient), initiating the action (agent), and implementing the action (instrument).  Of the three, only patient is mandatory; agent and instrument often occur, but sometimes either or both don't.  Afaics, agent and instrument are always distinct from each other, but either may map to the same concrete role as patient.

Patient is usually either cursor or end, though occasionally pivot or start; "path patient", say my notes, "is unattested". Agent is usually either cursor, start, or pivot; if the patient is the cursor, usually the agent is either pivot or cursor.  Instrument is usually either cursor or pivot:  pivot when cursor is agent, cursor when cursor isn't agent.  Patient also correlates with natlang verb valency:  when a vector corresponds to an intransitive verb, its patient is almost always the cursor, when to a transitive verb, its patient is typically the end.

For some time it remained unclear whether the logical roles should be considered a platonic feature.  I've often taken a "try it, see if it works" attitude toward adding things to the language, which is after all meant to be a testbed; the eventual rough indicator of a feature's platonic authenticity (platonicity?) is then how well it takes hold in the language once added.  A few of the things I've added just sat there inertly in the language design, until eventually discarded as failing to resonate with the design (such as a vector equivalent of the English verb be; which in retrospect clashes with the Lamlosuo design, both as copula which is what role particles are for, and as pure being whereas vectors impose motion on everything they portray).  Given some time to settle in, logical roles appear reasonably successful, having deeply integrated into some inner workings of the language:  various sorts of alignments both guide and are guided by logical roles.  Alignment guides logical roles, notably, in restrictive sequential or parallel alignments; for example, an advector inherits the logical roles of the other vector in parallel alignment.  Logical roles guide alignment in the highly irregular vector(s) at the apparent heart of the language, which I'll describe momentarily.

I wondered about aspect —the structure of an activity with respect to time (as opposed to its placement in time, which is tense)— for the prototype language, since aspect is a prominent feature of human natlangs.  Aspect has arisen in Lamlosuo mainly through the principle that the action of a neutral vector is usually supposed by default to happen once, whereas the action of an engendered vector is usually supposed to happen habitually.  Thus, in losua fu li susu‍a someone speaks and then sleeps, whereas in losutu li susu‍a a habitual speaker sleeps.  Usually, in a restrictive alignment, aspect too is inherited by the dominant vector, which affords some games with aspect by particular vectors (deferred to the next section below).  If one wanted more nuanced sorts of aspect in the testbed language, one might introduce them through alignments with particular vectors that exist to embody those aspects; however, I never actually did this.  Allowing myself to be guided by whatever "felt" natural to pursue (so one may speculate what sort of butterfly started the relevant breeze), my explorations led me instead to something... different.  Not relating a vector to time, but rather taking "tangents" to the basic vector at various points and in various abstract-spatial directions.  As the trend became more evident, I dubbed that sort of derived relation attitude.  (My language notes assert, within the fictional narrative, that the emphasis on attitude rather than aspect is a natural consequence of the language speakers' navigational mindset.)  Some rather mundanely regular particular vectors were introduced to support attitudes; looking through the lexicon, I see stems jali- (leave), jeli- (go continuously), joli- (arrive), supporting respectively the inceptive, progressive, and terminative attitudes.

Extraordinary idiosyncrasies

In any given language, it seems, there's likely to be some particular hotspot in the vocabulary where idiosyncrasies cluster.  Hopefully, the location of such a hotspot ought to say something profound about the language model, though as usual there's always potential bias-of-thought to take into account.  The English verb be is a serious contender for the most irregular verb in the language, with do coming in a respectable second to round out this semantic heart of the language structure.  As noted earlier, I've sometime referred to human languages as "being-doing languages"; and occasionally my notes have called vector languages "going languages".  Early on, I simplistically imagined that a generic vector meaning go might be the center of the language.  Apparently not, though; in the central neighborhood, yes, but not at the very heart.  The stand-out vector that's accumulated irregularity like it's going out of style is fajoa — meaning change state.

A sort of microcosm for this hotspot effect occurs in the finitely bounded set of Lamlosuo's vector prefixes (which, by the phonotactics described earlier, are each consonant-vowel-consonant, so there are at most 5×5×2 = 50 of them, or 25 if no two prefixes differ only in their final consonant; the current lexicon has 12, which is about 50% build-out and feels fairly dense).  Most of the prefixes are fairly straightforward in function (since prefix jun- makes a vector reflexive, junlosua would be talk to oneself; and so on).  The most exceptional prefix, consistently through the evolution of the language, has been lam-, which makes the vector deictic, i.e, makes it refer to the current situation.  The deictic prefix, as I've used it, is rather strongly transformative and I've used it only sparingly, on a few vectors where its effect is especially useful; in particular, stems losu-, sile-, jilu-.  (Though I would expect a fluent speaker to confidently use lam- in less usual ways when appropriate, as fluent speakers are apt to occasionally bend their language to the bafflement of L2 speakers.)

Stem lam‍losu- is the speaking in which the word itself occurs.  Several of its engendered forms are particularly useful; lamlosuo (volitional pivot) is the living language of the speaking in which the word occurs, hence the conlang itself (viewed fictionally as a living language); lam‍losu‍so (volitional end) is the audience, thus the second-person pronoun; lam‍losu‍tu (volitional start) is the speaker, thus the first-person pronoun.  The latter two are contracted (a purely syntactic form of irregularity, motivated by convenience/practicality) to laso and latu.

Stem sile- means experience the passage of time; the cursor is the experiencer; path, time; start, the experiencer's past; end, their future; pivot, the moment of experience.  lam‍sile- is the immediate experience-of-time, whose pivot is now; after working with it for a while, I adopted a convention that the past/present/future might colloquially omit the prefix.  Tense is indicated by aligning a clause with engendered sile‍tu (past), sile‍wo (present, if one wants to specify that explicitly), or sile‍so (future).  Hence, latu fi losua oa sile‍tu = sile‍tu a latu fi losua = I spoke.

Stem jilu- means go or travel in a generic sense (whereas go in a directional sense is wilu-).  lam‍jilu‍a is the going we're now engaged in; its cursor is an inclusive first-person pronoun (we who are going together); path, the journey we're all on (i.e, the activity we're engaged in); pivot, here; end (or occasionally start), there.  With preposition sum indicating a long path, this enters into the formal phrase sum lam‍sile‍tu sum lam‍jilu‍selong ago and far away.

Now, fajo-.  Change state.  The cursor is the thing whose state changes.  Non-volitional path is the process of state-change, volitional path is the instrument of state-change.  Non-volitional pivot is an intermediate state, volitional pivot is the agent of state-change.  Start and end, both non-volitional, are the state before and after the change.

When dominant fajo- aligns its cursor with some role of a subordinate vector, fajo- is the state change undergone by the aligned subordinate role during the action of the subordinate vector.  Either the dominant role, the subordinate role, or both may be elided; the dominant role when unspecified defaults to cursor —even if fajo- is engendered, an extraordinary exception— while the subordinate role when unspecified defaults to patient, making the meaning of the construct overtly dependent on which concrete role of the subordinate vector is the patient.  Along with all this, the dominant pivot aligns to the subordinate agent, and dominant path to subordinate instrument (when the subordinate vector has those logical roles).  According to the language notes, if the subordinate vector doesn't have an agent, and the subordinate pivot is an intermediate point on the subordinate path (as e.g. for sile-), and the subordinate cursor aligns with the dominant cursor, the dominant pivot is the state of the subordinate cursor as it passes through the subordinate pivot.

One thus has such creatures as fajo‍ti losu‍tu, the state of having not yet spoken; and fajo‍se losu‍so, the state of having been spoken to.  (Notice that these things take many more words to say in English than in Lamlosuo, whereas the past tense took many more words to say in Lamlosuo than in English.)

Cursor-aligned fajo- can also take the form of a preposition fam or prefix fam-, with the difference between the two that engenderment of the vector is applied after a prefix, but before a preposition.  Thus, fam⟨stem⟩⟨gender⟩ = fajo⟨gender⟩ ⟨stem⟩a.  For example, susu‍e = event of dreaming, fam‍susu‍e = fajo‍e susu‍a = state of dreaming.

When dominant fajo- aligns its path with a subordinate content clause, fajo- is the state change vector of the complex process described by the content clause.  Combined role particle ja initiates a noncommittal content clause by implying subordinate content particle e.  The dominant cursor is then the situation throughout the process, dominant start the situation before the process, dominant end the situation after the process, dominant pivot the agent of the process.

fajoa has siblings lajoa and wajoa.

lajoa describes a change of mental state.  Dominant path of lajo- doesn't align with a subordinate clause, but dominant cursor aligns similarly to fajo-, describing the change of mental state of whichever participant in the subordinate action; noting, the agent, if not otherwise determined, is the cursor's inclination toward the change (always available in the volitional pivot engendered form, lajo‍o).  For example, recalling sile‍tu = past (earlier point in time), where fajo‍ti sile‍a = fam‍sile‍ti = youth (external state at earlier point in time), lajo‍ti sile‍a = inexperience (internal state at earlier point in time).  When the subordinate vector already describes the mind, fajo- describes mental state, and lajo- is not used; e.g., fam‍susu‍e = state of dreaming is primarily an internal state.

wajoa describes the abstract process of being used as an instrument.  Cursor, instance of use; non-volitional path, process of use; volitional path, person who uses; (volitional/non-volitional) pivot, agent of use; (volitional/non-volitional) start, instrument of use; end, patient of use.  Alignment is similar to fajo-, but subordinate role defaults to instrument rather than patient.  For example, wajo‍o jilu‍a = person who uses a vehicle or riding beast, wajo‍o jilu‍e = person who uses a vehicle, wajo‍o jilu‍o = person who uses a riding beast.

On the periphery of this central knot of irregularity is jilu-, meaning (again) go or travel in a generic sense.  When dominant in an alignment with combined path particle ja, the role particle implies subordinate noncommittal content particle e, and jilu- aligns in parallel (it's an advector) to whatever complex process is described by the following subordinate clause.  (I don't group this with the larger family of mundane vectors using combined role particles to imply subordinate content, because here the alignment is implicitly restrictive and doesn't follow from complexity in the semantics of the vector, as with teach (lawa-), imply (soo-), etc.)  Here the alignment is purely a grammatical device; it unifies a complex process from the subordinate clause into a coherent vector, and objectifies it as the volitional path (engendered form jilu‍jo).  More subtly, jilu‍a ja with an engendered subordinate vector can provide a neutral vector with habitual aspect:  jilu‍a wo jeo‍e = go using a fast vehicle (once), jilu‍a ja jeo‍e = habitually go using a fast vehicle.

One can (btw) also play games with habitual aspect in using fajo-, exactly because it doesn't inherit the aspect of the subordinate clause:  engendering fajo- gives the state change habitual aspect, but gender in the subordinate clause does not.  Thus, latu we fajoa ja susu‍a lu laso = I (once) cause you to (once) sleeplatu we fajoa ja susu‍lu laso = I (once) cause you to habitually sleeplatu fajo‍o ja susu‍a lu laso = I habitually cause you to (once) sleeplatu fajo‍o ja susu‍lu laso = I habitually cause you to habitually sleep.  (Why I would have this soporific effect, we may suppose is provided by the context in which the sentence occurs.)

Whither Lamlosuo?

After a while —perhaps a year or more of tinkering— Lamlosuo began to take on an increasingly organic texture.  Natlangs draw richness from being shaped by many different people; a personal project, I think, when carried on for a long time starts to accrue richness from essentially the same source:  its single author is never truly the same person twice.  If you set aside the project and come back to it a week or a month later, you're not the same person you were when you set it aside; beside the additional things you've experienced in that time, most people would also no longer be quite immersed in some project details and would likely develop a somewhat different experience of them while reacquiring.  So the personal project really is developed by many people:  all the people that its single author becomes during development.  This enrichment cannot be readily duplicated over a short time, because the author doesn't change much in a short time.  This may be part of why the most impressive conlangs tend to be decades-long efforts; of course total labor adds up, but also, richness adds up.

The most active period of Lamlosuo development tailed off after about three years, due to a two-part problem in the vocabulary — phonaesthetic and semantic.

The phonology and phonotactics of Lamlosuo (whose conception I discussed a bit in the earlier post) are flat-out boring.  There are just-about no internal markers indicating morphological structure within a vector stem —even the class suffix is generally hard to recognize as not part of the stem— so there has been a bias toward two-syllable vector stems; it's been my perception that uniformly two-syllable simple stems help a listener identify the class suffix, so that nonuniform stem lengths (especially, odd-syllable-count stems) can be disorienting.  There are only a rather small number of two-syllable stems possible (basically, 54 = 625) and, moreover, packing those stems too close together within the available space not only makes them harder to remember, but harder even to distinguish.  After a while I reformed the lexicon a bit by easing in some informal principles about distance between stems (somewhat akin to Hamming distance) and some mnemonic similarities between semantically related stems.  The most recent version of the language document has 70 simple vector stems.

Semantically, a large part of the extant vocabulary is about the mechanics of saying things — attitude, conjunctions, numbers.  One also wants to have something to talk about.  Not wanting to build social biases into a vocabulary that didn't yet have a culture attached to it, I started with vocabulary for rather generic biological functions (eat, sleep...) and navigational maneuvers (go fast/slow, go against the current...) on the —naive— theory this would be "safe".  Later, with the mechanics-oriented vocabulary more complete, a small second wave of content-oriented words ventured into emotional, intellectual, and spiritual matters.  (The notes outline somewhat more ambitious spiritual structure than has been implemented yet; though I do rather like the stems deployed so far (speaking of bias) — fulo-, go wrongly, go contrary to the proper order of things; jolo-, go rightly, go with the proper order of things; wio-, inform emotion with reason; wie-, inspire reason with emotion.)

I did take away some lessons from building content vocabulary for Lamlosuo.  The vector approach has a distinctly dynamic effect on the language's outlook, since it doesn't lend itself to merely labeling things but asks for some sort of "going" to underlie each word.  This led, for instance, to the coining of two different words for blood, depending on what activity it's engaged in — jesa‍lu (circulating blood) and fesa‍lu (spilt blood).  Also, just as the vector concept induces conception of motions for a given noun, the identification of roles for each vector induces conception of participants for a given activity; for instance, in trying to provide a vector corresponding to English adjective fast, one has first advector jeo‍a, go at high speed, from which one then gets jeo‍lu (fast goer), jeo‍e (fast vehicle), jeo‍o (fast riding beast).

The dynamism of everything being in motion is accompanied by a secondary effect playing in counterpoint:  whereas human languages tend to provide each verb with a noun actor, Lamlosuo is more concerned to provide each vector with a noun affected.  This is a rather subtle difference.  The human-language tendency manifests especially in the nominative case (which of course ergative languages don't have, but then, accusative languages are more common); the Lamlosuo tendency is visible in the stipulation that the patient logical role is mandatory while the agent role is optional (keeping in mind, my terms patient and agent for Lamlosuo have somewhat different connotations than those terms as commonly used in human grammar:  affected is not quite the same as acted upon).  The distinction seems to derive from the relatively calm, measured nature of the vector metaphor for activity:  while going is more dynamic than being, it is on balance less dynamic than most forms of doing.  (If there's a bias there in my patterns of thought, I'm not sure its effect on this feature could be distinguished from its effect on the selection of the vector primitive in the first place.)

From time to time as Lamlosuo has developed, I've wondered about personal names.  If even labeling ordinary classes of things requires the conception of underlying motions, how is one to handle a label meant to signify the unique identity of a particular person?  I would resist a strategy for names that felt too much like backsliding into "being-doing" mentality, since much of the point of the exercise is to try something different (and since, into the bargain, any such backsliding would be highly suspect of bias on my part; not that I'd absolutely stonewall such a development, but the case for it would have to be pretty compelling).  Early in the development of Lamlosuo, I was able to simply defer the question, as at that point questions about the language that had answers were the exception, and this was just one more in the general sea of unknowns.  Lately, though, in closely studying the evolution of abstract thought in ancient Greece (reading Bruno Snell's The Discovery of Mind, as part of drafting a follow-up to my post on storytelling), I'm struck by how Lamlosuo's ubiquity of process sits in relation to Snell's analysis of abstract nouns, concrete nouns, and proper nouns (and verbs, to which Snell ascribes a special part in forming potent metaphors).  The larger conlanging project, within which Lamlosuo serves, posits a long timeline of development of the conspeakers' civilization, and as I look at it now this begs the question of how their abstractions evolved.  Mapping out the evolution might or might not provide, or inspire, a solution to the naming problem; at any rate, it's deeply unclear at this point what these factors imply for Lamlosuo, as well as for the larger project.

Avoiding cultural assumptions in the vocabulary created a highly clinical atmosphere (which is why I called the hope of culture-neutrality "naive":  lack of cultural features is a kind of culture; also note, human culture ought to contain traces from the evolution of abstract thought).  Each word tended to be given a rather pure, antiseptic meaning (until late in the game when I started deliberately working in a bit of flavor), heightening a trend already latent in the cut-and-dried mechanics of the language that arose from its early intent, as a testbed, to not bother with naturalism (so, in a way all this traces back to regularity).  For example:  hoping to insulate the various sex-related vocabulary words from lewd overtones, I set out to fashion an advector corresponding to the English adjective obscene, so that one might then claim the various other words weren't obscene without the advector (which of course amounts to making those other words more clinical).  The result took on a life of its own.  Advector josu-, do something obscene (with absolutely no implication whatever as to what is done); start agent; end patient; pivot instrument.  One is naturally led to consider the difference between a non-volitional instrument and a volitional instrument.  Throw in the reflective prefix and, for extra vitriol, an invitational mood particle, and you've got i jun‍josu‍a, which the language notes translate as "FU", but really it's more precise, more... clinical than that.

One natural next major step for Lamlosuo —if there were to be a next major step, rather than moving on to the other languages it was meant to prepare the way for— would be a push to significantly expand the vocabulary, to allow testing the dynamics of larger discourses.  (I wrote a long post a while back about long discourses).  However, the bland, narrow vocabulary space seemed an obstacle to this sort of major vocabulary-expansion operation.  A serious naturalistic conlang would combat this sort of problem partly through the richness that, as noted, comes from developing in many sessions over a long time; but ultimately one also has to mix this with some technological methods.  Purely technological methods would always create something with an artificial feel, so one really wants to find ways of using technological methods to amplify whatever sapient richness is input to the system; and that sounds like an appropriate study for a testbed language such as Lamlosuo.  Moreover, I just don't readily track all the complex details of a linguistic project like this — not if it's skipping like a pebble across time, with intervals between development sessions ranging from a few hours to a few years; I therefore imagined some sort of automated system that would help keep track of the parts of the language design, noting which parts are more, or less, conformant to expected patterns — and why.  (I'm very much aware that, in creating such designs, to maintain an authentic sapient pattern you need to be able to explain an exception just once and not have the system keep hounding you about it until you give the answer the automated system favors.)

And at this point, things take an abrupt turn toward fexprs.  (Cf. the law of the instrument.)  My internal document describing the language is written in LaTeX.  Yet, as just described, I'd like it to do more, and do it ergonomically.  As it happens, I have a notion how to approach this, dormant since early in the development of my dissertation:  I've had in mind that, if (as I've been inclined to believe for some years now) fexprs ought to replace macros in pretty much all situations where macros are used, then it follows that TeX, which uses macros as its basic extension mechanism, should be redesigned to use fexprs instead.  LaTeX is a (huge) macro package for TeX.

So, Lamlosuo waits on the speculative notion of a redesign of TeX.  It seems I ought to come out of such a redesign with some sort of deeper understanding of the practical relationship between macro-based and fexpr-based implementations, because Knuth's design of TeX is in essence quite integrated — a daunting challenge to contemplate tampering with.  (One also has to keep in mind that the extreme stability of the TeX platform is one of its crucial features.)  It's rather sobering to realize that a fexpr-based redesign of TeX isn't the most grandiose plan in my collection.

No comments:

Post a Comment