Language is the dress of thought— Samuel Johnson, The Life of Cowley, 1779(?).Only a few, out of the hundred, claimed to use mathematical symbology at all. [...] All of them said they did it [math or physics] mostly in imagery or figurative terms. An amazing 30% or so, including Einstein, were down here in the mudpies [doing]. Einstein's deposition said, "I have sensations of a kinesthetic or muscular type." Einstein could feel the abstract spaces he was dealing with, in the muscles of his arms and his fingers[...] almost no adult creative mathematician or physicist uses [symbology] to do it[...] They use this channel to communicate, but not to do their thing.— Alan Kay, Doing With Images Makes Symbols, 1987.
I've some thoughts to pursue here, drawing together the timeline of human evolution, the nature of human consciousness, human language processing, evolutionary memetics, and a few other odds and ends.
I've been developing various elements of this for years; several of them have featured in earlier posts on this blog. They came together into a big picture of sorts early this year as I've been reading Richard Leakey's 1994 The Origin of Humankind. (I've found the content of great interest, although, for some reason I've been unable to quite place, the writing style often causes me to lose track of what he's just said and have to back up, sometimes by a page or more and sometimes more than once; I was sufficiently interested I was willing to back up for comprehension. Interestingly, the effect seemed less pronounced when the material was read aloud.)
Thinking without language
The non-Cartesian theater
The dress of thought
Leakey's book describes controversies in paleoanthropology. Fascinating stuff if you're interested in how scientific ideas develop (which I am), or in how humans developed (which I also am). Much of it has to do with when in human evolution various peculiarly human traits emerged.
Darwin suggested that three major human traits all co-evolved, developing simultaneously as a package because they complement each other — making and using tools, bipedalism which frees up the hands to make and use tools, and a big brain for figuring out how to make and use tools. Since it's a cogent idea and, into the bargain, appeals to one's sense of human exceptionalism, the co-evolution idea held sway for about a century before people questioned it and it broke down under scrutiny. Bipedalism seems to have emerged around five million years ago, tools maybe two and half million years ago, and Leakey suggests a bigger brain more or less contemporaneous with tools.
Two other events are also of interest: the emergence of language, and the emergence of recognizably modern human behavior, with art, tools for making tools, tools with artistic flare, tools for making clothing, etc. The latter took place, by the evidence, around 35,000 years ago in Europe, some tens of thousands of years earlier in Africa; the beginning of the Upper Paleolithic, a.k.a. the Late Stone Age. When language emerged is harder to judge, but Leakey suggests it goes back as far as tools and the big brain. (It seems somewhat ironic that Leakey stresses how Darwin's co-evolution theory was wrong, and then Leakey separates out bipedalism but ends up promoting co-evolution of a different set of traits; I suspect he's likely right, but it is a bit bemusing that the theories would reshuffle that way.)
A question that hovers over Leakey's notion of early language development is, why did tool use evolve incredibly slowly for more than two million years, and then accelerate hugely at the beginning of the Upper Paleolithic? As it happens, I have an answer ready to hand, speculation of course but an interesting fit for the purpose (noting, by Leakey's account paleoanthropology has a healthy mix of speculation in it): the onset of the Upper Paleolithic might be the observable manifestation of the transition to Havelock's oral society from what, here in an earlier post, I called verbal society.
Our available example of verbal society — so I conjectured — is the Pirahã society recently discovered in the Amazon. An atypical example, necessarily under the conjecture, because examples typically would have disappeared dozens of millennia ago. To remind from the earlier post, here's a key excerpt from David J. Peterson's enumeration of anomalous properties of the Pirahã:
no temporal vocabulary or tense whatsoever, no number system, and a culture of people who have no oral history, no art, and no appreciation for storytelling.If early genus homo had that sort of culture, it would seem to explain rather well why things picked up spectacularly when they got out of that mode and into the more advanced oral culture of Havelock's Preface to Plato.
This in turn offers some insight into a question that arose in the relation of verbal culture to my still earlier blog post on memetic organisms. According to my theory, sciences are a major taxon of memetic organisms specifically adapted to literate society, but they cannot survive in oral society; and religions were a major taxon of memetic organisms in oral society, but cannot survive in verbal society. The first commenter on my verbal-society post asked what sort of memetic organisms would be dominant in verbal society. I suggested language itself as a memetic taxon, a suggestion I'm now more doubtful of but which, in any case, seems at best a rather incomplete answer. If the transition to oral culture is the onset of the Upper Paleolithic, though, we have at least a basis from which to speculate on the memetics of verbal society, because verbal society is then the memetic environment that gives rise to the archaeological record of the first two and a half million years or so of human technology. I've no deep thoughts atm on what specifically one might infer from this archaeological record about the memetic evolution behind it, except perhaps that memetic evolution in verbal society was very slow; but in principle, it offers a place for such memetic investigations to start.
A side-plot in my post on verbal society was the observation that the verbal-to-oral transition was a good fit for the story of Adam and Eve's eating of the fruit of the tree of knowledge, and being expelled for that from the Garden of Eden. I got criticized for that comparison. I'd had the comparison in mind, really, not primarily based on likelihood of it actually being the origin of the story, but because I'd notice a connection to the backstory of how we know about the Pirahã. As recorded in Don't Sleep, There Are Snakes: Life and Language in the Amazonian Jungle (which I recommend), the Pirahã were studied by Christian missionaries; ultimately, the hope was to bring the Word of God to the Pirahã. The Pirahã culture wouldn't support a religion, which is an oral memetic organism; in fact, one of the missionaries (who wrote the book) ended up converting to atheism. When the comparison occurred to me between verbal-to-oral transition and eating of the fruit of the tree of knowledge, I was struck that this analogy would cast the Christian missionaries in the role of the Serpent in the Garden of Eden. The irony was irresistible, which is how the comparison ended up in the blog post.
In comments, the alternative suggestion was raised that the Fall corresponds to the development of agriculture (the onset of the Neolithic). A theory I'd heard before and that does seem to fit passably well. With several more years to think about it, though, I don't think these theories are mutually exclusive. The story could of course have nothing to do with either of these actual events; but if it has a basis in actual events, there's no reason it should be based on just one real event. In oral culture, elements are introduced into a tradition and then shift around and mutate, tending toward a stable form. So the story of the Fall, which eventually got written down, can contain echoes of multiple major ancient societal shifts that don't have to have happened all at the same time. Reconsidering the comparison now, I'm struck by the inclusion, on Leakey's list of innovations at the beginning of the Upper Paleolithic, of tools for making clothing — recalling that when Adam and Eve ate of the fruit of the tree of knowledge, one of the immediate effects was that they realized they were naked, and... made clothes. Huh.Driverless cars
Proponents of aviation often cite statistics for how safe it is compared to driving. Statistics, though, even when accurate can often miss the point by presuming what questions ought to be asked. One such difficulty here is that the statistics comparing aviation to driving are per unit distance. I recall an SF novel (looking it up, A Talent for War) describing an interstellar hyperspace-jump technology in which ships sometimes just don't come back out of a jump — but it's the "safest form of travel per passenger-kilometer".
There is a second way those statistics on aviation versus driving can miss the point. Some things aren't even about probabilities. What if someone would rather risk themself in a situation where they have a significant degree of control over events rather than risk themself in a situation where they've completely given up control to someone else? That's a decision based on a philosophical criterion, not a numerical one, and there is nothing inherently irrational nor ignorant about it.
The matter has been further complicated by the advent of fly-by-wire airplane technology. If one could rationally prefer to retain control rather than give it up to another human, how about giving up control to another human versus to a software system? It has long been my impression (and I gather I'm not alone) that the people least willing to trust computers are found amongst those who know most about them. Now, here's a subtle point: it seems to me that when someone is concerned with whether or not they have significant control over events, it's not the routine events they care about, but the exceptional ones. The unforeseen circumstances. And here there does seem to be a qualitative difference between a human pilot and a fly-by-wire system. The fly-by-wire system is programmed some time beforehand by programmers who try to anticipate what situations the system should be able to handle; rather by definition, it doesn't know how to handle unforeseen circumstances. To some extent, at least, the fly-by-wire system incorporates knowledge of past accidents, although there may also be some conscious trade-offs in which not every known possibility is built into the system. The human, on the other hand, draws on their experience to try to cope with the situation — which might sound much the same as the fly-by-wire system incorporating knowledge of past accidents, but I submit is qualitatively different. The fly-by-wire approach is algorithmic, while the human approach is improvisational.
The difference may be clearer in the case of driving, which readers of this blog would seem more likely to have first-hand experience of (the reader is more likely to have driven a car than piloted an airplane).
Driverless cars have been touted lately as a coming technology. Those with a vested interest (be it financial or emotional) in the success of the technology naturally tend to portray it as "safer" than cars driven by humans. Yet we're told driverless cars couldn't be put on the road in numbers unless one also banned human drivers. Why would that be? Presumably because the driverless cars would have trouble with unexpected human behavior. Which begs the question, if humans are so unpredictable, and driverless cars are (going to be) so much better drivers than humans are, why would human drivers be better than driverless cars at coping with unpredictable drivers? It's often seemed to me, when some other driver does something unusual and I compensate for it, or I do something unusual and other drivers compensate for it, that what's really impressive about the statistics on traffic accidents isn't how high the numbers are, but how low the numbers are. Spend some time driving in traffic, and it's a good bet you'll see a bunch of situations where a human's unexpected behavior didn't result in an accident because other human drivers successfully compensated for it — improvisationally.
Humans are really good at coping with free-form unexpected situations. Comparatively. We don't always handle a sudden problem correctly; but we also don't fall off the edge of our programming, either. A computer program that doesn't know what to do really doesn't know what to do, in a way that clearly distinguishes natural intelligence from conventional software. (I'm interested here in the nature of human intelligence; artificial intelligence is not to the current point, though the current discussion might offer some insights into it.)
It seems likely to me that the point of the hominid big brain is to enable individuals to cope with unexpected situations. This might even help to explain our impulse to be in control of our own fate when an emergency happens: we are, as a species, specialists in coping with the unexpected, and it's in the best interests of our selfish genes that we each rely on our own ability to cope, so that the combinations of genes that cope most effectively will be favored in the gene pool. In other words, dispassionately, if somebody is going to be pruned from the population because of a failure to cope with an emergency, our selfish genes are better off if they get pruned for their own failure rather than someone else's. (Yes, social behaviors vastly complicate this; there's definitely a place for good Samaritans, heroes, etc.; but atm I'm commenting on why individuals would desire to control their own fate, not why they'd commit acts of altruism.)
One might take the speculative riff further than that. One of the questions that comes up in archaeology is, why would we have started in East Africa? Leakey suggests the emergence of bipedalism had to do with varying habitats brought on by the advent of the East African Rift. It certainly seems plausible bipedalism would have been enabled and favored by some sort of habitat shift. The subsequent emergence of tools, with co-evolving big brain and language, might then be supposed to follow simply from the opportunity provided by bipedalism; but for my part, I'm inclined to think the tool/brain/language effect may have required more of an evolutionary nudge than that. If it were that easily catalyzed, one might think it would have happened before us, and left evidence that we would have found by now. (Granted, that's easy to poke holes in. Maybe we're just randomly the first. Maybe it only happens once per planet because it rapidly causes the planet to be destroyed. Or —a rather more fun hypothesis— perhaps it has happened and left evidence, and the evidence is staring us in the face but we're assuming something that prevents us from seeing it. As long as you don't bring ancient aliens into it; I've contempt for fake science, though I enjoy exotic serious speculations.) So perhaps the tool/brain/language co-evolution got going, once enabled by bipedalism, because something in the environment was particularly irregular and therefore favored individuals able to individually cope with unforeseen circumstances.
What would this hypothetical environmental factor be, whose irregularity favors individual intelligence to cope with it? Besides the East African Rift, Leakey mentions the social intelligence hypothesis, that intelligence evolved to predict the behaviors of others in a complex social milieu. (This could put an interesting spin on Asperger's syndrome, which one might conjecture would isolate brain power from its usual application to socialization, potentially making a surplus available for other purposes.)
If that's too cerebral for you, here's an alternative theory: Maybe language started as a rather pointless mating display, like so many other exaggerated animal features such as the peacock's tail. So that men trying to chat up women in bars would be what the species originally evolved for. (In all seriousness, the two ideas are not mutually exclusive; language skill might have had some sort of survival value from the outset and consequently it was beneficial for it to be treated as desirable in a mate.)Thinking without language
The Sapir–Whorf hypothesis says that language influences thought. Benjamin Lee Whorf, writing in the early-to-mid twentieth century, called this the principle of linguistic relativity, alluding to Einstein's theory of relativity since linguistic relativity implies that how we perceive the world varies with what language we use. Modern understanding of the hypothesis distinguishes strong and weak versions: the strong version says that the structure of a language prevents its speakers from non-conforming patterns of thought, the weak version that the structure of a language discourages its speakers from non-conforming patterns of thought. Typically for ideas named after people, Sapir and Whorf never coauthored a paper on the idea, didn't present it as a hypothesis, and didn't make the modern distinction between strong and weak versions.
The strong Sapir–Whorf hypothesis isn't taken very seriously by linguists nowadays, but the weak form is generally accepted to some extent or other. Popular culture enshrines the language–thought connection with tropes "in <language X>, there are forty different words for <Y>" and "in <language X>, there is no word for <Y>". Either claim is ambiguous as to which direction it expects the thought/language influence to go. Out of context, I'd guess forty-different-words-for is typically meant to imply that <language X> speakers think about <Y> a lot, which depends on thought influencing language; while no-word-for is more ambiguous in direction, and seems to me at least as often meant to imply that the speakers cannot even conceive of <Y>. Saying they can't conceive of it could still be using the language as evidence for thought, but seems likely to have a stiff dose of strong Sapir–Whorf mixed in. The glaring flaw in this strong Sapir–Whorf reasoning is that if your language doesn't have a word for <Y>, and you have a need for such a word, you're likely to invent a word for it — borrowing from another language, compounding from existing vocabulary, or whatever sort of coinage <language X> favors.
Word coinage is an example of how thought can, rather than being limited by language, drive expansion of language to encompass new realms of thought. This however raises a subtler question about the relation between language and thought. The meaning <Y> appears to conceptually presage the new word — but how much wiggle room is there between the ability to think of <Y> and the ability to express it? Could you have chosen in the example, as a <language X> speaker, to hold off on inventing a word for <Y>, and think about <Y> for a while without having a name for it? More so, is it possible to think without language? Is there perhaps a threshold of sophisticated thought, beyond which we need language to proceed?
Seems to me there's plenty of evidence of nontrivial thinking without language. It's definitely possible to think in pictures; I sometimes do this, and I've heard of others doing it (and, see the Alan Kay epigraph at the top of this post). One might suggest that pictures, being a concrete representation, are a sort of "language". But to amplify, it's possible to think in abstracts represented by pictures without any accompanying verbal descriptions of the abstracts. I'm confident of the lack of accompanying verbal descriptions because, in general, the abstracts may lack short names, while long, often awkward descriptions would have been conspicuous if present. In such a case, the pictures represent relationships amongst the abstracts, not the abstracts themselves, and possibly not all the relationships amongst them — so even if the pictures qualify as language, they express far less than the whole thought. More broadly, it's a common experience to come up with a deep idea and then have trouble putting it into words — which evidently implies that the two acts are distinct (coming up with it, and putting it into words), and frankly this effect isn't adequately explained by saying the thinker didn't "really" have an idea until they'd put it into words. The idea might become more refined whilst being put into words, and sometimes one finds in the process that the idea doesn't "pan out"... but there has to have been something to be refined, or something to not pan out.
So I don't find it at all plausible that thought arises from language. That, however, does not interfere either with the proposition that language naturally arises from thought, nor that language facilitates thought, both of which I favor. (I recall, a few years ago in conversation with an AI researcher, being mistaken for an advocate of symbolic AI since I'd suggested some reasoning under discussion could be aided by symbols. There's a crucial difference between aiding thought and being its foundation; these ideas need to be kept carefully separate.)The non-Cartesian theater
Daniel Dennett's 1991 book Consciousness Explained is centrally concerned with debunking a common misapprehension about the internal structure of the human mind. Dennett reckoned the misapprehension is a holdover from the seventeenth-century mind–body dualism of René Descartes. In Cartesian dualism as recounted by Dennett, the body is a machine which, based on input to the senses, constructs a representation of the world in the brain where the mind/soul apprehends it. Although this doesn't actually solve the problem of interfacing between a material body and nonmaterial soul, it does at least simplify it by reducing the interface to a single specialized organ where the mind interacts with the material world (Descartes figured the point of interface was the pineal gland). Modern theorists envision the mind as an emergent behavior of the brain, rather than positing mind–body dualism; but they still have an unfortunate tendency, Dennett observed, to envision the mind as having a particular place — a "Cartesian theater" — where a representation of the world is presented for apprehension by the consciousness. An essential difficulty with this model of mind is that, having given up dualism, we can no longer invoke a supernatural explanation of the audience who watches the Cartesian theater. If a mind is something of type M, and we say that the mind has within it a Cartesian theater, then the audience is... something of type M. So the definition of type M is infinitely recursive, and supposing the existence of a Cartesian theater has no explanatory value at all. To understand consciousness better we have to reduce it to something else, rather than reducing it to the same thing again.
As an alternative to any model of mind based on the Cartesian theater, Dennett proposes a "multiple drafts" model of consciousness, in which representations of events are assembled in no one unique place (Cartesian theater) and, once assembled, may be used just as any other thoughts, revised, reconciled with each other, etc. Which is all very well but seems to me to be mostly a statement of lack of certain kinds of constraints on how consciousness works, with little to say about how consciousness actually does work.
It occurred to me in reading Dennett's book, though, that in setting about debunking a common misapprehension he was also making a mistake I'd seen before — when reading Richard Dawkins's 1976 book The Selfish Gene. Dawkins's book is another that sets about debunking a common misapprehension, in that case group selection, the idea that natural selection can function by selecting the most fit population of organisms. Dawkins argued compellingly that natural selection is intrinsically the differential survival of genes: whatever genes compete most successfully come to dominate, and only incidentally do these genes assemble and manipulate larger structures as means to achieve that survival, such as individual organisms, or populations of organisms. A hardy individual, or a hardy population, may help to make genes successful, but is not what is being selected; natural selection selects genes, and all else should be analyzed in terms of what it contributes to that end. Along the way, Dawkins illustrates the point with examples where people have fallen into the trap of reasoning in terms of group selection — and this is where, it seemed to me, Dawkins himself fell into a trap. The difference between group selection, where a population of organisms is itself selected for its success, and gene selection, where a population of organisms may promote the success of genes in its gene pool, is a subtle one. I got the impression (though I highly recommend Dawkins' book) that Dawkins was somewhat overeager to suppose thinking about successful populations must be based on the group-selection hypothesis, and thereby he might be led to discount some useful research.
Likewise, Dennett seemed to me overenthusiastic about ascribing a Cartesian theater to any model of consciousness that involves some sort of clearinghouse. That is, in his drive to debunk the Cartesian theater, the possibility of a non-Cartesian theater may have become collateral damage. This struck me as particularly unfortunate because I was already coming to suspect that a non-Cartesian theater might be a useful model of consciousness.
There is a widespread perception (my own first encounter with it was in a class on HCI in the 1980s) that the human mind has a short-term memory whose size is about seven plus-or-minus-two chunks of information. Skipping much historical baggage associated with this idea (it can be a fatal mistake, when looking for new ideas, to take on a set of canonical questions and theories already known to lead to the very conceptual impasse one is trying to avoid), the notion of a short-term memory of 7±2 chunks has interesting consequences when set against the "Cartesian theater". Sure, we can't reduce consciousness by positing a Cartesian theater, but this short-term memory looks like some kind of theater. If short-term memory stores information in chunks, and our brain architecture has an evident discrete aspect to it, and experience suggests thoughts are discrete, perhaps we can usefully envision the audience of this non-Cartesian theater as a collection of agents, each promoting some thought. An agent whose thought relates associatively to something on-stage (one of the 7±2 chunks) gets amplified, and the agents most successfully amplified get to take their turn on-stage, where they can bring their thought to the particular attention of other members of the audience. There are all sorts of opportunities to fine-tune this model, but as a general heuristic it seems to me to answer quite well for a variety of experienced phenomena of consciousness — fundamentally based on a non-Cartesian theater.The dress of thought
If "chunking" information is a key strategy of human thought, this might naturally facilitate representing thought using sequences of symbols with a tree-like structure. Conversely, expressing thought in tree-like linguistic form would naturally afford facile thinking as well as facile expression. Thus, as suggested above, language would naturally arise from thought and would facilitate thought. As an evolutionary phenomenon, development of thought and development of language would thus favor each other, tending to co-evolve.
Treating language as a naturally arising, and facilitating, but incomplete manifestation of thought seems to me quite important for clear thinking about memetic evolution, notably for clear thinking about memetic evolution across my conjectured phase boundaries of culture, from verbal to oral and from oral to literate. The incompletion tells us we should not try to understand memes, nor culture, nor even language evolution, as a purely linguistic phenomenon. It does raise interesting questions about how culture and thought — the stuff of memes — are communicated from person to person (memetically, from host to host). Does the Pirahã language instill the Pirahã culture? Experience suggests that communicating with people in their physical presence is qualitatively more effective than doing so by progressively lower-bandwidth means, with text-only communications — which are extremely widespread now thanks to the internet — way down near the bottom. Emoticons developed quite early in the on-line revolution. Does the internet act as a sort of cultural filter, favoring transmission of some kinds of culture while curtailing others? I don't mean to answer these questions atm; but I do suggest that exploring them, including realizing they should be asked, is facilitated by understanding the thought–language relationship.
Somewhere along the evolutionary curve, co-evolving thought and language become an effective platform for the evolution of memetic lifeforms. From there onward, we're pushed to reconsider how we think about the evolutionary process. I agree with Dawkins that group selection is a mistake in thinking — that groups of organisms, and individual organisms, are essentially phenotypes of genes, and our understanding of their evolution should be grounded in survival of the fittest genes. The relationship between genes and memes, though, is something else again. At best, our genes and our memes enjoy a symbiotic relationship. Increasingly with risings levels of memetic evolution, our genes might be usefully understood as vehicles for our memes, much as Dawkins recommended understanding organisms as vehicles for our genes. This is in keeping with a general trend I've noticed, that thinking about memetic evolution forces us to admit that concepts in evolution have fuzzy boundaries. From the more familiar examples in genetic evolution we may habitually expect hard distinctions between replicator and phenotype, between organism and population; we want to assume an organism has a unique, constant genome throughout its life; we even want, though even without memetics we've been unable to entirely maintain, that organisms are neatly sorted into discrete species. All these simplifying assumptions are a bit blurry around the edits even in genetic biology, and memetics forces us to pay attention to the blur.
As a parting shot, I note that chunking is an important theme in two other major areas of my blogging interest. In one of several blog posts currently in development relating to physics (pursuant to some thoughts already posted, cf. here), I suggest that while quantum mechanics is sometimes portrayed as saying that in the strange world of the very small, seemingly continuous values turn out to be discrete, perhaps we should be thinking of quantum mechanics the other way around — that is, our notion of discrete entities works in our everyday world but, when we try to push it down into the realm of the very small it eventually cannot be pushed any further, and the pent-up continuous aspect of the world, having been smashed as flat as it can be, is smeared over the whole of creation in the form of quantum wave functions with no finite rate of propagation. In another developing post, I revisit my previously blogged interest in Gödel's Theorem, which arose historically from mathematicians' struggles to use discrete — which is to say, symbolic — reasoning to prove theorems about classical analysis. I don't go in for mysticism: I'm not inclined to suppose we're somehow fundamentally "limited" in our understanding by the central role of information chunking in our thought processes (cf. remarks in a previous blog post, here); but it does seem that information chunking has a complexly interesting interplay with the dynamics of the systems we think about.