IT was six men of IndostanMonadic programming is a device by which pure functional languages are able to handle impure forms of computation (side-effects). Depending on who you ask, it's either very natural or preposterously esoteric.
To learning much inclined,
Who went to see the Elephant
(Though all of them were blind),
That each by observation
Might satisfy his mind.
— The Blind Men and the Elephant, John Godfrey Saxe
Note: So far, I've found three different ways of applying this epigraph to the following blog post.
I submit there is a systemic problem with monadic programming in its current form. Superficially, the problem appears technical. Just beneath that surface, it appears to be about breaking abstraction barriers. Below that, the abstraction barriers seem to have already been broken by the pure functional programming paradigm. I'll suggest, broadly, an alternative approach to functional programming that might resurrect the otherwise disabled abstraction barriers.
What monads do for pure FP
A pure function has no side-effects. Pure functional programming —programming entirely with pure functions— has some advantages for correctness proofs, which is (arguably) all very well as long as the purpose of the program is to produce a final answer. Producing a final answer is just what a pure function does. Sometimes, though, impurities are part of the purpose of the program. I/O is the canonical example: if the point is to interact with the external world, a program that cannot interact with the external world certainly misses the point.
If the point is to interact with the external world, and you still want to use a pure function to do it, you can write a pure function that (in essence) takes the state of the external world as a parameter, and returns its result together with a consequent state of the external world. Other forms of impurity can be handled similarly, with a pure function that explicitly provides for the particular variety of impurity; monads are "simply" a general pattern for a broad range of such explicit provisions.
Note, however, that while the monadic program is parameterized by the initial state of the external world, the monad itself is hardcoded into the type signature of the pure function.
What goes wrong
The essential difficulty with this approach is that since the monad is hardcoded into the function's type signature, it also gets hardcoded into clients who wish to call that function.
To illustrate the resulting brittleness, suppose some relatively small function f is used by a large program p, involving many functions, with calls to f at the bottom of many layers of nesting of calls. Suppose all the functions in p are pure, but later, we decide each call to f should incidentally output some diagnostic message, which makes no difference to the operation of the program but is meant to be observed by a human operator. That's I/O, and the type signature of f hadn't provided for I/O; so we have to change its type signature by wiring in a suitable I/O monad. But then, each function that directly calls f has to be changed to recognize the new signature of f, and since the calling function now involves I/O, its type signature too has to change. And type signatures change all the way up the hierarchy of nested function calls, until the main function of p gets a different type signature.
Every direct or indirect client of f has been forced to provide for the stateful I/O behavior of f. One could ask, though, why this stateful behavior of f should make any difference at all to those clients. They don't do any I/O, and if not for this type-signature business they wouldn't care that f does; so why should f's I/O be any of their business? For them to bother with it at all seems a violation of an abstraction barrier of f.
Actually, this very real abstraction violation was not caused by the introduction of monads into pure functional programming — it was highlighted by that introduction. The violation had already occurred with the imposition of pure functional programming, which denies each component function the right to practice impurities behind an abstraction barrier while merely presenting a pure appearance to its clients.
The introduction of monads also created the distracting illusion that the clients were the ones responsible for violating the abstraction barrier. On the contrary, the clients are merely where the symptoms of the violation appear. The question should not be why the client function cares whether f is internally impure (it doesn't care; its involvement was forced), but rather, who is it who does care, and why?
Monads come from a relatively modern branch of mathematics (it's a mere half-century old) called category theory.
A category is a well-behaved family of morphisms between objects of some uniform kind. The category provides one operation on morphisms: composition, which is defined only when one morphism ends on the same object where the next morphism starts.  (Technically, a category is its composition operation, in that two different categories may have the same objects and the same morphisms, and still be different if their composition operation is different.)
The canonical example is category Set, which is the family of mathematical functions between sets — with the usual notion of function composition. That's all possible functions, between all possible sets. This is typical of the scale at which category theory is brought to bear on computation theory: a category represents the universe of all possible computations of interest. The categories involved are then things like all computable pure functions, or all computable functions with a certain kind of side-effect — it should be clearly understood that these categories are big. Staggeringly, cosmologically, big.
Besides well-behaved families of morphisms between objects of a uniform kind, there are also well-behaved families of morphisms from objects of one uniform kind to objects of another uniform kind. These families of heterogenous morphisms are called adjunctions. An adjunction includes, within its structure, a category of homogeneous morphisms within each of the two kinds of objects — called the domain category (from which) and codomain category (to which the heterogeneous morphisms of the adjunction go). The adjunction also projects objects and morphisms of each category onto the other, projects each of its own heterogeneous morphisms as a homogeneous morphism in each of the categories, and requires various relations to hold in each category between the various projections.
The whole massive adjunction structure can be viewed as a morphism from the domain category to the codomain category — and adjunctions viewed this way are, in fact, composable in a (what else?) very well-behaved way, so that one has a category Adj whose objects are categories and whose morphisms are adjunctions. If the categories we're interested in are whole universes of computation, and the adjunctions are massive structures relating pairs of universes, the adjunctive category Adj is mind-numbingly vast. (In its rigorous mathematical treatment, Adj is a large category, which means it's too big to be contained in large categories, which can themselves only contain "small" categories — an arrangement that prevents large categories from containing themselves and thereby avoids Russell's paradox.)
A monad is the result of projecting all the parts of an adjunction onto its domain category — in effect, it is the "shadow" that the adjunction casts in the domain. This allows the entire relation between the two categories to be viewed within the universe of the domain; and in the categorical view of computation, it allows various forms of impure computation to be viewed within the universe of pure computation. This was (to my knowledge) the earliest use of monads in relation to computation: a device for viewing impure computation within the world of pure computation. A significant limitation in this manner of viewing impure computations is that, although adjunctions are composable, monads in general are not. Here the "shadow" metaphor works tolerably well: two unconnected things may appear from their shadows to be connected. Adjunctions are only composable if the domain of one is the codomain of the other — which is almost certainly not true here, because all our monads have the same domain category (pure computation), while the shadows cast in pure computation all appear to coincide since the distinct codomains have all been collapsed into the domain.
Who is viewing these various forms of computation, through the shadows they cast on the world of pure computation? Evidently, the programmer — in their role as Creator. A God's eye view. Viewing the totality of p through the universe of pure computation is the point of the exercise; the need for all clients of f to accommodate themselves to f's internal use of I/O is an artifact of the programmer's choice of view.
Rethinking the paradigm
So, here are the points we have to work with.
- A monad represents the physical laws of a computational universe within which functions exist.
- The monad itself exists within the pure computational universe, rather than within the universe whose laws it represents.  This is why monads are generally uncomposable: they have forgotten which actual universes they represent, and composition of adjunctions wants that forgotten knowledge.
- Function signatures reflect these computational laws, but serve two different purposes. From the client's eye view, a function signature is an interface abstraction; while from the God's eye view (in the pure computational universe), a function signature is the laws under which the function and everything it uses must operate.
- by synthesis — when function f calls function g, g returns its computational laws along with its result value, and f works out how to knit them all into a coherent behavior, and returns its own knit-together computational laws along with its own result value — or
- by inheritance — when function f calls function g, f passes in its computation laws along with its parameters to g, and g works out both how to knit them into its own computational laws internally, and how to present itself to f in terms f can understand.