An introduction to combinatory calculus, by Brent Kerby
We'll now introduce the notion of a "function", a notion which is quite vague. A function can be thought of as an imaginary table (possibly infinite) with two columns, the things on the left being called parameters and the things on the right being called results. A function maps parameters to results, like a dictionary maps words to their definitions.
One example of a function is arithmetic negation, which given a number as a parameter, yields another number (the parameter's opposite) as a result. Negation would, for instance, given "3", yield "-3".
Here are some other examples of functions:
It is often convenient to assign abbreviated names to certain functions. For instance, the function mentioned in the second example we'll call simply "sqr" (an abbreviation for "squaring", appropriate because that function yields the square of its parameter). Then, the square of a particular number "x" we'll write as "sqr(x)" (the "application" of the function "sqr" to parameter "x"). For instance, the number "9" could be written as "sqr(3)".
Now, it should not be thought that we use "sqr(3)" to denote the "action" of finding the square of "3". We use "sqr(3)" simply to denote the square of "3" itself (in other words, "9").
Also, we distinguish between "sqr" and "sqr(x)". The former is the function of squaring; the latter is the square of a particular number "x".
At this point, we offer a few simple problems to ensure that the reader understands our conventions:
All of the functions mentioned so far took a single parameter to yield a result. However, there are many functions that require several parameters to yield a result. For instance, the functions of addition, subtraction, multiplication, and division ("+", "-", "*", "/") each require two parameters.
We'll call "add" the function of addition. Then, we'll use "add(3,5)" to mean the sum of "3" and "5", or in other words, "8".
In general, if "f" is a function that takes two parameters then we write the application of "f" to two parameters "x" and "y" as "f(x,y)".
Whereas functions that take only one parameter are called unary, functions that take two parameters are called binary. Functions that take three parameters are called tertiary. In general, the number of parameters that a function takes is called the function's order.
Up to this point, we have mostly only dealt with functions that take numbers as parameters and yield numbers as results. However, there are many other interesting things that functions could take or yield (for instance: lists of numbers). In particular, a function could take or yield another function. A function that takes or yields a function is called a higher-order function.
One example of a higher-order function is the "composition" function. This function takes two parameters (each parameter being a unary function) and yields a unary function that is a "composite" of the two functions.
By a "composite" of two unary functions "f" and "g", we mean a function that, given a parameter "x", yields "f(g(x))". In other words, the composition of "f" and "g" applies "g" to its parameter and then "f".
As an example use of the composition function (which we'll call "comp"), take "comp(sqr,succ)", the composition of squaring with succession. This is a function that takes a parameter and yields the square of its successor; given a parameter "x", it would yield "sqr(succ(x))" or in other words "(x+1)*(x+1)". Note that "comp(sqr,succ)" is very different from "comp(succ,sqr)".
In general, for composition, we can say that "(comp(f,g))(x)" is the same as "f(g(x))".
There is a way in which unary higher-order functions can be used to emulate multiple-parametered functions. This technique is very interesting and is called "currying". We'll illustrate it with an example:
Consider the function that adds 1 to its parameter (the previously mentioned "succ" function). Now consider the function that adds 2 to its parameter, and the function that adds 3 to its parameter. Clearly, there is many similar functions like these, each adding a certain amount to their parameter. We could capture them all if we had a higher-order function (which we'll call "plus") that given a number, yields a function that adds that number to its parameter. For example, "plus(1)" would be the successor function, while "plus(2)" would be the function that adds 2 to its parameter.
This "plus" function is very similar to the original multiple-parametered function "add". Whereas the "add" function took both parameters at once, the "plus" function takes just one, yielding a function that takes the second parameter, yielding the result. We call "plus" the "curried" version of "add". In general, we can say about "plus", that "(plus(x))(y)" is the same as "x+y", or in other words, "add(x,y)".
This technique of currying can be used with all multiple-parametered functions, including those that take 3 or more parameters. From now on, we will generally not use multiple-parametered functions, in favor of curried functions.
Now that we no longer have to worry about multiple-parametered functions, we can adopt a new notation for function application that will be less cluttery (requiring fewer parentheses).
We illustrate our new notation with a few examples:
Old Notation | New Notation |
---|---|
succ(1) | succ 1 |
(plus(x))(y) | plus x y |
(add(succ(1)))(succ(2)) | add (succ 1) (succ 2) |
Basically, we write the application of "f" to "x" as just "f x", unless "x" is a large expression (itself containing applications), in which case the "x" will be surrounded by parentheses.
Note that even if the left side of the application (the "f") is a large expression, it is not necessary to surround it by parentheses; that is, "plus x y" is understood to mean "(plus x) y". Also, if there was a function "f" that took three parameters through currying, then "f x y z " would be understood to mean "((f x) y) z".
This notation is very convenient. In fact, the convenience of this kind of notation is probably the most important reason to use curried functions instead of multiple-parametered ones (just kidding).
The notion of a "function" has, for its simplicity, been around a
surprisingly short time. It originated, I think, sometime in the 1800s
with mathematicians like G. W. Leibniz and Leonhard Euler. The idea of
applying functions to things other than numbers (in particular, to
other functions) began, it seems, with Gottlob Frege in the late 1800s
and early 1900s.
Frege, it seems, was the first to think of emulating multiple-parametered
functions using higher-order functions, but the technique ("currying")
has come to be named after Haskell Curry, an influential mathematician
who popularized the technique.
In the last section we defined some basic terminology relating to
functions and explained the technique of currying, a way of emulating
multiple-parameter functions using higher-order unary ones.
In this section, we'll describe some very special higher-order
functions called "combinators".
The first combinator we'll describe is the one which has come to
be known simply as "B". In the last section we mentioned a
composition function "comp", which was presented as a binary function
that takes two parameters (each unary functions) and yields their
composite. The "B" combinator is related to the composition function
and, in fact, is basically just a curried version of it.
In the last section, we mentioned that the composition function
had the property (for all "f", "g", and "x")
The "B" combinator has a similar property:
Using extra parentheses and spaces, the statement could be written as:
This property (sometimes called "B"'s "rewrite rule") can be thought of
as the definition of what we mean by "B".
As described in the last section using "comp", the "B" combinator has
many uses. Given a function "plus 2" that adds two, and a function
"times 3" that multiplies by three, we can form a function
"B (plus 2) (times 3)" that multiplies its parameter by three and then
adds two.
In general, "B" is used to sort of "chain" functions together...
Remember that the order of the parameters does matter with "B";
i.e., "Bfg" is not always the same as "Bgf".
Although "B" is often applied to two things together, yielding their
composite, it is also meaningful when applied to just one parameter.
For instance, "B sqr" is a function that is like squaring, except that
instead of taking just one parameter (a number), it takes two parameters
(a function and a number), the function being applied to the number
before being passed to the original squaring function. One might think
of "B sqr" as a version of "sqr" in which the parameter has
been split into two. This function might be called the
"composition of squaring".
We've only used "comp" on functions that are unary; however "Bfg"
may be meaningful even if "f" takes more than one parameter (through
currying). For instance, "B plus sqr" (the composition of addition
with squaring) is meaningful: Even though "plus" can be thought of
as a binary function (taking two numbers through currying and adding
them together), it can also be thought of as a (higher-order) unary
one that takes a number and yields a function that adds that number.
So, if "plus" takes a number and yields a function that adds that number,
then "B plus sqr" takes a number and yields a function that adds
the square of that number. That is:
So, we see that the composition of addition and squaring is an
adding function, except that its first parameter is squared.
In general, when composing a multiple-parametered function "f" with
function "g", only the first parameter of "f" is affected (in speaking
of a "multiple-parametered" function we mean a function that takes
multiple parameters through currying; this is how we will usually
use "multiple-parametered" from now on).
Now we introduce another very interesting higher-order function,
the "C" combinator. Basically, this function swaps the order of
a function's parameters. That is, if "C" is applied to a binary
function "f", then it yields a modification of "f" that is the
like "f", except the order of its two parameters are reversed.
For instance, if "minus" is a subtraction function such that
"minus 3 5" is the same as "2", then "C minus" will be like the
minus function, except that it takes the two parameters in reverse
order; that is, "(C minus) 3 5" is the same as "minus 5 3", and "-2".
In general, we can say for "C" that
or without the extra parentheses:
Another way to think of "C" is that it can be used to postpone
a function's first parameter. Using "C", it is possible to
"sneak in", so to speak, a function's second parameter
without giving the first. Suppose, for instance, that we have "minus"
and want to form a unary function that subtracts a number
from "3"; now, the number to subtract from comes second
in "minus". However, we can sneak in the second parameter using "C";
"C minus 3" is a function that subtracts a number from "3".
(later, we will show how variations of the "C" combinator can
be used to sneak in a function's third, fourth, or other parameter).
We now turn our attention to another interesting combinator,
the W combinator. Basically, this combinator duplicates a
function's parameter. That is, if "W" is applied to a binary
function "f", the result is a unary function that is like "f"
except that its one parameter plays the role of both parameters.
For instance, "W plus" is the doubling function. Whereas "plus"
is a binary function adding the new parameters together,
"W plus" (doubling) is a unary function that adds a parameter
with itself. That is, "(W plus) 5" is the same as "plus 5 5".
In general, we can say about "W" that
Whereas the W combinator is used to construct functions that use their
parameter more than once (so to speak), the K combinator is
used to construct functions that ignore their parameter. In particular,
applying "K" to an "x" yields a function that always results in "x".
For instance, "K 3" is a function that always results in three,
regardless of its parameter. That is, "(K 3) 0" is "3". Also,
"(K 3) W", "(K 3) B", and "(K 3) C" are also "3". The second
parameter to K is always ignored.
In general, we can say about "K" that
We've now mentioned several combinators (B, C, W, and K) and shown
how, given some primitives like "plus" and "times", they can be
used to form other interesting functions (like the doubling function,
the squaring function, the function that doubles and then squares, etc.)
One interesting thing about combinators is that, even without
other primitives like "plus" and "times", very interesting things
can be formed simply by applying combinators to one another.
For instance, consider "WK". "K" can be thought of as a binary function
that always yields the first parameter; so "WK" is then a unary function
whose parameter plays the role of both. The result is that "WK"
always yields its parameter unchanged and is another name for
the "I combinator", for which we can say:
To see this more clearly, note that "WKx" is the same as "Kxx"
(by W's rewrite rule) and thus "x" (by K's rewrite rule). So,
"(WK)x" is the same as "x", and "WK" is a function that results
exactly in its parameter, with no change.
We've just shown how the I combinator can be derived from W and K.
Now, we'll give a much more elaborate example. We'll show how
a combinator called "S" can be derived from B, C, and W.
To begin, by "S", we mean the "S" that has this rewrite rule:
This "S" is actually quite an interesting combinator. It could be
thought of as a generalized version of "W". Remember that "W"
duplicates a parameter of a binary function "f". Similarly, "S"
duplicates a parameter of a binary function "f" but allows the duplicated
parameter to be modified by a function "g" before being passed as
the second parameter to "f".
For example, "W plus" is a doubling function; i.e., it is a function
that adds a parameter with itself. Then, "S plus sqr" is a function that
adds a parameter, not with itself, but with its square. For instance,
"(S plus sqr) 4" is "plus 4 (sqr 4)" or in other words "4 + 42"
or "20".
Now, we'll show how "S" may be constructed from B, C, and W. In
fact, "S" is quite simply
(Read: "the composition of composed duplication with composition
itself and the swapping function". Just kidding.)
At first, this may appear to be jibberish, but a closer look will
show that it is indeed the "S" combinator.
In particular, when this combinator is applied to a parameter "f",
one gets
If we apply this to another parameter "g", we get
Finally, if we apply this to a third parameter "x", we get
So, we can see that "B(BW)(BBC)" is a combinator that, given
three parameters "f", "g", and "x", yields "fx(gx)", and is the
same as the "S" combinator.
In this section, we've described some specific higher-order functions
called "combinators" but have not yet told what exactly we mean
by saying that they are "combinators". We'll now fill in this
gap.
Before giving a precise definition of "combinator", we first introduce
another term, a "combination". We define this term using an example;
here are some combinations of "x", "y", "z":
Basically, to form a combination of a set of things, just pick some
out (possibly picking some more than once), scramble them up a bit,
and insert some parentheses. (This definition of "combination" is
somewhat vague, but it will do for now).
Now, with that in mind, we finally define what we mean by "combinator".
A combinator is a function that, given some parameters, yields
a combination of those parameters.
We can see that B, C, W, K, I, and S are all functions that take some
parameters and always yield a combination of them.
We've described several interesting combinators, including
B, C, W, K, I, and S. Some of these combinators can be constructed
from one another (as "I" can from "W" and "K"), and many interesting
other combinators can be constructed from all these.
Actually, the set of combinators is infinite, and there are lots
of interesting ones. One thing that may be a bit surprising is that
all combinators can be constructed from just B, C, W, and K.
It may also be surprising that all combinators can be constructed
from just S and K. This topic will be discussed more in the next
section, on Completeness.
Note that this function is the same as the function in the
last example. In general, "Bf(Bgh)" is the same as "B(Bfg)h",
in that both, given a parameter "x", result in "f(g(hx))". This
property of "B" is called its associativity.
Answers to Problem Set 1
Answers to Problem Set 2
Answers to Problem Set 3
Answers to Problem Set 4
Answers to Problem Set 5
Historical Note
B combinator
(comp(f,g))(x) = f(g(x))
Bfgx = f(gx)
((B f) g) x = f (g x)
plus x y = x + y
(B plus sqr) x y = x2 + y
Problem Set 1 (Answers)
C combinator
(Cf)xy = fyx
Cfxy = fyx
Problem Set 2 (Answers)
W combinator
(Wf)x = fxx
Problem Set 3 (Answers)
K combinator
Kxy = x
Problem Set 4 (Answers)
I combinator
Ix = x
Problem Set 5 (Answers)
S combinator
Sfgx = fx(gx)
B(BW)(BBC)
B(BW)(BBC)f
BW(BBCf) by "B"'s rewrite rule
BW(B(Cf)) by "B"'s rewrite rule (again)
BW(B(Cf))g
W(B(Cf)g) by "B"'s rewrite rule
W(B(Cf)g)x
B(Cf)gxx by "W"'s rewrite rule
Cf(gx)x by "B"'s rewrite rule
fx(gx) by "C"'s rewrite rule
Problem Set 6 (Answers)
Definition of "combinator"
Conclusion
Answers to Problem Set 1
fx = x2*3 + 1
fx = x2*3 + 1
Answers to Problem Set 2
Answers to Problem Set 3
Answers to Problem Set 4
I | suggested by "identity" |
B | suggested by "substitution"; he was reserving "S" for other uses (possibly for "sum" or "successor"), although he ended up adopting Schonfinkel's use of "S" when he heard of it. |
C | suggested by "converse" |
W | suggested by the letter's appearance; "W" is often rendered as two side-by-side "V"'s, suggesting duplication. |
K | suggested by "constant", which is pronounced with a "k" sound. |
In the last section we showed how using combinators (like B, C, W, and K) and certain primitive functions like "plus" and "times", it is possible to form many other interesting functions. In this section, we'll show how a very different approach, using a construct called a "lambda" instead of combinators, can also be used to form functions.
We illustrate the approach with an example. Consider the function that takes the square of the successor of a number. Using the "B" combinator, this function might be represented as
B sqr succ
On the other hand, using a "lambda", the function could be represented as
x\ sqr (succ x)
(Read: "the function that given 'x', yields the square of the successor of 'x'")
A second example: Consider the function of doubling (adding something with itself). Using the "W" combinator, this function might be represented in terms of "plus" as
W plus
On the other hand, using a lambda, the function could be represented as
x\ plus x x
(Read: "the function that given 'x', yields the sum of 'x' and 'x'")
In general, we say that "x\F" (where "F" is an expression possibly mentioning "x") is the function that, given a parameter "x", yields "F". We call "\" the lambda operator.
To ensure that the reader understands this new notation, we offer a few simple problems:
When using the combinator approach to forming functions, we used currying to model multiple-parametered functions. When using the lambda approach, we can (and will) still use currying.
So far, we have only used lambdas to form unary functions (like the doubling function and the double-and-square function). But lambdas can also be used to form multiple-parametered functions (in the curried sense).
For instance, consider the "minusfrom" function (a subtraction such that "minusfrom 5 3" is "2"). Using the C combinator, this can be expressed in terms of "minus" as
C minus
It can also be expressed using lambdas as
x\ (y\ minus y x)
(Read: "the function that given 'x', yields the function that given 'y', yields the subtraction of 'y' from 'x'")
Note that we are using a lambda within a lambda (i.e., nested lambdas); this is common when using lambdas to form multiple-parametered functions. The parentheses and extra space in the last expression are not necessary and could have been omitted, as in
x\y\ minus y x
(In our notation, a lambda should always be interpreted as extending as far to the right as possible; for instance, "x\plus 2 x" should be interpreted as "x\ (plus 2 x)", and not "(x\plus) 2 x").
So far we've only used lambdas to form functions that actually use their parameter; however, a lambda can also be used to form functions that ignore their parameters. For instance, the function that always results in "3", regardless of its parameter, we can write using a lambda as
x\3
This function could also be formed easily using the K combinator:
K 3A function that ignores its parameter is said to be vacuous.
We have seen that many functions that can be formed using lambdas can also be formed using combinators. However, it is also possible to form the combinators themselves using lambdas. For instance, the I combinator can be formed using a lambda as
x\x
It is the function that, given parameter 'x', yields 'x'. Similarly, it is possible to form the W combinator using lambdas as
f\x\ fxx
It is the function that, given parameter 'f', yields the function that given parameter 'x', yields 'f' applied to 'x' and 'x'. In a similar way, it is possible to form other combinators using lambdas:
K = x\y\ x B = f\g\x\ f(gx) C = f\x\y\ fyx
When discussing combinators, we assigned each combinator a rewrite rule; in particular, we said that (for all "f", "g", "x", "y", and "z")
Ix = x Bfgx = f(gx) Cfxy = fyx Wfx = fxx Sxyz = fx(gx) Kxy = x
These rewrite rules sort of acted as definitions of the combinators. We'll now give a similar rewrite rule that will more precisely define the lambda construct. Before precisely stating the rule, we'll give some instances of it. One instance is that
(x\ sqr (succ x)) 3 = sqr (succ 3)
Since "x\ sqr (succ x)" is the function that, given "x", yields the square of the successor of "x", it makes sense that applying the function to a specific parameter "3" would yield the square of the successor of "3". Some other instances of the rule:
(x\ plus x x) 3 = plus 3 3 (x\ plus x x) (sqr 3) = plus (sqr 3) (sqr 3) (x\ x) 3 = 3 (x\ y\ minus y x) 3 = (y\ minus y 3) (x\ y\ minus y x) (sqr 3) = (y\ minus y (sqr 3)) (x\ y\ x) 3 = (y\ 3) (x\ 1) 3 = 1
Basically, the general rule is that "(x\F) G" is the same as just "F", except that every instance of "x" must be replaced with "G". This rule is called the rule of beta-reduction.
Each time one forms a function using a lambda, it is necessary to make up an arbitrary name (like "x") to stand for the function's parameter. If one is using nested lambdas (say, to form a multiple-parametered function), then when the time comes to pick the name for the second parameter, a name (say "x") will already be used up by the first one; so, it is necessary to pick a different name (like "y"). Obviously, if one picks the same name for both parameters, then problems will occur.
A person is probably not likely to initially pick the same name for two different parameters; however, it is quite easy to make a mistake while doing beta-reductions. It is possible for an expression initially free of naming conflicts to cause a conflict when beta-reduced. For instance, consider the expression
(z\ zz) (x\y\x)
This is the application of the K combinator to itself (mentioned in the last problem). The expression is free of name conflicts, but if we naively beta-reduce it, we get
(x\y\x) (x\y\x)
and then
y\x\y\x
There are now two parameters with the name "y". In this case, the problem is not really too severe, since "y" is not actually referenced; however, there are worse cases (one is mentioned in the next problem), where several referenced parameters end up having the same name.
Note in the second step of the last example, that there is not yet a naming conflict with "(x\y\x) (x\y\x)". Although there are two parameters each with names "x" and "y", the functions with those parameter names are not nested within each other; the expression is easily identifiable as the application of the K combinator to itself. The problem does not occur until the final step.
The problem with the final step could have been solved if the variables in one of the "x\y\x"s had been renamed. For instance, the expression could have been rewritten as
(x\y\x) (f\g\f)
before proceeding. Then the final step would yield
y\f\g\f
which has no naming conflicts.
The most common kind of rewrite used with lambdas is beta-reduction, but there is another kind, alluded to in the examples, known as "eta-reduction". Eta-reduction is the kind of rewrite used when we say that
x\ sqr x
is the same as just "sqr". In general, eta-reduction is the rewriting of a expression of the form "x\ F x" (where "F" contains no references to "x") to just "F".
Eta-reduction is a subset of a more general principle, called the principle of extensionality. The principle of extensionality says that if "Fx = Gx" for an arbitrary "x", then "F = G". In other words, if two functions always yield the same thing given the same parameter, then they are same.
Before proceeding to discuss lambda expressions and their relation to combinators, it may be useful to step back and reflect on our technique of referring to things.
By expression, we mean a symbol (typically a sequence of English characters/puncuators) used to denote something. In a broad sense, we might call the English sentence "The cat jumped over the hat" an expression, but by "expression" usually we mean something like "sqr", "3", "plus 5", "B sqr succ"; i.e., one of those symbols not common in ordinary English but assigned a precise meaning by us.
In fact, throughout this work we've been developing a sort of formal language. Because ordinary English does not have a concise way of precisely referring to things like the B, C, W, and K combinators and the function of squaring (etc.), we have made up our own abbreviated words for referring to them. Our abbreviated words like "B", "C", "W", "K", "sqr", and "plus" that are assigned specific meanings we will call primitive expressions.
There are some expressions that are not primitive expressions. The expression "B sqr succ" we will call a compound expression (and not a primitive expression) because it is formed several parts, parts which individually have meaning.
In English and other natural languages a myriad of strange constructs are used for forming compound expressions. Often a sentence in English appears as a long string of primitive words, possibly with puncuators. Complicated traditions (involving verbs, subjects, direct objects, prepositions, etc.) determine how the meaning of overall sentence is to be drawn from the individual parts.
In contrast, in our little formal language, compound expressions are generally formed from primitives using only the simple construct of function application. An expression in our formal language that is written using only primitives and function application is said to be purely applicative. A purely applicative expression is represented nicely using a "binary tree"; that is, a diagram like this:
* / \ / \ / \ / \ * * / \ / \ / \ sqr 3 plus * / \ sqr 5
Each "*" represents a function application, the thing to the left being the function and the thing on the right being the parameter. This particular diagram represents "plus (sqr 5) (sqr 3)", which, with extra parentheses, is written "(plus (sqr 5)) (sqr 3)".
Because binary trees are a bit awkward to draw, we've adopted the convention of writing expressions as a string of primitives with parentheses that convey the tree structure. Relevant to parentheses, it is always a good idea to keep in mind that by "f x y z" we mean "((f x) y) z", or, using with a binary tree diagram:
* / \ * z / \ * y / \ f x
Now, it should not be thought that all our expressions are purely applicative. In particular, in this last section, we've used the lambda construct, a construct that is neither a primitive nor a function application. An expression that is purely applicative except for uses of lambda will be called a lambda expression.
The lambda construct involves odd things called "variables". Symbolized using letters like "x", "y", "z", these symbols do not have any concrete meaning permanently associated with them, and thus are not primitives. Thus, the lambda construct is quite odd. There is then great interest (and practical use, we suspect, in computing systems) in seeing how lambdas can be eliminated in place of combinators.
Earlier we showed how certain expressions written with lambdas could be rewritten using combinators instead. For instance, the expression
x\ sqr (succ x)
can be rewritten, using the B combinator, as
B sqr succ
Some expressions are apparently more difficult to rewrite, for example,
x\y\ minusfrom (sqr x) (sqr y)
In a moment we'll show how this expression, and in general all lambda expressions, can be written as purely applicative expressions, using combinators. But first, we'll give some more concrete examples, demonstrating some of our techniques.
x\ minus 2 (sqr (succ x))We want to eliminate the lambda in this expression. We can see that this is the composition of three functions, "minus 2", "sqr", and "succ". Thus, the expression could be written without lambdas as
B (minus 2) (B sqr succ)
x\ minus x 3This is the function that subtracts a number from 3. It is the "minus" function with the second parameter "snuck in" to be "3". It can be written without a lambda as
C minus 3
x\ minus (sqr x) 3
This function subtracts the square of a number from 3. It is just like the last example except that the parameter is squared first. So, if we compose the last example with squaring, we get
B (C minus 3) sqr
But there is an alternative solution to this one. Consider the function "B minus sqr"; this is the subtraction function with the first parameter squared. If we snuck in "3" as the second parameter to this function, we would get the function we want, the function that subtracts a number's square from 3:
C (B minus sqr) 3
Now that we've given these examples, we'll give a couple simple techniques that can be used to help eliminate lambdas. The first technique is that if one has an expression of the form
x\ F G
where "G" mentions "x" but "F" does not, then the lambda may be "pushed" deeper into the right, using the B combinator:
B F (x\G)
For instance, if one had the expression
x\ sqr (jiba x 3)then, since the left side ("sqr") does not mention "x", it is possible to "push" the lambda to the right, as in
B sqr (x\ jiba x 3)
This makes sense; the original function does something to "x" (finds "jiba x 3") and then takes the "sqr" of the whole thing. Since "sqr" is being taken of the whole thing, it makes sense that the function is the composition of "sqr" with the other part. Also it can be verified that each function, given an arbitrary parameter "z", yields "sqr (jiba z 3)"; for the original function, this is obvious; for the new function, the result is
B sqr (x\ jiba x 3) z sqr ((x\ jiba x 3) z) (B's rewrite rule) sqr (jiba z 3) (beta-reduction)
So, we see that the technique of pushing a lambda to the right (when the variable is only mentioned on the right) is sound. Now, we'll demonstrate a similar technique; if the only occurence of the variable is on the left, then the lambda can be pushed to the left, using the C combinator. For instance, in the expression
x\ jiba x 3 4
which can be redundantly parenthesized as
x\ (jiba x 3) 4
the left side is "jiba x 3" and the right side is "4". Since only the left side mentions "x", we can push the lambda to the left, using the C combinator:
C (x\ jiba x 3) 4
In the resulting lambda, the left side is "jiba x" and right side is "3". So, the lambda can be pushed left further:
C (C (x\ jiba x) 3) 4
Now, "x\ jiba x" can be eta-reduced to "jiba", so we get
C (C jiba 3) 4
This makes sense; the original function "x\ jiba x 3 4" is like "jiba" except that the second and third parameters are snuck in to be "3" and "4" respectively. Our function is formed by sneaking in "3" as the second parameter ("C jiba 3"), and then sneaking in "4" as the new second parameter. It can be verified that we get the right answers when we apply this function to an arbitrary parameter "z":
C (C jiba 3) 4 z C jiba 3 z 4 (C's rewrite rule) jiba z 3 4 (C's rewrite rule)
These two techniques are often quite helpful ways of eliminating lambdas. Note that in order for these techniques to work, the lambda's variable must be referenced only on one side of the expression or the other (the left or the right); the variable must not be referenced on both sides.
We've given a couple techniques that can help one rewrite lambda expressions as purely applicative ones. We'll now describe a couple more, and then tie them all together into a specific procedure for eliminating lambdas.
We've already shown how a lambda expression of the form "x\ F G" can be simplified if "x" is mentioned by either "F" or "G" (but not both). There are a couple cases we haven't considered:
The first case, if "x" is not mentioned by either one, is very easy. The entire lambda can be eliminated in one swell foop using the K combinator.
The second case is a bit trickier. The thing to do in the second case is to push the lambda inward to both "F" and "G", using the S combinator. Remember that the S combinator has the rewrite rule
Sfgx = fx(gx)
Now, what we're saying is that "x\ F G" could be rewritten as "S (x\F) (x\G)". To see that this is sound (i.e., that "x\ F G" is in all cases the same as "S (x\F) (x\G)"), check that both "x\ F G" and "S (x\F) (x\G)" give them same thing when applied to an arbitrary parameter "z":
The original expression "x\ F G" when applied to an "z" will give "F G", but with every occurence of "x" changed to "z". Also, "S (x\F) (x\G)" when applied to "z" will give "(x\F) z ((x\G) z)", or in other words "F G", with every occurence of "x" within "F" changed to "z" and also every occurence of "x" within "G" changed to "z". So, the technique is sound. We'll give an example of it now, in rewriting this expression:
x\ plus (sqr x) (succ x)
The left side is "plus (sqr x)" and the right is "succ x"; each side mentions "x". So, we use the S combinator to push the lambda inward both ways:
S (x\ plus (sqr x)) (x\ succ x) S (x\ plus (sqr x)) succ (eta-reduction) S (B plus sqr) succ (eliminate remaining lambda)
We can see that this function is the same as the original function (that adds the square of a number with the successor of the number):
S (B plus sqr) succ x B plus sqr x (succ x) (S's rewrite rule) plus (sqr x) (succ x) (B's rewrite rule)
Now that we can see that this technique works, we can tie together a complete procedure for eliminating a lambda:
Suppose that we have a lambda expression of the form "x\Z" (where "Z" is a purely applicative expression; this must be an innermost lambda). Then, the structure of "Z" determines how the expression should be rewritten:
Note that "Z" will always be in at least one of those forms. The first three cases are the base cases. It would be nice to get to one of those cases, because then the lambda can be gotten rid of entirely. However, in the other three cases, the rewrite of the expression still involves a lambda; but, the body of each lambda is smaller than the body of the original lambda, which brings the expression closer to a base case.
Now, as for eliminating lambdas in expressions that have more than one lambda ... Simply pick a lambda that has no lambdas within it (i.e., pick an innermost lambda; for, there will always be at least one innermost lambda if there are any lambdas at all). Eliminate it. Then there will be one less lambda. Repeat the process until there are no more.
We've just shown how using the I, K, B, C, and S combinators it is possible to eliminate lambdas in an otherwise purely applicative expression. Since all combinators can be written as lambda expressions, this means that I, K, B, C, and S can be used to form any combinator. Because of this, we say that {I, K, B, C, S} (i.e., the set containing these five combinators) is a complete combinatory base.
However, this is not necessarily the most special base, or even the smallest. For, if the B and C combinators are omitted from the base, it will still be complete (i.e., {I, K, S} is also complete). In the cases where a lambda is pushed to the left or to the right using "B" or "C", it also works to push it both ways, using "S"; this will cause it to be pushed to a side where it is not needed, but this not a problem, and will just require an extra use of the K combinator. And, in fact, since {I, K, S} is a complete combinator base, it is possible to construct the B and C combinators from I, K, and S. Applying our procedure to the B and C combinators will show that
B = S(KS)K C = S(BBS)(KK) = S(S(K(S(KS)K))S)(KK)
It is also interesting to note that the I combinator can be constructed from just "S" and "K". In fact, "SKx" for any "x" is an identity combinator; for "(SKx)z" rewrites to "Kz(xz)" and thus "z", regardless of what "x" is. So, "SKK" is the identity combinator, then (as is "SKS").
This means that {S, K} is a complete base (and in fact, is a smallest base; no base with just one combinator can be complete, as we'll show in a later section). Still, it is not necessarily the most elegant, or the most useful in all cases.
There might be some interest in a base that does not involve "S". "S" is sort of an odd combinator that might seem kind of complex and thus not well fit to be taken as primitive. It is possible to use "W" in some cases as a primitive instead of "S". In particular, {I, B, C, W, K} is a complete base that does not involve "S". To show that this is complete, it is enough to show that "S" can be constructed from these. It can; remember that
S = B(BW)(BBC)
Also, we might like to know a good procedure for eliminating lambdas using this base (simply using our original procedure with the derived version of "S" would not be a very elegant solution). With "W", it is possible, at the very beginning stage of eliminating lambdas, to change all lambdas with more than one occurence of their variables into ones that only have one occurence (once this is done, there will be no need to use "S" to push a lambda inward, since a variable could not possibly occur on both sides if there is only one occurence). In a lambda "x\Z", if "Z" mentions "x" more than one time, then the lambda can be rewritten as "W (x\y\Z)", with one of the occurences of "x" changed to a "y"; the process may need to be repeated, if "x" still has more than one occurence (i.e., if there were three or more at the beginning).
Now we'll give a specific example of eliminating lambdas with the base {I, B, C, W, K}:
x\y\ times (plus x y) (minusfrom x y)
To begin with, there are multiple occurences of "y"; so, we'll use "W" to split the lambda:
x\W (y\y2\ times (plus x y) (minusfrom x y2))
Now, there are still multiple occurences of "x", so we'll split that lambda also:
W (x\x2\ W (y\y2\ times (plus x y) (minusfrom x2 y2)))
Now it is just a matter of eliminating the lambdas with B and C.
W x\x2\ W y\y2\ times (plus x y) (minusfrom x2 y2) W x\x2\ W y\ B (times (plus x y)) (minusfrom x2) W x\x2\ W (C (B B (B times (plus x))) (minusfrom x2)) W x\B W (B (C (B B (B times (plus x)))) minusfrom) W (B (B W) (C (B B (B C (B (B B) (B (B times) plus))) minusfrom))
Needless to say, our procedure gave quite a messy result in this case.
(The reader may be wondering if this function can be formed
using combinators instead of lambdas; it can, but because
it is a little complicated, we'll delay showing how for a bit.)
(Incidentally, it is a fairly well-known fact of arithmetic that
this function and the function mentioned in problem 2 always yield
the same thing, given the same parameters).
beta-reduces to
and then to
Then we can see that this is the composition of squaring
with succession.
Then, in the new lambda, only the left side is mentioned,
so we can push the lambda left with "C", giving
Then, since "x\ minus x" can be eta-reduced to "minus", we get
We are now left only with one lambda. The only occurence of the "x"
is on the left, so we can push the lambda left with "C":
Now it is just a matter of pushing the lambda right a couple times:
Now, we focus on the next lambda, the one with the "x". Once again,
only the right side mentions "x", so we push right with "B":
Now, we focus on the next lambda, the one with the "g". It can
be gotten rid of immediately using an eta-reduction, leaving
Now, our answer is simply
("Composition composed with composition")
Now, "f" is mentioned on the left ("S f") only, so we rewrite with "C":
This gives us an interesting way of writing the W combinator
in terms of "C", "S", and "I".
This is an interesting combinator which we will call "N". Whereas
"S" was a generalized version of "W", "N" is a more general
form still. From "S"'s rewrite rule,
we see that "S" is like "W" except that it allows the second
"x" to be modified by a function before being passed to the
binary "f". "N", with the rewrite rule,
allows both "x"s to be modified (each by a different function)
before each is passed to the binary "f".
The idea of using a construct like a lambda to refer to functions
began, it seems, with Alonzo Church in the 1930s or so as a device
for his "lambda calculus" (Gottlob Frege I think may have used a similar
notation a bit earlier). Before lambdas were used, functions were
apparently referred to in this sort of way:
This sort of notation requires that the function be given an
arbitrary name "f" (in addition to an arbitrary name "x" for the parameter),
whereas with a lambda construct, a name like "f" is not needed.
There are actually many different notations in use for the lambda
construct. In the most standard notation, the last function would
be written
using "", the greek letter "lambda".
(We've avoided this notation because it is somewhat cluttery and
because it is difficult to get the lambda to match the font size in HTML).
One might also see
where a backslash ("\") is used in place of the lambda symbol. Also,
in the Haskell programming language,
is used.
We've already introduced several interesting combinators,
including I, B, C, W, K, and S. In this section, we'll introduce
a series of combinators called "iterators". These combinators are
very similar to the natural numbers "0", "1", "2", "3", and so forth.
The first iterator we'll introduce is the one which corresponds
to the number two. This combinator, "2", has the rewrite rule:
This combinator takes a function "f" and yields a function that
applies "f" two times. For instance, "2 sqr" would be a function that
squares a number twice (i.e., raises it to the fourth power); applying
"2 sqr" to a number "3" would give "sqr (sqr 3)", or in other words "81".
"2 sqr" is the same as "B sqr sqr".
Now, we introduce the iterator "3", with the rewrite rule:
This combinator take a function "f" and yields a function that
applies "f" three times. In other words, given "f", it yields
"Bf(Bff)", which is the same as "B(Bff)f". Continuing this pattern, we
define "4" to have the rewrite rule:
Combinators that are like "2", "3", and "4" are called iterators.
In an expression "4f", the "f" is called the iterand, and
the the result is called an iteration of "f".
There are, of course, iterators for the two very special numbers
one and zero. "1" is an iterator that takes a function "f" and yields
a function that applies "f" a single time (in other words, it
makes no change on "f"); it would have the rewrite rule:
In extension, the "1" iterator is the same as the "I" combinator,
since it makes no change to its parameter.
Next, the "0" iterator is a combinator that takes a function "f"
and yields a function that applies "f" no times at all:
In the last section we showed how the I, B, C, W, and K combinators
could be used to form all other combinators. Since an iterator is
a kind of combinator, iterators can also be formed from I, B, C, W, and K.
We already mentioned that the "1" iterator is the same as the
I combinator. The "0" iterator when applied to any function yields
the I combinator, so we can say that
Also, the "2" iterator takes a function "f" and yields "Bff"
(composition of "f" with itself). Because of this, we say that
Now, as for the "3" iterator, it is
And, the "4" iterator is
At this point, it looks like "SB" may be a successor function,
since "SB2" is "3" and "SB3" is "4". In fact, "SB" is a successor
function; it may be useful to write it using lambdas:
It takes three parameters; the first one, "n", is the iterator
to take the successor of. The next one is "f", the function to
iterate. The third is "x", the thing that will have "f" applied
to it some number of times. Then, after taking these three parameters,
"SB" yields, not "nfx" (which would be "f" applied to "x" the original
number of times), but "f(nfx)", which is "f" applied to "x" the
original number of times plus one more time.
We found that "SB" is a successor function, a function that adds one.
We'll now show that "NB" is a general addition function, where
"N" is the combinator such that
If "N" is this combinator, then "NB", the addition function, is
Unlike the successor function which took three parameters, this
function takes four parameters; "m" is the iterator that is being added
(i.e., the amount to add); "n" is the original iterator; "f" is
the function being iterated; and "x" is the thing that "f" will
be applied to some number of times. Unlike the successor function
which gave "f(nfx)" ("f" applied to "x" the original number of
times plus one more), the addition function yields "mf(nfx)"
("f" applied to "x" the original number of times plus "m" more times).
As noted in the last problem, "3(2K)" is the same as "6K". And,
in general, "3(2f)" is the same as "6f". Thus, "B 3 2" is the same as "6".
It seems that the B combinator can be used to multiply iterators,
and it can.
This makes sense because in general, if "m" and "n" are iterators,
then "m(nf)" is a function that applies "nf" "m" times, but "nf"
is the applies "f" "n" times, so "m(nf)" applies "f" a total of "m * n"
times. So, in general, "B m n", given a function "f", yields a
function that applies "f" "m * n" times.
It is interesting that the combinator for multiplication is
even simpler than the combinator for addition.
As demonstrated in the last problem, "3 2" is the cube of "2", or
in other words, "8". In general, if "m" and "n" are iterators,
then "m n" is "nm".
This makes sense because in general, if "n" is an iterator,
then "n f" is "f" composed with itself "n" times. And, if "f"
is an iterator, then composing it with itself "n" is the same as
multiplying it with itself "n" times (since composition is used for
multiplication of iterators), and multiplying something by
itself "n" times is what is meant by raising it to the "n"th
exponent.
So, essentially, the "2" iterator acts as a squaring function,
when applied to other iterators. And, "3" acts as a cubing function.
And, "1" acts a function that raises a number to the power of "1"
(i.e., makes no change). And, "0" acts as a function that raises
a number to the power of "0" (i.e., always yields "1").
A note about our notation: although we often omit spaces in
expressions (as in "2Kx"), we will always leave a space between
the names of two iterators. That is, we will not write "32" if
we mean the cube of two; we will instead write "3 2". When we
do write "32", we are referring to the iterator for thirty-two.
There are several properties that numbers are traditionally
taken as having; here are some (where "^" is exponentiation):
These properties hold for all iterators "x", "y", and "z", as nearly
trivial consequences of the definitions we've given of
"+", "*", "^", "0", and "1". We won't bother to show how every
single one can be proven, but we will prove a few...
For instance, the property that
One particularly interesting property is
So, this property is the most fundamental property of "*" of all
(B's rewrite rule), and is a sort of definition of multiplication
in terms of exponentiation.
Similarly, the property
is a fundamental definition of addition in terms of
multiplication and exponentiation.
Now, it should not be thought that all properties of numbers
are trivial to prove with iterators. One that is typically
not-so-easy is
In a theory of iterators, this might be expressed as
Yet, this does not hold for all "x" and "y" (remember that composition
is not commutative). However, it does hold if "x" and "y" are both
iterators, but this is not a simple thing to prove (in fact, we
won't attempt to prove it here at all).
Up until now, we have only used iterators corresponding to
the natural numbers: 0, 1, 2, 3, (etc.). We thought of "2"
as the iterator applies a function twice; "1" is the iterator
that applies a function once; "0" is the iterator that applies
a function no times at all. If we continued the pattern, we
would take "-1" to be the iterator that "takes away" a function
from a parameter. In other words, "-1", given a function, would
yield the function's "inverse", a function that has the opposite
effect as the original.
Unfortunately, although iterators for "0", "1", "2", (etc.) could
be defined in terms of combinators, there is no way that this
can be done for "-1". For, consider our definitions of "3", "2", "1", and "0":
There does not seem to be a way to continue this pattern. However,
if we added an odd extension to our lambda construct, we could, in
this way:
But this is quite an abuse of the lambda construct, as we've currently
defined it.
An addition of a "-1" as a primitive would add great strength to
a language which otherwise could only express combinators, because,
using multiplication, it is possible to form all negative integers
(for instance, "B 3 (-1)" is "-3"). Also, fractions could be formed
by raising a integer to the power of "-1" (for instance, "(-1) 3"
is one third). Then, by raising numbers to fractional powers,
roots of numbers could be expressed (for instance, "(-1) 2 3" is the
square root of three, because "(-1) 2" is the inverse of the squaring
function, the square root function). Roots of negative numbers could be
used to express imaginary numbers.
However, because consideration of "-1" and similar functions would
cause many complications that we are not yet prepared to handle, we will
postpone discussing them further until a later chapter.
Often in this work we'll refer to compositions of various functions.
With our current syntax for composition, this can be quite cluttery.
For instance, the composition of functions "f", "g", "h", and "x"
might be written:
Since this is too cluttery, we introduce a new notation. In our
new notation, the above function would be written:
In general, "Bfg" will be abbreviated to "f * g". The use of "*" is,
of course, suggested by "B"'s use for multiplication.
Also, we will sometimes abbreviate "NBfg" to "f + g". This is
sometimes meaningful even when "f" and "g" are not iterators.
Whereas "*" is used for composing unary functions, "+" is
used to sort of "compose" binary functions. For instance,
"add + mul" is a function that multiplies by number and then
adds by the same number. That is, "(add + mul) 3" is a function
that multiplies by 3 and then adds 3, and is the same as
"add 3 * mul 3". Applying "(add + mul) 3" to, say, the number "2"
would give "add 3 (mul 3 2)", or in other words, "9".
This is the application of "K" upon "x" 3 times and then 2 more times.
In other words, it is "5Kx", or "K(K(K(K(Kx))))". We could have
seen this easily if we'd noticed that "NB 2 3" (the sum of 2 and 3)
is "5".
So, "3(2f)", given an "x", gives "f(f(f(f(f(fx)))))", just as
"6f" does.
The idea of using combinators in this way to represent natural numbers
originated with Alonzo Church in the early 1930s. Actually, Church
did not use the iterator for zero (he began with one); apparently,
he did not like the zero iterator, because it is a function ("KI")
that ignores its parameter (in general Church did not
like functions that ignore their parameters).
We've already mentioned several combinators, including I, B, C, W, K, S,
N, T, and several iterators, with these rewrite rules:
In this section, we'll describe several other interesting combinators,
primarily variations of the above combinators.
We've already used the B combinator extensively. Now we'll focus
on various iterations of "B". For instance, "2B" (read: "B squared"),
or in other words, "BBB" or "B * B", has the rewrite rule:
This is clear, because "(2B)fgxy" is the same as
Whereas "B" is used to compose a function "f" upon a unary
function "g", "2B" can be used to compose "f" upon a binary
function. For instance, "2B sqr plus" is a binary function
that adds two numbers and then squares the result.
Note that "B sqr plus" (as opposed to "2B sqr plus") is not particularly
meaningful; whereas "2B sqr plus" waits for two parameters "x" and "y"
(so to speak), giving them to "plus" before taking the square,
"B sqr plus" only waits for one parameter to give to "plus", and takes
the square (the square is then taken of an adding function then, rather
than a number). This actually could be meaningful after all
if "sqr" is interpreted as the "2" iterator; for sake of clarity,
however, we're using "sqr" and "plus" to mean the functions of
squaring and addition over numbers in the ordinary sense, not iterators.
Now, "3B", as one might expect, has the rewrite rule:
because "(3B)fgxyz" is the same as
"3B" could be useful when composing "f" with a tertiary function
"g" that needs to take 3 parameters before the result is passed
to "f". And, of course, "4B", "5B", (etc.) could be used for
compositions upon functions of 4, 5, and more parameters.
Unlike "B", when "C" is iterated, new interesting combinators
are not formed. For instance, "2C" is the same as "I", because "C(Cf)"
is in all cases the same as "f". This is because taking the converse
of a function one time flips the order of its two parameters, but
taking the converse again restores them to the original order.
Then, "3C" is the same as "C", and "4C" is the same as "I".
In general, if "n" is an iterator, then "nC" is either "C" or "I";
it is "C" if "n" is odd, but "I" if "n" is even.
The W combinator does yield interesting variations when iterated.
For instance, "2W" has the rewrite rule:
This is because "(2W)fx" is the same as
Whereas "W" was used to unify a binary function to a unary one,
"2W" can be used to unify a tertiary function to a unary one.
Then, as would be expected, "3W" can be used to unify 4-ary functions,
and has the rewrite rule:
This is clear because "(3W)fx" is the same as:
One can think of "3W" as making 3 duplicates of its second parameter,
"x" (making a total of 4 instances).
We've already mentioned some iterations of "K" in the last section.
We saw that while "K" can be used to form a function that ignores
a parameter, "2K" can be used to form a function that ignores two
parameters. In particular, "2K" has the rewrite rule:
We know this because "(2K)xyz" is the same as
Also, "3K" can be used to form a function that ignores three
parameters; it has the rewrite rule:
This is clear because "(3K)xyzf" is the same as
We'll now look at what is formed when "B", and iterations of "B",
are applied to "C". We've seen that "C" can be used to swap the
first two parameters of a function that is at least binary.
it has the rewrite rule:
Now, "BC" is similar to "C"; it takes a function and swaps
its second and third parameters (leaving the first intact).
It has the rewrite rule:
Note that the "y" and "z" are interchanged on the right.
To verify that "BC" behaves this way, see that "(BC)fxyz" is
the same as:
Now, "B(BC)", or in other words, "2BC", is also similar to "C".
It takes a function and swaps its third and fourth parameters
(leaving the first and second intact). It has the rewrite rule:
This can be verified by noting that "(2BC)fxyzg" is the same as:
As one might expect, "3BC" is like "2BC" except it postpones
the interchange for one more parameter (i.e., it swaps a function's
fourth and fifth parameters). In general, if "n" is an iterator,
then "nBC" takes a function and swaps its "n+1"th and
"n+2"th parameters.
We'll now look at the results of applying iterations of "B" to "W".
We know that "W" can be used on an at-least binary function to duplicate
its first parameter (i.e., make it take the place of the first and
second parameters); it had the rewrite rule:
This can be verified by noting that "(BW)fxy" is the same as:
As one might expect, "2BW", or in other words, "B(BW)", has the
rewrite rule:
This is clear because "(2BW)fxyz" is the same as:
It is clear that each time "B" is applied the operation of duplication
is postponed another parameter.
The situation with "K" is similar to that of "C" and "W".
We know that "K" can be used on a function (possibly a 0-ary function;
i.e., not a function at all) to yield a function like the original, but
with a "dummy" parameter inserted first in the parameter list. "K" has
the rewrite rule:
Now, "BK" is like "K", except that the second parameter of
the resulting function is ignored, instead of the first. "BK" has the
rewrite rule:
This is clear because "(BK)fxy" is the same as:
As one would expect, applying "2BK" to a function results in
the function but with a dummy parameter inserted after two real ones.
Then, "3BK" takes a function and inserts a dummy parameter after
three real ones.
Iterating "B" upon itself has some interesting results.
Recall that "B" essentially takes a function and yields a
function like the original, but with the first parameter "split"
into two (function and parameter). We see that "BB" is like
"B", except that it splits a function's second parameter
instead of its first:
Similarly, "2BB" would split a function's third parameter
(skipping over 2 of them):
This is clear because "(2BB)fxygz" is the same as:
We've now seen the results of iterating and composing the
basic combinators B, C, W, and K. In summary, iterating each
combinator three times gives:
Remember that by "*" we mean composition, and that iterating
a function yields that function's composition with itself
so many times.
Then, to summarize our discussion of compositions of B, C, W, and K,
note that composing each two times gives:
The B, C, W, and K combinators all take a function "f" and yield
a function like "f", but with the parameter list rearranged
(some parameters possibly duplicated, ignored, or applied
to each other). Because these combinators behave this way, they
are called "regular combinators". Applying "B" to any regular combinator
yields another regular combinator like the original but
with the effect on the parameter list postponed one parameter.
To more precisely define "regular combinator", we say that a
combinator is a regular combinator if and only if
it mentions its first parameter only one time, and at the very front of
the right side of the rewrite rule. The combinators I, B, C, W, K, and S
are all examples of regular combinators. The combinators T and M,
however, are irregular:
We'll now examine interesting variations of the C combinator
which we'll call "pushing" combinators; a pushing combinator is
a regular combinator that when applied to a function yields a function
like the original but with the function's first parameter pushed back
some number of parameters (i.e., the first parameter is postponed).
The C combinator is the most elementary pushing combinator; it postpones
a parameter back one place. The next pushing combinator is:
Essentially, applied to a function, it takes the function's converse,
pushing the first parameter back one place; then, "BC" is taken of the
result, swapping the function's new second parameter with the third,
pushing it back one further.
The next pushing combinator is:
This combinator takes a function, swaps its first parameter
with its second, its new second parameter with its third,
and then its new third parameter with its fourth; in the final
result, the original first parameter is swapped into the fourth place
and the other three are shifted forward.
The pushing combinators are quite useful; thus, we'll use
a special syntax for naming them ... "BC * C" will be called
simply "C2", and "2BC * BC * C" will be called "C3"
and so forth.
There is another kind of variation of "C", "pulling" combinators.
These behave in the opposite way as pushing combinators; whereas
a pushing combinator postpones a parameter some number of places,
a pulling combinator takes a parameter some number of places back
and pulls it to the front. That is, a pulling combinator, when
applied to a function of several parameters, yields a function like
the original but with one of the later parameters taken first.
The C combinator is the most elementary pulling combinator; it
pulls a function's second parameter to the front. The next
pulling combinator is:
When applied to a function, it swaps the function's second and
third parameters, then swaps the new second parameter with the first.
The result is that the third parameter is pulled to the first place
while the original first and second parameters are pushed back.
The next pulling combinator is:
This combinator pulls a function's fourth parameter to the front.
It is interesting that pulling combinators are formed in
quite a similar way to pushing combinators. Also, all pushing
combinators and pulling combinators can be formed from just "B"
and "C". For instance, the pulling combinator just mentioned would
be written without the "*" abbreviation as:
Because pulling combinators are quite useful, we also use
a special syntax for naming them;
will be called "C2-1", while
will be called "C3-1". This is a natural
way of naming pulling combinators, since they are inverses of
corresponding pushing combinators (the inverse of a function
"f" is commonly (and naturally) written "f-1").
So, applying "2W" to this unifies all three parameters, giving
This function could also have been written,
In general, if "m" and "n" are iterators, then "mK(nK)" is
a function that ignores "m" parameters, takes the important
parameter, ignores "n" more parameters, and yields the important
one. It takes a total of "m+n+1" parameters, "m+n" being the
number of parameters that are ignored.
Note that our last example, "2KI", could be written "2K(0K)";
as one would think from the rule stated above, it ignores two
parameters, takes the important one, doesn't ignore any more,
and yields the important one.
Then, applying "BC" to this would interchange the second and third
parameters, giving:
So, applying "BW" to this would duplicate the second parameter "y",
making it take the place of the third parameter "z" also, giving:
Answers to Problem Set 1
Answers to Problem Set 2
Answers to Problem Set 3
Answers to Problem Set 4
Answers to Problem Set 5
(f\g\x\ f(gx)) sqr succ
(g\x\ sqr (gx)) succ
x\ sqr (succ x)
Answers to Problem Set 6
Answers to Problem Set 7
Answers to Problem Set 8
B (times 3) (x\ minus x 3)
B (times 3) (C (x\ minus x) 3)
B (times 3) (C minus 3)
x\ B (minusfrom (sqr x)) (y\ sqr y)
x\ B (minusfrom (sqr x)) sqr (by eta-reduction)
C (x\ B (minusfrom (sqr x))) sqr
C (B B (x\ minusfrom (sqr x))) sqr
C (B B (B minusfrom (x\ sqr x))) sqr
C (B B (B minusfrom sqr)) sqr
f\g\x\ Bf(y\gxy)
f\g\x\ Bf(gx) (eta-reduction)
f\g\ B(Bf)(x\gx)
f\g\ B(Bf)g (eta-reduction)
f\ B(Bf)
BB(f\Bf)
BBB (eta-reduction)
Answers to Problem Set 9
f\ S (x\fx) (x\x)
f\ S f (x\x) (eta-reduction)
f\ S f I ("x\x" is I combinator)
C (f\ S f) I
CSI (eta-reduction)
Answers to Problem Set 10
f\g\h\x\ f(gx)(hx)
f\g\h\ S(Bfg)h
f\g\ S(Bfg)
f\ BS(Bf)
B(BS)B
Sfgx = fx(gx)
Nfghx = f(gx)(hx)
Historical Note
The function, f(x) = 2*x + 3
x. 2*x + 3
\x. 2*x + 3
\x-> 2*x + 3
Common Iterators: 2, 3, 4
2fx = f(fx)
3fx = f(f(fx))
4fx = f(f(f(fx)))
Problem Set 1 (Answers)
One and Zero
1fx = fx
0fx = x
Problem Set 2 (Answers)
Construction of Iterators from Other Combinators
0 = KI
2 = WB
f\x\ f(f(fx))
f\ Bf(Bff)
S(f\Bf)(f\Bff)
SB(WB)
f\x\ f(f(f(fx)))
f\Bf(Bf(Bff))
SB(f\Bf(Bff))
SB(SB(f\Bff))
SB(SB(WB))
The Successor Function
SB
n\f\ Bf(nf)
n\f\x\ f(nfx)
Problem Set 3 (Answers)
Addition
Nfghx = f(gx)(hx)
m\n\f\ B(mf)(nf)
m\n\f\x\ mf(nfx)
Problem Set 4 (Answers)
Multiplication
Problem Set 5 (Answers)
Exponentiation
Problem Set 6 (Answers)
Properties of Iterators
0 + x = x
1 * x = x
0 * x = 0
x ^ 1 = x
x ^ 0 = 1
(x + y) + z = x + (y + z)
(x * y) * z = x * (y * z)
(x + y) * z = (x * z) + (y * z)
z ^ (x + y) = (z ^ x) * (z ^ y)
z ^ (x * y) = (z ^ y) ^ x
(x * y) * z = x * (y * z)
can be proven by showing that "B" is associative, that "B(Bfg)h" is in all
cases the same as "Bf(Bgh)". This is fairly obvious, because
both "B(Bfg)h" and "Bf(Bgh)" give "f(g(hx))" when applied to a
parameter "x".
z ^ (x * y) = (z ^ y) ^ x
Written using "B" for "*", it says:
(Bxy)z = x(yz)
z ^ (x + y) = (z ^ x) * (z ^ y)
x * y = y * x
Bxy = Byx
Negative Numbers
3 = f\x\ f(f(fx))
2 = f\x\ f(fx)
1 = f\x\ fx
0 = f\x\ x
-1 = f\(fx)\ x
-2 = f\(f(fx))\ x
-3 = f\(f(f(fx)))\ x
New Notation for Composition
Bf(Bg(Bhx))
f * g * h * x
Answers to Problem Set 1
Answers to Problem Set 2
Answers to Problem Set 3
SBIfx
Bf(If)x (S's rewrite rule)
Bffx (I's rewrite rule)
f(fx) (B's rewrite rule)
as we'd expect the "2" iterator to.
SB(KI)fx
Bf(KIf)x (S's rewrite rule)
BfIx (K's rewrite rule)
f(Ix) (B's rewrite rule)
fx (I's rewrite rule)
Answers to Problem Set 4
B(2K)(3K)x (N's rewrite rule)
2K(3Kx) (B's rewrite rule)
Answers to Problem Set 5
3(2f)x
2f(2f(2fx))
2f(2f(f(fx)))
2f(f(f(f(fx))))
f(f(f(f(f(fx)))))
Answers to Problem Set 6
Historical Note
Ix = x
Bfgx = f(gx)
Cfxy = fyx
Wfx = fxx
Kxy = x
Sfgx = fx(gx)
Nfghx = f(gx)(hx)
Txf = fx
0fx = x
1fx = fx
2fx = f(fx)
3fx = f(f(fx))
Iterations of B
(2B)fgxy = f(gxy)
2Bfgxy
B(Bf)gxy (2's rewrite rule)
Bf(gx)y (B's rewrite rule)
f(gxy) (B's rewrite rule)
(3B)fgxyz = f(gxyz)
3Bfgxyz
B(B(Bf))gxyz (3's rewrite rule)
B(Bf)(gx)yz (B's rewrite rule)
Bf(gxy)z (B's rewrite rule)
f(gxyz) (B's rewrite rule)
Problem Set 1 (Answers)
Iterations of C
Problem Set 2 (Answers)
Iterations of W
(2W)fx = fxxx
2Wfx
W(Wf)x (2's rewrite rule)
Wfxx (W's rewrite rule)
fxxx (W's rewrite rule)
(3W)fx = fxxxx
(3W)fx
W(W(Wf))x (3's rewrite rule)
W(Wf)xx (W's rewrite rule)
Wfxxx (W's rewrite rule)
fxxxx (W's rewrite rule)
Problem Set 3 (Answers)
Iterations of K
(2K)xyz = x
2Kxyz
K(Kx)yz (2's rewrite rule)
Kxz (K's rewrite rule)
x (K's rewrite rule)
(3K)xyzf = x
3Kxyzf
K(K(Kx))yzf (3's rewrite rule)
K(Kx)zf (K's rewrite rule)
Kxf (K's rewrite rule)
x (K's rewrite rule)
Problem Set 4 (Answers)
Compositions of C
Cfxy = fyx
(BC)fxyz = fxzy
BCfxyz
C(fx)yz (B's rewrite rule)
fxzy (C's rewrite rule)
(2BC)fxyzg = fxygz
2BCfxyzg
B(BC)fxyzg
BC(fx)yzg
C(fxy)zg
fxygz
Problem Set 5 (Answers)
Compositions of W
Wfx = fxx
Similarly, "BW" can be used on an at-least tertiary function to
duplicate its second parameter, leaving the first parameter intact.
It has the rewrite rule:
(BW)fxy = fxyy
BWfxy
W(fx)y (B's rewrite rule)
fxyy (W's rewrite rule)
(2BW)fxyz = fxyzz
2BWfxyz
B(BW)fxyz (2's rewrite rule)
BW(fx)yz (B's rewrite rule)
W(fxy)z (B's rewrite rule)
fxyzz (W's rewrite rule)
Problem Set 6 (Answers)
Compositions of K
Kfx = f
(BK)fxy = fx
BKfxy
K(fx)y (B's rewrite rule)
fx (K's rewrite rule)
Problem Set 7 (Answers)
Compositions of B
(BB)fxgy = fx(gy)
(2BB)fxygz = fxy(gz)
2BBfxygz
B(BB)fxygz (2's rewrite rule)
BB(fx)ygz (B's rewrite rule)
B(fxy)gz (B's rewrite rule)
fxy(gz) (B's rewrite rule)
Problem Set 8 (Answers)
Summary
3B = B * B * B = f\g\x\y\z\ f(gxyz)
3C = C * C * C = f\x\y\ fyx
3W = W * W * W = f\x\ fxxxx
3K = K * K * K = f\x\y\z\ f
2BB = B(BB) = f\x\y\g\z\ fxy(gz)
2BC = B(BC) = f\x\y\z\g\ fxygz
2BW = B(BW) = f\x\y\z\ fxyzz
2BK = B(BK) = f\x\y\z\ fxy
Txf = fx
Mf = ff
Pushing Combinators
BC * C
2BC * BC * C
Problem Set 9 (Answers)
Pulling Combinators
C * BC
C * BC * 2BC
BC(B(BC)(B(BC)))
C * BC
C * BC * 2BC
Problem Set 10 (Answers)
Answers to Problem Set 1
x\y\z\ (x*y) + z
Answers to Problem Set 2
Answers to Problem Set 3
x\y\z\ (x*y) + z
x\ (x*x) + x
N plus sqr I
Answers to Problem Set 4
(2KI)xyz = z
(2K(2K))xyzfg = z
Answers to Problem Set 5
x\y\z\ (x*y) + z
x\y\z\ (x*z) + y
Answers to Problem Set 6
x\y\z\ (x*y) + z
x\y\ (x*y) + y
Answers to Problem Set 7
x\f\y\z\ (x*y) + z
Answers to Problem Set 8
f\x\y\z\ (fx * y) + z
x\f\y\z\ (x * fy) + z
x\y\f\z\ (x * y) + fz