[Sept, 2021] I will use a C-like language throughout, with substantial liberties in its syntax, and I will try to answer "what if" and "how" questions regarding the implementation of some new features that actually cannot be implemented in C due to its limitations. I will examine and highlight those limitations. The scope of this exercise is to better understand Lisp and the power of the abstractions it offers over and above what most languages have, even though there will be no line of Lisp anywhere. For the full experience, you are encouraged to come up with new examples, sketch some exercises with pen and paper and think hard about these problems on your daily walks.
Were there no advantages to be reaped from these studies, beyond the gratification of an innocent curiosity, yet ought not even this to be despised; as being one accession to those few safe and harmless pleasures, which are bestowed on human race.
— David Hume - An Enquiry Concerning Human Understanding (1748)
The function, as an abstraction, is known to every developer. But what is, conceptually speaking, a function? Ignoring any implementation details like stacks and pointers and just taking a higher-level view of the concept of function, we can begin to understand its essence (example slightly adapted),
[…] people who use the word 'function' ordinarily have in mind expressions in which a number is just indicated indefinitely by the letter
x
, e.g.2*x3 + x
People call
x
the argument, and recognize the same function again in2*13 + 1,
2*43 + 4,
2*53 + 5,only with different arguments, viz.
1
,4
, and5
. From this we may discern that it is the common element of these expressions that contains the essential peculiarity of a function; i.e. what is present in2*x^3 + x
over and above the letterx
. We could write this somewhat as follows:2*( )3 + ( )
I am concerned to show that the argument does not belong with a function, but goes together with the function to make up a complete whole; for a function by itself must be called incomplete, in need of supplementation, or unsaturated […].
Thus, e.g., we split up the sentence 'Caesar conquered Gaul' into 'Caesar' and 'conquered Gaul'. The second part is unsaturated - It contains an empty place; only when this place is filled up with a proper name, or with an expression that replaces a proper name, does a complete sense appear. Here too I give the name 'function' to the meaning of this unsaturated part. In this case the argument is Caesar.
— Frege - Function and Concept (1891)
Bluntly, a function is a piece of code that is missing another piece of code to make it whole. To make it complete. We usually specify the missing pieces by giving them names (the arguments) so that we are not confused which is which when the time comes to replace them. We can translate the above example in C,
int f(int ___) { return 2 * cube(___) + ___; }
The essence of this function is the cube
, the multiplication(*
) and the
addition (+
). What is missing, the thing that we need to supplement it with, is a
number on which to apply the cube
, the multiplication and the addition
operations.
Usually we're defining such functions as,
int f(int x) { return 2 * cube(x) + x; }
but notice that it does not matter what the name of the parameter is, which is
nicely exemplified in our previous example, where we've used ___
to highlight
the fact that what we are doing is taking away some piece of code which we
intend to put in later. Thus, x
could be replaced throughout the function with
abc
or my-new-parameter
and it would not matter a bit regarding the
semantics of our intention. It would still be the same function.
Let's use this function and analyze what happens during its call,
/* We can give this whatever name we want */ int a = 5; /* Call the function (i.e. add the missing pieces) */ f(a); /* Inspect the body of the function after the replacement. */ return 2 * cube(5) + 5;
Notice there is no trace of a
inside the body of the function after the
replacement and before the call of the multiplication, the cube
and the
addition. Before the missing pieces are added, or before the function is
supplemented, if you will, the parameter itself gets replaced as well. The call
is f(a);
but the result is not return 2 * cube(a) + a;
. The C implementation
knows that what we really want is the value of a
, which is 5
and not a
itself. This is the good news. The bad news is that it enforces this
replacement without a right to appeal from our side. We will get to the
limitation of this shortly.
If all this seems a bit pedantic, fear not. It is important for what follows. As
we will see, the kind of "missing things" (the ___
's) the we can extract and
talk about in the process of creating (or defining) a function has a direct
relation to the kind of abstractions we can create in a programming
language. They put a high limit on the kind of functions we can define. Or, more
directly, it directly relates to the power of the language.
Essences are those common and recurring qualities essential to all members of a class, conceived of abstractly, that is, in abstraction from all particular problems.
— Arthur F. Holmes - A History of Philosophy, John Locke's Theory of Ideas
These abstractions that we just saw, that involve a kind of 'replacing of
values', are what most programming languages offer, including C. We can define
any and how many functions we might like, but when it comes to defining
functions like or similar to if
and while
, among others, we reach the dead
C. These are all offered by the original designers of the language, specified in
a standard and enacted by all the complying C implementations. They are useful
to us but we cannot implement them, if we wanted to. Or, what it's more
important, we cannot implement anything similar to them for our own needs.
if
statementA simple example,
if (a > b) { result = a-b; } else result = a+b;
I've said above that this is a function, even though in C terms it is a
statement. Indeed, this statement is underneath nothing more than a void
function with a special syntax approved by the standard. While the standard is
untouchable, we can still imagine implementing an if
function (iff
) that
behaves like the original one,
void iff(bool condition, ??? exp1, ??? exp2) { if (condition) exp1; else exp2; }
I can clearly imagine this being the C's way of doing things and it would subtract nothing from its usefulness,
iff(a > b, result = a-b, result = a+b);
iff(a > b, result = a-b, result = a+b);
This is not valid C code, but its a skeleton on which we want to build. It is
code under development, if you will. But the intention is clear: we want our
function to decide if it should evaluate result = a-b
or result = a+b
based
on the value of a > b
, but not both of them.
I've left out the type of the iff
's parameters, since there is actually no C
type that we can use to do what we want. We will come back to this in the next
section when we will discuss in more detail what type those parameters should
really be.
This looks similar to our very first example, the f
function. As we saw then,
before replacing the missing code inside our function body, the C language
evaluates the parameters to get their value. And it is the value of these
parameters and not the parameters themselves that get passed inside our function
body to replace the missing code. To supplement our function and make it
whole. This is pass by value and is used in C as well as in most other
programming languages.
To make the above clear, let us imagine the inside of our function right after it was called and check the value of the parameters,
/* Take some values for a and b */ int a=5; int b=10; int result; /* Make the call. */ iff(a > b, result = a-b, result = a+b); /* Inspect the inside of the function after the call. */ if (false) -5; else 15;
To sum up, all the parameters to any function we might care to define are
evaluated before we can even get a chance to inspect and decide what to do with
them. This is the bad news from our previous section. In the above call, a >
b
, result = a-b
and result = a+b
are all evaluated before entering the
iff
body, in no particular order (the order of evaluation is not specified in
the standard and it is left to each implementation to decide). We would have
wanted the whole expression and we to be the ones that decide what expression is
to be evaluated and when. The point to take is that we do not control the time
of the evaluation nor the order of it, nor we can decide not to evaluate some of
the parameters at all if we so choose.
There is no reason enough to despair for not being able to implement the iff
function. After all, our language already provides this conditional. The
point of the exercise, though, is to extend the language to its breaking
point so as to understand both it and more powerful languages
better. And this, we did. Nonetheless, there are really useful
conditionals out there which, owing to the same limitations that we've
already discovered, cannot really be implement and are not offered by
the language standard either,
cond(a > b, "a is greater than b", a > c, "a is greater than c", weekday(FRIDAY), "let's go home", else, "nothing matched");
Known as cond
in the lisp world, the generic conditional is similar in looks
with the switch
statement, but each case has a brand new condition. I could
imagine a C implementation of a function with an even and variable number of
arguments, where each pair of arguments specifies a condition and a result
returned by the cond
function if that condition is true. All conditions are
evaluated in order until one evaluates to true; otherwise the last one is
returned. Here again we need to control the order of evaluation of the
parameters and choose not to evaluate some of them, if we so wish.
Or, a small update to the standard if
function would let it
return the result of its last evaluation (an expression, that is), in
which case we really do need to be able to implement the if
function,
result = iffv(a > b, a-b, a+b);
Or an extended if
that catches any errors that might be thrown during the
evaluation of its first argument and sends them to its last argument (I've
ignored the actual error here),
if_timeout( elevator_state(), printf("Elevator is running"), printf("Elevator is stopped"), printf("Unknown elevator state, possibly due to bad sensor.") );
Any such or other variants of conditionals specifically designed for our own problem domain would greatly enhance our expressiveness in the language. But all these depend on features we found lacking in our language, for the reasons we've already seen.
We have a similar story in the case of the while
statement. The only
difference is in the way we would like to handle the parameters to it. I will
again assume that while
is an implementable function, and just use it,
while(a > 0, a--, add_to_list(a));
What we want is to evaluate the first argument and if that is true it evaluate
the second and third arguments in order, and terminate otherwise. Somewhat
similar to the if
, in that it selectively evaluates its arguments based on one
of them. But unlike an if
, a while
continues this process over and over
again until its first argument evaluates to false. That is, we would like to
have the ability not only to selectively evaluate what parameters we like, but
also the ability to evaluate some of them more than once. This, our language
again does not offer. If we remember the bad news, any parameter is evaluated
once before entering the function body, after which we only have access to the
evaluation result and no way to repeat that evaluation again, if we so choose.
We cannot have a function, for example, that evaluates an expression a given number of times (10) as long as it returns false, with a timeout (5000 ms) between the evaluations,
try_wait_ms( open_lock(), 5000, 10, error("cannot open lock") );
As above, similar abstractions in the same spirit as a while
function
are imaginable but not expressible with our meager tools.
The while
function evaluates all its parameters more than once until the first
one returned false. This for
function is a mixed bag,
for(a = 1; a < 10; a++; add_to_list(a))
The for
function evaluates its first argument only once, expects the second
argument to be an expression returning a boolean and evaluates the last two as
long as the second expression returns true. Again, not possible for multiple
reasons, one of them being the inability to control when, if and how many times
we can evaluate the parameters given to a function.
Can you think of other useful tools? After all, we're barely scratched the surface while trying, and failing, to implement some of the features our language already offers. Once you're thus liberated, you start imagining and implementing abstractions that were not possible before. And all that in the language you are already using. What usually happens if these abstraction are not available? A differet language or tools get implemented that takes care of current limitations. See The Nature of Lisp for a very beautiful example from the Java world. Ok. I'll be of some help and offer you a few more impossible abstractions. But you'll have to start and think out some new ones yourself. Promise? Let's go!
function
The previous kind of abstractions had to do with the evaluation strategy of the
arguments passed to a function. At least, that's what we've highlighted in our
examples. We saw that some of the arguments can be evaluated multiple times
(while
and for
) or not evaluated at all (if
) and that this was decided
based on the values of arguments or expressions that were also passed to those
functions. We also saw how this was useful if nothing more than for the reason
than these are abstractions that any C developer already uses.
This section stresses a different kind of abstraction. This is already evident from the previous discussion but there we ignored it. This time, we will focus on what can actually pass as a function parameter. Let's take a closer look at what kind of code we can actually implement in C,
int iff(bool cond, int ___, int ______) { if (cond) ___ * 10; else ______ + 5; }
and the kind of code we can't,
int iff(bool cond, ??? ___, ??? ______) { if (cond) ___; else ______; }
I am again using question marks for types the language does not quite offer and
underscores to better reflect what we want to extract from our function to make
it more general, more abstract. In the first case, the parameters are int
's
and what we are extracting from the function are just the places where those
int
's would go. In the second example, the parameters are expressions and what
we are extracting from the function are whole expressions of the
language. Expressions with which we want to supplement our function at a later
date and evaluate them in their respective places. What is another name for one
or more expressions evaluated in order? A function. So the correct type for
these parameters, as well as in our if
, for
and while
examples from above,
are functions,
int iff(bool cond, fn exp1, fn exp2) { if (cond) exp1(); /* Call the function parameter */ else exp2(); /* Call the function parameter */ }
While the type of the parameters are indeed fn
, or function, notice that we're
actually passing expressions. What this means is that we need an ability to say
this to our language,
x + 1
and transmit the intention of creating and passing this function as a parameter, at run-time,
int random_name(int x) { return x + 1; }
Except that functions are not first-class citizens in C. There is no such type
as fn
. That is, we cannot build functions at runtime and pass them, as
objects, to other functions for their use. What we can do, is define a function
statically and pass a pointer to that function to our map function. That means
that, even if we use a function just once, we still have to define it before we
use it. More importantly, the kind of functions we can thus use are limited
since we cannot use and catch variables (more on this below). In short, we've
uncovered another limitation in our language.
Thus far we've only seen what a replacement of sequences of expressions would
mean for our ability to define new type of functions. We've correctly deduced
that these expressions are nothing more than functions of zero parameters which
we can just call at the right place. But what if we could define functions with
parameters of type function
that would themselves have parameters. Let's
consider map
,
int numbers[5] = [1, 2, 3, 4, 5]; int squared_numbers[5] = map(2 * x, numbers); /* newlist == [2, 4, 6, 8, 10] */
Map is a function that accepts two parameters, a function and an array (of ints,
in our case). The result of calling this function is a new list where each
element is replaced with the result of the function applied to the old
element. 2 * x
is a function of one parameter x
, that doubles the value of
its parameter.
We've said above that using the C's "function to pointer" type is rather limited in its ability. Besides the fact that we would need endless functions defined before their use, what if we have code to double all the elements of our list using an extra local variable,
int d = 2; int numbers[5] = [1, 2, 3, 4, 5]; int squared_numbers[5] = map(d * x, numbers); /* newlist == [1, 4, 6, 8, 10] */
But d
would not be defined in our "function pointer" parameter and we cannot
achieve what we want. What we want is a closure, a much powerful construct
directly related to the ability of building functions at run-time, which is
again not offered by C.
So the correct call to our map
function would include creating a function at
runtime, a function that would define what are its bound variables, that is, it
would have to know what needs to be replaced where, and it would also need the
ability to refer to variables around it, so to speak,
int d = 2; int squared_numbers[5] = map(fn(x){d * x}, numbers);
This, again, is not valid C code, but we're just imagining what such a feature could look like.
Even though we're so deep into Lisp territory already that we barely see the C waters now, let's have a mind stretch and think about other possibilities, maintaining, as much as we can, a semblance of C code.
int a = 10; loop(start=1; stop=10, printf("looping..\n")); loop(start=1; stop=a*10, printf("looping..\n")) loop(start=1; stop=10; step=2; index=i, printf("looping %d..\n", i))
These are all looping constructs evaluating the second parameter a given number
of times, based on the first parameter. The first parameter is an expression
where we are allowing some keywords like start
, step
, etc. What is special
here is that we would need the ability not only to add the missing pieces back
into our "unsaturated" function, as we did initially, but we would also need the
ability to parse those parameters, to extract relevant info from them and
possibly to include part of them in other places (see i
, for example). We've
seen this in the for
example, where the first parameter contains an
initialization of a variable that we need to use subsequently in the other
parameters. What we need, in short, that we again don't posses? For the ability
of the C language to parse itself.
So C lacks any means to actually handle and manipulate code within the C language itself, as we saw in the last example. Secondly, it does not have functions as objects in their own right that we can handle, manipulate, pass to functions, create and destroy at run-time. Thirdly, it does not offer a way of avoiding the passing by value of its parameters. That would help us, in special cases, to have a chance to take a look at those parameters, avoid evaluating some of them or evaluating others multiple times, transforming them, and come up, eventually, with new language constructs that the original language designers have not provided us or that are so specific to our use-case that nobody could have come up with such abstractions but us.
All these dead-ends and design choices that do not play well with each another and limit the language extensibility add up. What we're left with are just some limited forms of abstractions. After a while, we hit a wall and we wished we could express something the language has no way of expressing.
If you are thinking that you don't need this ability to implement new
functionality, stop and think what the language already offers in the case of
the if
statement, for example. For a clearer view, consider it as a function,
as we did above. What you are passing to this if
function now are really
expressions. Expressions that do not all of them get evaluated, meaning, the
language you are using has a means of avoiding evaluation of the function call
and selecting which expression to evaluate at a later time. But you do not have
the power to create such a function. Why not? Better to take a walk in the fresh
air.