A Lisp adventure on the calm waters of the dead C

I will use a C-like language throughout, with substantial liberties in its syntax, and I will try to answer "what if" and "how" questions regarding the implementation of some new features that actually cannot be implemented in C due to its limitations. I will examine and highlight those limitations. The scope of this exercise is to better understand Lisp and the power of the abstractions it offers over and above what most languages have, even though there will be no line of Lisp anywhere. For the full experience, you are encouraged to come up with new examples, sketch some exercises with pen and paper and think hard about these problems on your daily walks.

Were there no advantages to be reaped from these studies, beyond the gratification of an innocent curiosity, yet ought not even this to be despised; as being one accession to those few safe and harmless pleasures, which are bestowed on human race.

— David Hume - An Enquiry Concerning Human Understanding (1748)

The basic abstractions of any language, the standard function

The function, as an abstraction, is known to every developer. But what is, conceptually speaking, a function? Ignoring any implementation details like stacks and pointers and just taking a higher-level view of the concept of function, we can begin to understand its essence (example slightly adapted),

[…] people who use the word 'function' ordinarily have in mind expressions in which a number is just indicated indefinitely by the letter x, e.g.

2*x3 + x

People call x the argument, and recognize the same function again in

2*13 + 1,
2*43 + 4,
2*53 + 5,

only with different arguments, viz. 1, 4, and 5. From this we may discern that it is the common element of these expressions that contains the essential peculiarity of a function; i.e. what is present in 2*x^3 + x over and above the letter x. We could write this somewhat as follows:

2*( )3 + ( )

I am concerned to show that the argument does not belong with a function, but goes together with the function to make up a complete whole; for a function by itself must be called incomplete, in need of supplementation, or unsaturated […].

Thus, e.g., we split up the sentence 'Caesar conquered Gaul' into 'Caesar' and 'conquered Gaul'. The second part is unsaturated - It contains an empty place; only when this place is filled up with a proper name, or with an expression that replaces a proper name, does a complete sense appear. Here too I give the name 'function' to the meaning of this unsaturated part. In this case the argument is Caesar.

— Frege - Function and Concept (1891)

Bluntly, a function is a piece of code that is missing another piece of code to make it whole. To make it complete. We usually specify the missing pieces by giving them names (the arguments) so that we are not confused which is which when the time comes to replace them. We can translate the above example in C,

int f(int ___) {
    return 2 * cube(___) + ___;
}   

The essence of this function is the cube, the multiplication(*) and the addition (+). What is missing, the thing that we need to supplement it with, is a number on which to apply the cube, the multiplication and the addition operations.

Usually we're defining such functions as,

int f(int x) {
    return 2 * cube(x) + x;
}    

but notice that it does not matter what the name of the parameter is, which is nicely exemplified in our previous example, where we've used ___ to highlight the fact that what we are doing is taking away some piece of code which we intend to put in later. Thus, x could be replaced throughout the function with abc or my-new-parameter and it would not matter a bit regarding the semantics of our intention. It would still be the same function.

Let's use this function and analyze what happens during its call,

/* We can give this whatever name we want */
int a = 5;

/* Call the function (i.e. add the missing pieces) */
f(a);

/* Inspect the body of the function after the replacement. */
return 2 * cube(5) + 5;  

Notice there is no trace of a inside the body of the function after the replacement and before the call of the multiplication, the cube and the addition. Before the missing pieces are added, or before the function is supplemented, if you will, the parameter itself gets replaced as well. The call is f(a); but the result is not return 2 * cube(a) + a;. The C implementation knows that what we really want is the value of a, which is 5 and not a itself. This is the good news. The bad news is that it enforces this replacement without a right to appeal from our side. We will get to the limitation of this shortly.

If all this seems a bit pedantic, fear not. It is important for what follows. As we will see, the kind of "missing things" (the ___'s) the we can extract and talk about in the process of creating (or defining) a function has a direct relation to the kind of abstractions we can create in a programming language. They put a high limit on the kind of functions we can define. Or, more directly, it directly relates to the power of the language.

Conditional evaluation of parameters

Essences are those common and recurring qualities essential to all members of a class, conceived of abstractly, that is, in abstraction from all particular problems.

— Arthur F. Holmes - A History of Philosophy, John Locke's Theory of Ideas

These abstractions that we just saw, that involve a kind of 'replacing of values', are what most programming languages offer, including C. We can define any and how many functions we might like, but when it comes to defining functions like or similar to if and while, among others, we reach the dead C. These are all offered by the original designers of the language, specified in a standard and enacted by all the complying C implementations. They are useful to us but we cannot implement them, if we wanted to. Or, what it's more important, we cannot implement anything similar to them for our own needs.

The classic if statement

A simple example,

if (a > b) {
    result = a-b;
} else
    result = a+b; 

I've said above that this is a function, even though in C terms it is a statement. Indeed, this statement is underneath nothing more than a void function with a special syntax approved by the standard. While the standard is untouchable, we can still imagine implementing an if function (iff) that behaves like the original one,

void iff(bool condition, ??? exp1, ??? exp2) {
    if (condition) 
        exp1;
    else
        exp2;
}  

I can clearly imagine this being the C's way of doing things and it would subtract nothing from its usefulness,

iff(a > b, result = a-b, result = a+b);
iff(a > b, result = a-b, result = a+b);
                

This is not valid C code, but its a skeleton on which we want to build. It is code under development, if you will. But the intention is clear: we want our function to decide if it should evaluate result = a-b or result = a+b based on the value of a > b, but not both of them.

I've left out the type of the iff's parameters, since there is actually no C type that we can use to do what we want. We will come back to this in the next section when we will discuss in more detail what type those parameters should really be.

This looks similar to our very first example, the f function. As we saw then, before replacing the missing code inside our function body, the C language evaluates the parameters to get their value. And it is the value of these parameters and not the parameters themselves that get passed inside our function body to replace the missing code. To supplement our function and make it whole. This is pass by value and is used in C as well as in most other programming languages.

To make the above clear, let us imagine the inside of our function right after it was called and check the value of the parameters,

/* Take some values for a and b */
int a=5;
int b=10;
int result;

/* Make the call. */
iff(a > b, result = a-b, result = a+b);

/* Inspect the inside of the function after the call. */
if (false) 
    -5;
else
    15;  

To sum up, all the parameters to any function we might care to define are evaluated before we can even get a chance to inspect and decide what to do with them. This is the bad news from our previous section. In the above call, a > b, result = a-b and result = a+b are all evaluated before entering the iff body, in no particular order (the order of evaluation is not specified in the standard and it is left to each implementation to decide). We would have wanted the whole expression and we to be the ones that decide what expression is to be evaluated and when. The point to take is that we do not control the time of the evaluation nor the order of it, nor we can decide not to evaluate some of the parameters at all if we so choose.

Other useful, but impossible, conditionals

There is no reason enough to despair for not being able to implement the iff function. After all, our language already provides this conditional. The point of the exercise, though, is to extend the language to its breaking point so as to understand both it and more powerful languages better. And this, we did. Nonetheless, there are really useful conditionals out there which, owing to the same limitations that we've already discovered, cannot really be implement and are not offered by the language standard either,

cond(a > b,           "a is greater than b",
     a > c,           "a is greater than c",
     weekday(FRIDAY), "let's go home",
     else,            "nothing matched");   

Known as cond in the lisp world, the generic conditional is similar in looks with the switch statement, but each case has a brand new condition. I could imagine a C implementation of a function with an even and variable number of arguments, where each pair of arguments specifies a condition and a result returned by the cond function if that condition is true. All conditions are evaluated in order until one evaluates to true; otherwise the last one is returned. Here again we need to control the order of evaluation of the parameters and choose not to evaluate some of them, if we so wish.

Or, a small update to the standard if function would let it return the result of its last evaluation (an expression, that is), in which case we really do need to be able to implement the if function,

result = iffv(a > b, a-b, a+b);

Or an extended if that catches any errors that might be thrown during the evaluation of its first argument and sends them to its last argument (I've ignored the actual error here),

if_timeout(
    elevator_state(),
    printf("Elevator is running"),
    printf("Elevator is stopped"),           
    printf("Unknown elevator state, possibly due to bad sensor.")
); 

Any such or other variants of conditionals specifically designed for our own problem domain would greatly enhance our expressiveness in the language. But all these depend on features we found lacking in our language, for the reasons we've already seen.

Repeated evaluation of the same parameters

We have a similar story in the case of the while statement. The only difference is in the way we would like to handle the parameters to it. I will again assume that while is an implementable function, and just use it,

while(a > 0, a--, add_to_list(a));
                

What we want is to evaluate the first argument and if that is true it evaluate the second and third arguments in order, and terminate otherwise. Somewhat similar to the if, in that it selectively evaluates its arguments based on one of them. But unlike an if, a while continues this process over and over again until its first argument evaluates to false. That is, we would like to have the ability not only to selectively evaluate what parameters we like, but also the ability to evaluate some of them more than once. This, our language again does not offer. If we remember the bad news, any parameter is evaluated once before entering the function body, after which we only have access to the evaluation result and no way to repeat that evaluation again, if we so choose.

We cannot have a function, for example, that evaluates an expression a given number of times (10) as long as it returns false, with a timeout (5000 ms) between the evaluations,

try_wait_ms(
    open_lock(),
    5000,
    10,
    error("cannot open lock")
);  

As above, similar abstractions in the same spirit as a while function are imaginable but not expressible with our meager tools.

Mixed evaluation of parameters

The while function evaluates all its parameters more than once until the first one returned false. This for function is a mixed bag,

for(a = 1; a < 10; a++; add_to_list(a))

The for function evaluates its first argument only once, expects the second argument to be an expression returning a boolean and evaluates the last two as long as the second expression returns true. Again, not possible for multiple reasons, one of them being the inability to control when, if and how many times we can evaluate the parameters given to a function.

Can you think of other useful tools? After all, we're barely scratched the surface while trying, and failing, to implement some of the features our language already offers. Once you're thus liberated, you start imagining and implementing abstractions that were not possible before. And all that in the language you are already using. What usually happens if these abstraction are not available? A differet language or tools get implemented that takes care of current limitations. See The Nature of Lisp for a very beautiful example from the Java world. Ok. I'll be of some help and offer you a few more impossible abstractions. But you'll have to start and think out some new ones yourself. Promise? Let's go!

The correct type that we're looking for is a function

The previous kind of abstractions had to do with the evaluation strategy of the arguments passed to a function. At least, that's what we've highlighted in our examples. We saw that some of the arguments can be evaluated multiple times (while and for) or not evaluated at all (if) and that this was decided based on the values of arguments or expressions that were also passed to those functions. We also saw how this was useful if nothing more than for the reason than these are abstractions that any C developer already uses.

This section stresses a different kind of abstraction. This is already evident from the previous discussion but there we ignored it. This time, we will focus on what can actually pass as a function parameter. Let's take a closer look at what kind of code we can actually implement in C,

int iff(bool cond, int ___, int ______) {
    if (cond) 
        ___ * 10;
     else
        ______ + 5;
} 

and the kind of code we can't,

int iff(bool cond, ??? ___, ??? ______) {
    if (cond) 
        ___;
    else
        ______;
}   

I am again using question marks for types the language does not quite offer and underscores to better reflect what we want to extract from our function to make it more general, more abstract. In the first case, the parameters are int's and what we are extracting from the function are just the places where those int's would go. In the second example, the parameters are expressions and what we are extracting from the function are whole expressions of the language. Expressions with which we want to supplement our function at a later date and evaluate them in their respective places. What is another name for one or more expressions evaluated in order? A function. So the correct type for these parameters, as well as in our if, for and while examples from above, are functions,

int iff(bool cond, fn exp1, fn exp2) {
    if (cond) 
        exp1();  /* Call the function parameter */
    else
        exp2();  /* Call the function parameter */
}  

While the type of the parameters are indeed fn, or function, notice that we're actually passing expressions. What this means is that we need an ability to say this to our language,

x + 1

and transmit the intention of creating and passing this function as a parameter, at run-time,

int random_name(int x) {
    return x + 1;
}   

Except that functions are not first-class citizens in C. There is no such type as fn. That is, we cannot build functions at runtime and pass them, as objects, to other functions for their use. What we can do, is define a function statically and pass a pointer to that function to our map function. That means that, even if we use a function just once, we still have to define it before we use it. More importantly, the kind of functions we can thus use are limited since we cannot use and catch variables (more on this below). In short, we've uncovered another limitation in our language.

A different kind of abstraction: map

Thus far we've only seen what a replacement of sequences of expressions would mean for our ability to define new type of functions. We've correctly deduced that these expressions are nothing more than functions of zero parameters which we can just call at the right place. But what if we could define functions with parameters of type function that would themselves have parameters. Let's consider map,

int numbers[5] = [1, 2, 3, 4, 5];
int squared_numbers[5] = map(2 * x, numbers);
/* newlist == [2, 4, 6, 8, 10] */ 

Map is a function that accepts two parameters, a function and an array (of ints, in our case). The result of calling this function is a new list where each element is replaced with the result of the function applied to the old element. 2 * x is a function of one parameter x, that doubles the value of its parameter.

We've said above that using the C's "function to pointer" type is rather limited in its ability. Besides the fact that we would need endless functions defined before their use, what if we have code to double all the elements of our list using an extra local variable,

int d = 2;
int numbers[5] = [1, 2, 3, 4, 5];
int squared_numbers[5] = map(d * x, numbers);
/* newlist == [1, 4, 6, 8, 10] */ 

But d would not be defined in our "function pointer" parameter and we cannot achieve what we want. What we want is a closure, a much powerful construct directly related to the ability of building functions at run-time, which is again not offered by C.

So the correct call to our map function would include creating a function at runtime, a function that would define what are its bound variables, that is, it would have to know what needs to be replaced where, and it would also need the ability to refer to variables around it, so to speak,

int d = 2;
int squared_numbers[5] = map(fn(x){d * x}, numbers); 

This, again, is not valid C code, but we're just imagining what such a feature could look like.

Stretching things even further

Even though we're so deep into Lisp territory already that we barely see the C waters now, let's have a mind stretch and think about other possibilities, maintaining, as much as we can, a semblance of C code.

int a = 10;

loop(start=1; stop=10,                  printf("looping..\n"));
loop(start=1; stop=a*10,                printf("looping..\n"))
loop(start=1; stop=10; step=2; index=i, printf("looping %d..\n", i)) 

These are all looping constructs evaluating the second parameter a given number of times, based on the first parameter. The first parameter is an expression where we are allowing some keywords like start, step, etc. What is special here is that we would need the ability not only to add the missing pieces back into our "unsaturated" function, as we did initially, but we would also need the ability to parse those parameters, to extract relevant info from them and possibly to include part of them in other places (see i, for example). We've seen this in the for example, where the first parameter contains an initialization of a variable that we need to use subsequently in the other parameters. What we need, in short, that we again don't posses? For the ability of the C language to parse itself.

Conclusion

So C lacks any means to actually handle and manipulate code within the C language itself, as we saw in the last example. Secondly, it does not have functions as objects in their own right that we can handle, manipulate, pass to functions, create and destroy at run-time. Thirdly, it does not offer a way of avoiding the passing by value of its parameters. That would help us, in special cases, to have a chance to take a look at those parameters, avoid evaluating some of them or evaluating others multiple times, transforming them, and come up, eventually, with new language constructs that the original language designers have not provided us or that are so specific to our use-case that nobody could have come up with such abstractions but us.

All these dead-ends and design choices that do not play well with each another and limit the language extensibility add up. What we're left with are just some limited forms of abstractions. After a while, we hit a wall and we wished we could express something the language has no way of expressing.

If you are thinking that you don't need this ability to implement new functionality, stop and think what the language already offers in the case of the if statement, for example. For a clearer view, consider it as a function, as we did above. What you are passing to this if function now are really expressions. Expressions that do not all of them get evaluated, meaning, the language you are using has a means of avoiding evaluation of the function call and selecting which expression to evaluate at a later time. But you do not have the power to create such a function. Why not? Better to take a walk in the fresh air.