Programming languages are not only tools for commanding computers, but also for communicating and structuring our own thinking; and they shape our reasoning about certain problems in their own way.

In this post I reflect on a few characteristics of Lisp (and specifically Racket) that capture my mind again and again.

Syntax vs data

When I write, say, Python (which is a great language!), I tend to have this mental image of writing a certain syntax that’s then parsed into the actual logical structure (the abstract syntax tree or AST) of the program by the interpreter / compiler, similar to how HTML is translated into the DOM in memory. Of course, I rarely consciously think about that distinction, but when switching to/from Racket I notice this nuance. The syntax is just an arbitrary set of rules to write out the program in this specific form, and alternative syntaxes for Python exist.

In contrast, when I write Racket, conceptually it feels like I’m typing out a data structure that represents the algorithm(s) of the program (very close to the AST). Obviously there are still many translation steps between the input I provide and the result (a running program) inside the runtime; but the relationship between the source code and the actual program is somewhat special.

The famous parentheses in a Lisp program map directly to its logical tree-like structure.

Let’s take a very primitive example:

Python:

1 + 2 + 3 ** 2 * 4

Racket:

(+ 1 2 (* (expt 3 2) 4))

A few things to note in the Racket example:

  1. The computation is directly dictated by the structure of the expression, delimited by pairs of (...), and not some other factor like operator precedence - i.e., the way it is written expresses how it is computed;
  2. Most expressions follow the pattern: (<OPERATION> [<ARGUMENT> ...]);
  3. Arithmetic operators like +, -, expt are not special syntax but regular functions: the expression (+ 1 2 3) means apply the function + to arguments 1, 2, 3.

Thus, there is a symmetry in Racket between writing (+ 1 2 3) and, say, (build-path "/" "tmp" "somefolder" "somefile.json") - both are expressions applying functions to values, and returning values.

Also, just like any function, + can be passed to other function as an argument. The following expression:

(map (curry + 5) '(5 10 15))

results in '(10 15 20). The single quote before ( is a shorthand for (quote ...) and means that what follows is not evaluated as a function call, but returned as a list instead (we will talk about this more below).

Code as a tree

Let’s look at another example, say, computing whether a given number is a fibonacci number:

Python:

from math import isqrt

def is_fibonacci(num):
    if num >= 0 and float(num).is_integer():
        intermediate = 5 * num ** 2  # save some repetition
        if (is_square(intermediate + 4)
            or is_square(intermediate - 4)):
            return True
    return False

def is_square(num):
    return num == isqrt(num) ** 2

Racket:

(define (fibonacci? num)
  (cond                                        ; conditional
    [(and (integer? num) (>= num 0))           ; clause
     (define intermediate (* 5 (expt num 2)))
     (or (square? (+ intermediate 4))
         (square? (- intermediate 4)))]
    [else #f]))                                ; clause

(define (square? num)
  (equal? num (expt (integer-sqrt num) 2)))

Again, in the Racket program nested parentheses define the “shape” of the computation. Pairs of [...] are syntactically equivalent to (...) and used only by convention.

Conditionals like cond (a generalized if), and, or, etc, follow the same pattern: (<OPERATION> <INPUTS>), and also produce a value. However, specific conditionals have specific shape of their inputs, for instance cond takes one or more clauses mapping tests to values.

Since conditionals are expressions that evaluate to values, there is no need for explicit return - the evaluated expression is the value:

  • (and ...) tests if none of its subexpressions are false (#f) and returns the value of the last one or #f otherwise;
  • (or ...) returns the first non-false subexpression or #f if none;
  • and cond returns whatever the last expression of the matching clause returns.

Functions return their last (outermost) value.

Overall, the program looks more like a tree of expressions that produce values to give the final answer, rather than a chain of actions or commands. This mental attitude is something I admine in Lisp.

Code as data, data as code

One last bit I’d like to touch on here. The source code in Lisp is actually a data structure, not just a particularly shaped long string.

For instance, with this command (executed in unix shell):

$ echo '(+ 1 2 3)' | racket -e '(read)'
'(+ 1 2 3)

…I have just parsed a chunk of source code "(+ 1 2 3)" into data - a list consisting of the symbol + and there numbers 1, 2 and 3, but have not executed it, just printed it back as data again. This small data structure can also be executed:

$ echo '(+ 1 2 3)' | racket -e '(eval (read))' # this is what happens normally
6

Notice how, conceptually, it’s not the source code that is executed, but the data structure - a list of items - that is read from the source code.

Reading is a separate, distinct operation from executing.

Nothing stops you from doing something with the data before executing it. For instance, running the following program would produce the familiar "hello world":

(define my-program
  ;; the single quote before the list means `read` that thing
  ;; as data but do not `eval` it yet.
  '(string-append "world" " " "hello"))

(eval (cons (first my-program) (reverse (rest my-program))))

In the above example I reversed all but the first element of my-program before executing it.

Now, this is obviously not what most Lisp programmers would normally do. But the fact that your program is a data structure changes how you think about it.

You can also read things you don’t plan executing at all. For instance, you can point the reading machinery of the interpreter at a configuration file without ever executing what’s inside, just using it as passive data.

Nevertheless, writing programs that transfrom themselves before being executed is very common in the Racket world, although this is done in a much cleaner and more declarative way using macros.

But why, you ask? Because programmers love abstractions and macros provide a way to abstract out patterns of code that emerge in this or that specific domain (or accross domains!).

I’ll end this post with a final example: a quite popular instance of macro use in Racket, providing a nice way to “pipe” values sequentially through a chain of operations:

#lang racket  ; selecting the main Racket dialect

(require threading)

(~> "world hello"
    (string-split " ")
    (reverse)
    (string-join " ")
    (string-append "💜"))

The (~> ...) fragment actually translates to:

(string-append
 (string-join
  (reverse
   (string-split "world hello" " "))
  " ")
 "💜")

The ~> form is not a language feature or special built-in syntax - it is a macro coming from a a third-party package. Macros are Lisp functions that transform Lisp code before it will be executed.

As you can see, the language allows great creativity and flexibility in applying itself (code) to itself (data), so to speak.

(Recursion in general is also practiced quite often by Lispers, but that is a topic for a dedicated blog post).