Core Developer @ Hudson River Trading
On 7/6/2021, 6:46:07 PM
Return to blog
This post is about Chapter 3.1-3 of SICP. Ironically (in a post about the meaning of functional programming), these sections are about mutation and state, which are more commonly associated with non-functional paradigms.
The purpose of this post is to describe what functional programming is. In the small group discussions we hold for this independent study, Prof. S1 and I both had some revelations about what it means according to Abelson and Sussman, as opposed to what we had originally thought.
Prof. S had worked with Scheme in the past, and his impression was that functional programming primarily dealt with (linked-)list processing -- as opposed to dealing with more concrete data structures like structs and arrays.2
My previous intuition was that functional programming was all about dealing with functions as first-class objects, like lambda calculus. This means a lot of clean code and aggregate operations like
As we went through the first two chapters3, we were always pulling parallels to grogramming constructs that we see in other languages, especially object-oriented languages. This is almost definitely because we are most familiar with these languages, like C++ and Java. By the end of Chapter 2, we see things that look quite a bit like objects: we design primitive "type systems" and consider inheritance patterns, and we see object-like structures: functions that bind state to a variable. These objects have "constructors" and "selectors" ("getters"). There are no "mutators" ("setters") mentioned before Chapter 3, but that is okay -- we can simply construct a new "object" if we want to change the value of a "field," and all is well.
At multiple times throughout the independent study thus far, Prof. S has said something along the lines of: "I see how [templates/polymorphism/lambdas] is powerful, but I still feel like it's a lot easier to do this using [C++/Java]." And I agree, but I also think that it's not the point of this book -- rather, SICP is about considering the design of programs and the design of programming languages, in order to become a better programmer. I think the question keeps coming up because the stark minimialism of Scheme still hasn't registered as "normal" with Prof. S, and thus we want to relate to concepts in more familiar languages.
But up to Chapter 3, the comparisons we made were implicitly between a functional programming language and an imperative or object-oriented one. The authors are cunning and deliberate4: they sneak in important concepts by means of example and connect names to those concepts only after the examples have practically spoken for themselves.5 Most noticeably, the "object" still felt very different from the standard idea of objects in OOP languages, perhaps because of the lack of mutators, and we don't have advanced data structures such as hashtables that require mutation of memory cells6. Actually, we haven't memory cells at all -- maybe only some hand-wavy mentions of "pointers." What gives?
The missing piece is found in Chapter 3. Suddenly, we are handed the
set! function (and the similar
set-cdr!). The first or second concept we learn in a conventional programming class are variables. We learn how to set a variable to one variable, and then change its value to another. We learn that
x = x+1 is not an equation but rather than an assignment (or we learn the walrus operator, which has the same effect), and we accept it immediately. It makes sense, right?
x stores a value, and maybe that value can change over time as we mutate it.
(At this point, we should be very clear what we mean by assignment versus mutation. Assignment means the binding of a value to a variable, whereas mutation means changing the value that a variable is bound to. In Scheme, we can think of every variable as a pointer to some memory location. Assignment or reassignment means setting the pointer to point to some value in memory, and this is what
let do. Mutation means to change the value in the memory cell that the variable is pointing to (via a pointer dereference), and this is what
Without assignment, we've been able to survive two interesting chapters which build increasingly complex abstractions. We are given the
let operators, but these only allow you to bind a name once in a scope7. It's as if we lived in the world of ANSI C89, where declarations are either top-level declarations or must be located at the top of a scope, and all variables were declared
const8. And so far, the only composite data structures we've worked with are the lambda (whose closure can store values) and
cons, as if we had a generic
Pair, ?> type in Java as the only basic type. With these two basic building blocks, we've been able to write binary search trees, an N-queens solver, paintings, a type system, and pseudo-objects. Everything that can be achieved with mutation can be achieved with immutable bindings9 10 11.
Despite the fact that we still can achieve the same things without mutation, mutation feels very useful. With mutation we can be satisfied with making a "true" object, where we change the values of variables by mutating their memory cells, as the authors show in the example about a bank account. We can now have a closed form for a PRNG, which is somewhat impossible in a purely functional form, because we have to store the previous state.12 As I've mentioned in the footnotes multiple times by now, we also get all the array-based data structures, which are extremely useful because of the way RAM is linear and constant-time random-access.
We've just described what we gain -- and this is intuitive, because we already deal with mutation all the time and see that it is a very practical concept. But what do we lose? What happens if we don't allow mutation, if we only allow for immutable bindings?13 The concrete examples of why immutability is helpful are the following:
'()). These symbols can be thought of as variables or values.
These practical points above are specific examples of general considerations that I understand to be the design principles of immutable design:
==would mean the same as
.equals()). This makes it easier to reason about our code. A value (a memory cell) has no concept of time: it always holds the same value when someone accesses it. This also has to do with eliminating race conditions. When we encounter streams in section 3.5, we show that we can emulate "time" by having a history of immutable states, and can act on this like a sequence.
Okay, I've talked long enough about immutability, but what does this have to do with functional programming? And it turns out that this is functional programming. Abelson and Sussman provide us the following definition of FP:
Programming without any use of assignments, as we did throughout the first two chapters of this book, is accordingly known as functional programming.
(Here, they use the word "assignments" as I used the word "mutation". They may call the use of
let "binding" rather than "assignment.")
This definition surprised both Prof. S and me.
Compare this to the description provided by Wikipedia:
In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that map values to other values, rather than a sequence of imperative statements which update the running state of the program.
In functional programming, functions are treated as first-class citizens, meaning that they can be bound to names (including local identifiers), passed as arguments, and returned from other functions, just as any other data type can. This allows programs to be written in a declarative and composable style, where small functions are combined in a modular manner.
Functional programming is sometimes treated as synonymous with purely functional programming, a subset of functional programming which treats all functions as deterministic mathematical functions, or pure functions. When a pure function is called with some given arguments, it will always return the same result, and cannot be affected by any mutable state or other side effects. This is in contrast with impure procedures, common in imperative programming, which can have side effects (such as modifying the program's state or taking input from a user).
This description (of non-pure FP) is somewhat closer to what I had in mind: composing functions and focused on expressions more than statements. It's probably closer to what Prof. S had in mind too, as we often map certain datatypes more than others, with
list being the most basic type other than
cons. This description also makes it sound like SICP's definition is closer to a definition of purely functional programming.
I would argue that, given these definitions, immutability plays as much a role in functional programming than the composition of functions, the namesake of FP. I guess you could say that enforcing pure functions leads to immutability, and immutability leads to pure functions16. And all of the other properties of FP are due to these core concepts.
The antithesis of this would be, of course, imperative programming, which we can now define as code that mutates state. This idea comes bundled naturally with the idea of statements and highly linear code, which is often easier for people to reason about and for computers to use, at the cost of FP's idealism.
Now that many modern languages support some degree of functional programming, it simply feels like another option, and it's easy to take the imperative route instead, especially if it's the more familiar one.
The final footnote in section 3.1 mentions that:
In view of [problems related to careless statement ordering in imperative programming], it is ironic that introductory programming is most often taught in a highly imperative style. This may be a vestige of a belief ... that programs that call procedures must inherently be less efficient than programs that perform assignments. ... Alternatively it may reflect a view that step-by-step assignment is easier for beginners to visualize than procedure call. Whatever the reason, it often saddles beginning programmers with "should I set this variable before or after that one" concerns that can complicate programming and obscure the important ideas.
at which point Prof. S asked me if I thought that this was true. Is functional programming (e.g., in Scheme) better and/or easier for a beginner than imperative programming?
My response was that I don't think it was easier for me at first, if only because I had a background in procedural and OOP programming before reading this book and the syntax felt alien to me. Now, it feels almost as natural as writing code in most languages, provided that equivalent libraries are provided and I don't have to deal directly with memory pointers or allocation. Perhaps it might be easier on mathematics students, because Scheme resembles lambda calculus and algebraic forms.
But even if I can't reset my programming knowledge and start learning programming from scratch using this book, I definitely appreciate why the first two chapters were taught in a functional style. It seems that the authors are determined to show that FP is more fundamental to computation -- building up a complex computation from primitives -- and has its benefits over imperative programming that are often missing in the common introductory course to programming. And only when its readers have a foundation of functional paradigm is imperative programming revealed, in such a way that the reader can appreciate its glory and understand its flaws.
1. This is the convention I used in my previous blog when referring to real people, and I will stick to it.
2. Somewhat related: When I proposed this independent study, his understanding was that this book uses Lisp because it teaches functional programming, and suggested that we use a more modern functional language like Python or Haskell. My understanding was that this book was not about functional programming at all, but rather that it chooses Lisp as a lens for studying program design for its minimalism and metaprogramming capabilities, and it happens to be a mostly-functional language. It took a while to convince him that the independent study was not about Lisp or functional programming, but about the shape of programs.
3. Chapter 1 deals with procedural abstractions, and Chapter 2 with data abstractions.
4. Schemers, if you will.
5. While this is perhaps the best part about SICP, this is also one of my criticisms -- for the unaware reader, it may seem like the book is throwing you examples before you're ready to deal with them. Sometimes the tools needed are not given to you! The authors sometimes don't give implementations of crucial functions, such as
get in 2.4 (these require material from Chapter 3), some of the concurrency primitives in 3.4 (this involves non-standard Scheme to support asm instructions), and
make-delay in 3.5 (this requires macros). This is great for the inquisitive reader with some background in programming, but it requires a decent amount of prior knowledge. I believe this is the most common criticism of the book.
6. This is a common criticism of functional languages. See this.
7. In a REPL or a non-standard program,
define may allow redefinitions in the same scope. But in a standards-compliant program, that's not allowed.
8. It would be very interesting and probably very ugly to code like this in C.
9. If you allow enough nested scopes.
10. This is a laymen's way of saying that lambda calculus and Turing machines are equivalent models of computation.
11. Efficiency may not be equivalent without mutation for certain data structures and algorithms, such as in arrays and array-based structures (e.g., heaps, hashtables) or the Sieve of Eratosthenes. See the earlier footnote. But we can still emulate them in functional programming. The authors are very deliberate not to mention hashtables at all (at least up to Chapter 4). They mention the Sieve of Eratosthenes in 3.5 when mentioning streams, which is still a functional programming concept, but their implementation is very inefficient. I'll mention this in a later post about
12. Actually, we can achieve this using streams (section 3.5), which actually stores the entire history of a variable's state. In other words, the stream is an infinite list of immutable states. More to come in a future post about streams.
13. This is supposed to be the climax of the post, but I can't think of a climactic way to display everything, so it's gonna go in a boring list. These are my key takeaways from these sections and are, in general, what I believe to be the strengths of immutability and functional programming, but it is of course not exhaustive.
14. There is a more precise definition of referential transparency than what I present here, which is just for the general idea.
15. However, there are also performance tradeoffs as mentioned many times before, but SICP suggests that the widespread belief is more pessimistic about functional programming performance than the reality. Some Scheme flavors and other functional languages like OCaml or Haskell can achieve very close to C speeds.
16. In a purely functional language like Haskell, pure functions are enforced. For example, in Haskell, non-pure functions due to I/O are "marked" with the
17. Haskell is the only purely functional language I know of without having to look up purely functional languages, which says something of their rarity.
© Copyright 2023 Jonathan Lam