Jonathan Lam

Core Developer @ Hudson River Trading


Blog

SICP notes: types of polymorphism

On 6/13/2021, 8:29:55 AM

Return to blog


When writing the previous post, I realized that I didn't really understand what polymorphism or generic procedures are, and I was a little afraid to use the terms incorrectly. Here I will try to consolidate what I understand to be their definitions, and various subtypes of polymorphism, with my own examples.

The problem I have with understanding these terms is that most definitions immediately jump to OOP as an example. While the OOP structure certainly has textbook examples in polymorphism, it is a bad definition for those who want to understand the general concept (i.e., in the spirit of SICP).

First things first -- what is a data type (or simply "type")? We have to distinguish between an abstract data type (ADT) and a concrete data type. An ADT is a description of an interface of a particular kind of data, such as a stack or queue, but it can be implemented in several different ways by different concrete data types. In this post (as in the previous post2), we'll potentially need to differentiate between multiple implementations of the same ADT. Note also that we are using the term "interface" very loosely -- unlike the very concrete meaning it has in Java and TypeScript and some other OOP languages, I mean it in the most general sense.

Using Wikipedia's definition3 polymorphism is "the provision of a single interface to entities of different types or the use of a single symbol to represent multiple different types." These might seem like two very different things, but we can invoke the wand of the Wizard Book and say that the boundary between procedures and data is somewhat fuzzy. Both provide an "interface"4, and if these interfaces can handle multiple types, then the procedure or datum is polymorphic.

This is an extremely broad definition and it applies to many things. We can characterize polymorphism in several ways5:

To illustrate some of these examples in a less-conventional context, we can use the examples from SICP, which use both ad-hoc and parametric (generic) polymorphism (although the authors do not use these names). Consider the following two functions from section 2.4:

(define (cplx-real-part z)
  ;; get the real part of a number
  (cond ((cplx-rectangular? z) 
	 (cplx-real-part-rectangular (contents z)))
	((cplx-polar? z)
	 (cplx-real-part-polar (contents z)))
	(else (error 'cplx-real-part "Unknown type" z))))

(define (complex+ z1 z2)
  ;; add two complex numbers
  (make-from-real-imag (+ (cplx-real-part z1)
			  (cplx-real-part z2))
		       (+ (cplx-imag-part z1)
			  (cplx-imag-part z2))))

cplx-real-part uses ad-hoc dynamic single-dispatch polymorphism -- we define separate implementations to handle different types. In a language with overloading support, this would most likely be compile-time polymorphism. complex+, on the other hand, is a generic function. Because z1 and z2 both implement the cplx-real-part and cplx-imag-part interface, we don't have to write multiple versions of complex+. Note that this is closely tied to the notion of abstraction barriers, as Abelson and Sussman call it: because we define the lower-level interface cplx-real-part and cplx-real-part, higher-level interfaces can use this interface generically. However, we do need to implement these for each underlying representation, and that is where the ad-hoc polymorphism comes in.

Later on, in section 2.5 when Abelson and Sussman extend this system to another level of abstraction by implementing the arithmetic system, we see another level of ad-hoc polymorphism: each algebraic field must register their implementation of the add, subtract, multiply, and divide in the dispatch table. There, we implement a very simple version of multiple dispatch and the authors suggest a coercion and a hierarchy, which might introduce a subtype polymorphism system.

One last important note to make about SICP's suggested hierarchy system: unlike typical hierarchical systems where we look to a superclass for an implementation, in the most general sense we can also look to subclasses for an implementation. This would require narrowing (downcasting) the type to that of one of its subclasses (or perhaps it may not be considered a hierarchy at all, but some other type of relation) which is typically an error in hierarchical systems, but it is an alternative design12.


Miscellaneous topics

Polymorphism was the main confusion point for me when looking up details of type systems, but here are some other notes that stemmed from writing the previous post:


Footnotes

1. I.e., sets of objects that can be added, subtracted, multiplied, and divided commutatively. They used complex numbers, reals, rationals, polynomials, and rational functions.

2. We had to differentiate between different implementations of the complex number interface, and between several implementations of the algebraic field1 interface.

3. Most of these definitions will come from Wikipedia, but I'll try to explain the definitions in my own words.

4. The interface of a procedure is the set of arguments it takes and the values it returns, and the interface of a datum is the set of arguments it[s constructors and methods] take(s) and the values it[s selectors and methods] return(s).

5. Of course, this list is neither absolute nor complete, and you may find many other characterizations online. This is only my own list based on the concepts that were useful for understanding the type concepts introduced in SICP 2.5. I also don't have a name for each characterization.

6. SICP section 2.4

7. This last point is a little dependent on implementation: in C++, templates are instantiated separately for each class it is compiled on, which is a pain point for the programmer because the template depends on the applied types. This breaks separate compilation. However, in Java, we can have separate compilation because the parameter type is enforced by restrictions on the type parameter. See more here, and see the section in this post about duck typing.

8. As far as I know, this is the way subclassing is implemented in other languages as well.

9. This only applies when you have multiple arguments, which wouldn't happen with data polymorphism you consider the message-passing model with multiple operations ("messages"). But I don't have an example of this in mind.

10. I say other because methods are usually implemented with an implicit this or self parameter, on which the polymorphism applies.

11. This has multiple problems but is a simple illustration of multiple dispatch.

12. See is-a relationships. I believe this is the typical semantic interpretation of subclassing, and in its perspective downcasting doesn't make sense. Downcasting also probably violates the Liskov substitution principle, but I don't know much about that.


© Copyright 2023 Jonathan Lam