Baby steps to functional programming—lists

A typical list sum function written in Haskell

I have always been asked what is a good starting point in learning functional programming. The answer I always have is — lists. They are built-in for almost all functional programming (FP) languages and has the most gentle learning curve for beginners.

Lists in object-oriented programming (OOP) — in the form of arrays, list classes — act very differently than the one commonly used in FP. In C or JavaScript, you might commonly see code like:

Here you can see although the list reference is immutable, its content is not. Every call to Add or push mutates the list and add more elements to it.

Let us look at the very definition of the list datatype in Haskell. It is literally one line and 25 characters (space included).

To FP beginners, these cryptic symbols might look very daunting, and yet this is exactly why FP languages are succinct and expressive at the same time. I will go slow and try to break down the definition into digestible pieces:

  1. The data keyword lets us define a new data type. In this case, the name of the data type is [].
  2. The a next to data [] is a type variable. It tells us that [] is a parametrically polymorphic type. In common OOP languages, this is more usually called a generic type (i.e. the list does not care what it stores, whether it is integers, strings, or objects, but they have to be of the same type). This differs with dynamically-typed languages like JavaScript, where arrays can contain elements of mixed types.
  3. The = symbol separates the type and data constructors: everything left of the = tells us about how the list type is constructed and everything right of the = tells us how list data is constructed.
  4. There are two data constructors of the [] type: the first of which is []. Yes, it has the same name as the type but this represents the value of an empty list.
  5. The | symbol separates the two data constructors.
  6. a : [a] is the second data constructor of the [] type, and it is a recursive data constructor. It says that you can construct a list by combining a value of type a, the : operator (also commonly known as the cons operator), and another list of type a.

Point 6 is where most beginners struggle with, so perhaps let us examine a few examples. With Glasgow Haskell Compiler Interactive (GHCi), it is extremely easy to run interactive examples to let the compiler verify the types and correctness for us.

From now on, you can simply copy everything on the right of the lambda prompt (λ>) and hit enter. GHCi will return a value to you.

λ> []
[]

Congratulations, you have just constructed a list using a data constructor! This is actually the empty list mentioned in point 5 above. But wait, what is its type? You can use the :t command to help you:

λ> :t []
[] :: [a]

This actually tells you two pieces of information:

  1. [] :: [a] means ‘[] is of type [a] ’, or ‘empty list is of the type: list of a
  2. Why the a? Because Haskell is a non-strict language, the compiler cannot infer a concrete type that fits in the execution environment when you type [] in GHCi. That is why GHCi did not bother inferring the type and leave it as an unknown type variable a.

An empty list is not very useful, isn’t it? How can we construct a list with the three integer elements 1, 2, and 3 in it just like the examples in C and JavaScript?

λ> 1 : 2 : 3 : []
[1,2,3]

This corresponds to how we make use of the recursive data constructor shown in point 6 above. To make the thought process a bit clearer, we can break it down into baby steps.

If you are familiar with basic computer science data structures, you might find the cons operator has a very close resemblance to how linked lists are constructed, where : kind of resembling the next pointer.

You can also examine the type of the list we just constructed:

λ> :t 1 : 2 : 3 : []
1 : 2 : 3 : [] :: Num a => [a]

Woh, what is happening there? What is that scary Num a => [a]? To put it simply, this is how Haskell represents number-like types when no concrete floating-point arithmetic is required. You can mentally replace Num a => [a] with [Int] if it is distracting you.

In order to make the syntax a bit terser, there is also syntactic sugar to construct the same list:

λ> [1, 2, 3]
[1, 2, 3] :: Num a => [a]
λ> [1, 2, 3] = 1 : 2 : 3 : []
True

The second command simply tests the equality between [1, 2, 3] and 1 : 2 : 3 : [] and GHCi will happily return True, signifying they do both represent the same list.

As a simple exercise, let me show you how to calculate the sum of its elements in an integer list in Haskell. Let us call the function mySum:

Line by line explanation:

  1. This is an optional type signature of mySum. It is not strictly necessary but it is considered good practice. It means the mySum function takes an [Int] (integer list) as an input argument and returns an Int (integer).
  2. This line pattern matches against an empty list ([]). It is almost similar to an OOP if statement (if the input is an empty list…). The right side of the = sign is the return value 0. This is saying ‘if the input is an empty list, return 0’. This forms the base case in the recursion.
  3. This line does another pattern matching against a non-empty list, where x is the first element in the list (also commonly known as the head) and xs is the remainder of the list (also commonly known as the tail). The right side of the = sign does two things: summing x with the recursive sum of the remainder of the list xs, by calling mySum.

It might be easier to break down the thought process as well. Let us take mySum [1,2,3] as an example:

-- Initial input
mySum [1,2,3]
-- [1,2,3] is not empty, thus does not match against the mySum [] pattern on line 2
mySum (1:[2,3]) = 1 + mySum [2,3]
-- [2,3] is not empty, thus does not match against the mySum [] pattern on line 2
mySum (1:[2,3]) = 1 + (2 + mySum [3])
-- [3] is not empty, thus does not match against the mySum [] pattern on line 2
mySum (1:[2,3]) = 1 + (2 + (3 + mySum []))
-- [] is empty, it now matches against the mySum [] pattern on line 2. mySum [] returns 0
mySum (1:[2,3]) = 1 + (2 + (3 + 0))
-- Thus mySum [1,2,3] = 6

This is a very common pattern in traversing a list using recursion in FP languages instead of loops in OOP. Thus this should be repetitively practised until it becomes second nature.

Congratulations! We have gone through examples of how to construct and manipulate lists in Haskell through two toy examples.

I hope this baby step walkthrough sheds some light on how FP views and uses lists. Please give me a clap if you like this article!

As a side note, I’d assume readers of this article are still uncomfortable with functional programming in general. Thus I recommend deliberate and constant practice until you have transcended from unconsciously incompetent (i.e. you do not know what you do not know) to a level of conscious incompetence (i.e. you know what you do not know).

Programmer | Watch enthusiast