I recently came across this fairly basic, run-of-the-mill interview question that involves writing algorithms.
Given an array of elements that are positve numbers, write an algorithm to find the second largest element
I personally loathe interview questions involving algorithms because they never reflect what am I expected to actually do on the job, but I reckon it would be fun to use this question to compare the difference between C♯, a popular OOP language, and Haskell, an FP language.
If I were to write it in C♯, I would have probably written it like this:
This seems reasonably well-implemented:
- It is generic: it works on all number types such as
decimaland all types of collections, as
IEnumerable<T>is the base type of all collections
- It is reasonably performant: it does not do too many unnecessary allocations without sacrificing readability
- It is reasonably readable: I took advantage of the latest and greatest language features when I think they improve comprehension, e.g. the use of
ValueTupleon line 6
Now contrast the way if I were to write it in Haskell:
Even if we discount the syntactic noise like braces and access modifiers in the C♯ version, the Haskell version still looks way more terse and concise than the C♯ version. Now granted Haskell was designed to be highly adapted to writing number algorithms like this, I cannot help but feel every keystroke I made on the Haskell version was more purposeful than the tedious ceremony I was required to comply with in order to make the C♯ version compile.
The Haskell version is also reasonably well-implemented (pardon my basic knowledge of Haskell that may limit my ability to judge what is well-implemented):
- It is generic: it works for all number types as well due to the
Numtypeclass constraint. It is even more generic than the C♯ version because this function would work on any
Foldablestructure, lists and arrays are just some examples of
- It is reasonably performant: its usage of
foldl'means it is strict and tail-recursive, thus avoiding stack overflows and stack frame allocation
- It is reasonably readable: to untrained eyes, this might look like alien language but I can guarantee this is idiomatic Haskell and it is written professionally
But what makes the Haskell version so elegant and yet so scary to non-FP programmers? I would like to try and break it down in this article.
Deep Dive into Haskell
The first line is an import statement that imports the
Data.Foldable module. It is synonymous with how C♯’s
using statements bring in namespaces. Nothing too difficult here.
The third line declares the type signature of the
secondLargest function. It is not strictly necessary because Haskell’s compiler would have been able to figure that out itself but it is nevertheless good practice for human readers to look at the signature and have a vague idea of what the function is trying to achieve.
secondLargest :: (Num a, Ord a, Foldable t) => t a -> a
This means we are declaring a function that has two generic parameters,
a, the input to the function is of the type
t a and the output to the function is of the type
a. The two generic parameters have different kinds.
thas a kind of
* -> *, which means it is a type constructor that takes another type to form a proper type. You can think of it as a
List<>(open generic type) waiting for a generic argument to form a fully constructed type
ahas a kind of
*, which means it is already a proper type
There are also a total of three generic constraints applied to
a, specified before the fat arrow,
tmust have a
Foldableinstance, which resembles C♯ generic constraints to require this generic argument to implement interfaces/abstract classes. This requires
tto implement either
amust have both a
Numinstance and a
Ordinstance, meaning the generic argument used must be a ‘number’ and must be able to be ‘ordered’
Let us move on to the actual implementation starting from line 4.
secondLargest = snd . foldl' ...
We are immediately greeted with a peculiarity. I mentioned in the previous section that this function has one input of type
t a, and one output of type
a, but line 4 clearly contradicts my statement! Nothing was between
secondLargest and the
= sign which is where our parameters should have been placed!
This, in fact, is a very common way to write Haskell and is called the point-free style. We achieved this by using the
. operator between
. operator, a.k.a the function composition operator, is the bread-and-butter of functional programming. It is defined as:
It composes two functions
g, returning a third function that takes
x as the input that first runs through
f, returning the result.
Exactly the same is happening with
secondLargest: we compose
foldl' ... with
snd, returning a function that takes a new parameter. That is how we are able to say
secondLargest is a function that takes one parameter without explicitly putting that parameter in its definition.
On to the implementation
The Haskell version of the algorithm actually looks very similar to its C♯ counterpart. They both keep track of the largest and second-largest elements found so far while traversing the list.
In Haskell’s case, the traversing is done via
foldl' using a two-element tuple
(0, 0), with the left element representing the largest element so far and the right element as the second-largest so far, as the starting state. The result is then fed to
snd, which is a built-in function that returns the right element in the tuple, in our case, the second largest element found.
The Left Fold
First, let us examine the type signature of
foldl' :: (Foldable t) => (b -> a -> b) -> b -> t a -> b
It requires three arguments:
- A function of type
b -> a -> b, which is called every time when the structure is traversed, taking the current state, of type
b, and the currently traversed element, of type
a, returning the next state, of type
b. We passed a helper function named
secondLargest'here to make it clearer how the next state is computed
- A value of type
b, which is the starting state. We passed
(0, 0)as discussed above
Foldablevalue of type
t a. This is not explicitly passed in our implementation because we are leveraging currying and function composition to avoid explicitly mentioning this argument.
The fully eta-expanded version can be written as but it is usually good practice to eta-reduce when possible.
The folding function passed into
foldl' as the first argument is
secondLargest'. It is very common for Haskllers to suffix an inner helper function with an apostrophe so that is what I have done.
secondLargest' is also used before it is defined. This is allowed in Haskell with the use of the
where keyword, where the definition of
secondLargest' is defined after the keyword.
This helper function is defined using pattern matching and guard clauses. On line 5, we pattern match on the two-element tuple that is passed in (as the state). The elements are aptly named
largest (for the left element), and
second (for the right element) to allow easy differentiation.
I have then employed guard clauses on lines 6–8 to implement the actual algorithm of determining the next state:
- On line 6, if the current element
xis larger than the largest element found so far, obviously
xis now the largest element and the original
second, so return
(x, largest)as the next state
- On line 7, else if the current element
xis larger than the second-largest element found so far, then we keep
xas the second-largest found so far, thus we return
(largest, x)as the next state
- On line 8, otherwise,
xis smaller or equal to
second, no update of state is necessary, we return the original
foldl' is complete, we now have a tuple containing the actual largest and second-largest elements found in the
Foldable. We can finally move on to the last bit.
The Final Stretch
This part is easy. We have our tuple and we feed it to a function named
It simply returns the second element of a two-element tuple, i.e. the second-largest element from the
Foldable, completing the algorithm.
Phew, that was a lot of words to explain 7 lines of Haskell code! I hope you like my explanation and you have learnt yourself some Haskell today!