OOP vs FP: How To Validate Input

Rex Ng
8 min readMay 6, 2018

--

I was working with a toy example today and thought it would be fun to write a comparison on how to validate user input in object-oriented programming (using C♯) and functional programming (using F♯).

Let’s dive right into it.

The problem domain

Given a simple JSON configuration file with three fields:

{
"Url": "",
"Name": "",
"Age": 42
}

How can we validate the config so that consumers can use it without validating themselves?

To scope down the problem to fit a blog post, we will make a few assumptions:

  • The JSON configuration file always exists
  • The JSON configuration file always contains these three fields only
  • The source code of constructing a Config model is not given to you, i.e. you must rely on the compiler and the API exposed to you

The only validation we want to perform is on the values of those three fields. Specifically, we want to enforce the following:

  1. Url and Name must not be null or empty
  2. Url must contain a valid HTTP/HTTPS URL
  3. Age must be greater than 18 (imagine we are building an application for adults :))

First up, let’s see how we can do so in OOP.

OOP: Good old exceptions

If you are familiar with any kind of mainstream OOP language, one obvious choice might be using exceptions to halt the application due to the invalid JSON configuration.

This should be the bread and butter of every OO programmer and would not need much introduction. There are a few problems with this approach though:

  1. No errors caught at compile time. Callers of the constructor invoking new Config(null, null, 0) will have no way of knowing this might result in an exception before runtime. (This of course does not hold if you have access to the source code of Config).
  2. Errors do not aggregate. If the caller calls the constructor using new Config("http://www.google.com", null, 10) and got an ArgumentNullException (due to name equals to null), fixing the name and calling the constructor for the second time will cause another exception (due to age < 18). You will have to fix the age parameter and run the code for the third time to get a valid Config instance constructed. Why can’t I get all the errors during my first call so that I can fix it in one go?

OOP: Error aggregation in constructor

A second OOP approach would be to have some sort of a error reporting when constructing the Config model:

This might not like typical OOP code you see, but it does give you the benefit that calling new Config(null, null, 0) will always succeeds. Examining the Errors property also gives you all errors aggregated so that you can fix the problematic parameters in one go. However this also poses some problems:

  1. Relying on the caller to examine the IsValid or Errors property after the constructor finishes. Since the constructor now always return a Config instance without exceptions, callers must examine the IsValid or Errors property to check for validity. This is extremely error-prone.
  2. Temporal coupling between the Errors property and other properties of the Config model itself. Consumers of the Config model must always examine IsValid before calling config.Name or config.Url because both or one of them may return null, leading to unclear trust boundaries and error checking spread all over the system.

OOP: Return null config model

The last OOP validation I can think of is to return null when trying to construct an invalid Config instance. Since constructors cannot have return statements, we must use a factory method:

The constructor is marked as private to deliberately force callers to use the factory method instead. I personally think this is the worst OOP approach:

  1. Error information lost due to using one single value (null) to represent failure. Callers will not know which parameter causes the failure and will have to guess.
  2. Still no help from the compiler if null is returned from the factory method. A NullReferenceException will still occur at runtime if you forget to do a null check after calling the factory method.

FP: A config model, maybe?

One advantage of using a functional language like F♯ is that you get very expressive generic types built-into the core library itself. One of the types is called Option, which is how F♯ represents the ‘absence’ of data.

This is especially important because unlike C♯, F♯ does not allow null by default (except when interoping with other .NET languages). The only way to represent missing data is to use this wrapper type Option.

Since I assume my readers to be unfamiliar with FP, I have redefined the Option type here for your reference. But the focus here is to look at the makeConfig function, it returns an Option<Config> instead of a plain Config.

The difference now is that you can no longer access the members of Config directly after calling makeConfig:

Contrast the above with the following where the Config model is constructed directly without calling the makeConfig function:

You can see that Intellisense gives different results because the first and second pieces of code returns different types, Option<Config> from the first and Config from the second.

Why is it a good thing? Because now the compiler is able to catch the error for us! In order to consume the Option<Config> returned, we now have to perform an additional step to extract the underlying config out of the Option type. This forces the caller to be aware of the possible failure!

The caller has to perform ‘pattern-matching’ in order to extract the underlying config from the return value. Otherwise, the code will not compile.

Does not compile when Age is accessed directly from Option
Pattern-matching by the caller

However, using the Option type is limited in its own sense in terms of validation:

  1. We have lost all error information from the caller’s perspective. We do not know what causes the makeConfig function to return None.
  2. Errors do not aggregate as well. We do not know whether one, two, or more errors occurred.

FP seems to be worse than OOP shown above? What gives?

Do not worry, we will explore stronger types in the following FP programming techniques: the Result type and ‘a better Result type’.

FP: A better option: the Result type

Apart from Option, we have another stronger type from the F♯, called Result

Don’t worry if the type definition looks confusing. Just focus on the makeConfig function, the point here is to demonstrate how FP handles validation, not deep diving into the intricate details of monads.

Now things get a bit interesting. We get a major improvement from using Option: the errors are back! And they specifically pinpoint what went wrong with the config, as shown in a simple interactive session playing with the code:

Error from calling the makeConfig function

We get all the benefits from using a wrapper type like Option, and changing Option and Result lets us clearly show what went wrong with the input parameters of makeConfig.

However, we still cannot aggregate errors into a single place. Can we do better? Yes!

FP: Monoidal validation

Let’s first take a step back and split the validation into different helper, sub-validation functions:

We now have three separate functions, one for each field in the Config model: validateName, validateAge, and validateUrl. Nothing too special, but it will be easier to understand after we make the next change.

We will now try to solve the problem where errors are not aggregated properly when there are more than one. How? By storing the errors in a list of course!

Notice the sub-validation functions now all return a Result<T, string list> instead of a Result<T, string>.

This allows us to concatenate, and thus aggregate all the errors occurred from the helper functions. We can then modify the makeConfig function to pattern-match on all possible cases and return the aggregated errors to the caller.

The reason why the pattern-matching looks so intimidating is because we have to cater every possible combination on where the errors occur, i.e. whether the errors arise from url and/or name and/or age.

And finally! We got what we wanted:

To recap what is achieved:

  1. Type-safe error checking. Callers of makeConfig cannot ignore the possible validation failure because the compiler prevents the caller from directly access the Name, Age, and Url fields
  2. Meaningful and domain-specific error messages. Since validation functions are small and target specific fields, we can build very specific error messages that guide the caller to what is considered valid input.
  3. Aggregated error reporting. Callers of makeConfig get all the aggregated validation errors in one go. Subsequent errors are just ‘appended’ to the end of the original error list.

Some final words

All FP techniques I demonstrated here seem to be jumping through multiple hoops to get some very simple validation done. Is it worth it?

In my humble opinion, yes it does. Because we do not need to write most of these boilerplate pattern-matching code by hand. They are already discovered as a concept called Applicative Functors. If this seems a very scary word, yes it is. Fortunately, to most software engineers, this is already a solved problem and we can stand on the shoulders of giants and reuse well-known solutions:

Most of the heavy pattern-matching noise is gone by using two additional custom operators <!> and <*>. These two operators are well-known in the FP world as fmap and apply respectively. They come from the concept of applicative functors mentioned above.

If you try to compare how the same function with ‘normal’ parameters and parameters wrapped in Result is called in F♯:

You can see that having these operators makes the transition extremely clean. In fact, this pattern is so common in FP such that Haskell’s typeclasses have them defined for every instance of Applicative.

I hope this article is useful to those who want to explore the FP paradigm through a meaningful, worked example.

--

--

Rex Ng
Rex Ng

Written by Rex Ng

Programmer | Watch enthusiast

No responses yet