Pure functions are essential to understanding functional programming and writing clean code. But what does it mean for a function to be pure?
Let's take a step back and consider for a second how most functions work. A function is nothing more than a process that takes some input, called an argument, calculates something using that argument, and returns a value as output.
When creating a function, our goal is to make it as stable and predictable as possible.
Consider, for example, the add()
function:
function add(a, b) {
return a + b;
}
add(1, 3); // 4
This function takes two arguments as input, sums them together, and returns a value as output. If you try to pass the same numbers as input, you will always get the same result. No matter how many times you call it, the result will not change if the input is the same.
In fact, while computing the output, the add()
function only cares about the argument it receives and nothing else. It does not care about other values or functions outside of it. We call this type of function deterministic.
What is a deterministic system?
A system is said to be deterministic when it produces the same output given the same starting conditions. This definition is quite simple, and you can immediately see how easy it is to predict the outcome of a deterministic system.
If a deterministic function is easy to predict, by definition, a non-deterministic function will be harder to understand and estimating its output will require more effort.
To understand a non-deterministic function, we can think of a function that generates a random number. In JavaScript, this is done using:
Math.random(); // 0.6978083732222371
Math.random(); // 0.5916538880120235
Math.random(); // 0.2321295881301506
No matter how many times you run this function, the result will vary and that is the purpose of a random number generator.
The above function is fairly simple, but another example can be a basic weather application. This application will contain an input field where you type in the name of a city or a zip code. If you try to enter the same input, for example "Berlin", every 30 minutes, you will always get a different result. In fact, after entering the name of the city as input, a function will connect via internet to a weather service, making an API request, and will return a different result every time (since the weather always changes).
As we can see from the example of Math.random
and the weather application, not all functions need to be deterministic, but if a function can be deterministic, then it should be.
Side effects
You have seen up to this point how easy it is to predict the outcome of a deterministic function, such as the add()
function. When something is easy to predict, it is also easy for us to understand its side effects. The add()
function has no side effects, since it does not produce changes that affect the outside world.
Whenever a function interacts with a variable that is outside of its scope, the function is considered to have a side effect. We can see this in action in the example below:
const originalCars = ["tesla", "audi", "ferrari"];
function addCar(carsArray, car) {
carsArray.push(car);
return carsArray;
}
const newCars = addCar(originalCars, "bmw");
console.log("original ->", originalCars); // ["tesla", "audi", "ferrari", "bmw"]
console.log("new ->", newCars); // ["tesla", "audi", "ferrari", "bmw"]
The addCar()
function will add a new car in the originalCars
array using the push()
method.
This is a perfect example of an impure function and, as we can see from the result printed on the console
, it produces side effects.
In fact, the addCar()
function is able to add a new car, but it also modifies the originalCars
array which is outside its scope.
A better solution would be to make a copy of the existing array and add a new car to that copy, instead of modifying the existing array.
This can be done by replacing the push()
method with the spread operator
, as shown below:
const originalCars = ["tesla", "audi", "ferrari"];
function addCar(carsArray, car) {
return [...carsArray, car];
}
const newCars = addCar(originalCars, "bmw");
console.log("original ->", originalCars); // ["tesla", "audi", "ferrari"]
console.log("new ->", newCars); // ["tesla", "audi", "ferrari", "bmw"]
However, when we have a function that has no side effects and is deterministic, we call that function a pure function. This type of function is easier to read, debug, and test. Essentially, dealing with pure functions is easier to reason about.
We have seen the add()
function in action, but we can look at the square function as well:
function square(num) {
return num ** 2;
}
square(2); // 4
square(2); // 4
The square()
function above will always give the same result for the same input. Also, by calculating the square of num
, it will not change the value of num
itself.
Conclusion
Having side effects is not a bad thing. In fact, if our application doesn't change anything on the outside, it won't be very useful.
If you think again about the weather app, which we mentioned above, you will realize that we want to contact the weather service (API), so we can get up-to-date information about the weather forecast. That's why our goal is to minimize the side effects as much as possible and not to eliminate them completely.
Within a system, pure functions do not depend on anything else. That is why the order in which they are called does not matter and they can even be called simultaneously. Basically, there is no need to think about time and what happened before the function is called or what will happen after.
The notion of time, however, is added with non-deterministic functions that have side effects. In this case, the time at which the function is called is important (e.g., weather app). Adding an additional dimension, time, makes the function more difficult to read, debug, and test. It is this kind of increase in complexity that is often so unnecessary.