Developers – Make Everything Idempotent

get_the_author_meta('user_nicename')

“Idempotent” – probably one of my favorite words. What does it mean?

Oxford Dictionary:

Denoting an element of a set which is unchanged in value when multiplied or otherwise operated on by itself.

https://en.oxforddictionaries.com/definition/idempotent

The definition above is a bit cryptic and reflects the origin of the word as it relates to mathematics. For developers, we can simplify the definition to the following:

No matter how many times you execute a distinct operation, it will achieve the same result

There’s a strong case for creating RESTful APIs that are idempotent. Create/read/update/delete operations within APIs should be able to be executed repeatedly in the same fashion without ill effects. For example – I should be able to call create twice on a record, and not end up with two records. The create function should ensure that the record does not already exist using some unique identifier and should not create it if it is already there.

The transient nature of connections over the web is another reason APIs should be idempotent. Sometimes we aren’t sure if an operation completed successfully and we need to retry it – it is critical that it does not cause ill effects if it is indeed happenning for the second time (assuming it did complete the first time).

I’m here to argue that, not just RESTful APIs, but all of our programming can benefit from thinking about idempotency. We protect ourselves from mistakes, and other developers who interact with our code. Let’s take a look at a couple of examples:

Batch Imports:

Bad:

Good:

Batch import implementations may be one of the places I’ve seen the most need for this type of thought process. While the code in the first example is extremely simple, it means if the same records get imported more than once we may have duplicates. We can’t assume the database level will have constraints that handle this, so if it is not our intention we should handle it in code.

Notice also that the second example gives us more control over the process, and in doing so is clearer about the intentions of this import – we do not want to create duplicate records and we may/may not want to update the existing records.

Updating a Sale Status:

Bad:

Good:

Without knowing what the “updateSaleStatus” in this example does, we should always protect ourselves in a situation like this by checking if the state change is necessary first. This not only protects us now from unwanted side effects of the updateSalesStatus function (which we cant assume is being idempotent like we are), but also in the future, in case updateSalesStatus changes to operate differently.

There’s a theme here in these two examples – ensure that a state changes is necessary before you initiate it, and initiate state changes in a safe/repeatable manner (like the possible update in the import example).

While the examples above may seem like common sense when you look at them side by side, too often I’ve seen developers interested in just getting a data point created or updated the first time rather than thinking about what happens if the same code runs on the same records again. There’s also sometimes the thought that the brevity of the code in the bad examples makes them better, which is not the case in this instance.

While this design principal may not apply to every single scenario, in my experience it seems to apply to most, and if we keep this thought front and center in our coding and testing process, we end up with safer code that is clearer about its intentions.