Sunday, April 13, 2014

There is always a single point of failure

Everyone these days is [or, at least, wants to show they are] concerned about system availability. And, as we all know, the best way to ensure availability is through redundancy. So most of the architecture documents going past me in recent years are careful to point out the lack of the dreaded single point of failure. But appearances can be deceiving, for example please spot the single point of failure in the following classic 3-tier application:

A classic 3-tier application

Come on! It’s easy – it’s right there in red:

A classic 3-tier application in real life

If an airplane falls from the sky or the nearby river floods the datacenter your entire application is gone. Ok, no problem: we can start designing around that by spreading our application across two datacenters, but then the connectivity between them becomes the weak link as CAP Theorem raises its ugly head. And on, and on it goes…

The moral of the story is: beware of architects who claim their system does not have a single point of failure – they were probably just not paying attention. There is always is a single point of failure (and sometimes more than one). We need to make sure that we have identified all of them and are ready to live with the risk they pose. Or go back to the drawing board and design a solution to mitigate them which will likely introduce another one, or, if we are not careful, more than one.

No comments:

Post a Comment