Updated: Nov 13, 2020
When it comes to contending with risk it is important to have an understanding of the nature of uncertainty – the root cause of risk. There are several types of uncertainty but the two that are most critical are: epistemic and aleatory uncertainty.
Epistemic uncertainty has to do with the lack of knowledge. The effects of epistemic uncertainty are often characterized in terms of its likelihood of occurrence and the severity of its impact. We can predict the outcomes with some level of confidence which facilitates decision making with respect to "buying down" these risks by reducing the likelihood or by mitigating the effects, or both. We call these reducible risks.
Aleatory uncertainty has to do with chance. The effects of aleatory uncertainty can also be characterized using probabilities, however, the specific outcomes are not predictable with any level of certainty. This kind of uncertainty is considered as irreducible although its effects can be mitigated by introducing margins in the form of such things as extra resources, time, and capacity to help mitigate the effects. However, what we cannot do is improve the accuracy of our predictions.
For risk management to be effective it must adequately contend with both kinds of uncertainty. However, in highly-regulated, high risk industries it is aleatory uncertainty that is foremost on everyone's mind as it presents a significant source of risk in the form of low occurrence, high impact events which are often called: unknown-unknowns and "black swans". These cannot be predicted and are in the domain of randomness, chaos, complexity and disorder – aleatory uncertainty.
The solution to aleatory uncertainty
In the book, "Antifragile" the author Nassim Nicholas Taleb who also wrote the book, "Black Swans" proposes that the solution to aleatory uncertainty is not greater margins or safeguards but instead the development of what he calls, antifragility properties.
Taleb defines antifragility as going beyond resilience and robustness. A resilient system resists shocks to maintain its state, whereas, antifragile systems gets better; it improves.
He suggests that uncertainty, disorder and the unknown are completely equivalent in their effects and therefore can be addressed in the same manner. Instead of trying to predict the future which is not possible for aleatory uncertainty, steps are taken to measure and reduce the level of fragility which is easier to do and results in greater utility.
Fragile systems breakdown easily in the presence of uncertainty. The solution is not to build more robust systems as we might think. Resilient, robust systems neither break nor do they improve and therein lies the rub. The opposite of fragile is not robustness it is a word that we don't have a name for, so Taleb uses, "antifragile" – things that gain from disorder.
Offshore drilling safety example
A few years ago, a safety assessment of offshore drilling platforms was conducted for operations in the North Sea.
Each platform had written procedures some of which were followed and some that were not. Each had a positive safety culture (more or less) and each had commitment from senior leadership, and so on. In terms of practice, compliance, and other categories of assessment there where no differences that stood out other than their safety performance.
Some of platforms had experienced no incidents for a long period of time, while others were contending with multiple but mostly minor ones. The question that was asked was which platforms are the safest to work on?
The platforms that had no incidents for a long time were considered to be the most unsafe which maybe surprising to some. While these platforms had excellent performance in the past there were other indicators that caused concern such as signs of complacency, and over confidence to name a few. Using past performance to predict the occurrence of future incidents suggested that these platforms would be the safest. However, their current behaviors suggested otherwise.
The platforms considered most safe were the ones dealing with minor incidents. They had a heightened level of awareness, and from a "antifragile" perspective were improving with each incident. Everyone was looking out for each other and not resting on the achievements of the past. You might get "injured" but you would not be harmed.
Lack of volatility is not the goal
Seeking stability by inhibiting fluctuations (you might say incidents) tends to produce the opposite of what we had intended. According to Taleb, overly constrained systems become prone to Black Swan events. Such environments tend to experience massive blowups, catching everyone off guard and undoing years of stability almost all at once.
It is for this reason that over regulation (mandatory or voluntary) and the preponderance of prescriptive rules can create greater levels of fragility which in turn increases the chance of risk. It is no wonder why some have criticized the pursuit of vision zero targets (zero defects, zero incidents, zero fatalities, and so on.) The low occurrence of these events is not sufficient to drive improvements and create the necessary behaviors.
Antifragile companies learn from errors they create and the errors from others. With every plant failure, worker injury, and failed objective the industry as a whole becomes safer, but only if the we learn from what has happened. That is why it is so important for companies to share not only their best practices but more importantly their failures; otherwise the "sacrifices" paid by others will be for nought. Unfortunately, sharing of failures is considered by many to be foolishness when it fact it is the behavior of the wise.
Continuous improvement as a means to introduce volatility
Over the last several decades the adoption of continuous improvement (CI) has helped to transform many organizations foremost coming from the automotive industry. However, you will now find its application in almost every sector. The reasons stated for why companies adopt CI often have more to do with improving quality, increasing efficiencies, or lowering costs. However, is that all that is happening?
Continuous improvement at its core is an intervention strategy to facilitate change. These changes done in small increments over time create the capacity for even greater changes in the future; they make companies less fragile. This is precisely what is behind the principle of "fail fast, fail often." Although, CI for many focuses on failures of the past it still creates the benefits associated with contending with volatility.
If you were to ask, "which company is most likely to succeed in the presence of uncertainty?" the answer for me would not be the largest or most robust. It would be the ones that were practicing continuous change in any of its forms be it LEAN, Agile, CI, and others. These are the companies that embrace uncertainty, becoming stronger in the process, and instead of being surprised by negative black swans they anticipate and are delighted to see the appearance of the positive black swan.