Ludwig Wittgenstein was one of the legions of philosophers (lesser known, which is a travesty) from early 20th century. He is best known for language games – one of the questions that he posed: ‘Are you using a ruler to measure the table or the table to measure the ruler?’ This is more than just a semantic question – the point he was making is very profound: if you believe that ruler is accurate, you will use the ruler to measure the table. But – if you don’t trust the ruler, you might be using the table to measure the accuracy of the ruler. In other words, in the absence of trust (or more formally, full information) about an entity, any observation should be used to reveal more information about the entity itself, and not the objective of the entity. If this sounds meta, it is – but if you are inclined to think that this is some philosophical sleight-of-hand with little practical implications, you might want to reserve your judgment. After all, a surprisingly large chunk of philosophy has real, practical applications.
Yet another Bias: Wittgenstein’s Ruler bias
There is no dearth of biases that have been listed – once Amos Tversky and Daniel Kahnemann formalized the first set with their seminal work, behavioral scientists have built an impressive list (and it is growing). Here’s a list: https://thedecisionlab.com/biases/
And to that, I would add Wittgenstein’s Ruler bias. As Nassim Taleb has said, you must avoid the W-bias which goes something like this:
1. When you use a ruler to measure the table, know that you are also using the table to measure the ruler
2. The more unexpected the measurement, the more you apply W’s ruler
In other words, be skeptical. And skepticism should increase directly as you are presented with ‘extreme’ observations.
Let’s say your analyst labels an observation as an ‘extreme event’ (she says it is a ’5-sigma’ event for effect: in other words, it occurs roughly 1 in 1.8 million times). This is when you might want to start with the ‘W ruler’ principle and ask the question: What is the probability that the data itself follows a normal distribution given that you have seen a 5-sigma event, compared to the alternative (i.e. a non-normal distribution)? Turns out that it is extremely unlikely that the underlying distribution itself is Normal – instead it is likely to be a fat-tailed distribution (e.g. Power Law). Taleb even suggests that heuristically, you should reject Gaussian (normal) distribution in the presence of any event >4 Standard deviations. And with that, it is most likely not as extreme as your analyst will have you believe. Which in turn, helps frame the right response to an outlier event observation. For instance, for Retail Bank Operations, an ‘off-the-charts’ spike in the collection default rate is probably not as much of an exception: it is quite likely that the underlying distribution is itself non-normal. And instead of just writing it off as a ‘black swan event’, the Operations team should apply W’s rule and use that observation to better understand the underlying distribution itself. In other words, use the table to measure the ruler as well.
And so, the key takeaway: extreme events may not always be all that extreme – and as we continue to live through these uncertain times, it becomes tempting to think of exceptions as extreme outliers. And that may be oversimplifying – worse, you may lose an important signal to better understand the underlying phenomena that really drives your business
And I am beginning to find that the W’s ruler principle could apply in all sorts of situations: whenever I come across someone who professes to be authoritative on a topic, I have begun to apply the W’s rule: it says more about the person than the topic itself. Especially when the topic is not close to an area where the person would have spent the proverbial 10,000 hours. Generally a useful rule on say, WhatsApp groups which seem to have made self-styled polymaths out of just about everyone.
I am now a firm advocate of moving away from what is probably one of the most commonly made assumptions when working with data: that of a normal (Gaussian) distribution. Time and again, the occurrence and influence of ‘extreme events’ points to the new ‘non-normal’ normal. Is this itself a manifestation of the ever-increasing complexity in just about everything around us? Topic for another day.
If you are craving for the formal proof on extreme events and normal distributions, here’s a short video from Nassim Taleb: https://youtu.be/k_lYeNuBTE8