#71 Dealing with Uncertainty: Vast and fast data

You would have to be living under a rock to be unaware of the nervous uncertainty that has gripped the world today. And inevitably, this uncertainty has spilled over into the business world. Companies, big and small, are increasingly nervous about making bets into the future and even more so, trying their best not to be hit by some ‘black swan’ event. Ask any corporate executive and a key operational question that is top of their mind would very likely be: ‘How can we get better signals faster that can help us react better to a rapidly changing environment?’ This question, when translated for the Data organization, should read: ‘For any given business function, how can we put data to work to not just monitor key metrics, but also derive insights and suggest actionable signals with greater velocity?’ Given the (not insignificant) investments made in data platforms accumulating petabytes of data over the last several years, it is a question of the HOW. And for the CDO, this is an opportunity like no other to grab this moment and make a meaningful impact to business. This will of course, require intense focus and a rapid execution mindset across multiple fronts. In this post, I want to focus on three key tenets and design hacks that are going to be important for decision making in this new normal (certainly neither comprehensive or mutually exclusive):

1/ Look at multiple signals: Data from a single source is unreliable when it comes to trying to explain any anomalous behavior in a metric. For instance, outcome metrics like Revenue trends or even trends in audience metrics like impressions, engagements etc. are highly correlated and, in any case, lagging. It is well known by now that early warning signals can be gleaned from customers by exploiting data from social media feeds, customer care call transcripts etc. The challenge has always been trying to extract signal from all the noise. Design hack #1: Consider developing a ‘company uncertainty index’ by mining categorical data from a variety of external and internal data. See below for some interesting work that can help get started.

2/ Historical data is no longer a good predictor: Recent events (starting with the pandemic) have shown that the standard paradigm of static models which are trained offline and then deployed in production to make predictions (e.g. forecasting, churn prediction etc.) are no longer effective. Design hack #2: Try working with dynamic models which are designed to evolve as the understanding of the world continues to improve over time. System dynamic models are gaining traction and worth exploring. Caution: easier said than done, since dynamic models have methodological constraints – mainly, the methods for validating dynamic models are not as well established as the traditional model monitoring methods. See earlier blogs on System Dynamics and Learning Systems

3/ Speed over accuracy: Data teams have had to deal with this trade-off for many years now. But the need to reduce this latency is more than ever – and managers are increasingly demanding rapid insights with ever shorter feedback loops. That makes it harder for predictive, causal models which need to learn over time to be effective, especially during volatile times. Design hack #3: In the correlation vs. causation debate, it might be time to lean on correlation as a quick and effective way to understand some key signals and use them to iteratively navigate the decision landscape. See an earlier blog on the topic of Correlation and Causation

To make some meaningful progress on this will require us to solve two fundamental problem – 1/ Data, data everywhere but how do you get the data the people who can use it most with minimal latency? 2/ Data Analysis tools have been built for data analysts/scientists and not for the people who are actually tasked to make operating decisions on a daily basis. How can we empower a true self-service paradigm? More on that in subsequent posts

Appendix: Measuring Uncertainty

In 2016, three economists published a paper that described the methodology of developing uncertainty indices based on the frequency of words in newspaper articles. Social media marketing teams have done text analyses for several years now, but have typically lacked the statistical rigor to account for issues like reliability, accuracy, bias and consistency (e.g. we all know how social data is skewed towards a small cohort of vocal customers triggered by negative experiences). This paper offers a methodology to work around these challenges. They now publish a series of Uncertainty indices: https://www.policyuncertainty.com/ – highly recommended, not just he methodology but also a great case study in distilling complexity into a simple, effective index that can be actually used.

Share this:

Leave a comment Cancel reply