Data science has come through the world of statistics – where you are taught hypothesis driven feature selection and engineering. In other words, a priori business understanding drives the feature selection. That in itself, creates the problem of availability bias. And to top that, a typical statistical model would favor stronger features over weaker ones. Take customer churn – any Customer Success Manager will tell you that factors like price discounts, quality of customer service etc. are going to be important in predicting the risk of churn. And your model will in all likelihood, faithfully confirm that – except that: these insights are merely confirming the intuition and, in many cases, are not really actionable (e.g. improving the quality of customer service is nebulous to begin with and will require long-drawn and often expensive process changes to make a difference).
Features: Strong and Weak
So, the question is: how can we know what we don’t know? In a world where the information value of things that we know has been more or less fully extracted, how do we look beyond the obvious to the long tail of factors that can predict outcomes? Can we go beyond the strong features to look at the weak features? Back to the churn problem: instead of the main factors like customer feedback from a service request, what if we look at the underlying all the behavioral inputs by taking the entire dataset around service requests, and so on. While AI holds out the promise with unsupervised techniques and the compute power to go after a discovery driven approach, the onus shifts to data. Do we have the quality and quantity of data to tease out the nuggets from the ‘weak features’?
How much data is enough?
Here lies the rub for many B2B companies. They have traditionally not invested in systems that have captured data systematically across the value chain, with the right ontologies, granularity and frequency. And that is the price many of them are paying now as they attempt to leverage AI to better monetize data – and the gap is widest in Sales and Marketing. And one of the primary reasons why compared to consumer AI, Enterprise AI is lagging behind. The data landscape is handicapped in 2 broad ways:
- Insufficient observations: In a typical Sales and Customer Success lifecycle, many of the interactions happen off-line. And much of those interactions and the customer response to these interactions never makes it back as structured data. And that impairs the rate at which AI can learn and subsequently, influence customer journeys.
- Data Quality: This is unfortunately, a bigger problem. One that could have been completely avoided with some foresight. Most of the data, especially the Pipeline and Opportunity activities are captured in disparate systems, often as unstructured data. Stitching that together to build a cogent customer journey ends up being a frustrating, often with results that find it hard to justify the investment.
Better late than never – most companies have recognized the potential value of data. And are putting in processes to capture and consolidate the data assets. And like everything else, it is a journey and while that happens, it is equally important to continuously evolve the models to keep improving the effectiveness (i.e. actionability) of predictions.