Scaling AI Systems: What can Cognitive Psychology teach us?

I have often discussed the topic of taking AI models (actually, all analytical models) out of the one-off, Proof-of-Concept concept into a scalable, engineering paradigm. Needless to say, this topic is not new – most organizations are struggling with this. Which begs the question: why is it so hard to scale these systems? Could be that it is not an implementation question but more of a design question: how can we design these systems to be built for scale?

I have just started reading about Cognitive Psychology (fascinating area – lots of interesting work going on) and one name keeps coming up: David Marr (some have called him the ‘Einstein of Neuroscience’). He has written extensively on Vision, which is clearly a spectacular feat of cognitive intelligence. For most animals, vision comes together as a result of continuous active learning and speed. There is a very good evolutionary reason: our survival has depended on the ability to process vast amounts of visual signals to make decisions rapidly. David Marr took an information processing system view of vision – and his fundamental insight was that information processing systems must be understood at three distinct, complementary levels:

Computational level: what does the system do? Why does it exist?
Algorithmic level: how does the system do what it does?
Implementation level: how is the system physically realized?

And his fundamental, brilliant insights (at least to me) are:

Each level has a distinctive role to play
Each level makes an important, non-redundant contribution to the system as a whole.

In other words, all three levels are integral strands of the overall fabric, and if we attempt to understand a level(s) in isolation without thinking of others, we are hopelessly lost. And that has been baked into the design of all of our cognitive systems – and most likely the reason why these systems have been so anti-fragile.

And herein is the fundamental lesson for AI systems: we should be thinking of them as information processing systems:

Computational level: what are the problems we would like to solve: g. Improve case resolution times in the Call Centers by providing better and faster access to contextual answers to product related questions to the support agents.
Algorithmic level: how do we want to solve: g. connect all the assets – from product documentation, case notes, chat transcripts etc. into a knowledge graph with a semantic search engine for retrieval
Implementation level: How are the models deployed and integrated into the physical information flow: g. interactive chatbots that help agents navigate through the problem triaging and resolution; and improve the knowledge graph on an ongoing basis with the increase in the corpus of cases. Furthermore, continuously evaluate and prune the graph structures through instrumentation, and where possible experimenting alternatives

All too often, Data Science projects have a disproportionate focus on #1 and #2. #3 is not a design feature – and even if it is, it is all too often, an afterthought. And this often ends up under-investing in the compute infrastructure required – which in turn, inhibits the speed and effectiveness with which the system can learn and improve the quality of answers. Not how an AI system that seeks to learn and improve over time should be designed. Even more so if the goal is to move beyond narrow AI to accomplish problem solving or reasoning tasks.

Share this:

Leave a comment Cancel reply