I just finished reading a fascinating short story by Jorge Luis Borges called ‘Funes the Memorious’. The story is about this person who has a ‘perfect’ memory and can recall every minute detail about everything around him. “… he had reconstructed a whole day; he never hesitated, but each reconstruction had required a whole day” … “He was, let us not forget, almost incapable of ideas of a general, Platonic sort. Not only was it difficult for him to comprehend that the generic symbol dog embraces so many unlike individuals of diverse size and form; it bothered him that the dog at 3.14pm (seem from the side) should have the same name as the data at 3.15pm (seen from the front)”. Which is a fascinating premise and goes to the very heart of the concept of learning. Towards the end of the story, Borges puts it beautifully: “I suspect, however, that he was not very capable of thought. To think is to forget differences, generalize, make abstractions. In the teeming world of Fumes, there were only details, almost immediate in their presences”. I was stunned by how deeply Borges had thought about this whole idea of information and learning: after all, over a span of 5 pages, he had struck at the very heart of some of the core issues around Data and Machine Learning that is top of mind in most Enterprises today.
Data and Analysis: For the longest time, enterprises were forced to rely on data analysts who came up with the hypotheses and asked questions of data using say, SQL. To top that, the cost of storage and compute power and the fact that we were limited by the Analyst’s ability to process and interpret data meant that EDWs were designed as aggregations. Everyone was painfully aware that this meant that ‘you can never know what you don’t know’ and you had to let go of the hidden nuggets that lay in granular data.
In other words, every organization had a Funes (the sum total of all the databases) but as with Funes, it was impossible to comprehend all this data because organizations lacked, like Funes, the ability to go beyond the detailed facts. And so it came to pass: hypothesis led analysis with aggregate data was the only way out. The enterprise world had collectively resigned itself to this definition of learning. Even as everyone could see that the human brain has figured out a way to solve this problem by – as Borges says, the ability to “forget, generalize, make abstractions” The very problem the AI community was trying to solve since the 1950s.
And so here we are today – we have the compute power and the algorithmic muscle to finally emulate the essence of learning: extract general patterns from specific observations. Which then brings to the question that the AI community is trying to answer: how do we learn? How can we teach machines to learn like how we as humans have fine tuned over thousands of years? And then at a more realistic level, what does it mean for how we think about data? Topic for the next blog!