Deep Knowledge as the Key to Higher Machine Intelligence

Gadi_Singer · ‎04-18-2022

Third in a series on Cognitive Computing Research – The Age of Knowledge Emerges

Published January 26th, 2021

Gadi Singer is Vice President and Director of Emergent AI Research at Intel Labs leading the development of Third Wave of AI capabilities.

The Next Big Thing - Cognitive AI

To address a new level of higher machine intelligence that can truly ‘think’ in new situations, a categorical transition is needed to evolve AI beyond the current paradigm based on the processing of shallow and broad data. Yoshua Bengio, Francois Chollet and DARPA are among the experts to define a new category of cognitive capabilities of AI.

Leaning on the paradigm defined by Daniel Kahneman in his book Thinking, Fast and Slow, Bengio defines “System 2” – a category of AI that is slow, logical, sequential, conscious, and algorithmic, such as the capabilities needed in planning and reasoning. This is in contrast to the characterization of DL today– intuitive, fast, unconscious, and habitual – which he attributes to Kahneman’s “System 1”. Francois Chollet refers to the progression of AI capabilities in Abstraction & Reasoning in AI Systems, describing this emergent new phase as broad generalization (“Flexible AI”), capable of adaptation to unknown unknowns within a broad domain. DARPA’s Third Wave of AI is similarly characterized by contextual adaptation, abstraction, reasoning, and explainability, with systems constructing contextual explanatory models for classes of real-world phenomena.

These three perspectives signal a major inflection point with an upcoming categorical shift in AI capabilities with substantially different competencies, including comprehension of the real-world, abstraction, reasoning, contextualization, logical and reasoned decision making, and the ability to adapt to completely new circumstances (unknown unknowns). These competencies cannot be addressed just by playing back past experiences. I will use the term “Cognitive AI” to refer to this new phase of AI.

While not expected to reach the goals of open-ended artificial general intelligence (AGI), AI with higher cognitive capabilities will play a new role in technology and business, both through this set of shared cognitive competencies and by navigating through the shared values that are fundamental in the relationship between humans and machines. AI that can make reliable decisions in unforeseen environments will eventually be trusted with higher autonomy and become significant in areas such as robotics, autonomous transportation, as well as in control points of logistics, industrial, and financial systems. Finally, an increased level of human-machine collaboration can be expected as AI represents an active and persistent agent that communicates and collaborates with people as it serves and learns from them.

Role of Structured Knowledge in Cognitive AI

There is a divide in the field of AI between those who believe Cognitive AI can be achieved by advancing DL further and those who see the need for incorporating additional fundamental mechanisms. I fall in the latter camp – let me explain why.

DL masters the underlying statistics-based mapping from a given input through the multi-dimensional structures in the embedding space, and to a predicted output, excelling in the classification of broad and shallow data (for example, a sequence of words or pixels/voxels in an image). The input data carry limited positional information (such as the location of a pixel, voxel, or a character in relation to others) and limited structural depth - it is the task of the AI system to discover the features, structures, and relationships. DL is equally effective in indexing very large sources (such as Wikipedia) and retrieving answers from the best matching places in the corpus – as demonstrated in benchmarks such as Natural QA or EfficientQA. For System 1 tasks, which rely on a mapping function created during training, DL delivers.

In contrast, structured knowledge is the key to unlocking Cognitive AI or System 2 type capabilities. Knowledge that is structured, explicit, and intelligible has several key properties that make it essential for higher intelligence. While DL primarily exploits correlation, knowledge constructs can capture and represent the full richness associated with human intelligence.

One essential knowledge construct is the ability to capture declarative knowledge about elements and concepts and encode abstract notions such as hierarchical property inheritance among classes. For example, knowledge about birds, with added particulars on passerine species, plus specifics on sparrows, provides a wealth of implied information about chestnut sparrows even when not specifically spelled out. Other knowledge constructs include causal and predictive models. These constructs rely on explicit concepts and well-identified, overtly defined relations rather than machine embeddings in the latent space, and the resulting models will have more extensive potential for explanations and predictions well beyond a “ready mapping function”. By capturing an underlying model of relationships between factors and forces, current events can be worked backward for causality or forward for predicted outcomes. A model of the underlying dynamics can be richer with context, internal states, and simulated sequences that were never yet encountered.

In the human brain, the prefrontal cortex demonstrates the ability to ‘imagine’, simulate and assess potential futures. This provides an evolutionary advantage to human intelligence. In a complex world, individuals have to make choices involving scenarios that haven’t yet been experienced. Mental simulations of possible future episodes within environments not bounded by clear rules are based on an underlying model of world dynamics and provide great adaptive value in planning and problem-solving. The resulting ability for humans to adapt and make choices uses a different part of the brain not available to other mammals, which are tasked primarily with fast and automatic ready mapping functions.

Essential for higher cognition, procedural modeling mechanisms are based on covert mathematical, physical, or psychological principles beyond input-to-output observable statistical correlations. For example, a physics model can capture the phenomenon of hydroplaning and provide a concise predictor of the motion of a car under various conditions. Addressing hydroplaning through physical modeling instead of leveraging statistical extrapolation from measured training data allows for effective handling of out-of-distribution circumstances and the long tail of real-life eventualities. On a higher level, having a model of special relativity that states “E=mc^2” captures the genius of Albert Einstein in expressing the fundamental relationship between elements, rather than the statistical correlation function extracted from multiple tests. Such a procedural model can be combined with a DL-based approach to expand current capabilities.

Knowledge bases can capture (implicit) commonsense assumptions and the underlying logic not always overtly presented in the training data of DL systems. This implied and “obvious” understanding of the world and its dynamics is highly instrumental in addressing many tasks of higher machine intelligence. Finally, well-structured knowledge representation can address aspects of disambiguation (separating attributes of ‘club’ as a playing bat, weapon, card type, or place for parties), contextualized, and aggregated content.

Research for Cognitive AI

In the coming years, major advances in what was described as current AI or “System 1” can be expected, as underlying shallow mapping functions become significantly more elaborate and knowledgeable. Cognitive AI will bring an additional level of more sophisticated capabilities, as described above. There are promising early efforts to evolve DL to a more generative and dynamic system (for example, Dynamic Neuro-Symbolic Knowledge Graph Construction for Zero-shot Commonsense Question Answering by Antoine Bosselut, Ronan Le Bras, and Yejin Choi). Overall, the emerging focus on symbolic-based approaches founded on overt, structured knowledge leads me to believe that a new set of Cognitive AI competencies will emerge by 2025.

Current DL provides an exceptional solution for perception and prediction tasks based on ready mapping functions reflecting multidimensional statistical correlations. The next level of machine intelligence will require reasoning over deep knowledge structures, including facts and deep structures of declarative (know-that), causal (know-why), conditional/contextual (know-when), relational (know-with), and other types of models. The capture and use of deep knowledge can address other fundamental challenges of current AI, namely exploding DL model size and gaps in model robustness, extensibility, and scaling. This is why I assert that Cognitive AI is signaling the arrival of the Age of Knowledge for machines.

At the Cognitive Computing Research (CCR) Group at Intel Labs, we look forward to building next-generation AI systems that will one day understand this blog series and other informative content – and deliver even greater benefits to our lives.

Read more

Age of Knowledge Emerges