Discovering the Hidden Structure of Knowledge
— Though we don’t how the human brain is able to transcend from the data processing layer (where the brain too is just processing low level data coming in from the senses) to the realm of knowledge we can, through introspection, examine the structure of the end product of our thought processes, that is knowledge itself.
What we find is a collection of ideas that are connected through various relationships that are themselves ideas. While many of these ideas represent specific objects in the real world, that tree, this car and so forth, many are abstractions, trees, cars. Each idea is connected to many others, some of which define its properties and some its relationship to others ideas. The power of abstract ideas as opposed to ideas representing particular things is that they are reusable. They can become components of new ideas. Complex concepts are built out of fundamentals.
As the material world is composed of atoms, our knowledge of the world is composed of ideas. The English language is has over a million words each referring to an idea. Without some notion that only a small portion of these ideas are fundamental (atoms) and can only be combined in certain ways, the task of putting knowledge in machines is overwhelming.
Democritus is known as the “laughing philosopher.” There is speculation that he was laughing at his critics who clearly had not thought things out as well as he had. He said: “Nothing exists except atoms and empty space; everything else is opinion.”
But where does one start to model such intricate complexity in a computer program? If we look at the history of science we find a powerful parallel that can show us the way. Democritus advanced the notion that material things were made out of atoms around 400 BC. He reasoned were different types of atoms and their properties gave rise to material properties. Hard things must be composed of atoms that stick to one another – perhaps they have hooks that interlock. The atoms of liquids must slide past one another.
His model was simplistic from our current perspective but essentially correct. It predicted that a small number of different kinds of atoms, combining in set ways, are responsible for the intricate complexity of the material world.
The practical difficulty of the problem is illustrated by Cyc an artificial intelligence project that has attempted to assemble a comprehensive ontology and knowledge base of everyday common sense knowledge, with the goal of enabling AI applications to perform human-like reasoning. It is essentially a huge rule-based “expert” system
The project was started in 1984 at the Microelectronics and Computer Technology Corporation (MCC) and has been on-going. Cyc has compiled a knowledge base containing over 3 million assertions. The project’s own estimate is that this number of assertions represents 2% of what an average human knows about the world. Thus by this approach, the knowledge-base of a functional AGI would consist of 150 million assertions.
For centuries alchemists labored to produce more valuable materials from more basic ones but lacking the knowledge of the specific categories of atoms and their principles of combination, they labored in the dark with no way to predict what would happen when they combined substances.
The project can hardly be considered a success. MIT’s Open Mind Common Sense AI project uses a semantic network instead of an expert system architecture but it suffers from the same failings, it has over 1 million facts or assertions. These projects bring to mind the plight of the medieval alchemists whose knowledge of the material world could only be acquired one experiment at a time.
Today we know the specific categories of atoms, we know their number and we know which will combine with which. That knowledge is elegantly displayed in the periodic table of the elements.
What we have discovered is that the core building blocks of knowledge exist in a number of discrete categories and that instances within these categories may only be combined with other instances to create more complex concepts according to fixed rules. This epistemological framework allows the software to assemble the core building blocks into models that accurately represent reality, this is, make sense rather than nonsense.
As Chemistry is knowledge about materials, Epistemology is knowledge about knowledge, meta-knowledge. Can all the complexity of human knowledge be constructed from a manageable number of fundamental concepts, can these concepts be placed into an even smaller number of categories that determine which kinds can be combined with other kinds to create sense rather than non-sense?
The answer is yes. Our approach is based upon “applied epistemology,” which, like mathematics, is partly discovered and partly invented. It is based on the insight that, like materials, knowledge itself has a hidden structure that can be exploited to create a compact specification for the construction of arbitrarily complex knowledge models from a relatively small number of core conceptual build blocks.