Intelligence as a cognitive algorithm

Formalization of intelligence as a cognitive evolution algorithm: variation by external inputs, & selection of input coordinates by projecting predictive patterns.

Intelligence is an ability to predict/self-predict (plan) by discovering & projecting patterns. This definition & the following opinions are mine, as the alternatives are scarce. For an excellent high-level discussion see "On Intelligence" by Jeff Hawkins, though consistency there is lacking.
General (scalable) intelligence must recursively self-improve: continuously develop new algorithms. This requires a criterion of improvement, & to be universal it must come from the very definition of intelligence. There is an opinion that intelligence can be recognized but not defined, which is absurd because recognition *is* a match between an input & a definition.
I think the lack of functional definition is the main reason for the failure of general AI attempts over the last half-century, although Algorithmic Information Theory and Bayesian logic are a start.


We know one mechanism that produced an intelligence (although a pretty messed-up one): the evolution. Initially algorithmically very simple, evolution changes heritable traits at random & evaluates results for reproductive fitness. But biological evolution is obscenely inefficient because intelligence is only one element of reproductive fitness, & selection is extremely coarse-grained: on the level of a whole genome rather than individual traits.

From the definition: an ability to predict/plan by discovering & projecting patterns, a fitness function specific to intelligence is predictive correspondence of the recorded input patterns. 
Correspondence is a representational analog of reproduction, maximized by an internalized evolution:
the heritable traits for predictions are past inputs, variation is a change of their coordinates & resolution, selected by their fitness: cumulative match to the following inputs. This match is discovered by comparison: an iterative subtraction, which also adds syntactic variation by derivation.

Match (fitness) should be quantified on the lowest level of comparison, - this makes the selection process more incremental & the subsequent search exponentially more efficient.
This lowest level is the comparison between two single-variable inputs, & the match is partial identity: a complimentary of the difference, or the smaller of the comparands. This is also a measure of analog compression: a sum of bitwise AND between uncompressed comparands (represented by strings of ones). This adds a whole new dimension to Bayesian logic: I quantify partial match or occurrence (a micro-dimension of prediction), just like Bayesian logic adds quantified partial probability (a macro-dimension of prediction) to classical logic.

To speedup, the search algorithm must incorporate increasingly more complex shortcuts to discover better predictions. The speed is what it’s all about, otherwise we can just sit back & let the biological evolution do the job. More complex predictions (patterns) & pattern discovery algorithms (meta-patterns) are derived from the past inputs with incremental comparison range & derivation depth. This includes introspective cognition: reinput & comparison among the higher derivatives.

The most basic predictive shortcuts are based on the assumption that the environment is not random:
- past patterns are decreasingly predictive with the distance from subsequent inputs.
- each pattern is increasingly predictive with the accumulated match among constituent inputs, & decreasingly predictive with the difference between them.
A core algorithm based on these assumptions would be an iterative step that selectively increases range of search & resulting complexity of the patterns in proportion to their projected cumulative match:

The original inputs are single variables produced by senses, such as pixels in case of visual perception.
Their subsequent comparison by iterative subtraction generates new variables: length & aggregate value of partial match & of partial miss (derivatives) for each variable of the comparands. The inputs are integrated into patterns (higher-level inputs) if the additional projected match is greater than the system's average for the computational resources necessary for additional syntactic complexity. This compressive syntax expands with every new level of search: each variable of an input pattern conditionally forms its own pattern.

On the other hand, if predictive value (projected match) falls below the systems' average, the input pattern is aggregated with adjacent "subcritical" patterns by iterative addition, into a lower-resolution input. Aggregation results in a "fractional" projection range for constituent inputs, as opposed to "multiple" range for matching inputs. By increasing magnitude of the input it also increases average projected match: a subset of the magnitude. Aggregation generates averages to evaluate match of the future inputs, & to determine their coordinates & resolution. So, alternative integrated|aggregated representations of inputs are produced by iterative subtraction|addition (the neural analogs are inhibition & excitation). Again, the choice between the two is determined by prior accumulated comparison of the respective inputs.

Thus, cognition is a form of evolution where variation does not proceed by altering the inputs directly: a prediction can only be derived from experience. Rather, variation redefines the source & resolution of inputs, & generates derivatives (higher syntax / higher level inputs) by comparing them. Such variation is incremental rather than random: the resolution & syntactic complexity of inputs increase or decrease in proportion to their relative cumulative match which is the selection criterion. I consider this to be a higher phase of metaevolution, a subject of my other knol.  Cognition is driven by predictive fitness, where the patterns themselves are dispensable, compared to biological evolution driven by reproductive fitness, where the patterns (genomes) are the end in themselves.

The biggest hangup people usually have is that this kind of algorithm is obviously very simple, while working intelligence is obviously very complex. But, as I tried to explain, additional complexity is learnable and should only improve speed, rather than change the "direction" of cognitive evolution (although it may save some eons). The main criterion for such algorithm is the ratio of benefit: predictive power, to cost: complexity.

I would summarize such algorithm as Comparison-Projection, - a more constructive analog to Jeff Hawkins' Memory-Prediction. Hope this makes sense. I have a far more advanced work-in-progress, but need a feedback on these core premises first. If correct, they are already way ahead of any other approach to my knowledge.

Comments

How to filter out the improbable seems to me to be the key

Generation of a plethora of possible near-futures seems possible, but how to filter out the staggering majority which are improbable, or illegal in terms of the physical laws of the universe, seems complicated. Also, how to collapse possibilities that are so similar as to be essentially the same probabilistically? Then, your discussion of probability ranking the remaining possibilities makes sense.

In any case, it would be a delight to hear from you Vitya/Burya. rick at bunkerplanet dot com.

Last edited Jul 17, 2009 5:12 AM
Report abusive comment

AGI

Interesting. Very nice to see more people working on Artificial General Intelligence.

I have written a few articles on self improving AI here: http://seedai.blogspot.com/2007_08_01_archive.html
In those, I agree with much of what is written here, for example "If we want to talk about improving programs, we have to define what it means to improve one's intelligence, and thus what it means to be intelligent. We want intelligent systems to be useful. Useful intelligence is, just as science, about prediction, planning and pattern recognition. These are all so intertwined as to be more or less the same thing."

You are very welcome to read and post your thoughts on my articles.

Last edited Aug 11, 2008 10:28 PM
Report abusive comment
Boris Kazachenko
Boris Kazachenko
cognitive algorithm
Boston, MA
Article rating:
Your rating:

Reviews

    Similar Content on the Web

    Knol translations

    Activity for this knol

    This week:

    25pageviews

    Totals:

    2290pageviews
    8comments