Like the biological brains that inspired them, neural networks process information in a manner that is both massively parallel AND massively interconnected. What does this mean, and how does this compare to digital computation?

The computational behavior of neural networks is a collective property that results from having many computing elements act on one another in a richly interconnected system. The collective properties have been studied using simplified model neurons for over a quarter century.

To understand how collective circuits work, we must take a wide view of computation. Any computing entity, whether it is a digital or analog device or a collection of nerve cells, begins with an initial state and moves through a series of changes to arrive at a state that corresponds to an “answer”. The process can be visualized as a path, from beginning state to answer, through the physical “configuration space” of the computer as it changes over time.

For a digital computer, this configuration space is defined by the set of voltages for its devices (usually transistors). The input data and program provide starting values for these voltage settings, which change as the computation proceeds and eventually reach a final configuration, which is reported to an output device such as a screen or printer.

For any computer, there are two questions of central importance: How does it determine the overall path through its configuration space? And how does it restore itself to the correct path when physical fluctuations and other “noise” cause the computation to drift off course? In a digital computer the path is broken down into logical steps that are embodied in the computer’s program. In addition, each computing unit protects against voltage fluctuations by treating a range of voltages, rather than just the exact voltage, as being equal to a nominal value; for example, signals between 0.8 volt and 1.2 volts can all be restored to 1.0 volt after each logical step in the computation.

In collective-decision circuits the process of computation is very different. Collective computation is an analog process, not a digital process. The overall progress of the computation is determined not by step-by-step instructions but by the rich structure of connections among computing devices. Instead of advancing and then restoring the computational path at discrete intervals, the circuit channels or focuses it in one continuous process. These two styles of computation are rather like two different approaches by which a committee makes decisions.

In other words, in a neural network the software is the structure - both the pattern of interconnections and the strength of each connection are of importance to the computed result. Each of what could be many, many inputs help determine the output of the neural node. This is very similar to how an operational amplifier works in the world of electronics. Therefore, artificial neural networks are comprised of connected operational amplifiers.

A neural network can be best thought of as a computational energy surface.

Collective computation is well suited to problems that involve global interactions between different parts of the problem. The nature of many of the problems that neural networks excel at can be described as “optimization problems”, such as the task assignment problem.

Perception can also be expressed as an optimization, in that our interpretation of sensory information is constrained by what we already know. Our senses constantly gather great quantities of information about the external world, which is often imprecise and noisy. The edge of an object might be hidden behind another object, for example. However, we know that the edges of objects are continuous, and simply because we can’t see an edge doesn’t make us wonder whether the object has changed its shape. Our interpretation of the information is constrained by what we already know.

This knowledge can often be represented as a set of constraints, similar to those in a task assignment problem, and expressed in a computational energy function.

The perceptual problem then becomes equivalent to finding the deepest valley in the computational energy surface. For example, problems in computer vision can be expressed as an optimization problem and solved by a collective decision circuit in which knowledge of the real world has been imposed as a set of constraints. This approach can be used to take incomplete depth information of a 3-D world and reconstitute missing information such as the location of the edges of objects.

One of the outstanding features of neural networks is that they converge on a good solution rapidly, usually in a few multiples of the response time of the computing devices – often less than a microsecond, for a problem that a digital computer implementing even the most efficient algorithm would require millions of cycles for.

Here we can begin to discern the phenomenal power of neural networks, which again is a result of both their massive parallelism and massive interconnectedness.

## No comments:

Post a Comment