Table of Contents


Chapter 1: Introduction to Artificial Intelligence and Neural Networks
What is Artificial Intelligence?
What is a Neural Network?
How Neural Networks and Expert Systems Differ?
What are Neurons?
How is a Neural Network Built, What Do They Look Like?
How Does a Neuron Work?
How Do Neurons Learn?
What Can Neural Networks Learn?
 
Chapter 2: Neural Networks and Machine Learning
Types of Neural Networks
Machine Learning
Types of Neural Network Learning
Specific Neural Network Uses and Applications
How to Determine If an Application is a Candidate for a Neural Network
List of Figures
List of Tables
 
Chapter 1
Introduction to Artificial Intelligence and Neural Networks
What is Artificial Intelligence?
What is a Neural Network?
How Neural Networks and Expert Systems Differ?
What are Neurons?
How is a Neural Network Built, What Do They Look Like?
How Does a Neuron Work?
How Do Neurons Learn?
What Can Neural Networks Learn?

What is Artificial Intelligence?

There is no basic, simple, or agreed upon strict definition of artificial intelligence however as a general definition artificial intelligence is the science and engineering of making intelligent machines, especially intelligent computer programs. Intelligence in this regard is in reference to biological (human) intelligence but artificial intelligence does not have to confine itself to methods that are biologically observable. Traditionally computers use a type of logic which requires each and every condition of a problem be specified in its entirety. Computers use algorithms which are a very step by step and precise sequence of events that solve a problem and guarantee a solution. Humans generally use heuristics which are general guidelines that allow us to solve most problems very quickly based on our knowledge of the world. The benefit to algorithms is that a solution is always perfectly correct and completely deterministic. The drawback is that algorithms can only do what they are designed to do. They can not handle unexpected or unforeseen occurrences. Also algorithms are frequently very time consuming compared to heuristics. Heuristics are general guidelines that are very flexible. The advantage to this type of problem solving is that it is very fast and tolerant of imprecise or unseen information. The drawback to heuristics is that they are not guaranteed to produce the best solution or even an acceptable solution.
Humans tend to use heuristics because of the number and complexity of tasks that must be performed every second of every day. The amount of information with which a human must deal and the fact that we have limited short term memory makes the use of strict algorithms in day to day activities practically impossible. Computers on the other hand have unlimited short term memory and deal with very specific problems one at a time. This makes algorithmic solutions the perfect choice for computers. The need for artificial intelligence arises when a computer must solve a problem that is normally solved in heuristic fashion. Computers must make step by step precise decisions where every step is perfectly verifiable and accurate. For some problems this method would make a solution so time consuming that for all practical purposes it is unsolvable by a computer. Classification problems are a classic example of this situation. Humans can classify objects very quickly and fairly accurately after seeing only a few examples of the objects in question. For example, after learning a few general rules and seeing a few examples of mammals, a human could classify most animals as mammals or not mammals very easily. For a computer to do that, every animal in the world would have to be cataloged and input to the computer so that it could recall that information when asked. In other words, no "thinking" is involved, only massive input, memorization, and perfect recall. Humans can perform the classification task with minimal input and imperfect memorization and recall. The trade off is between accuracy on one side (computers) and speed and efficiency on the other side (humans). The solution to this problem for computers is artificial intelligence. Artificial intelligence seeks to bridge the gap between human and machine and use heuristics within computer programs. The computer will give up some small measure of accuracy in exchange for flexibility and speed.
Some artificial intelligence techniques include neural networks, expert systems, symbolic manipulation, search and planning strategies, and genetic algorithms. Each has its advantages and disadvantages but today, the most widely used and successful techniques are expert systems, search and planning strategies, and very recently neural networks.
back to top
 
What is a Neural Network?

Artificial Neural Networks (ANN) are a relatively new approach to computing that involves using an interconnected assembly of simple processing elements loosely based on the animal neuron, a specialized biological cell, found only in the animal brain. A generally accepted basic definition of an ANN is a network of many simple processors. These simple processing elements are referred to as units, nodes, or neurons. These units are connected by communication channels referred to as "connections" which carry numeric data between nodes (see Figure 3). Each unit operates only on its local data and on the inputs they receive via the connections. The processing ability of the network as a whole is stored in the inter-unit connection strengths, or weights. These weights are obtained by a process of adaptation to a set of training patterns, similar to the way neural connections in the human brain are strengthened or weakened by some stimulus. Another name for this model is connectionist architecture. This approach differs greatly from the more traditional symbolic or expert system approach to artificial intelligence. Neural nets have the ability to learn and derive meaning from a complex, imprecise, or noisy data, thus extracting patterns that would otherwise be imperceptible by other means. A trained neural network can be thought of as an "expert" in it the category of information it has been given to analyze and it can then be given "what if" questions to answer on that information. The greatest power of a neural network comes from its ability to generalize from information it has seen to similar patterns of information that it has not seen.

back to top
 
How Neural Networks and Expert Systems Differ?

Neural networks differ from both the expert system approach to artificial intelligence and the traditional algorithmic approach to computing. Expert systems use rules and facts to offer solutions to complex problems that would normally require a human expert. These types of rule-based and symbolic solutions have a common thread in that they all address relatively well defined problems solvable by some procedural method. In other words, rule-based systems perform high level reasoning tasks. An example of such a system is MYCIN, an Expert System for diagnosing and recommending treatment of bacterial infections of the blood, developed by Shortliffe and associates at Stanford University. To create such a system, hundreds or thousands of facts are entered into the expert system. In addition to theses facts, hundreds or thousands of rules that operate on those facts are also entered. The facts and rules that operate on the facts are essentially kept separate and any fact can affect any rule and vice versa. The system operates by taking in facts representing a current problem, applying applicable rules to those facts, generating new facts to which further rules are applied, and eventually producing a conclusion to the initial set of facts. Expert systems are very powerful tools in that any number of facts and rules can be entered to the system in any order. Conflicting facts and rules may also be entered assuming an appropriate conflict resolution scheme exists within the expert system. Theoretically an expert system can solve any high level reasoning task, provided the rules and facts of the problem have been entered. The drawback to such a system is that the rules and facts must be known ahead of time and they must be specified to the system. For some problems this is impossible. For example, the most extreme high level artificial intelligence problem is common sense reasoning. For a rule-based system to perform common sense reasoning, every fact and rule even remotely connected to a common sense problem would have to be entered to the system. The possible number of rules, facts, and potential conflicts make this impossible with current programming tools. In fact, no artificial intelligence technique has been able solve this problem although recent research has pointed to a partial solution using hybrid models, a combination of neural nets and expert systems.
Neural networks take an entirely different approach to artificial intelligence. Neural networks seek to model (on a very rudimentary level) the biological action of the animal brain. Neural networks operate on the idea that the conceptual (high level) representation of information is not important. A neural network seeks to represent data in a distributed fashion across many simple processing elements so that no single piece of that network contains any meaningful information, only the network as a whole has any ability to process, store, and produce information and make decisions. Because of this, very little information can be gained by observing the network itself, only the actions of the network are meaningful.
Each technique has it’s strengths and weaknesses and although they are often thought of as competitors this is not true. Each is well suited to a type of artificial intelligence problem. Expert systems are very well suited to well defined problems with facts and rules. Where these rule-based techniques fall short is on low level perceptual tasks such as vision, speech recognition, complex pattern matching, and signal processing. Rule-based techniques also have difficulty dealing with fuzzy, imprecise, or incomplete data. Data in a rule-based system must be in a precise format. Noisy or incomplete data may confuse an expert system unless specific steps are taken to account for such variability. This is where neural networks can do what expert systems can not. Neural networks distribute the representation of data across the whole network of neurons so no one part of the network is responsible for any one fact or rule. This enables the network to deal with errors in data and allows it to learn complex patterns that no human expert could perceive and quantify in simple rule/fact form.

back to top
 
What are Neurons?

The power and flexibility of a neural network follows directly from the connectionist architecture. This architecture begins with the simple neuron-like processing elements. A real neuron is a specialized biological cell, found only in the animal brain, that processes information and presumably stores data. As shown in Figure 1, a neuron is composed of a cell body and two types of outreaching tree-like branches, the axon and the dendrites. A neuron receives information from other neurons through its dendrites and transmits information through the axon, which eventually branches into strands and substrands. At the end of these substrands is the synapse, which is the functional unit between two neurons. When an impulse (information) reaches a synapse, chemicals are released to enhance or inhibit the receiver’s tendency to emit electrical impulses.
 

Figure 1: Biological Neuron

The synapse’s effectiveness can be adjusted by the impulses passing through it so that they can learn from the activities in which they participate. This dependence on a specific sequence of impulses acts as a memory, possibly accounting for human memory, and forms the basis for artificial neural network technology. Dendrites and axons form the inputs and outputs of the neuron respectively. A neuron does nothing unless the collective influence of all its inputs reaches some threshold level. When the threshold is reached, the neuron produces a pulse that proceeds from the body to the axon branches. Stimulation at some synapses encourages neurons to fire, while at others firing is discouraged.
An artificial neuron, as conceptually shown in Figure 2, is structured to simulate a real neuron with inputs (x1, x2,...xn) entering the unit and then multiplied by corresponding weights (w1, w2,...wn ) to indicate the strength of the "synapse." The weighted signals are summed to produce an overall unit activation value. This activation value is compared to a threshold level. If the activation level exceeds the threshold value, the Neuron passes on its data. This is the simplest form of the artificial neuron and is known as a perceptron.

Figure 2: Artificial Neuron (perceptron)

Another interesting property of biological neurons is the way they also encode information in terms of frequency. Real neurons not only pass on information in simple electrical pulses, but it is the rate at which the pulses are emitted that encodes information. This presents a major difference between simple perceptrons (artificial neurons) and real neurons. This difference can be partially overcome by allowing the artificial neuron to pass on a partial pulse based on a mathematical function known as an activation function. This activation function is some type of mathematical function which allows the artificial neuron to simulate the frequency characteristics of the real neuron’s electrical signals.

back to top
 
How is a Neural Network Built, What Do They Look Like?

The single neuron described earlier can be structured to solve very simple problems however it will not suffice for any complex problems. The solution to complex problems involves the use of multiple neurons working together, this is known as a neural network. The artificial neuron is a simple element that can be made a part of a large collection of neurons in which each neuron’s output is the input to the next neuron in line. These collections of neurons usually form layers as shown in Figure 3. Although this multi-layer structure can take on virtually any shape, the most common structure is called a feedforward network and is pictured in Figure 3. The term feedforward comes from the pattern of information

Figure 3: Example Multi-layer Perceptron

flow through the network. Data is transferred to the bottom layer, called the input layer, where it distributed forward to the next layer. This second layer, called a hidden layer , collects the information from the input layer, transforms the data according to some activation function, and passes the data forward to the next layer. The third layer, called the output layer, collects the information from the hidden layer, transforms the data a final time and then outputs the results.
The 3-layer structure shown in Figure 3 is a standard feedforward network although many variations of this network exist. For example, feedforward networks may have 2 or more hidden layers, although the basic idea of any feedforward network is that information passes from bottom to top only Feedforward networks may have any number of neurons per layer although it is very common for networks to have a pyramid shape in that the input layer is generally larger than the hidden layer which is larger than the output layer.

back to top
 
How Does a Neuron Work?

Artificial neural networks are built up from the simple idea of the perceptron or artificial neuron. To understand the network it is necessary to understand the neuron. One neuron is able to solve very simple problems, for example a simple logic problem known as the logical AND. The logical AND problem assumes two premises and it says that something is true if and only if both of the premises in the problem are true. For example, if it is raining AND I go outside, then I will get wet. The two premises are 1) it is raining, 2) I go outside. For I get wet to be true both of the premises must be true first. If either one is true but other is not, or if they are both false, I will not get wet. This type of problem can be directly applied to a single neuron and a single neuron can classify all the possible cases in the problem. Table 1 shows all the possible cases in this problem. Notice there is only one possible way for the conclusion "I get wet" to be true which is listed first in the table.
 

Premise 1

Premise 2

Conclusion

It Is Raining

I Go Outside

I Get Wet

It Is Not Raining

I Go Outside

I Do Not Get Wet

It Is Raining

I Do Not Go Outside

I Do Not Get Wet

It Is Not Raining

I Do Not Go Outside

I Do Not Get Wet


 
Table 1: Logical AND Problem (1)

Now consider a single neuron structured as shown in Figure 4. Assume that if a premise is true it is equal to the number 1, if it is false it is equal to 0. Also assume that the neuron sends out a "signal" of 1 if the answer is "get wet" or it sends out a 0 if the answer is "do not get wet." We can set an simple arbitrary activation function that says if the neuron receives a combined signal of higher than 1.5 it will send out a 1 (get wet), otherwise it will send out a 0 (do not get wet). This is all that is needed to solve this logic problem.

Figure 4: Simple Neuron Problem

If "it is raining" and "I go outside" are both true the neuron in Figure 4 will receive 1 + 1 = 2 which is greater than 1.5 and it will send out a signal of 1 which means "I get wet." For all other cases the neuron will receive a total of only 1 or 0 and it will send out a 0 and "I will not get wet" will be the conclusion. In this way a conceptual problem such as "will I get wet?" has been transformed to a mathematical problem. This type of conversion from a conceptual problem that humans understand to a numerical problem computers understand is termed encoding . Much of artificial intelligence is concerned with encoding and data representation. Table 1 can now be shown numerically as Table 2.
 

Premise 1

Premise 2

Conclusion

1

1

1

0

1

0

1

0

0

0

0

0


 
Table 2: Logical AND Problem (2)

By adjusting the strength of the inputs (weighting on the connections) and the way the collective influence of the inputs is used (activation function) any simple problem such as has been described can be encoded and solved. For a more complex problem a single neuron will not suffice. More complex problems require several neurons working together as neural network. Neural networks operate similarly to the single neuron except they combine their outputs to handle complex problems.

back to top
 
How Do Neural Networks Learn?

Neural networks are known for their ability to learn a task when given the proper stimulus. Usually neural networks learn through a process called Supervised Learning. This learning requires sets of data where a set of inputs and outputs are known ahead of time. For example, if a neural network were to be taught to recognize hand written characters. Several examples of each letter, written by different people could be given to the network. As the teacher, the neural programmer will have several examples of all characters in the alphabet (inputs) and the programmer knows to which category (‘A’, ‘B’, ‘C’, etc.) each character belongs (output). Inputs (characters in this case) are then given to the network. The network will produce some kind of output (probably wrong, e.g. it will say an ‘A’ is a ‘O’). Initially, the network’s responses are totally random and most likely incorrect. When the neural network produces an incorrect decision the connections in the network are weakened so it will not produce that answer again. Similarly, when the network produces a correct decision the connections in the network are strengthened so it will become more likely to produce that answer again. Through many iterations of this process, giving the network hundreds or thousands of examples, the network will eventually learn to classify all characters it has seen. This process is called supervised learning since the programmer guides the network’s learning through the type and quality of data given to the network. For this reason neural networks are said to be data driven and it is critical that the data given to the network is very carefully selected to represent the information the network is to learn.
The real power in a neural network is not what they can learn but rather what they can do with that information. A trained neural network will not only be able to identify and classify data it has seen but it will generalize to similar data it has not seen. In the handwriting example, if the network is given examples of characters from 20 different people it will then be able to correctly identify characters written by almost any person, whether it has seen those particular instances of characters or not. This represents a major difference between artificial intelligence programming and conventional programming. In conventional programming each and every character and every variation on every character would have to be programmed into a computer before that computer could identify all characters written by all people. Artificial intelligence techniques such as neural networks work by generalizing from specific patterns to general patterns. This is similar to human problem solving in that we very often reason from the specific to the general (inductive reasoning). In this way neural networks can learn to classify groups of data, match patterns in data, and approximate any mathematical function.

back to top
 
What Can Neural Networks Learn?

Theoretically neural networks can learn any computable function, whether or not that function can be identified by the programmer or a mathematician. Neural networks are especially useful for classification and function approximation problems which are tolerant of some imprecision, which have lots of data available, but to which hard and fast rules cannot easily be applied. Neural networks work on finding a best match to inputs (premises) and outputs (conclusions) based on what they have seen in the past. In this way, neural networks do not give a perfect solution, they give a "best" solution given the information at hand. Neural networks, like all artificial intelligence techniques, are based on the assumption that a slightly less than perfect solution that is acceptable is better than a perfect solution which may be practically impossible to find and implement. For example, in the handwriting example given earlier the neural network will learn to identify most characters written by most people (greater than 99% accuracy) but it will fail a small percentage of the time. This inaccuracy is considered acceptable because the alternative is to use conventional programming and to create a database of every possible character ever written and that ever will be written by every human. Just creating, let alone using such a data base is practically impossible. In giving up a small measure of accuracy an artificial intelligence technique such as a neural network can be implemented in a matter of hours by a single programmer using a small fraction of the total information to be learned. Artificial intelligence is inspired by biological intelligence in that it is considered more important to have a fast, general, and very robust solution than it is to have a perfect but time consuming solution.
Neural networks are basically function approximators and pattern matchers and in general all neural networks perform only these functions. Neural networks may however be applied to a variety of problems that can make use of their pattern matching and approximation ability. Neural networks are not only used to classify and match data directly but also for vision and speech recognition, prediction and forecasting, data mining and extraction, and process control and optimization. Each of these tasks is accomplished through the creative use of the neural network’s pattern matching ability. Once trained, a neural network can be inserted as the heart of any decision making system where the patterns of inputs and outputs are fed to an problem specific application that can interpret and process that data. For example, a neural network on its own has no process control logic or process control ability, however a neural network can be given process control data for a particular system and it can learn the workings of that control system. Once that system has been learned the neural network can be inserted as the decision making part of the control system. The neural network will not only replicate the control system rules it has seen but it will also be able to generalize to unknown conditions. This means that when the system as a whole is presented with new and unseen situations the neural network will extrapolate from known conditions to unknown conditions and provide a "best match" decision based on this new information.
back to top
 

Chapter 2
Neural Networks and Machine Learning
Types of Neural Networks
Machine Learning
Types of Neural Network Learning
Specific Neural Network Uses and Applications
How to Determine If an Application is a Candidate for a Neural Network
 
 
 
Types of Neural Networks

The type and variety of artificial neural networks is virtually limitless although neural networks are classified according to two factors, the topology (shape) of the network and the learning method used top train the network. For example, the most widely used topology is the feedforward network and the most common learning method is the backpropagation of errors. Backpropagation is a form of supervised learning in which a network is given input and then the network’s actual output is compared to the correct output. The network’s connections are adjusted to minimize the error between the actual and the correct output. Feedforward networks that use backpropagation learning are so common that these networks are commonly referred to as "backpropagation networks" although this terminology is not correct. "Multi-layer feedforward" refers to the topology and pattern of information flow in the network. "Backpropagation" refers to a specific type of learning algorithm in which errors in the output layer are fed back through the network. It is possible to use a feedforward architecture without backpropagation, or to use backpropagation with another type of architecture. In any case, it has become commonly accepted to call this combination of topology and learning method simply a backpropagation network.
Another common network structure is the recurrent or feedback network. Recurrent networks are similar (usually) in shape to the feedforward network although data may pass backwards through the net or between nodes in the same layer (see Figure 5). Networks of this type operate by allowing neighboring neurons to adjust other nearby neurons either in a positive or negative direction.

 
Figure 5: Recurrent/Feedback Network

This allows the network to reorganize the strength of its connections by not only comparing actual output against correct output but also by the interaction of neighboring neurons. Recurrent networks are generally slower to train and to implement than feedforward networks although they present several interesting possibilities including the idea of unsupervised learning. In unsupervised learning the network is only given input with no output and neurons are allowed to compete or cooperate to extract meaningful information from the data. This is especially useful when trying to analyze data searching for some pattern but no specific pattern is known to exist ahead of time.
A third network structure, also based on the feedforward architecture, is the functional link network. This type of network, as shown in Figure 6, duplicates the input signal with some type of transformation on the input. For example consider a network that is designed to input a series of past stock prices and the output is a predicted future stock price. This network may have as input four past stock prices (last month, last week, and the past two days). The output may be a single value such as tomorrow’s stock price. In a functional link network additional inputs will also be given to the network which are some form of the original inputs. These additional inputs may be various products of the original inputs, or they may be high and low values from the whole input set, or they may be virtually any combination of mathematical functions that are deemed to contain value for this set of input. In Figure 6 a functional link network is shown with four actual inputs and two additional functional links which in this example are products of the first two and second two inputs. In this network the functional link is directly connected to the output layer although the functional link may be directed toward the hidden layer. The idea behind this type of network is to give the network as much information as possible about the original input set by also giving it variations of the input set.

Figure 6: Functional Link Network

back to top
 
Machine Learning

In discussions of neural networks and artificial intelligence in general the topic of learning is a central theme. True human like learning is beyond all artificial intelligence techniques although some learning techniques have been developed which allow machines to mimic human intelligence. These techniques that allow computers to acquire information with some degree of autonomy are collectively known as machine learning. Machine learning is the artificial intelligence field of study that attempts to mimic or actually duplicate human learning. There are many artificial intelligence techniques that do not employ any type of learning, such as search and planning strategies. These type of artificial intelligence methods rely on sophisticated search methods that can examine massive amounts of data and very quickly pick out important information without searching the entire set of data. These strategies do not learn but they do mimic a human’s ability to quickly investigate different paths and select the one that seems most productive. These kind of techniques are fairly static in that as long as the information they are given does not change they will always behave exactly the same.
Theoretically neural networks fall into the category of machine learning. Neural networks are specifically designed to program themselves based on information they are given. A neural programmer’s job is to set up the structure and learning ability of the network and then provide the network with good information. If the network is designed correctly and the information input to the network is of acceptable quantity and quality the network will adapt to understand that information. In a sense neural networks exhibit the ability to learn in a similar fashion to animal learning, they have a given structure (topology and learning method), they are presented with stimulus (inputs), and they adapt to that stimulus.
Most practicality implemented neural networks do not continue learning once they have been trained and placed in service. Neural networks are usually designed to be taught once and then the network is put to use. While in service they remain fixed and do not adapt to changing conditions. So in this sense they are not truly "intelligent." There are some examples of on-line adaptive networks that "learn as they go" and continually adapt to changing conditions. Most on-line learning neural networks are experimental although a few practical networks have been constructed and practically implemented. These networks continually retrain in small increments to adapt to changing conditions. This presents an exciting area in neural network technology since a network that can reliably learn in an on-line fashion can be put into service for a virtually indefinite period of time and it will continue to acquire information and adapt to it’s environment. The most sophisticated of these on-line networks can also adjust the number of nodes in their hidden layer(s) although at the moment this is still largely experimental.
In practicality, expert systems are not generally considered in the category of machine learning since they are built with a certain set of facts and rules and then put into service and they generally do not adapt on their own. By this definition however most neural networks must also be excluded from machine learning since neural network training can be considered analogous to entering rules and facts to an expert system and then both systems are simple put into service where they usually remain static. Expert systems can however be updated at any time by entering new facts and rules which again is analogous to a neural network that observes new conditions and is allowed to update itself to those conditions while in service. Since both require a human to carefully prepare new information and either enter that information or explicitly allow the system to acquire that information they may both be considered in the category of machine learning. This is especially true for the few theoretical and experimental expert systems that have the ability to create new facts and rules autonomously by combining the already given facts and rules. Some experimental and "toy" expert systems have been designed with the ability to enter information to themselves from what they observe during operation and from the interaction of the current set of rules and facts. These systems are not considered completely practical at this time but there is not reason to believe they will not eventually be brought into practical service.
Unfortunately neural networks and expert systems are like all artificial intelligence techniques in that they can only solve problems for which they were designed and they have no ability to change problem domains, cross reference learning, or restructure themselves to a new problem. For example, a neural network that has been designed and trained to drive a car (there are several examples of this) can not learn to do character recognition. If it is restructured to another task, it will not be able to perform the original task. In addition, a neural network that learns one task such as driving a car will have no ability to drive a motorcycle, a similar but different task. Neural networks like all current artificial intelligence techniques, are highly task specific (narrow domain). The ability to combine learning from different domains and acquire truly new information from that combination is beyond all machine learning techniques. In narrow domains with relatively stable conditions, there are many neural network and machine learning solutions that perform extremely well and can learn.
back to top
 
 
Types of Neural Network Learning

There are generally three different ways to approach neural network learning, supervised learning, unsupervised learning, and reinforcement learning. Supervised learning requires the programmer to give the network examples of inputs and correct output for each given input. In this way the network can compare what it has output against what it should output and it can correct itself. Figure 7 shows the backpropagation method. Backpropagation is the most widely used method for neural network training because it is the easiest to implement and to understand and it works reasonably well for most problems.
Unsupervised learning provides input but no correct output. A network using this type of learning is only given inputs and the network must organize it’s connections and outputs without direct feedback. There are several ways in which this type of learning is accomplished, one is Hebbian learning and another is competitive learning. Hebbian learning states that if neurons on both sides of a synapse are selectively and repeatedly stimulated the strength of the synapse is increased. This type of learning is well suited to data extraction and analysis in which a pattern is known to exist in some data but the type and location of the pattern is unknown. Competitive learning uses a "winner take all" strategy in which output neurons compete to decide which is the stronger and should remain active while all others must remain passive for a given input. This type of learning is used most often for categorization where categories of data are thought to exist within a set of data but the exact nature of the categories is unknown. Unsupervised learning is still not completely understood or as practically implemented as supervised learning but the possibilities of unsupervised learning are very promising.
Reinforcement learning is a method half way between supervised and unsupervised learning but it is usually considered a subtype of supervised learning. In reinforcement learning a network is given input and although no specific target output is provided (as in supervised learning) the network is "punished" when it does poorly and "rewarded" when it does well. Punished and rewarded in this sense takes the form of the weakening or strengthening the connections between neurons. This means during the learning phase of a network’s life there are three possibilities for the adaptation of the neurons in the network. Connections may be selectively strengthened, selectively weakened, or they may be left unchanged depending on how the network performs. In this type of learning the network is given input and output is observed Then output neurons are categorized as being either right, wrong, or neutral. Output neurons that are judged incorrect, and all neurons that provided input to that neuron, have their connections weakened. Similarly, output neurons that are judged correct, and all neurons that provided input to that neuron, have their connections strengthened. Output neurons that are neither right nor wrong are left unchanged. With this learning method no specific output is targeted. The network does not know what it should do, only that when it does something it is either right, wrong, or neutral. This way the network is allowed to find information in data without being told what the information is but at the same time it is guided to a solution. This type of learning has been successfully applied to search problems in which a path to some goal must be identified but the exact path is not know ahead of time. Theoretically these types of networks are good candidates for on-line learning in an variable environment. Networks employing reinforcement learning can be placed into environments where decisions must be made and the outcome of those decisions is known ahead of time but the exact decisions that need to be made is unknown.
 

Figure 7: Backpropagation Learning

 
back to top
Specific Neural Network Uses and Applications

Neural networks essentially are function approximators, pattern matchers, and categorizers. They do very little outside of these basic functions although these task can be employed in a wide variety of powerful and complex applications. The following represents some common and practically implemented solutions using neural networks.
Neural networks have been used very successfully in speech recognition tasks. Verbal speech is encoded mathematically and input to the network and the network responds with an action. Using a neural network for this purpose allows a person or multiple people to speak with different tones and voices but the verbal command is still understood by the network despite variations in tone, pitch, quality, etc.
Character recognition is accomplished by presenting the network with many examples of handwritten characters and allowing the network to learn those characters. Once trained networks used for this task are remarkably accurate across not only the characters they have seen but also with characters they have never seen before.
Neural networks have constructed that process image data such as a photograph or x-ray image. In the case of photographs neural networks have been trained to pick out details in the photograph and identify portions of the image as being specific objects. In x-ray images neural networks have been used to construct composite and 3-D images from several flat x-ray images taken from different angles on the same bone structure.
Obviously pattern recognition and categorization are the most straightforward use for a neural network. Neural networks can take virtually any set of data that contains one or more patterns or categories and then extract those patterns. This is extremely useful for any application that must sort data by category or make decisions based on some pattern of information.
Signal processing is closely related to pattern recognition and neural networks have been used very successfully to reduce noise in corrupt electrical signals and to separate various signals from transmissions which contain multiple signals. Signal processing neural networks have been used in wide variety of problems. Two examples of this use include noise reduction in phone lines and detecting engine misfires in engines that can run as high as 10,000 RPMs.
One of the newest and most important neural net uses is in process control and optimization. Neural networks have been trained by allowing them to observe some system, such as piece of machinery, and then it can take over control of that system. Not only will the neural net control the system in normal operation but it will control that system during unforeseen occurrences. Neural networks have been put to this use in tests at NASA's Dryden Flight Research Center in Edwards, California using a modified F-15 aircraft. In this application a neural network was allowed to study normal flight operations. The neural network learned how a correctly flying aircraft should behave. Then if the aircraft suffered some type of damage the flight control system enables the neural net and allows the network to correct mismatches between data on the plane's airspeed, bearing, and the forces on its body versus what the network thinks the data should be if the plane were flying normally. In this way the pilot can continue to fly a damaged aircraft by controlling the plane as if it were undamaged. The neural network does the job of transforming the pilot’s actions from normal operation to the necessary operations given the plane is damaged in some way. The network was tested in high performance maneuvers, such as tracking a target or performing a 360 degree roll. The neural net managed to keep disabled planes under control even at supersonic speeds.
Process optimization is similar to process control in that a neural net is trained by allowing it to observe some type of system in operation. In the case of process control the inputs are the system state and the outputs are the control positions that affect the system. In process optimization the inputs and outputs are similar but additional inputs and/or outputs are also specified to represent some target state for the system. For example, consider a vehicle that takes in fuel and air and produces some speed. In operating this vehicle there are several factors which may be important at any given moment, like speed, fuel consumption, wear on the vehicle, safety, etc. Targeting one or more of these factors as most important requires a careful balance of fuel, air intake, air mechanical settings, etc (e.g. if it is decided the vehicle must run at minimal fuel usage it probably can not operate at maximum speed). Neural networks are used to balance system setting so that one or more system factors can be maximized, minimized, or stabilized. In the vehicle example a neural net could be set up to minimize fuel consumption by carefully adjusting air intake, speed, and other mechanical settings that affect fuel consumption. Process optimization represents one of the most challenging neural network application areas. Expert systems have also been successfully applied to both process control and optimization and they are the older and more traditional way of applying an artificial intelligence solution in this domain. Expert systems however have a few drawbacks, they still require a human expert to input to the system, they must be tailor made for each system, and they do not deal well with unseen or imprecise data. Neural networks have the advantage in that they program themselves, provided of course they are given the proper input. Neural networks also have advantage of being very robust and dealing well with unseen data. An expert system faced with unknown or corrupt facts will not do anything where as a neural network faced with unseen or corrupt data will respond with a "best guess" answer. Provided the new data does not stray too far from the original conditions shown to the network, neural networks perform very well and can extrapolate from the new information to a reasonable solution.
back to top


How to Determine If an Application is a Candidate for a Neural Network

There are several requirements and conditions a problem must meet if it is to be an acceptable candidate for a neural network solution. First and foremost the problem must be tolerant of some level of imprecision. All artificial intelligence techniques sacrifice some small measure of precision in favor of speed an tractability. This imprecision may be very small, much less than one percent or it may be relatively large such as ten percent. Neural network error rates tend to be below two percent however for certain applications error rates can go as low as a very small fraction of one percent. Any application that has zero tolerance for imprecision can not be solved with any artificial intelligence technique including neural networks. For example digital data transmission algorithms must be perfectly precise. If even the tiniest portion of a digital data transmission (e.g. sending a file over a network from computer to computer) is corrupted the entire transmission may be ruined. Conversely, something like an analog voice or video transmission is very tolerant of error. If a fraction of a second of a video or audio transmission is lost or damaged it may never be noticed by the observer. There are many such examples of processes that can tolerate some small measure of error where there is no appreciable impact on the problem.
Another requirement for a neural network solution is that abundant high quality data exists for both training and testing purposes. A neural network must be able to observe the problem at hand and it must be tested on that problem once it is trained but before it is put into service. This may require massive amounts of training and test data depending on the complexity of the problem.
Related to the error tolerance requirement, neural networks (like all artificial intelligence methods) work best when there exists one or more acceptable solutions to a problem that are not necessarily the best solution. There exist many problems for which finding an acceptable solution is easy but finding the perfect solution requires a practically impossible amount of resources. For example there may be a time dependant problem in which "fast enough" is just as good as "fastest."
At the heart of any problem for which a neural network is the solution there must be a pattern matching or categorization problem. This is not a difficult requirement to meet since pattern matching and categorization are inherent to a wide variety of problems. Most of what humans do that is considered "intelligent" is really the ability to quickly categorize and match what we see versus what we know and make a decision based on that match.
If a neural network is used in a process optimization or control application, economics plays an important part in the neural network’s usage. Neural networks used in this area tend to be of marginal benefit, in other words they provide benefits at the edges of existing performance. In any system where small increases in performance and efficiency translate to large changes in economic gain a neural network will prove very useful. This is especially true in systems where the small gain in performance is very difficult to achieve but when it is achieved it provides large benefits.
 
back to top


Lists

 
List of Figures

 
Figure 1: Biological Neuron *
Figure 2: Artificial Neuron (perceptron) *
Figure 3: Example Multi-layer Perceptron *
Figure 4: Simple Neuron Problem *
Figure 5: Recurrent/Feedback Network *
Figure 6: Functional Link Network *
Figure 7: Backpropagation Learning *