Contents

Artificial Neural Networks Intro

Neural Networks are fascinating method of solving various problems. Fascinating, partly, due to their curious name :)

Warning: this text is very general and verbose. Going to be improved in future. Feel free to skip it and go solving corresponding tasks, if you have some understanding on the topic! task 1, task 2

But what are they? And are they so good to solve any problem? In this article we are going to explain few general questions about them - while a couple of problems will help to understand how exactly do they work.

About the name

Properly we call them Artificial Neural Networks to distinguish from networks of real neurons in live organisms. However the latter meaning is not so widely used and thus "artificial" is often omitted. Abbreviations are NN and ANN.

The name is due to fact that NN is represented by network (or graph) of small objects (really, math functions), passing values between them. So they distantly resemble connections of neurons (nerve cells) in live organism, passing signals from brain to muscles, for example.

This resemblance and fact that NNs are supposed to solve some tasks - lead to the choice of such a name.

Beware of marketing!!!

Part (probably large) of the fame the NNs enjoy is due to the name. General people tend to consider them "something ultra-clever" because of such name. However technology is about 50 years old and nowadays often different methods are preferred.

However it is harder to impress public with the usage of Support Vector Machines in comparison with Neural Networks (or Genetic Algorithm likewise).


What is the Neural Network

Shortly speaking: it is just a mathematical function converting some input values to some output values.

What makes NNs magical, is that we don't know how to create necessary function, but we can "train" the function using sample data. As we often don't know exact dependencies between things in real problems, it may be convenient in many applications.

Let's look at some simple example. Consider a tram:

old tram in saint-petersburg

Suppose, we want to create intelligent safety stopping algorithm for such tram.

There will be ultrasonic sensor in front, which detects when something (some careless pedestrian) suddenly appears on the way. So at this moment we shall know two parameters - speed of the tram V and allowed distance for deceleration S. Our goal is to calculate voltage U, applied to motors of the tram (tram stops by applying reverse voltage to engines).

We don't want too much voltage (as it damages motors and instant stop may hurt passengers). Let it be just enough to stop in exactly S meters.

Problem is complicated because applying the voltage won't immediately make motors rotating backwards anyway, they have their own inertia, there is friction also, and there is inductance of motor coils which won't allow current to change instantly.

So, again, we want some function to calculate proper voltage depending on inputs:

U  =  f(S, V)

However it is very difficult to figure out precise mathematical law.

Another Example

As many first electronical and mechanical calculators were created for military purpose, let's discuss simplified task of hitting an enemy aircraft with an explosive shell:

cannon shooting at the UFO

The cannon is in the origin, and we detect aircraft at the distance S and height H, approaching with the speed V. These three values will be inputs.

We want to know the angle A to which raise the barrel of the cannon and timeout T to which the timer on the shell should be set, so that explosion happens in proximity of the airplane.

There is braking force of the atmosphere, depending non-linearly on the projectile speed. Also gravity is decreasing with height. Here could be other factors. So the function (with two outputs):

(A, T) = f(S, H, V)

this function becomes complex. Add third dimension to this chart, and ability of the aircraft to fly non-horizontally - and the matter becomes even worse.


How Neural Network Helps

We won't discuss here how internally NN is built (this could be learnt further from our problems) - but consider it has many parameters or coefficients inside, which we can manipulate to tune it, e.g. change its behavior to our need.

Then it becomes a question of how to "tune" these coefficients to make NN "work well" for our problem. This is done by the process of training. With tram example it may look like this:

  1. Initialize internal coefficients of Neural Network to some values, perhaps, random or zeroes.
  2. We choose some set of value pairs for distance and speed (S, V).
  3. For every such pair (i.e. input dataset) calculate neural network output (e.g. voltage U), perform experiment on a live system with such parameters, and remember error E - for example, square of real travel distance Sreal and expected S.
  4. Find out average error for all pairs (S, V) included in our training set.
  5. If the average error is acceptable, stop the training - our NN is already in good state.
  6. Otherwise change coefficients somehow in attempt to reduce the error.
  7. Repeat from the step 3.

The main question is, of course, how to "change coefficients to reduce error". One of popular algorithms used with NNs is "backpropagation" - i.e. backward propagation of error according to derivatives of internally used mathematical functions.

However we can do even without complicated math. This will be demonstrated in the second problem on NNs.


Input and output values

As the neural network is a math function, its input and output values are numbers. This makes several important points:

Really, how to choose inputs and how to convert them to values - this is large part of solving the problem with neural network. It is never enough just to say "use NN" - one should explain how it could be conveniently used.

For example, if we want to create character recognizing algorithm, we should first create unrelated algorithm for splitting image into text characters, then convert them to monochrome and scale to some small grid (say, 8*8 pixels). Let's call white pixel as 0 value and black as 1 (while gray shades go between).

So we shall have 256 inputs in range 0 ... 1 but this still is not a complete solution probably. For example, we may want to choose optimal inner structure of NN so that we don't waste time on unrelated calculations.

But enough considerations - let's go and study Neural Networks by practice: