Understanding Neural Networks – Part One

Part One in a series…

Welcome to part one in a series helping demystify Neural Networks. The aim of this series is to give you a solid understanding of neural networks and Deep Learning (DL) so that you can start to develop your skills to actually build simple DL models to solve Information Security problems. We will start with the basics but by the end of this series you should understand how a neural network works, how it learns and about different types of neural networks and when you might use them. but lets get started with the simplest building block:

The Neuron

The idea behind deep learning and Artificial Neural Networks (ANN) is to mimic the way a brain works. There are limits to how closely we can mimic those biological systems which in any case we don’t fully understand but the basic idea of neutrons connected together in networks has proven to produce some exciting results and neural networks are now common in everyday use from your phone, to the latest self driving car.

At the heart of the network is the Artificial Neuron, also called a node. Each neuron has some number of inputs and an output. Here we show a neuron getting inputs from other neutrons but the Input layer of neurons will get their inputs directly from input parameters.

Neurons are arranged in layers and the Deep in Deep learning refers to the fact that there can be very many layers of neurons between the Input layer and the final output layer. A layer is in fact either an input layer, an output layer or a hidden layer. All layers that are not input or output are by definition hidden. The connection between neurons are analogous to synapses in a bran and we will hear a lot more not just about the neurons but about the synapses too.

For a network the inputs are called independent variables and represent the parameters that you are feeding to the network. In you have twenty different parameters then you would need twenty input neurons for the network. another way of saying that is to say that there are twenty neurons in the input layer. Hidden layers will generally have at least as many neurons in each layers as there are in the preceding layer and often have some multiple of that number. Output layers are different and might have as few as a single neuron or for categorical processing they may have one neuron per possible output value. We will see how this works in a later article…

You can clearly see from this that a very simple network with 20 parameters and a single hidden layer will have at least 41 neurons. this represents a very simple network indeed. Given that some networks might have hundreds or even thousands of input nodes and many, many hidden layers it is easy to see that some Neural networks can be very large indeed.

When considering Input variables it is very important to understand that you must standardise (or normalise) them. This means having them range in value from 0 to 1. From this it must be clear that you will need to do some considerable work in order to preprocess your input data before feeding it to your neural network. Some variables which are categorical will also need to be split out into a range of additional input variables in order to encode them. Data preprocessing is one of the most important and most time consuming tasks for the data scientist when first developing a model. This need to standardise input variables is not limited to Deep Learning but is also required in conventional machine learning and we will learn more about the requirements to do this in a subsequent article.

The output value of individual neurons is generally either a continuous value normally between 0 and 1 or a binary value which, for the output layer of a network may also be used to represent categorical outputs. Each time you present an observation at the input neuron layer it will adjust the network as a whole and produce some output. Training a neural network is essentially about adjusting the network to give you the outputs you want for a set of inputs. Once trained, a neural network can then be used to generate an output for any input wether it has previously seen that particular input or not.

The way the network learns is to do with the inputs from the previous layer. Each input or synapse has a weight and this weight influences what level of signal is passed on from neurons in the preceding layer. By adjusting these input synapse weights we can adjust how each individual neuron is impacted by signals on those synapses and how influenced it is by the neurons it is directly connected to in the previous layer. You might now be getting an insight into why you might develop a network with many layers; It gives you more fine grained influence networks. Some forms of neural networks will vary how connected neurons can be, with layers which are connected sideways, as well as backwards and even some forms of network can be fully connected, but we are getting ahead of our selves here.

We will talk about how these weights get adjusted later but let’s now look at a single neuron and understand what is happening. Each neuron has an activation function which will sum-up the weighted values arriving on its inputs. The result of this function decides wether the neuron fires its output (or to what extent). In this way wether an individual neuron fires is dependent both upon the inputs it receives and the weights on each of those inputs. By adjusting those weights we can train a neuron to fire for none, some or all of the inputs it has.

One of the things we will learn to do is to select an appropriate activation function for a layer of neurons (each layer should have the same activation function but neurons in different layers can and often do have different functions). You should be able to see that a binary neuron needs to fire or not fire presenting a 0 or 1 on it’s output synapse, where a continuous neuron needs to output a value between zero or one which is infinitely variable – at least within the constraints of numeric resolution. The activation function we choose must output the right type of signal for the effect we are looking for.

In part two of this series we will look in-depth at activation functions and we will start to look at how the network as a whole works and in particular how it learns.

About Us

Welcome to the home of advanced Information Security. Here you can learn about using Machine Learning and advanced analytics to improve your security environment.

In addition we will provide impartial advice about security technologies such as SIEM (Security Information and Event Management) and UEBA (User and Entity Behavioral Analysis) systems.

If you’d like help or advice on any of these subjects, or if you’d like to submit your own articles for consideration, then you can contact the site administrator through Linkedin. Check out the Contact page for more details.

Recent Posts

Categories