6 steps to build a neural network in OpenNN

In this tutorial we can see the principal ingredients to build a neural network model in a few steps using OpenNN. The script of the example that we are going to be using can be found at Github.

The central goal here is to design a model which makes good classifications for new data or, in other words, one which exhibits good generalization.

if you want to know more about the concepts we are going to see in this tutorial, you can read this neural network tutorial or the machine learning blog created by Neural Designer.

Contents:

  1. Data set.
  2. Neural network.
  3. Training strategy.
  4. Model selection.
  5. Testing analysis.
  6. Model deployment.

1. Data set

The first step is to prepare the data set, which is the source of information for the classification problem. For that, we need to configure the next concepts:

The data source is the file iris_flowers.csv. It contains the data for this example in comma separated values (CSV) format and can be loaded as

DataSet data_set("path_to_source/iris_flowers.csv",',',true);

The number of columns is 5 and the number of rows is 150. The variables in this problem are:

In this regard, OpenNN recognizes the categorical variables and transforms them into numerical variables. In this example the transformation is as follows:

Then, there will be 7 numerical variables in the data set. Once we have the data ready, we will get the information of the variables, such as names and statistical descriptives

const Vector<string> inputs_names = data_set.get_input_variables_names();
const Vector<string> targets_names = data_set.get_target_variables_names();

The instances are divided into a training, a selection and a testing subsets. They represent 60% (90), 20% (30) and 20% (30) of the original instances, respectively, and are splitted at random using the following command

data_set.split_instances_random();

To make the neural network work in the best possible conditions, we scale the data set. In our case we will choose the minimum-maximum scaling method

const Vector<Descriptives> inputs_descriptives = data_set.scale_inputs_minimum_maximum();

where we will obtain the statistical descriptives of each input: maximum value, minimum value, mean and standard deviation. In this case we have not scaled the targets because the values they have are 0 or 1, allowing a good work to the neural network.

For more information about the data set methods, see Data set class.

2. Neural network

The second step is to choose a correct neural network architecture. For classification problems, it is usually composed by:

We define the architecture by

const size_t inputs_number = 4;
const size_t hidden_neurons_number = 6;
const size_t outputs_number = 3;

and build the architecture vector

const Vector<size_t> architecture = {inputs_number,hidden_neurons_number,outputs_number};

Now, the NeuralNetwork class is responsible for building the neural network and organizing the layers of neurons in an appropriate way using the following constructor. If you need more complex architectures, you should see NeuralNetwork class.

NeuralNetwork neural_network(NeuralNetwork::Classification,architecture);

Once had been created the neural network, we can introduce information in the layers

neural_network.set_inputs_names(inputs_names);
neural_network.set_outputs_names(targets_names);

in the case of the scaling layer it is necessary to enter the descriptives and the scaling method calculated previously

ScalingLayer* scaling_layer_pointer = neural_network.get_scaling_layer_pointer();
scaling_layer_pointer->set_descriptives(inputs_descriptives);
scaling_layer_pointer->set_scaling_method(ScalingLayer::MinimumMaximum);

Therefore, we have already created a good-looking model, thus we proceed to the learning process with TrainingStrategy.

3. Training strategy

The third step is to set the training strategy, which is composed by:

Firstly, we construct the training strategy object

TrainingStrategy training_strategy(&neural_network, &data_set);

secondly, set the error term

training_strategy.set_loss_method(TrainigStrategy::NORMALIZED_SQUARED_ERROR);

and finally the optimization algorithm

training_strategy.set_optimization_method(TrainingStrategy::OptimizationMethod::QUASI_NEWTON_METHOD);

Note that this part is not necessary, due to the fact that OpenNN builds by default the training strategy object using the quasi Newton method as the optimization algorithm and normalize squared error as the loss method. We can now start the training process by using the command

training_strategy.perform_training();

If we need to go further, OpenNN allows control of the optimization, for example

QuasiNewtonMethod* quasi_Newton_method_pointer = training_strategy.get_quasi_Newton_method_pointer();
quasi_Newton_method_pointer->set_minimum_loss_decrease(1.0e-6);
quasi_Newton_method_pointer->set_loss_goal(1.0e-3);
quasi_Newton_method_pointer->set_minimum_parameters_increment_norm(0.0);
quasi_Newton_method_pointer->perform_training();

For more information about the training strategy methods, see TrainingStrategy class.

4. Model selection

The fourth step is to set the model selection, which is composed by:

If you are not sure you have chosen the right architecture, the model selection class aims to find the network architecture with best generalization properties, that is, that which minimizes the error on the selection instances of the data set.

The first step is constrcut the model selection object

ModelSelection model_selection(&training_strategy);

In this example, we want to optimize the number of neurons in the network architecture using the neurons selection algorotihm

model_selection.perform_neurons_selection();

where once the algorithm is finished, our model will have the most optimal architecture for our problem.

For more information about the model selection methods, see ModelSelection class.

5. Testing analysis

The fifth step is evaluate our model, for that purpose we need to use testing analysis class, whose purpose is to validate the generalization performance of the model. Here, we compare the outputs provided by the neural network to the corresponding targets in the testing instances of the data set.

First of all, we must do the reverse process of the neural network input, unscaling the data

data_set.unscale_inputs_minimum_maximum(inputs_descriptives);

Moreover, we are ready to test our model, as in the previous cases we start by building the testing analysis object

TestingAnalysis testing_analysis(&neural_network, &data_set);

and perform the testing, in our case we use confusion matrix

testing_analysis.calculate_confusion();

In the confusion matrix the rows represent the targets (or real values) and the columns the outputs (or predictive values). The diagonal cells show the cases that are correctly classified, and the off-diagonal cells show the misclassified cases.

For more information about the testing analysis methods, see TestingAnalysis class.

6. Model deployment

Once our model is completed, the neural network is now ready to predict outputs for inputs that it has never seen. This process is called model deployment.

In order to generate predictions with new data you can use

neural_network.calculate_outputs();

For instance, the new inputs are:

and in OpenNN we can write it as

const Tensor<double> new_inputs(Vector<size_t>{1,inputs_number},0.0);
Tensor<double> outputs(Vector<size_t>{outputs_number},0.0);
new_inputs[0]=5.84;
new_inputs[1]=3.05;
new_inputs[2]=3.76;
new_inputs[3]=1.20;
outputs=neural_network.calculate_outputs(new_inputs);

or save the model for a later implementation in python, php,... .

neural_network.save_expression("../data/expression.txt");
neural_network.save_expression_python("../data/expression.py");

References: