6 steps to build a neural network in OpenNN

This tutorial shows the main steps for building a neural network model using OpenNN. You can find the script for the example we will use on GitHub.

The central goal here is to design a model that makes reasonable classifications for new data or, in other words, one that exhibits good generalization.

If you want to know more about the concepts in this tutorial, you can read this neural network tutorial or the machine learning blog created by Neural Designer.

Contents:

  1. Data set.
  2. Neural network.
  3. Training strategy.
  4. Model selection.
  5. Testing analysis.
  6. Model deployment.

1. Data set

The first step is to prepare the data set, which is the source of information for the classification problem. For that, we need to configure the following concepts:

  • Data source.
  • Variables.
  • Instances.

The data source is the file iris_flowers.csv. It contains the data for this example in semicolon-separated values (CSV) format and can be loaded as:

// Dataset
DataSet data_set("path_to_source/iris_flowers.csv", ";", true, false);

There are five columns and 150 rows. The variables in this problem are:

  • sepal_length: Sepal length, used as input.
  • sepal_width: Sepal width, used as input.
  • petal_length: Petal length, used as input.
  • petal_width: Petal width, used as input.
  • class: Iris Setosa, Versicolor, or Virginica, used as target.

In this regard, OpenNN recognizes categorical variables and converts them to numerical values. In this example, the transformation is as follows:

  • iris_setosa: 1 0 0.
  • iris_versicolor: 0 1 0.
  • iris_virginica: 0 0 1.

Then, there will be 7 numerical variables in the dataset. Once we have the data ready, we will obtain information on the variables, including names and statistical descriptives.

const vector<string> inputs_names = data_set.get_variable_names("Input");
const vector<string> targets_names = data_set.get_variable_names("Target");

The instances are divided into training, selection, and testing subsets. They represent 60% (90), 20% (30), and 20% (30) of the original instances, respectively, and are split randomly using the following command:

// Split data set into training, selection and testing samples
data_set.split_samples_random();

To get the input and target variables number, we use the following command:

const Index input_variables_number = data_set.get_variables_number("Input");
const Index target_variables_number = data_set.get_variables_number("Target");

We scale the dataset to ensure the neural network operates in the best possible conditions. In our case, we will choose the minimum-maximum scaling method.

// Scaling input variables
Tensor<string, 1> scaling_inputs_methods(input_variables_number);
scaling_inputs_methods.setConstant("MinimumMaximum");
const vector<Descriptives> inputs_descriptives = data_set.scale_variables("Input");

We will obtain the statistical descriptives for each input: maximum, minimum, mean, and standard deviation. In this case, we did not scale the targets because they are either 0 or 1, which works well for the neural network.

For more information about the data set methods, see the DataSet class.

2. Neural network

The second step is to choose the correct neural network architecture. For classification problems, it is usually composed of:

  • A scaling layer.
  • Two perceptron layers.

This architecture is already defined in OpenNN as ClassificationNetwork, it can be created as follows:

// Neural network architecture
const Index neurons_number = 3;
ClassificationNetwork neural_network(
    {input_variables_number},
    {neurons_number},
    {target_variables_number}
);

Now, the NeuralNetwork class is responsible for building the neural network and adequately organizing the layers of neurons using the following constructor. If you need more complex architectures, you should see the NeuralNetwork class.
Once the neural network has been created, we can introduce information into the layers to achieve more precise calibration.

// Set input and output names
neural_network.set_feature_names(inputs_names);
neural_network.set_output_names(targets_names);

For the scaling layer, it is necessary to enter the descriptives and the scaling method calculated previously:

// Configure scaling layer
Scaling<2>* scaling_layer_pointer =
    static_cast<Scaling<2>*>(neural_network.get_first("Scaling2d"));

scaling_layer_pointer->set_descriptives(inputs_descriptives);
scaling_layer_pointer->set_scalers("MinimumMaximum");

Therefore, we have already created a good-looking model. Thus, we proceed to the learning process with TrainingStrategy.

3. Training strategy

The third step is to set the training strategy, which is composed of:

  • Loss index.
  • Optimization algorithm.

Firstly, we construct the training strategy object

// Training strategy
TrainingStrategy training_strategy(&neural_network, &data_set);

then, set the error term

// Loss function
training_strategy.set_loss_index("NormalizedSquaredError");

and finally, the optimization algorithm

// Optimization algorithm
training_strategy.set_optimization_algorithm("AdaptiveMomentEstimation");

Note that this part is unnecessary because OpenNN builds the training strategy object by default using the quasi-Newton method as the optimization algorithm and normalized squared error as the loss method. We can now start the training process by using the command:

// Train the model
training_strategy.train();

If we need to go further, OpenNN allows control of the optimization, for example:

// Configure Adam optimizer
AdaptiveMomentEstimation* adam =
    dynamic_cast<AdaptiveMomentEstimation*>(
        training_strategy.get_optimization_algorithm()
    );

adam->set_loss_goal(type(1.0e-3));
adam->set_maximum_epochs_number(10000);
adam->set_display_period(1000);

// Train the model
training_strategy.train();

For more information about the training strategy methods, see the TrainingStrategy class.

4. Model selection

The fourth step is to set the model selection, which is composed of:

  • Input selection algorithm.
  • Neuron selection algorithm.

If you’re unsure about your architecture choice, the model selection class helps identify the architecture with the best generalization, minimizing errors on the selected dataset.

The first step is to construct the model selection object:

// Model selection
ModelSelection model_selection(&training_strategy);

In this example, we want to optimize the number of neurons in the network architecture using the «neuron selection» algorithm:

// Perform neurons selection
model_selection.perform_neurons_selection();

Once the algorithm is finished, our model will have the optimal architecture for our problem.

For more information about the model selection methods, see the ModelSelection class.

5. Testing analysis

The fifth step is to evaluate our model. For that purpose, we need to use the testing analysis class, which is designed to validate the model’s generalization performance. Here, we compare the neural network outputs to the corresponding targets in the testing instances of the data set.

First of all, we must do the reverse process of the neural network input, unscaling the data:

// Unscale input variables
data_set.unscale_variables("Input", inputs_descriptives);

We are ready to test our model. As in the previous cases, we start by building the testing analysis object

// Testing analysis
TestingAnalysis testing_analysis(&neural_network, &data_set);

and perform the testing. In our case, we use a confusion matrix:

// Confusion matrix
Tensor<Index, 2> confusion = testing_analysis.calculate_confusion();

In a confusion matrix, rows represent targets (or real values), and columns represent outputs (or predicted values). The diagonal cells show correctly classified cases, and the off-diagonal cells show misclassified cases.

For more information about the testing analysis methods, see the TestingAnalysis class.

6. Model deployment

Once our model is completed, the neural network can predict outputs for inputs it has never seen. This process is called model deployment.

To generate predictions with new data, you can use:

// Calculate network outputs
neural_network.calculate_outputs();

For instance, the new inputs are:

  • Sepal length: 5.10 cm.
  • Sepal width: 3.50 cm.
  • Petal length: 1.40 cm.
  • Petal width: 0.20 cm.

In OpenNN, we can write it as:

// Model inference
Tensor<type, 2> inputs(1, 4);
inputs.setValues({{type(5.1), type(3.5), type(1.4), type(0.2)}});

neural_network.calculate_outputs<2, 2>(inputs);

// Unscale input variables
data_set.unscale_variables("Input", inputs_descriptives);

or save the model for later implementation in Python, PHP, etc.

// Model export
ModelExpression model_expression(&neural_network);

model_expression.save_c(
    "../data/expression.txt",
    data_set.get_raw_variables()
);

model_expression.save_python(
    "../data/expression.txt",
    data_set.get_raw_variables()
);