OpenNN
Open-source neural networks library
Loading...
Searching...
No Matches
opennn::StochasticGradientDescent Class Referencefinal

Mini-batch SGD with optional momentum, Nesterov acceleration and learning-rate decay. More...

#include <stochastic_gradient_descent.h>

Inheritance diagram for opennn::StochasticGradientDescent:
[legend]

Public Types

enum  DataSlot { ParameterUpdate , LastParameterUpdate }
 Slot indices into OptimizerData::views used by SGD. More...
 
- Public Types inherited from opennn::Optimizer
enum class  StoppingCondition {
  None , MinimumLossDecrease , LossGoal , MaximumSelectionErrorIncreases ,
  MaximumEpochsNumber , MaximumTime
}
 Reasons that can terminate a training run. More...
 

Public Member Functions

 StochasticGradientDescent (Loss *loss=nullptr)
 Constructs the optimizer.
 
void set_default ()
 Resets all hyperparameters to their default values.
 
void set_batch_size (const Index)
 Sets the mini-batch size used during training.
 
Index get_samples_number () const
 Returns the number of training samples in the bound dataset.
 
void set_initial_learning_rate (const float)
 Sets the base learning rate (before any decay).
 
void set_initial_decay (const float)
 Sets the per-epoch learning-rate decay factor.
 
void set_momentum (const float)
 Sets the momentum coefficient.
 
void set_nesterov (bool)
 Toggles Nesterov accelerated gradient.
 
void update_parameters (BackPropagation &back_propagation, OptimizerData &data, float learning_rate) const
 Applies one SGD parameter update.
 
TrainingResults train () override
 Runs SGD to completion.
 
void from_JSON (const JsonDocument &) override
 Loads optimizer hyperparameters from a parsed JSON document.
 
void to_JSON (JsonWriter &) const override
 Writes optimizer hyperparameters to a streaming JSON writer.
 
- Public Member Functions inherited from opennn::Optimizer
 Optimizer (Loss *loss=nullptr)
 Constructs an optimizer bound to a loss function.
 
virtual ~Optimizer ()=default
 Virtual destructor.
 
const Lossget_loss () const
 Read-only access to the loss being optimized.
 
bool get_display () const
 Whether progress should be printed to stdout during training.
 
void set (Loss *new_loss)
 Re-initializes the optimizer by setting its loss pointer.
 
virtual void set_loss (Loss *new_loss)
 Updates the loss pointer; subclasses may override to refresh cached state derived from the loss.
 
virtual void set_display (bool new_display)
 Toggles per-epoch progress printing.
 
void set_display_period (const Index new_display_period)
 Sets how often progress is printed.
 
void set_maximum_epochs (const Index new_maximum_epochs)
 Sets the maximum number of epochs.
 
void set_maximum_time (const float new_maximum_time)
 Sets the maximum wall-clock training time.
 
void set_loss_goal (const float new_loss_goal)
 Sets the training-loss goal.
 
void set_maximum_validation_failures (const Index new_maximum_validation_failures)
 Sets the maximum number of consecutive validation-error increases tolerated.
 
const string & get_name () const
 Canonical name of the optimizer (set by subclasses).
 
virtual void print () const
 Prints a human-readable summary of the optimizer to stdout.
 
void save (const filesystem::path &) const
 Saves the optimizer state to a file.
 
void load (const filesystem::path &)
 Loads the optimizer state from a file.
 

Additional Inherited Members

- Static Public Member Functions inherited from opennn::Optimizer
static float get_elapsed_time (const time_t &beginning_time)
 Computes the elapsed wall-clock time since a reference instant.
 
- Protected Member Functions inherited from opennn::Optimizer
void set_names ()
 Subclass hook to refresh layer name caches after a loss change.
 
void set_scaling ()
 Subclass hook to install the dataset-derived input scalers.
 
void set_unscaling ()
 Subclass hook to install the dataset-derived output unscalers.
 
bool check_stopping_condition (TrainingResults &results, Index epoch, float elapsed_time, float training_error, Index validation_failures) const
 Evaluates every stopping criterion and updates the result accordingly.
 
void write_common_xml (JsonWriter &) const
 Writes the common Optimizer fields to JSON.
 
void read_common_xml (const Json *)
 Reads the common Optimizer fields from JSON.
 
void setup_device_training ()
 Allocates the CUDA stream and events used for batch prefetching.
 
void teardown_device_training ()
 Releases the CUDA stream and events allocated by setup_device_training().
 
void prefetch_batch (Batch &batch, Index sample_count, int slot)
 Asynchronously prefetches the next training batch into a slot.
 
void wait_prefetch (int slot)
 Waits for the prefetch into a given slot to finish.
 
void sync_device ()
 Synchronizes the device on the optimizer's CUDA stream.
 
bool should_display (Index epoch) const
 Whether the current epoch should print progress.
 
EpochStats train_epoch (bool is_classification, ForwardPropagation &forward_propagation, BackPropagation &back_propagation, ThreadSafeQueue< Batch * > &empty_queue, ThreadSafeQueue< Batch * > &ready_queue, const vector< vector< Index > > &batches, const vector< Index > &input_feature_indices, const vector< Index > &decoder_feature_indices, const vector< Index > &target_feature_indices, const std::function< void(BackPropagation &)> &update)
 Runs a single training epoch over all batches.
 
EpochStats evaluate_epoch (bool is_classification, ForwardPropagation &forward_propagation, ThreadSafeQueue< Batch * > &empty_queue, ThreadSafeQueue< Batch * > &ready_queue, const vector< vector< Index > > &batches, const vector< Index > &input_feature_indices, const vector< Index > &decoder_feature_indices, const vector< Index > &target_feature_indices)
 Runs a single evaluation pass over all batches without updating parameters.
 
- Static Protected Member Functions inherited from opennn::Optimizer
static void clip_gradient_norm (Buffer &gradient, float max_norm)
 In-place gradient norm clipping.
 
- Protected Attributes inherited from opennn::Optimizer
Lossloss = nullptr
 Loss being optimized; not owned.
 
float training_loss_goal = 0.0f
 Training stops when the training loss reaches this value.
 
Index maximum_validation_failures = numeric_limits<Index>::max()
 Maximum number of consecutive validation-error increases tolerated.
 
Index maximum_epochs = 10000
 Maximum number of training epochs.
 
float maximum_time = 360000.0f
 Maximum wall-clock training time in seconds.
 
Index display_period = 10
 Number of epochs between progress prints.
 
bool display = true
 Whether progress should be printed to stdout during training.
 
string name
 Canonical name of the optimizer (set by subclasses).
 
cudaStream_t memory_stream = nullptr
 CUDA stream used to prefetch batches into device memory.
 
cudaEvent_t batch_ready_event [2] = {nullptr, nullptr}
 CUDA events signaling when each prefetched batch is ready.
 

Detailed Description

Mini-batch SGD with optional momentum, Nesterov acceleration and learning-rate decay.

Updates parameters as theta -= lr * grad. With momentum, accumulates a velocity vector v = momentum * v + grad and steps along v; the Nesterov variant evaluates the gradient at theta - lr * momentum * v to obtain a lookahead update.

Member Enumeration Documentation

◆ DataSlot

Slot indices into OptimizerData::views used by SGD.

Enumerator
ParameterUpdate 

Current parameter increment.

LastParameterUpdate 

Previous increment (used by momentum).

Constructor & Destructor Documentation

◆ StochasticGradientDescent()

opennn::StochasticGradientDescent::StochasticGradientDescent ( Loss * loss = nullptr)

Constructs the optimizer.

Parameters
lossLoss to optimize; may be nullptr if set later.

Member Function Documentation

◆ from_JSON()

void opennn::StochasticGradientDescent::from_JSON ( const JsonDocument & )
overridevirtual

Loads optimizer hyperparameters from a parsed JSON document.

Reimplemented from opennn::Optimizer.

◆ get_samples_number()

Index opennn::StochasticGradientDescent::get_samples_number ( ) const

Returns the number of training samples in the bound dataset.

Returns
Sample count, or 0 if the optimizer is not bound to a loss.

◆ set_batch_size()

void opennn::StochasticGradientDescent::set_batch_size ( const Index )

Sets the mini-batch size used during training.

Receives the number of samples per gradient update.

◆ set_default()

void opennn::StochasticGradientDescent::set_default ( )

Resets all hyperparameters to their default values.

◆ set_initial_decay()

void opennn::StochasticGradientDescent::set_initial_decay ( const float )

Sets the per-epoch learning-rate decay factor.

Receives the decay rate; the effective learning rate at epoch t is lr / (1 + decay * t).

◆ set_initial_learning_rate()

void opennn::StochasticGradientDescent::set_initial_learning_rate ( const float )

Sets the base learning rate (before any decay).

Receives the learning rate used at the first epoch.

◆ set_momentum()

void opennn::StochasticGradientDescent::set_momentum ( const float )

Sets the momentum coefficient.

Receives the momentum value (0 disables momentum).

◆ set_nesterov()

void opennn::StochasticGradientDescent::set_nesterov ( bool )

Toggles Nesterov accelerated gradient.

Receives true to enable Nesterov, false to use vanilla momentum.

◆ to_JSON()

void opennn::StochasticGradientDescent::to_JSON ( JsonWriter & ) const
overridevirtual

Writes optimizer hyperparameters to a streaming JSON writer.

Reimplemented from opennn::Optimizer.

◆ train()

TrainingResults opennn::StochasticGradientDescent::train ( )
overridevirtual

Runs SGD to completion.

Returns
Per-epoch error history and the stopping condition that fired.

Implements opennn::Optimizer.

◆ update_parameters()

void opennn::StochasticGradientDescent::update_parameters ( BackPropagation & back_propagation,
OptimizerData & data,
float learning_rate ) const

Applies one SGD parameter update.

Parameters
back_propagationGradient buffer for the current batch.
dataMutable optimizer state (last update buffer).
learning_rateEffective learning rate for this epoch.