|
OpenNN
Open-source neural networks library
|
Mini-batch SGD with optional momentum, Nesterov acceleration and learning-rate decay. More...
#include <stochastic_gradient_descent.h>
Public Types | |
| enum | DataSlot { ParameterUpdate , LastParameterUpdate } |
| Slot indices into OptimizerData::views used by SGD. More... | |
Public Types inherited from opennn::Optimizer | |
| enum class | StoppingCondition { None , MinimumLossDecrease , LossGoal , MaximumSelectionErrorIncreases , MaximumEpochsNumber , MaximumTime } |
| Reasons that can terminate a training run. More... | |
Public Member Functions | |
| StochasticGradientDescent (Loss *loss=nullptr) | |
| Constructs the optimizer. | |
| void | set_default () |
| Resets all hyperparameters to their default values. | |
| void | set_batch_size (const Index) |
| Sets the mini-batch size used during training. | |
| Index | get_samples_number () const |
| Returns the number of training samples in the bound dataset. | |
| void | set_initial_learning_rate (const float) |
| Sets the base learning rate (before any decay). | |
| void | set_initial_decay (const float) |
| Sets the per-epoch learning-rate decay factor. | |
| void | set_momentum (const float) |
| Sets the momentum coefficient. | |
| void | set_nesterov (bool) |
| Toggles Nesterov accelerated gradient. | |
| void | update_parameters (BackPropagation &back_propagation, OptimizerData &data, float learning_rate) const |
| Applies one SGD parameter update. | |
| TrainingResults | train () override |
| Runs SGD to completion. | |
| void | from_JSON (const JsonDocument &) override |
| Loads optimizer hyperparameters from a parsed JSON document. | |
| void | to_JSON (JsonWriter &) const override |
| Writes optimizer hyperparameters to a streaming JSON writer. | |
Public Member Functions inherited from opennn::Optimizer | |
| Optimizer (Loss *loss=nullptr) | |
| Constructs an optimizer bound to a loss function. | |
| virtual | ~Optimizer ()=default |
| Virtual destructor. | |
| const Loss * | get_loss () const |
| Read-only access to the loss being optimized. | |
| bool | get_display () const |
| Whether progress should be printed to stdout during training. | |
| void | set (Loss *new_loss) |
| Re-initializes the optimizer by setting its loss pointer. | |
| virtual void | set_loss (Loss *new_loss) |
| Updates the loss pointer; subclasses may override to refresh cached state derived from the loss. | |
| virtual void | set_display (bool new_display) |
| Toggles per-epoch progress printing. | |
| void | set_display_period (const Index new_display_period) |
| Sets how often progress is printed. | |
| void | set_maximum_epochs (const Index new_maximum_epochs) |
| Sets the maximum number of epochs. | |
| void | set_maximum_time (const float new_maximum_time) |
| Sets the maximum wall-clock training time. | |
| void | set_loss_goal (const float new_loss_goal) |
| Sets the training-loss goal. | |
| void | set_maximum_validation_failures (const Index new_maximum_validation_failures) |
| Sets the maximum number of consecutive validation-error increases tolerated. | |
| const string & | get_name () const |
| Canonical name of the optimizer (set by subclasses). | |
| virtual void | print () const |
| Prints a human-readable summary of the optimizer to stdout. | |
| void | save (const filesystem::path &) const |
| Saves the optimizer state to a file. | |
| void | load (const filesystem::path &) |
| Loads the optimizer state from a file. | |
Additional Inherited Members | |
Static Public Member Functions inherited from opennn::Optimizer | |
| static float | get_elapsed_time (const time_t &beginning_time) |
| Computes the elapsed wall-clock time since a reference instant. | |
Protected Member Functions inherited from opennn::Optimizer | |
| void | set_names () |
| Subclass hook to refresh layer name caches after a loss change. | |
| void | set_scaling () |
| Subclass hook to install the dataset-derived input scalers. | |
| void | set_unscaling () |
| Subclass hook to install the dataset-derived output unscalers. | |
| bool | check_stopping_condition (TrainingResults &results, Index epoch, float elapsed_time, float training_error, Index validation_failures) const |
| Evaluates every stopping criterion and updates the result accordingly. | |
| void | write_common_xml (JsonWriter &) const |
| Writes the common Optimizer fields to JSON. | |
| void | read_common_xml (const Json *) |
| Reads the common Optimizer fields from JSON. | |
| void | setup_device_training () |
| Allocates the CUDA stream and events used for batch prefetching. | |
| void | teardown_device_training () |
| Releases the CUDA stream and events allocated by setup_device_training(). | |
| void | prefetch_batch (Batch &batch, Index sample_count, int slot) |
| Asynchronously prefetches the next training batch into a slot. | |
| void | wait_prefetch (int slot) |
| Waits for the prefetch into a given slot to finish. | |
| void | sync_device () |
| Synchronizes the device on the optimizer's CUDA stream. | |
| bool | should_display (Index epoch) const |
| Whether the current epoch should print progress. | |
| EpochStats | train_epoch (bool is_classification, ForwardPropagation &forward_propagation, BackPropagation &back_propagation, ThreadSafeQueue< Batch * > &empty_queue, ThreadSafeQueue< Batch * > &ready_queue, const vector< vector< Index > > &batches, const vector< Index > &input_feature_indices, const vector< Index > &decoder_feature_indices, const vector< Index > &target_feature_indices, const std::function< void(BackPropagation &)> &update) |
| Runs a single training epoch over all batches. | |
| EpochStats | evaluate_epoch (bool is_classification, ForwardPropagation &forward_propagation, ThreadSafeQueue< Batch * > &empty_queue, ThreadSafeQueue< Batch * > &ready_queue, const vector< vector< Index > > &batches, const vector< Index > &input_feature_indices, const vector< Index > &decoder_feature_indices, const vector< Index > &target_feature_indices) |
| Runs a single evaluation pass over all batches without updating parameters. | |
Static Protected Member Functions inherited from opennn::Optimizer | |
| static void | clip_gradient_norm (Buffer &gradient, float max_norm) |
| In-place gradient norm clipping. | |
Protected Attributes inherited from opennn::Optimizer | |
| Loss * | loss = nullptr |
| Loss being optimized; not owned. | |
| float | training_loss_goal = 0.0f |
| Training stops when the training loss reaches this value. | |
| Index | maximum_validation_failures = numeric_limits<Index>::max() |
| Maximum number of consecutive validation-error increases tolerated. | |
| Index | maximum_epochs = 10000 |
| Maximum number of training epochs. | |
| float | maximum_time = 360000.0f |
| Maximum wall-clock training time in seconds. | |
| Index | display_period = 10 |
| Number of epochs between progress prints. | |
| bool | display = true |
| Whether progress should be printed to stdout during training. | |
| string | name |
| Canonical name of the optimizer (set by subclasses). | |
| cudaStream_t | memory_stream = nullptr |
| CUDA stream used to prefetch batches into device memory. | |
| cudaEvent_t | batch_ready_event [2] = {nullptr, nullptr} |
| CUDA events signaling when each prefetched batch is ready. | |
Mini-batch SGD with optional momentum, Nesterov acceleration and learning-rate decay.
Updates parameters as theta -= lr * grad. With momentum, accumulates a velocity vector v = momentum * v + grad and steps along v; the Nesterov variant evaluates the gradient at theta - lr * momentum * v to obtain a lookahead update.
Slot indices into OptimizerData::views used by SGD.
| Enumerator | |
|---|---|
| ParameterUpdate | Current parameter increment. |
| LastParameterUpdate | Previous increment (used by momentum). |
| opennn::StochasticGradientDescent::StochasticGradientDescent | ( | Loss * | loss = nullptr | ) |
Constructs the optimizer.
| loss | Loss to optimize; may be nullptr if set later. |
|
overridevirtual |
Loads optimizer hyperparameters from a parsed JSON document.
Reimplemented from opennn::Optimizer.
| Index opennn::StochasticGradientDescent::get_samples_number | ( | ) | const |
Returns the number of training samples in the bound dataset.
| void opennn::StochasticGradientDescent::set_batch_size | ( | const Index | ) |
Sets the mini-batch size used during training.
Receives the number of samples per gradient update.
| void opennn::StochasticGradientDescent::set_default | ( | ) |
Resets all hyperparameters to their default values.
| void opennn::StochasticGradientDescent::set_initial_decay | ( | const float | ) |
Sets the per-epoch learning-rate decay factor.
Receives the decay rate; the effective learning rate at epoch t is lr / (1 + decay * t).
| void opennn::StochasticGradientDescent::set_initial_learning_rate | ( | const float | ) |
Sets the base learning rate (before any decay).
Receives the learning rate used at the first epoch.
| void opennn::StochasticGradientDescent::set_momentum | ( | const float | ) |
Sets the momentum coefficient.
Receives the momentum value (0 disables momentum).
| void opennn::StochasticGradientDescent::set_nesterov | ( | bool | ) |
Toggles Nesterov accelerated gradient.
Receives true to enable Nesterov, false to use vanilla momentum.
|
overridevirtual |
Writes optimizer hyperparameters to a streaming JSON writer.
Reimplemented from opennn::Optimizer.
|
overridevirtual |
Runs SGD to completion.
Implements opennn::Optimizer.
| void opennn::StochasticGradientDescent::update_parameters | ( | BackPropagation & | back_propagation, |
| OptimizerData & | data, | ||
| float | learning_rate ) const |
Applies one SGD parameter update.
| back_propagation | Gradient buffer for the current batch. |
| data | Mutable optimizer state (last update buffer). |
| learning_rate | Effective learning rate for this epoch. |