62 float initial_learning_rate;
66 float momentum = 0.0f;
68 bool nesterov =
false;
Unified loss container supporting MSE, cross-entropy, Minkowski, weighted, and regularized variants.
Definition loss.h:24
Optimizer(Loss *=nullptr)
Constructs an optimizer optionally bound to a Loss instance.
void set_batch_size(const Index)
Sets the minibatch size used by train().
void set_nesterov(bool)
Enables or disables Nesterov-accelerated momentum.
void set_initial_decay(const float)
Sets the learning-rate decay applied each epoch.
void to_JSON(JsonWriter &) const override
Serializes hyperparameters to JSON.
void set_default()
Resets all hyperparameters (learning rate, decay, momentum, Nesterov) to library defaults.
void update_parameters(BackPropagation &, OptimizerData &, float) const
Applies one SGD update to the network parameters using the gradient and current learning rate.
void set_initial_learning_rate(const float)
Sets the initial learning rate eta_0.
Index get_samples_number() const
Returns the number of training samples seen by the bound dataset.
TrainingResults train() override
Runs the SGD training loop and returns the recorded error history.
StochasticGradientDescent(Loss *=nullptr)
Constructs SGD optionally bound to a Loss instance.
void from_JSON(const JsonDocument &) override
Restores hyperparameters from a JSON document.
DataSlot
Slot index into the optimizer scratch buffer (momentum velocity).
Definition stochastic_gradient_descent.h:25
@ Velocity
Definition stochastic_gradient_descent.h:25
void set_momentum(const float)
Sets the momentum coefficient (0 disables momentum).
Definition adaptive_moment_estimation.h:14
Workspace holding parameter gradients and per-layer deltas during a backward pass.
Definition back_propagation.h:21
Per-optimizer scratch state (moments, directions, iteration counter) backing the update step.
Definition optimizer.h:182
History and final metrics produced by a training run.
Definition optimizer.h:204