|
OpenNN
Open-source neural networks library
|
BFGS quasi-Newton optimizer with line search. More...
#include <quasi_newton_method.h>
Public Types | |
| enum | DataSlot { OldParameters , ParameterDifferences , ParameterUpdates , OldGradient , GradientDifference , OldInverseHessianDotGradientDifference , BFGS , InverseHessian , OldInverseHessian } |
| Slot indices into OptimizerData::views used by BFGS. More... | |
Public Types inherited from opennn::Optimizer | |
| enum class | StoppingCondition { None , MinimumLossDecrease , LossGoal , MaximumSelectionErrorIncreases , MaximumEpochsNumber , MaximumTime } |
| Reasons that can terminate a training run. More... | |
Public Member Functions | |
| QuasiNewtonMethod (Loss *loss=nullptr) | |
| Constructs the optimizer. | |
| void | set_default () |
| Resets all hyperparameters to their default values. | |
| void | set_minimum_loss_decrease (const float new_minimum_loss_decrease) |
| Sets the minimum acceptable loss decrease between iterations. | |
| void | update_parameters (const Batch &batch, ForwardPropagation &forward_propagation, BackPropagation &back_propagation, OptimizerData &data) |
| Applies one BFGS parameter update. | |
| TrainingResults | train () override |
| Runs BFGS to completion. | |
| void | from_JSON (const JsonDocument &) override |
| Loads optimizer hyperparameters from a parsed JSON document. | |
| void | to_JSON (JsonWriter &) const override |
| Writes optimizer hyperparameters to a streaming JSON writer. | |
Public Member Functions inherited from opennn::Optimizer | |
| Optimizer (Loss *loss=nullptr) | |
| Constructs an optimizer bound to a loss function. | |
| virtual | ~Optimizer ()=default |
| Virtual destructor. | |
| const Loss * | get_loss () const |
| Read-only access to the loss being optimized. | |
| bool | get_display () const |
| Whether progress should be printed to stdout during training. | |
| void | set (Loss *new_loss) |
| Re-initializes the optimizer by setting its loss pointer. | |
| virtual void | set_loss (Loss *new_loss) |
| Updates the loss pointer; subclasses may override to refresh cached state derived from the loss. | |
| virtual void | set_display (bool new_display) |
| Toggles per-epoch progress printing. | |
| void | set_display_period (const Index new_display_period) |
| Sets how often progress is printed. | |
| void | set_maximum_epochs (const Index new_maximum_epochs) |
| Sets the maximum number of epochs. | |
| void | set_maximum_time (const float new_maximum_time) |
| Sets the maximum wall-clock training time. | |
| void | set_loss_goal (const float new_loss_goal) |
| Sets the training-loss goal. | |
| void | set_maximum_validation_failures (const Index new_maximum_validation_failures) |
| Sets the maximum number of consecutive validation-error increases tolerated. | |
| const string & | get_name () const |
| Canonical name of the optimizer (set by subclasses). | |
| virtual void | print () const |
| Prints a human-readable summary of the optimizer to stdout. | |
| void | save (const filesystem::path &) const |
| Saves the optimizer state to a file. | |
| void | load (const filesystem::path &) |
| Loads the optimizer state from a file. | |
Additional Inherited Members | |
Static Public Member Functions inherited from opennn::Optimizer | |
| static float | get_elapsed_time (const time_t &beginning_time) |
| Computes the elapsed wall-clock time since a reference instant. | |
Protected Member Functions inherited from opennn::Optimizer | |
| void | set_names () |
| Subclass hook to refresh layer name caches after a loss change. | |
| void | set_scaling () |
| Subclass hook to install the dataset-derived input scalers. | |
| void | set_unscaling () |
| Subclass hook to install the dataset-derived output unscalers. | |
| bool | check_stopping_condition (TrainingResults &results, Index epoch, float elapsed_time, float training_error, Index validation_failures) const |
| Evaluates every stopping criterion and updates the result accordingly. | |
| void | write_common_xml (JsonWriter &) const |
| Writes the common Optimizer fields to JSON. | |
| void | read_common_xml (const Json *) |
| Reads the common Optimizer fields from JSON. | |
| void | setup_device_training () |
| Allocates the CUDA stream and events used for batch prefetching. | |
| void | teardown_device_training () |
| Releases the CUDA stream and events allocated by setup_device_training(). | |
| void | prefetch_batch (Batch &batch, Index sample_count, int slot) |
| Asynchronously prefetches the next training batch into a slot. | |
| void | wait_prefetch (int slot) |
| Waits for the prefetch into a given slot to finish. | |
| void | sync_device () |
| Synchronizes the device on the optimizer's CUDA stream. | |
| bool | should_display (Index epoch) const |
| Whether the current epoch should print progress. | |
| EpochStats | train_epoch (bool is_classification, ForwardPropagation &forward_propagation, BackPropagation &back_propagation, ThreadSafeQueue< Batch * > &empty_queue, ThreadSafeQueue< Batch * > &ready_queue, const vector< vector< Index > > &batches, const vector< Index > &input_feature_indices, const vector< Index > &decoder_feature_indices, const vector< Index > &target_feature_indices, const std::function< void(BackPropagation &)> &update) |
| Runs a single training epoch over all batches. | |
| EpochStats | evaluate_epoch (bool is_classification, ForwardPropagation &forward_propagation, ThreadSafeQueue< Batch * > &empty_queue, ThreadSafeQueue< Batch * > &ready_queue, const vector< vector< Index > > &batches, const vector< Index > &input_feature_indices, const vector< Index > &decoder_feature_indices, const vector< Index > &target_feature_indices) |
| Runs a single evaluation pass over all batches without updating parameters. | |
Static Protected Member Functions inherited from opennn::Optimizer | |
| static void | clip_gradient_norm (Buffer &gradient, float max_norm) |
| In-place gradient norm clipping. | |
Protected Attributes inherited from opennn::Optimizer | |
| Loss * | loss = nullptr |
| Loss being optimized; not owned. | |
| float | training_loss_goal = 0.0f |
| Training stops when the training loss reaches this value. | |
| Index | maximum_validation_failures = numeric_limits<Index>::max() |
| Maximum number of consecutive validation-error increases tolerated. | |
| Index | maximum_epochs = 10000 |
| Maximum number of training epochs. | |
| float | maximum_time = 360000.0f |
| Maximum wall-clock training time in seconds. | |
| Index | display_period = 10 |
| Number of epochs between progress prints. | |
| bool | display = true |
| Whether progress should be printed to stdout during training. | |
| string | name |
| Canonical name of the optimizer (set by subclasses). | |
| cudaStream_t | memory_stream = nullptr |
| CUDA stream used to prefetch batches into device memory. | |
| cudaEvent_t | batch_ready_event [2] = {nullptr, nullptr} |
| CUDA events signaling when each prefetched batch is ready. | |
BFGS quasi-Newton optimizer with line search.
Maintains an approximate inverse Hessian H_k that is updated at each iteration using the BFGS formula based on the parameter and gradient differences between consecutive iterations. The search direction is d = -H * grad and a line search selects the step length.
Best suited to small / medium networks where evaluating the full gradient over the dataset is cheap; not suited to very large models.
Slot indices into OptimizerData::views used by BFGS.
| opennn::QuasiNewtonMethod::QuasiNewtonMethod | ( | Loss * | loss = nullptr | ) |
Constructs the optimizer.
| loss | Loss to optimize; may be nullptr if set later. |
|
overridevirtual |
Loads optimizer hyperparameters from a parsed JSON document.
Reimplemented from opennn::Optimizer.
| void opennn::QuasiNewtonMethod::set_default | ( | ) |
Resets all hyperparameters to their default values.
|
inline |
Sets the minimum acceptable loss decrease between iterations.
| new_minimum_loss_decrease | Threshold below which training stops. |
|
overridevirtual |
Writes optimizer hyperparameters to a streaming JSON writer.
Reimplemented from opennn::Optimizer.
|
overridevirtual |
Runs BFGS to completion.
Implements opennn::Optimizer.
| void opennn::QuasiNewtonMethod::update_parameters | ( | const Batch & | batch, |
| ForwardPropagation & | forward_propagation, | ||
| BackPropagation & | back_propagation, | ||
| OptimizerData & | data ) |
Applies one BFGS parameter update.
| batch | Current training batch. |
| forward_propagation | Forward intermediates for the batch. |
| back_propagation | Gradient buffer for the batch. |
| data | Mutable optimizer state (inverse Hessian, history). |