You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<p>Hyperparameter tuning is an important, but often difficult and computationally intensive task. Changing the architecture of a neural network or the learning rate of an optimizer can have a significant impact on the performance.</p>
345
345
<p>The goal of hyperparameter tuning is to optimize the hyperparameters in a way that improves the performance of the machine learning or deep learning model. The simplest, but also most computationally expensive, approach uses manual search (or trial-and-error <spanclass="citation" data-cites="Meignan:2015vp">(<ahref="references.html#ref-Meignan:2015vp" role="doc-biblioref">Meignan et al. 2015</a>)</span>). Commonly encountered is simple random search, i.e., random and repeated selection of hyperparameters for evaluation, and lattice search (“grid search”). In addition, methods that perform directed search and other model-free algorithms, i.e., algorithms that do not explicitly rely on a model, e.g., evolution strategies <spanclass="citation" data-cites="Bart13j">(<ahref="references.html#ref-Bart13j" role="doc-biblioref">Bartz-Beielstein et al. 2014</a>)</span> or pattern search <spanclass="citation" data-cites="Torczon00">(<ahref="references.html#ref-Torczon00" role="doc-biblioref">Lewis, Torczon, and Trosset 2000</a>)</span> play an important role. Also, “hyperband”, i.e., a multi-armed bandit strategy that dynamically allocates resources to a set of random configurations and uses successive bisections to stop configurations with poor performance <spanclass="citation" data-cites="Li16a">(<ahref="references.html#ref-Li16a" role="doc-biblioref">Li et al. 2016</a>)</span>, is very common in hyperparameter tuning. The most sophisticated and efficient approaches are the Bayesian optimization and surrogate model based optimization methods, which are based on the optimization of cost functions determined by simulations or experiments.</p>
346
346
<p>We consider below a surrogate model based optimization-based hyperparameter tuning approach based on the Python version of the SPOT (“Sequential Parameter Optimization Toolbox”) <spanclass="citation" data-cites="BLP05">(<ahref="references.html#ref-BLP05" role="doc-biblioref">Bartz-Beielstein, Lasarczyk, and Preuss 2005</a>)</span>, which is suitable for situations where only limited resources are available. This may be due to limited availability and cost of hardware, or due to the fact that confidential data may only be processed locally, e.g., due to legal requirements. Furthermore, in our approach, the understanding of algorithms is seen as a key tool for enabling transparency and explainability. This can be enabled, for example, by quantifying the contribution of machine learning and deep learning components (nodes, layers, split decisions, activation functions, etc.). Understanding the importance of hyperparameters and the interactions between multiple hyperparameters plays a major role in the interpretability and explainability of machine learning models. SPOT provides statistical tools for understanding hyperparameters and their interactions. Last but not least, it should be noted that the SPOT software code is available in the open source <code>spotPython</code> package on github<ahref="#fn1" class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a>, allowing replicability of the results. This tutorial descries the Python variant of SPOT, which is called <code>spotPython</code>. The R implementation is described in <spanclass="citation" data-cites="bart21i">Bartz et al. (<ahref="references.html#ref-bart21i" role="doc-biblioref">2022</a>)</span>. SPOT is an established open source software that has been maintained for more than 15 years <spanclass="citation" data-cites="BLP05">(<ahref="references.html#ref-BLP05" role="doc-biblioref">Bartz-Beielstein, Lasarczyk, and Preuss 2005</a>)</span><spanclass="citation" data-cites="bart21i">(<ahref="references.html#ref-bart21i" role="doc-biblioref">Bartz et al. 2022</a>)</span>.</p>
347
-
<p>This tutorial is structured as follows. The concept of the hyperparameter tuning software <code>spotPython</code> is described in <ahref="#sec-spot"><span>Section 1.1</span></a>. <ahref="14_spot_ray_hpt_torch_cifar10.html"><span>Chapter 12</span></a> describes the execution of the example from the tutorial “Hyperparameter Tuning with Ray Tune” <spanclass="citation" data-cites="pyto23a">(<ahref="references.html#ref-pyto23a" role="doc-biblioref">PyTorch 2023</a>)</span>. The integration of <code>spotPython</code> into the <code>PyTorch</code> training workflow is described in detail in the following sections. <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-setup-14"><span>Section 12.1</span></a> describes the setup of the tuners. <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-data-loading-14"><span>Section 12.3</span></a> describes the data loading. <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-selection-of-the-algorithm-14"><span>Section 12.5</span></a> describes the model to be tuned. The search space is introduced in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-search-space-14"><span>Section 12.5.3</span></a>. Optimizers are presented in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-optimizers-14"><span>Section 12.6.1</span></a>. How to split the data in train, validation, and test sets is described in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-data-splitting-14"><span>Section 12.7.1</span></a>. The selection of the loss function and metrics is described in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-loss-functions-14"><span>Section 12.7.5</span></a>. <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-prepare-spot-call-14"><span>Section 12.8.1</span></a> describes the preparation of the <code>spotPython</code> call. The objective function is described in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-the-objective-function-14"><span>Section 12.8.2</span></a>. How to use results from previous runs and default hyperparameter configurations is described in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-default-hyperparameters"><span>Section 12.8.3</span></a>. Starting the tuner is shown in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-call-the-hyperparameter-tuner-14"><span>Section 12.8.4</span></a>. TensorBoard can be used to visualize the results as shown in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-tensorboard-14"><span>Section 12.9</span></a>. Results are discussed and explained in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-results-14"><span>Section 12.10</span></a>.</p>
348
-
<p><spanclass="quarto-unresolved-ref">?sec-hyperparameter-tuning-lightning-30</span> shows the integration of <code>spotPython</code> into the <code>PyTorch Lightning</code> training workflow.</p>
347
+
<p>This document is structured as follows. The concept of the hyperparameter tuning software <code>spotPython</code> is described in <ahref="#sec-spot"><span>Section 1.1</span></a>. <ahref="14_spot_ray_hpt_torch_cifar10.html"><span>Chapter 12</span></a> describes the execution of the example from the tutorial “Hyperparameter Tuning with Ray Tune” <spanclass="citation" data-cites="pyto23a">(<ahref="references.html#ref-pyto23a" role="doc-biblioref">PyTorch 2023</a>)</span>. The integration of <code>spotPython</code> into the <code>PyTorch</code> training workflow is described in detail in the following sections. <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-setup-14"><span>Section 12.1</span></a> describes the setup of the tuners. <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-data-loading-14"><span>Section 12.3</span></a> describes the data loading. <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-selection-of-the-algorithm-14"><span>Section 12.5</span></a> describes the model to be tuned. The search space is introduced in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-search-space-14"><span>Section 12.5.3</span></a>. Optimizers are presented in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-optimizers-14"><span>Section 12.6.1</span></a>. How to split the data in train, validation, and test sets is described in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-data-splitting-14"><span>Section 12.7.1</span></a>. The selection of the loss function and metrics is described in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-loss-functions-14"><span>Section 12.7.5</span></a>. <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-prepare-spot-call-14"><span>Section 12.8.1</span></a> describes the preparation of the <code>spotPython</code> call. The objective function is described in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-the-objective-function-14"><span>Section 12.8.2</span></a>. How to use results from previous runs and default hyperparameter configurations is described in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-default-hyperparameters"><span>Section 12.8.3</span></a>. Starting the tuner is shown in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-call-the-hyperparameter-tuner-14"><span>Section 12.8.4</span></a>. TensorBoard can be used to visualize the results as shown in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-tensorboard-14"><span>Section 12.9</span></a>. Results are discussed and explained in <ahref="14_spot_ray_hpt_torch_cifar10.html#sec-results-14"><span>Section 12.10</span></a>.</p>
348
+
<p><ahref="31_spot_lightning_csv.html"><span>Chapter 17</span></a> shows the integration of <code>spotPython</code> into the <code>PyTorch Lightning</code> training workflow.</p>
349
349
<p><ahref="14_spot_ray_hpt_torch_cifar10.html#sec-summary"><span>Section 12.11</span></a> presents a summary and an outlook.</p>
0 commit comments