|
304 | 304 | <a href="./25_spot_torch_vbdp.html" class="sidebar-item-text sidebar-link"> |
305 | 305 | <span class="menu-text"><span class="chapter-number">20</span> <span class="chapter-title">HPT: PyTorch With VBDP</span></span></a> |
306 | 306 | </div> |
| 307 | +</li> |
| 308 | + <li class="sidebar-item"> |
| 309 | + <div class="sidebar-item-container"> |
| 310 | + <a href="./30_spot_lightning_csv.html" class="sidebar-item-text sidebar-link"> |
| 311 | + <span class="menu-text"><span class="chapter-number">21</span> <span class="chapter-title">HPT PyTorch Lightning: VBDP</span></span></a> |
| 312 | + </div> |
307 | 313 | </li> |
308 | 314 | <li class="sidebar-item"> |
309 | 315 | <div class="sidebar-item-container"> |
310 | 316 | <a href="./99_spot_doc.html" class="sidebar-item-text sidebar-link"> |
311 | | - <span class="menu-text"><span class="chapter-number">21</span> <span class="chapter-title">Documentation of the Sequential Parameter Optimization</span></span></a> |
| 317 | + <span class="menu-text"><span class="chapter-number">22</span> <span class="chapter-title">Documentation of the Sequential Parameter Optimization</span></span></a> |
312 | 318 | </div> |
313 | 319 | </li> |
314 | 320 | <li class="sidebar-item"> |
@@ -362,7 +368,9 @@ <h1 class="title"><span id="sec-hyperparameter-tuning" class="quarto-section-ide |
362 | 368 | <p>Hyperparameter tuning is an important, but often difficult and computationally intensive task. Changing the architecture of a neural network or the learning rate of an optimizer can have a significant impact on the performance.</p> |
363 | 369 | <p>The goal of hyperparameter tuning is to optimize the hyperparameters in a way that improves the performance of the machine learning or deep learning model. The simplest, but also most computationally expensive, approach uses manual search (or trial-and-error <span class="citation" data-cites="Meignan:2015vp">(<a href="references.html#ref-Meignan:2015vp" role="doc-biblioref">Meignan et al. 2015</a>)</span>). Commonly encountered is simple random search, i.e., random and repeated selection of hyperparameters for evaluation, and lattice search (“grid search”). In addition, methods that perform directed search and other model-free algorithms, i.e., algorithms that do not explicitly rely on a model, e.g., evolution strategies <span class="citation" data-cites="Bart13j">(<a href="references.html#ref-Bart13j" role="doc-biblioref">Bartz-Beielstein et al. 2014</a>)</span> or pattern search <span class="citation" data-cites="Torczon00">(<a href="references.html#ref-Torczon00" role="doc-biblioref">Lewis, Torczon, and Trosset 2000</a>)</span> play an important role. Also, “hyperband”, i.e., a multi-armed bandit strategy that dynamically allocates resources to a set of random configurations and uses successive bisections to stop configurations with poor performance <span class="citation" data-cites="Li16a">(<a href="references.html#ref-Li16a" role="doc-biblioref">Li et al. 2016</a>)</span>, is very common in hyperparameter tuning. The most sophisticated and efficient approaches are the Bayesian optimization and surrogate model based optimization methods, which are based on the optimization of cost functions determined by simulations or experiments.</p> |
364 | 370 | <p>We consider below a surrogate model based optimization-based hyperparameter tuning approach based on the Python version of the SPOT (“Sequential Parameter Optimization Toolbox”) <span class="citation" data-cites="BLP05">(<a href="references.html#ref-BLP05" role="doc-biblioref">Bartz-Beielstein, Lasarczyk, and Preuss 2005</a>)</span>, which is suitable for situations where only limited resources are available. This may be due to limited availability and cost of hardware, or due to the fact that confidential data may only be processed locally, e.g., due to legal requirements. Furthermore, in our approach, the understanding of algorithms is seen as a key tool for enabling transparency and explainability. This can be enabled, for example, by quantifying the contribution of machine learning and deep learning components (nodes, layers, split decisions, activation functions, etc.). Understanding the importance of hyperparameters and the interactions between multiple hyperparameters plays a major role in the interpretability and explainability of machine learning models. SPOT provides statistical tools for understanding hyperparameters and their interactions. Last but not least, it should be noted that the SPOT software code is available in the open source <code>spotPython</code> package on github<a href="#fn1" class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a>, allowing replicability of the results. This tutorial descries the Python variant of SPOT, which is called <code>spotPython</code>. The R implementation is described in <span class="citation" data-cites="bart21i">Bartz et al. (<a href="references.html#ref-bart21i" role="doc-biblioref">2022</a>)</span>. SPOT is an established open source software that has been maintained for more than 15 years <span class="citation" data-cites="BLP05">(<a href="references.html#ref-BLP05" role="doc-biblioref">Bartz-Beielstein, Lasarczyk, and Preuss 2005</a>)</span> <span class="citation" data-cites="bart21i">(<a href="references.html#ref-bart21i" role="doc-biblioref">Bartz et al. 2022</a>)</span>.</p> |
365 | | -<p>This tutorial is structured as follows. The concept of the hyperparameter tuning software <code>spotPython</code> is described in <a href="#sec-spot"><span>Section 1.1</span></a>. <a href="14_spot_ray_hpt_torch_cifar10.html"><span>Chapter 14</span></a> describes the execution of the example from the tutorial “Hyperparameter Tuning with Ray Tune” <span class="citation" data-cites="pyto23a">(<a href="references.html#ref-pyto23a" role="doc-biblioref">PyTorch 2023</a>)</span>. The integration of <code>spotPython</code> into the <code>PyTorch</code> training workflow is described in detail in the following sections. <a href="14_spot_ray_hpt_torch_cifar10.html#sec-setup-14"><span>Section 14.1</span></a> describes the setup of the tuners. <a href="14_spot_ray_hpt_torch_cifar10.html#sec-data-loading-14"><span>Section 14.3</span></a> describes the data loading. <a href="14_spot_ray_hpt_torch_cifar10.html#sec-selection-of-the-algorithm-14"><span>Section 14.5</span></a> describes the model to be tuned. The search space is introduced in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-search-space-14"><span>Section 14.5.3</span></a>. Optimizers are presented in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-optimizers-14"><span>Section 14.6.1</span></a>. How to split the data in train, validation, and test sets is described in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-data-splitting-14"><span>Section 14.7.1</span></a>. The selection of the loss function and metrics is described in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-loss-functions-14"><span>Section 14.7.5</span></a>. <a href="14_spot_ray_hpt_torch_cifar10.html#sec-prepare-spot-call-14"><span>Section 14.8.1</span></a> describes the preparation of the <code>spotPython</code> call. The objective function is described in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-the-objective-function-14"><span>Section 14.8.2</span></a>. How to use results from previous runs and default hyperparameter configurations is described in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-default-hyperparameters"><span>Section 14.8.3</span></a>. Starting the tuner is shown in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-call-the-hyperparameter-tuner-14"><span>Section 14.8.4</span></a>. TensorBoard can be used to visualize the results as shown in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-tensorboard-14"><span>Section 14.9</span></a>. Results are discussed and explained in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-results-14"><span>Section 14.10</span></a>. Finally, <a href="14_spot_ray_hpt_torch_cifar10.html#sec-summary"><span>Section 14.11</span></a> presents a summary and an outlook.</p> |
| 371 | +<p>This tutorial is structured as follows. The concept of the hyperparameter tuning software <code>spotPython</code> is described in <a href="#sec-spot"><span>Section 1.1</span></a>. <a href="14_spot_ray_hpt_torch_cifar10.html"><span>Chapter 14</span></a> describes the execution of the example from the tutorial “Hyperparameter Tuning with Ray Tune” <span class="citation" data-cites="pyto23a">(<a href="references.html#ref-pyto23a" role="doc-biblioref">PyTorch 2023</a>)</span>. The integration of <code>spotPython</code> into the <code>PyTorch</code> training workflow is described in detail in the following sections. <a href="14_spot_ray_hpt_torch_cifar10.html#sec-setup-14"><span>Section 14.1</span></a> describes the setup of the tuners. <a href="14_spot_ray_hpt_torch_cifar10.html#sec-data-loading-14"><span>Section 14.3</span></a> describes the data loading. <a href="14_spot_ray_hpt_torch_cifar10.html#sec-selection-of-the-algorithm-14"><span>Section 14.5</span></a> describes the model to be tuned. The search space is introduced in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-search-space-14"><span>Section 14.5.3</span></a>. Optimizers are presented in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-optimizers-14"><span>Section 14.6.1</span></a>. How to split the data in train, validation, and test sets is described in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-data-splitting-14"><span>Section 14.7.1</span></a>. The selection of the loss function and metrics is described in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-loss-functions-14"><span>Section 14.7.5</span></a>. <a href="14_spot_ray_hpt_torch_cifar10.html#sec-prepare-spot-call-14"><span>Section 14.8.1</span></a> describes the preparation of the <code>spotPython</code> call. The objective function is described in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-the-objective-function-14"><span>Section 14.8.2</span></a>. How to use results from previous runs and default hyperparameter configurations is described in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-default-hyperparameters"><span>Section 14.8.3</span></a>. Starting the tuner is shown in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-call-the-hyperparameter-tuner-14"><span>Section 14.8.4</span></a>. TensorBoard can be used to visualize the results as shown in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-tensorboard-14"><span>Section 14.9</span></a>. Results are discussed and explained in <a href="14_spot_ray_hpt_torch_cifar10.html#sec-results-14"><span>Section 14.10</span></a>.</p> |
| 372 | +<p><a href="30_spot_lightning_csv.html"><span>Chapter 21</span></a> shows the integration of <code>spotPython</code> into the <code>PyTorch Lightning</code> training workflow.</p> |
| 373 | +<p><a href="14_spot_ray_hpt_torch_cifar10.html#sec-summary"><span>Section 14.11</span></a> presents a summary and an outlook.</p> |
366 | 374 | <div class="callout callout-style-default callout-note callout-titled"> |
367 | 375 | <div class="callout-header d-flex align-content-center"> |
368 | 376 | <div class="callout-icon-container"> |
@@ -446,7 +454,7 @@ <h3 data-number="1.3.1" class="anchored" data-anchor-id="the-objective-function- |
446 | 454 | <div class="cell" data-execution_count="5"> |
447 | 455 | <div class="sourceCode cell-code" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>spot_0.run()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div> |
448 | 456 | <div class="cell-output cell-output-display" data-execution_count="5"> |
449 | | -<pre><code><spotPython.spot.spot.Spot at 0x106e5c370></code></pre> |
| 457 | +<pre><code><spotPython.spot.spot.Spot at 0x12cb58490></code></pre> |
450 | 458 | </div> |
451 | 459 | </div> |
452 | 460 | <div class="cell" data-execution_count="6"> |
@@ -498,7 +506,7 @@ <h2 data-number="1.4" class="anchored" data-anchor-id="spot-parameters-fun_evals |
498 | 506 | <p><img src="01_spot_intro_files/figure-html/cell-10-output-2.png" width="600" height="449"></p> |
499 | 507 | </div> |
500 | 508 | <div class="cell-output cell-output-display" data-execution_count="9"> |
501 | | -<pre><code><spotPython.spot.spot.Spot at 0x133855360></code></pre> |
| 509 | +<pre><code><spotPython.spot.spot.Spot at 0x12cc0c730></code></pre> |
502 | 510 | </div> |
503 | 511 | </div> |
504 | 512 | </section> |
|
0 commit comments