Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ We aim to provide a dynamic resource where users can find the latest optimizatio
- [Redis](software/similarity-search/redis/README.md)
- [Spark](software/spark/README.md)
- [scikit-learn](software/scikit-learn/README.md)
- [scikit-learn-intelex](software/scikit-learn-intelex/README.md)
- [MySQL & PostgreSQL](software/mysql-postgresql/README.md)
- [Envoy](software/envoy/README.md)
- [Kafka](software/kafka/README.md)
Expand Down
48 changes: 48 additions & 0 deletions software/common/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
Here are the common recommendations about your system configuration that are beneficial for getting maximum performance from the workloads.

## Energy Performance Bias (EPB)

Energy Performance Bias (EPB) is an Intel Xeon hardware setting that controls the trade-off between power consumption and processing performance. For the best performance, it is recommended to set it to `0` (Performance mode).

### On Windows

Run the following command in `cmd`:

```
powercfg -setacvalueindex scheme_current sub_processor PERFEPP 0
```

[More info about `powercfg`](https://learn.microsoft.com/en-us/windows-hardware/customize/power-settings/options-for-perf-state-engine-perfenergypreference).

### On Linux

To check the current value of EPB, run:
```
sudo cpupower info
```

To set EPB to Performance mode:
```
sudo cpupower set -b 0
```

## CPU Frequency Scaling

CPU Frequency Scaling is a technique that dynamically adjusts the processor clock speed based on workload demands. It lowers CPU core frequencies during idle periods to reduce power consumption. For better performance, it is recommended to set the clock speed to a higher frequency.

### On Windows 11

Select **Start** > **Settings** > **System** > **Power & battery**.

Under [**Power**](https://support.microsoft.com/en-us/windows/change-the-power-mode-for-your-windows-pc-c2aff038-22c9-f46d-5ca0-78696fdf2de8#category=windows_11) mode, choose the **Best performance** option for **Plugged in** or **On battery**.

### On Linux

Use the CPU scaling governor:

```
sudo cpupower frequency-set --governor performance
sudo x86_energy_perf_policy -c all performance
```

**Note:** If the maximum CPU frequency cannot be achieved, check the [BIOS limitations](https://wiki.archlinux.org/title/CPU_frequency_scaling#BIOS_frequency_limitation).
126 changes: 126 additions & 0 deletions software/scikit-learn-intelex/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
This chapter contains information about the practices that lead to better performance of scikit-learn-intelex on Intel CPUs.

Before you read the information below, it is recommended to read the [common recommendations](../common/README.md) about the system configuration.

## Hyper-threading (HT)

Hyper-threading (HT) is Intel's simultaneous multithreading implementation that can improve the parallelization of computations. When HT is enabled, for each processor core that is physically present, the operating system addresses two logical cores and shares the workload between them when possible. In this case, the logical cores located on a single physical core share the same resources. For resource-demanding workloads like scikit-learn-intelex, it is recommended to disable HT either in BIOS settings or by modifying the affinity settings of the process.

### On Windows

Hyper-threading can be deteched by running **Task Manager**. Then navigate to **Performance** > **CPU** tab.

The number of physical and logical cores is listed in the bottom right corner of the tab. In case the number of logical cores is greater, HT is enabled:

![alt text](images/cpu-ht.png)

According to this picture, hyper-threading is enabled on two P-cores. Here is an illustration of the locations of the bits corresponding to those P-cores in the affinity mask of the system:

<img src="images/cpu-cores-indices-ht.png" alt="drawing" style="width:600px;"/>

To disable hyper-threading for a process, the affinity mask in binary format should look like:

<img src="images/cpu-affinity-ht.png" alt="drawing" style="width:600px;"/>

Which is equivalent to `2BFF` in hexadecimal format. Run following command to disable HT on Windows:

```
start /affinity 2BFF cmd /c python <workload.py>
```

### On Linux

Hyper-threading can be detected by running `lscpu` utility as follows: `lscpu -e=cpu,core`. Here is the example output for Intel® Core™ Ultra 7 165U:

```
CPU CORE
0 0
1 0
2 1
3 1
4 2
5 3
6 4
7 5
8 6
9 7
10 8
11 9
12 10
13 11
```

From the output we can see that 4 logical processors (0, 1, 2, 3) are running on two physical cores (0, 1).
To run the process on physical cores only, use one of the following commands:

```
numactl -C 0,2,4-13 python <workload.py>
```
or
```
taskset -c 0,2,4-13 python <workload.py>
```

The required list of logical processors can be formed programmatically in Bash. In this case, the command sequence looks like:

```bash
cpus=$(lscpu -e=cpu,core | awk 'NR>1 && !seen[$2]++ {print $1}' | paste -sd,)
numactl -C "$cpus" python <workload.py>
```

## Low Power Efficient Cores (LPE cores)

Low Power Efficient Cores (LPE cores) are a type of core available on modern Intel Core processors, designed to manage lightweight background processes independently. This allows the main compute tiles to be powered down, saving battery life on mobile devices.

For the best performance, it is recommended to exclude LPE cores from the list of CPU cores on which the workload is running. The affinity settings of the process have to be modified to achieve this.

### On Windows

Check the CPU specification on the Intel [products page](https://www.intel.com/content/www/us/en/products/overview.html). Here is an example for the Intel® Core™ Ultra 7 165U processor:

![alt text](images/cpu-specifications.png)

The location of the LPE cores in the system affinity mask in this case would be:

<img src="images/cpu-lpe-cores-indices.png" alt="drawing" style="width:600px;"/>

The recommended affinity mask that disables both hyper-threading and LPE cores would be `2BFC`:

<img src="images/cpu-affinity-lpe-cores.png" alt="drawing" style="width:600px;"/>

Run the following command to disable HT and LPE cores on Windows:

```
start /affinity 2BFC cmd /c python <workload.py>
```

### On Linux

LPE cores can be detected by running the `lscpu -e=cpu,core,maxmhz` command. Here is the example output:

```
CPU CORE MAXMHZ
0 0 4900.0000
1 0 4900.0000
2 1 4900.0000
3 1 4900.0000
4 2 3800.0000
5 3 3800.0000
6 4 3800.0000
7 5 3800.0000
8 6 3800.0000
9 7 3800.0000
10 8 3800.0000
11 9 3800.0000
12 10 2100.0000
13 11 2100.0000
```

The logical processors with the lowest maximum frequency (12, 13) are running on LPE cores.

Run the following command to disable HT and LPE cores on Linux:

```
numactl -C 0,2,4-11 python <workload.py>
```

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added software/scikit-learn-intelex/images/cpu-ht.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.