Bring a fresh machine to a state where PACT can run. Do this before
building ../src/ or using ../run/.
| Subdir | Purpose |
|---|---|
kernel/ |
Build, install, and boot vanilla Linux 6.3, then build and load the two modules the tiering subsystem requires - tierinit (registers the slow tier + demotion targets) and kswapdrst (keeps kswapd/demotion from stalling). |
env/ |
Prepare the machine for controlled runs: uncore-frequency pinning, CXL/NUMA layout, performance governor, and disabling turbo / THP / KSM / NUMA-balancing. |
perf/ |
Build the PAC-patched perf used for PAC sampling (install-perf.sh clones Linux v5.15, applies pac_perf.patch, and builds the perf binary). |
- Kernel -
kernel/:setup_kernel.sh→ reboot into 6.3 →build_modules.sh. - Environment -
env/:sudo ./prepare_environment.sh(or let the runner do it viarun_setup_config=true ./run-pact.sh ...).
The env scripts source each other by their own location, so they can be
invoked from anywhere (e.g. the runner calls
../setup/env/prepare_environment.sh).
The slow tier is emulated by a remote NUMA node with its CPUs taken offline
and its uncore frequency pinned low. check_cxl_conf now aborts if Node 1
still has online CPUs (an unvalidated layout silently produces invalid
results). Two things are not auto-validated and you should confirm them:
-
Topology - after
prepare_environment.sh, the slow-tier node must be CPU-less:numactl --hardware # 'node 1 cpus:' should be empty -
Latency - the emulated tiers should land near the paper's targets (~90 ns local DRAM, ~140 ns remote NUMA, ~190 ns CXL-emulated; see
../modeling/README.md). Measure with a pointer-chase latency tool (e.g. Intel MLC--latency_matrix, or theptr_chasemicrobenchmark) and adjust the uncore-frequency targets inenv/modify-uncore-freq.sh/UNCORE_ARGSuntil the slow tier matches. Pinning the frequency without checking the resulting latency is the most common source of invalid tiering results.
Use memmap on the GRUB cmdline to reserve DRAM and shrink the fast tier.
Example: CloudLab
c220g5has 96 GB DRAM per socket. Addingmemmap=76G!2GtoGRUB_CMDLINE_LINUXreserves 76 GB starting at 2 GB. Afterupdate-gruband reboot:node 0 size: 20730 MB; node 0 free: 18746 MB. The values vary by ~hundreds of MB each boot - tune carefully.
- The current scripts assume a 2-NUMA-node server (one fast tier, one CXL-like slow tier).