From c74c035a08998e0b154c9f475523846d722ebe2e Mon Sep 17 00:00:00 2001
From: Stafford Horne <shorne@gmail.com>
Date: Sun, 22 Mar 2026 06:50:07 +0000
Subject: [PATCH 1/6] QEMU: cleanup the tutorial

 - Fix line wrappings
 - Try to fix some run on sentences
---
 platform/qemu.md | 79 +++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 61 insertions(+), 18 deletions(-)

diff --git a/platform/qemu.md b/platform/qemu.md
index d56e589..2a3f5b5 100644
--- a/platform/qemu.md
+++ b/platform/qemu.md
@@ -5,43 +5,69 @@ nav_order: 2
 parent: Platforms
 ---
 
-The objective of this tutorial is to run a Hello World program on QEMU. The prerequisite is to have a [cross-compiler](https://openrisc.io/software) for OpenRISC 1000 (or1k). And don't forget to add it to the `PATH` .
+The objective of this tutorial is to run a Hello World program on QEMU. The
+prerequisite is to have a [cross-compiler](https://openrisc.io/software) for
+OpenRISC 1000 (or1k). And don't forget to add it to the `PATH`.
 
-# Intro
-QEMU is a generic emulator that supports various target architectures. [It can be used to emulate a 32-bit OpenRISC CPU](https://www.qemu.org/docs/master/system/target-openrisc.html).\
-QEMU has two different modes of emulation: user mode and system emulation mode. The user mode only allows you to run programs compiled for the target architecture, while the system mode emulates the complete hardware. 
+# Intro QEMU is a generic emulator that supports various target architectures.
+[It can be used to emulate a 32-bit OpenRISC CPU](https://www.qemu.org/docs/master/system/target-openrisc.html).
+
+QEMU has two different modes of emulation: user mode and system emulation mode.
+The user mode only allows you to run programs compiled for the target
+architecture, while the system mode emulates the complete hardware.
 
 # User Mode
+
 ## Install QEMU
-Install the pre-built package from your distribution's package manager. On Ubuntu, run\
-`sudo apt install qemu-user-static`  
-for running statically linked binaries OR\
-`sudo apt install qemu-user`  
+
+Install the pre-built package from your distribution's package manager. On Ubuntu, run:
+
+```bash
+sudo apt install qemu-user-static
+```
+
+for running statically linked binaries OR
+
+```bash
+sudo apt install qemu-user
+```
+
 for running dynamically linked binaries. Personally, I am using the static version for this tutorial.
-The way this mode of emulation works is that QEMU captures the target's (or1k in this case) system calls and translates them into your host system's. 
+
+The way this mode of emulation works is that QEMU captures the target's (or1k in
+this case) system calls and translates them before passing them to your host system.
 
 ## Cross-compile the Program
+
 `hello.c` is included in this directory. Compile it using the cross-compiler you have. For me, the command is\
 `or1k-none-linux-musl-gcc hello.c -static -o hello`.
 
 And check if the output file type looks correct using `file` command.
-```
+
+```bash
 file hello
 hello: ELF 32-bit MSB executable, OpenRISC, version 1 (SYSV), statically linked, with debug_info, not stripped
 ```
 
-Note that `-static` flag was used when compiling. Without this, the output will be a dynamically linked ELF, which will give an error, ```qemu-or1k-static: Could not open '/lib/ld-musl-or1k.so.1': No such file or directory```, if we try to run it using `qemu-or1k-static`.
+Note that `-static` flag was used when compiling. Without this, the output will
+be a dynamically linked ELF, which will give an error, ```qemu-or1k-static:
+Could not open '/lib/ld-musl-or1k.so.1': No such file or directory```, if we try
+to run it using `qemu-or1k-static`.
 
 ## Run the Program
-```
+
+```bash
 qemu-or1k-static hello
 ```
+
 If the output is `Hello World!`, then everything is working correctly.
 
 # System Emulation
-The following exercise will make a very simple bare-metal or1k system with our hello.c program as the "kernel". 
+
+The following exercise will make a very simple bare-metal or1k system with our `hello.c` program as the "kernel".
 
 ## Install QEMU (From Source)
+
 Running QEMU full-system emulation means you need to run a different QEMU binary than what you used for the user mode. On Ubuntu, you can run\
 `sudo apt install qemu-system`. **However**, this will download a quite old version of QEMU (likely version 8.2.2) and this does not work with this tutorial. This tutorial was tested to be working on QEMU version 9.2.2.
 
@@ -74,16 +100,33 @@ export PATH=$PATH:<path of or1k-elf bin folder>
 
 
 Then we compile the code, *but* for a specific board. This is done by passing `-mboard` option. There are two boards that can work: `or1ksim-uart` or `ordb1a3pe1500`. The full list can be found [here](https://github.com/openrisc/newlib/tree/or1k/libgloss/or1k/boards).
-```
+
+```bash
 or1k-elf-gcc -mboard=ordb1a3pe1500 hello.c -o hello.qemu
-// OR
+# OR
 or1k-elf-gcc -mboard=or1ksim-uart hello.c -o hello.qemu
 ```
 
-## Run the Program
+## Running the Program
+
 ```
 ./qemu-9.2.2/build/qemu-system-or1k -cpu or1200 -serial mon:stdio -kernel hello.qemu -nographic
 ```
-The expected result of the run should be `Hello World!` being printed out and then hanging. 
 
-> If you compiled the program without passing `-mboard=` flag or with something other than 2 boards mentioned above, you may not see any output. [QEMU by default uses or1ksim board](https://www.qemu.org/docs/master/system/target-openrisc.html#choosing-a-board-model) when the board is not specified using `-M` flag like we did above. And it uses specific memory layout and configuration that can be found [here](https://github.com/qemu/qemu/blob/master/hw/openrisc/openrisc_sim.c). In order for serial output to be captured and displayed properly by the QEMU or1ksim, its UART configuration (memory-mapped address, baud rate and IRQ) should match that of the binary. [`or1ksim-uart`](https://github.com/openrisc/newlib/blob/or1k/libgloss/or1k/boards/or1ksim-uart.S) and [`ordb1a3pe1500`](https://github.com/openrisc/newlib/blob/or1k/libgloss/or1k/boards/ordb1a3pe1500.S) happen to have the matching configuration and allow the compiler to generate binaries that can work well with QEMU (if I have to pick the _best_ one for this, I would pick `ordb1a3pe1500`, because it has 20MHz clock frequency just like QEMU or1ksim as opposed to 100MHz). 
+The expected result of the run should be `Hello World!` being printed out and then hanging.
+
+> If you compiled the program without passing `-mboard=` flag or with something
+> other than 2 boards mentioned above, you may not see any output.
+> [QEMU by default uses or1ksim board](https://www.qemu.org/docs/master/system/target-openrisc.html#choosing-a-board-model)
+> when the board is not specified using `-M` flag like we did above. It uses
+> specific memory layout and configuration that can be found in QEMU's [or1k-sim.c](https://github.com/qemu/qemu/blob/master/hw/or1k/or1k-sim.c).
+> In order for serial output to be captured and displayed properly by the QEMU
+> or1ksim, its UART configuration (memory-mapped address, baud rate and IRQ)
+> should match that of the binary.
+> [`or1ksim-uart`](https://github.com/openrisc/newlib/blob/or1k/libgloss/or1k/boards/or1ksim-uart.S)
+> and
+> [`ordb1a3pe1500`](https://github.com/openrisc/newlib/blob/or1k/libgloss/or1k/boards/ordb1a3pe1500.S)
+> happen to have the matching configuration and allow the compiler to generate
+> binaries that can work well with QEMU (if I have to pick the _best_ one for
+> this, I would pick `ordb1a3pe1500`, because it has 20MHz clock frequency just
+> like QEMU or1ksim as opposed to 100MHz).

From f4bf81e58d270c8f547d3b967a63336e7d0f78c3 Mon Sep 17 00:00:00 2001
From: Stafford Horne <shorne@gmail.com>
Date: Sun, 22 Mar 2026 06:51:14 +0000
Subject: [PATCH 2/6] Linux: Rewrite the linux intro

The old entro was all over the place and just explained
how to build linux, which we now do in the other tutorials.
Rewrite it to be a introduction to the basics of OpenRISC
embedded Linux.
---
 docs/Linux.md | 224 +++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 183 insertions(+), 41 deletions(-)

diff --git a/docs/Linux.md b/docs/Linux.md
index 66a1178..27a3ab4 100644
--- a/docs/Linux.md
+++ b/docs/Linux.md
@@ -4,77 +4,219 @@ layout: page
 nav_order: 5
 ---
 
-## Build OpenRISC linux image
+## OpenRISC Linux
 
-### Prerequisites
+Linux on OpenRISC is the essence of [embedded Linux](https://elinux.org/Main_Page).
+From the FPGA based SoC's, simulators, toolchains, kernel and software it provides a complete
+open-source software and hardware stack.
 
-#### Software
+In this tutorial we cover the basics of OpenRISC embedded Linux before diving
+into our Linux on OpenRISC tutorials.  We will cover:
 
-* Linux source code
-* `or1k-elf` or `or1k-linux` toolchain [Releases](https://github.com/stffrdhrn/or1k-toolchain-build/releases)
-* or1ksim (optional)
+ * Memory layout - we will explain how devices, Linux and our user processes
+   share memory space
+ * Boot loaders - we need to get Linux onto the system, we will explain how this
+   is done.
+ * Device tree - how does Linux know what hardware is available in the system
+ * Toolchains - We covered this before, but a quick refresher on linux
+   specific toolchains
+ * Rootfs - Applications
 
-### Setting up
+If you wish to skip this you can continue directly with our tutorials:
 
-OpenRISC is officially supported in the Linux kernel since 2011 and can be downloaded from https://www.kernel.org.
+ * [Linux on or1ksim](linux-on-or1ksim.html) - Our instruction level simulator
+ * [Linux on QEMU](linux-on-qemu.html) - The QEMU emulator
+ * [Linux on De0 Nano](linux-on-de0nano.html) - An FPGA Development Board
+ * And more (see left panel).
 
-#### Get Linux source code
+### Memory Layout
+
+Before diving into Linux, boot loaders, the device tree a basic understanding of
+the memory layout is helpful.
+
+The OpenRISC is able to address up to 32-bits of address space giving us up
+to 4GB of addressable memory.
+
+#### Physical Addresses
+
+In Linux SoC's our data caches are configured with a 31-bit addresses width.
+This means only the first 2GB of memory addresses are cached.  This is useful
+as it guarantees that all operations on addresses above `0x80000000` are not cached.
+We use these upper address ranges for IO devices which we do not want to be
+cached.
+
+```
+Address Range      | Description
+-------------------+---------------------------
+0x80000000 ~ (2GB) | IO space, not cached
+-------------------+---------------------------
+0x00000000 ~ (2GB) | Memory space, cached
+```
+
+#### Virtual Memory
+
+Virtual memory in Linux is split between kernel space and user space as below.
+There is 1GB reserved for the kernel, 2GB reserved for userspace and a 1GB hole
+which we reserver for other purposes.
+
+OpenRISC uses 8kb pages.
 
 ```
-git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
+Address |
++---
+| 0xffffffff - Top of address space
+|               ^
+|      1GB kernel space (30-bits)
+|               V
+| 0xc0000000 - Linux kernel base       (KERNELBASE 0xc0000000)
++--
+| 0xbc000000 - 0xbfffffff (VMALLOC_START - VMALLOC_END) 64MB vmalloc/ioremap (64MB)
+| 0x80000000 - 0xbbffffff
++--
+| 0x7fffffff - Top of user space (stack)
+!
+|       1GB User space                 (TASK_SIZE  0x80000000)
+|
+| 0x00002000 - Bottom of address space
+| 0x00000000 - Unmapped page (NULL)
++----
 ```
 
-#### Set up Linux source code
+If we look at the Linux kernel ELF binary we see the following.
+
+```
+~ # cat /proc/1/maps
+00002000-00168000 r-xp 00000000 00:03 7          /bin/busybox
+00168000-0016a000 r--p 00164000 00:03 7          /bin/busybox
+0016a000-0016c000 rw-p 00166000 00:03 7          /bin/busybox
+0016e000-00170000 ---p 00000000 00:00 0          [heap]
+00170000-00172000 rwxp 00000000 00:00 0          [heap]
+30000000-300de000 r-xp 00000000 00:03 114        /lib/libc.so
+300de000-300e0000 r--p 000dc000 00:03 114        /lib/libc.so
+300e0000-300e2000 rw-p 000de000 00:03 114        /lib/libc.so
+300e2000-300e4000 rwxp 00000000 00:00 0
+7ff84000-7ffa6000 rw-p 00000000 00:00 0          [stack]
+```
 
 ```
-    cd linux
-    export ARCH=openrisc
-    export CROSS_COMPILE=or1k-elf-
+readelf -S vmlinux
+There are 26 section headers, starting at offset 0x677ef20:
+
+Section Headers:
+  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
+  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
+  [ 1] .text             PROGBITS        c0000000 002000 549344 00  AX  0   0 4096
+  [ 2] .rodata           PROGBITS        c054a000 54c000 0c1098 00  WA  0   0 32
+  [ 3] .eh_frame         PROGBITS        c060b098 60d098 00005c 00   A  0   0  4
+  [ 4] __param           PROGBITS        c060b0f4 60d0f4 000744 00   A  0   0  4
+  [ 5] __modver          PROGBITS        c060b838 60d838 000024 00   A  0   0  4
+  [ 6] .notes            NOTE            c060b85c 60d85c 000054 00   A  0   0  4
+  [ 7] .data             PROGBITS        c060c000 60e000 029240 00  WA  0   0 8192
+  [ 8] __ex_table        PROGBITS        c0635240 637240 0009f0 00   A  0   0  2
+  [ 9] .init.text        PROGBITS        c0636000 638000 02a808 00  AX  0   0 8192
+  [10] .init.data        PROGBITS        c0660820 662820 36b68c 00  WA  0   0 32
+  [11] .data..percpu     PROGBITS        c09cc000 9ce000 003ea0 00  WA  0   0 16
+  [12] .bss              NOBITS          c09cfea0 9d1ea0 013d80 00  WA  0   0 16
+  [13] .debug_aranges    PROGBITS        00000000 9d1ea0 009078 00      0   0  1
+  [14] .debug_info       PROGBITS        00000000 9daf18 3e546ac 00      0   0  1
+  [15] .debug_abbrev     PROGBITS        00000000 482f5c4 201846 00      0   0  1
+  [16] .debug_line       PROGBITS        00000000 4a30e0a f3cf27 00      0   0  1
+  [17] .debug_frame      PROGBITS        00000000 596dd34 0c020c 00      0   0  4
+  [18] .debug_str        PROGBITS        00000000 5a2df40 16d574 01  MS  0   0  1
+  [19] .debug_line_str   PROGBITS        00000000 5b9b4b4 007811 01  MS  0   0  1
+  [20] .debug_loclists   PROGBITS        00000000 5ba2cc5 9d990a 00      0   0  1
+  [21] .debug_rnglists   PROGBITS        00000000 657c5cf 132676 00      0   0  1
+  [22] .comment          PROGBITS        00000000 66aec45 000012 01  MS  0   0  1
+  [23] .symtab           SYMTAB          00000000 66aec58 066db0 10     24 14480  4
+  [24] .strtab           STRTAB          00000000 6715a08 069418 00      0   0  1
+  [25] .shstrtab         STRTAB          00000000 677ee20 0000ff 00      0   0  1
+```
+
+For fat kernels the rootfs is built into `.data` section.  As we can see below.
+
 ```
+$ nm vmlinux | grep __irf_
+c09cbea4 d __irf_end
+c06654a4 d __irf_start
+
+$ printf "%d\n" $(((0xc09cbea4 - 0xc06654a4) / 1024))
+3482
+```
+
+The `__irf_*` symbols mark the start and end of the Initramfs which we
+include using the `CONFIG_INITRAMFS_SOURCE` kernel configuration option.
+In the above example, we can see the included data is about 3.4 MB in size.
+The rootfs is included into the kernel image using the Makefile and tools
+in the `usr/` directory of kernel source tree.
+
+```
+Program Headers:
+  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
+  LOAD           0x002000 0xc0000000 0x00000000 0x549344 0x549344 R E 0x2000
+  LOAD           0x54c000 0xc054a000 0x0054a000 0x485ea0 0x499c20 RWE 0x2000
+  NOTE           0x60d85c 0xc060b85c 0x0060b85c 0x00054 0x00054 R   0x4
+  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4
+```
+
+```
+f0000000  - devices
+c0000000  - kernel space
+b0000000  - user space
+```
+
+
+### Device tree
+
+Rootfs loaded to memory for embedded systems / no sd card
 
 The device tree file (.dts) is used to specify hardware configuration settings, such as base addresses and interrupt numbers
 for peripherals, memory sizes, the numbers of CPUs in the system and other things. If no custom device tree is used, a default one
 is enabled that contains a UART at address 0x90000000, 32MB RAM and one CPU. These parameters should be available for any
 OpenRISC system and can therefore be safely used. To enable more options, a separate device tree must be used
 
-To add a new device tree
+### Boot loaders
 
-Copy `$NAME.dts` to arch/openrisc/boot/dts/
+The job of the [boot loader](https://en.wikipedia.org/wiki/Bootloader) is to prepare the operating system to boot
+and then boot it.  In the most simple sense this means loading the operating system kernel into memory and then
+jumping to the entry point.  Traditionally the popular Linux boot loader is [GRUB](https://www.gnu.org/software/grub/).
+However, on embedded Linux platforms like OpenRISC Linux more simple loaders are used.  These include:
 
-(Later, when you run `make menuconfig`, you can specify your device tree under Processor type and features->Builtin DTB)
+ - For Simulators - or1ksim and QEMU provide built in boot loaders
+ - FPGA Boards - For larger FPGA boards with litex support we use the litex bios
+ - Tiny FPGA Boards - For tiny FPGA boards we use GDB as a simple boot loader
 
-The defconfig files is used to customize the Linux kernel for a certain hardware, e.g. enable
-extra device drivers, networking, filesystems etc. For basic usage it is enough to use the built-in default
-configuration. This will be enough to boot the kernel and communicate with it through a UART.
+Simulators like `or1ksim` and `QEMU` have the ability to be passed a kernel ELF image from the command
+line.  When the simulator is initialized they can read the ELF binary and load the bits directly into the simulator memory.
+In `QEMU` it will additionally generate and load a device tree to describe to the kernel what hardware
+is available, dynamically.  After the system and memory are initialized the simulator CPU will jump to `0x100`
+the entry point of the OpenRISC platform.
 
-To use the built-in default configuration
+On typical FPGA boards there is storage available to store a bootloader and devices available to store the operating system.
+For example on the [Digilent Arty](https://digilent.com/shop/arty-a7-100t-artix-7-fpga-development-board/) when
+the FPGA bitstream is programmed a ROM is programmed with the [litex bios](https://github.com/enjoy-digital/litex/blob/master/litex/soc/software/bios/main.c).
+This firmware plus boot loader will train DDR3 RAM before loading and jumping to the kernel entry point.
+The litex bios can load the operating system from an SD-card or from TFTP over a network connection.
 
-`make defconfig`
+On very Tiny FPGA boards like a base De0 Nano lacking non-volatile storage,
+there is no means to load an OS via SD-card or network.  We use GDB, a debugger
+typically used to read and write CPU and memory state.  We can leverage this to
+load ELF kernel images into memory over the JTAG debug interface.  Once, memory
+is loaded we can reset the CPU to have it jump to `0x100` and boot the kernel.
 
-To use a customized default configuration
+### Toolchains
 
-Copy `$NAME_defconfig` to `arch/openrisc/configs/`
+Linux toolchain vs baremetal toolchains.
 
-`make $NAME_defconfig`
+ Libc - musl
+ libc - glibc
 
-To make further customizations
 
-`make menuconfig`
+### Rootfs
 
-To build the kernel
+The rootfs is like the Linux distribution for an embedded linux.
 
-`make`
+We provide some prebuilt rootfs images here https://github.com/stffrdhrn/or1k-rootfs-build
 
-To load the kernel and start running it in openOCD run
-
-```
-init
-reset
-halt; load_image vmlinux; reg r3 0; reg npc 0x100; resume
-```
+  buildroot
+  busybox
 
-The kernel image is now available as an elf file called `vmlinux`. This file can be used as any other bare-metal program for OpenRISC. To test the Linux image, you can:
-* Run it in the reference C simulator (or1ksim)
-* Run it on a simulated RTL model (Most likely extremely slow, unless using verilator)
-* [Load it to RAM on an FPGA board with a debugger](Debugging.html)
-* Program it to non-volatile flash on an FPGA board

From 6711586b1970ecc6fa7318262f9e0296111d48d6 Mon Sep 17 00:00:00 2001
From: Stafford Horne <shorne@gmail.com>
Date: Fri, 27 Mar 2026 08:00:11 +0900
Subject: [PATCH 3/6] Linux: fix up memory layout descriptions a bit

---
 docs/Linux.md | 361 +++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 270 insertions(+), 91 deletions(-)

diff --git a/docs/Linux.md b/docs/Linux.md
index 27a3ab4..2239508 100644
--- a/docs/Linux.md
+++ b/docs/Linux.md
@@ -13,14 +13,14 @@ open-source software and hardware stack.
 In this tutorial we cover the basics of OpenRISC embedded Linux before diving
 into our Linux on OpenRISC tutorials.  We will cover:
 
- * Memory layout - we will explain how devices, Linux and our user processes
-   share memory space
  * Boot loaders - we need to get Linux onto the system, we will explain how this
    is done.
  * Device tree - how does Linux know what hardware is available in the system
  * Toolchains - We covered this before, but a quick refresher on linux
    specific toolchains
  * Rootfs - Applications
+ * Memory layout - we explain how devices, Linux and our user processes
+   share memory space
 
 If you wish to skip this you can continue directly with our tutorials:
 
@@ -29,13 +29,204 @@ If you wish to skip this you can continue directly with our tutorials:
  * [Linux on De0 Nano](linux-on-de0nano.html) - An FPGA Development Board
  * And more (see left panel).
 
-### Memory Layout
 
-Before diving into Linux, boot loaders, the device tree a basic understanding of
-the memory layout is helpful.
+### Boot loaders
+
+The job of the [boot loader](https://en.wikipedia.org/wiki/Bootloader) is to prepare the operating system to boot
+and then boot it.  In the most simple sense this means loading the operating system kernel into memory and then
+jumping to the entry point.  Traditionally the popular Linux boot loader is [GRUB](https://www.gnu.org/software/grub/).
+However, on embedded Linux platforms like OpenRISC Linux more simple loaders are used.  These include:
+
+ - For Simulators - or1ksim and QEMU provide built in boot loaders
+ - FPGA Boards - For larger FPGA boards with litex support we use the litex bios
+ - Tiny FPGA Boards - For tiny FPGA boards we use GDB as a simple boot loader
+
+Simulators like `or1ksim` and `QEMU` have the ability to be passed a kernel ELF image from the command
+line.  When the simulator is initialized they can read the ELF binary and load the bits directly into the simulator memory.
+In `QEMU` it will additionally generate and load a device tree to describe to the kernel what hardware
+is available, dynamically.  After the system and memory are initialized the simulator CPU will jump to `0x100`
+the entry point of the OpenRISC platform.
+
+On typical FPGA boards there is storage available to store a bootloader and devices available to store the operating system.
+For example on the [Digilent Arty](https://digilent.com/shop/arty-a7-100t-artix-7-fpga-development-board/) when
+the FPGA bitstream is programmed a ROM is programmed with the [litex bios](https://github.com/enjoy-digital/litex/blob/master/litex/soc/software/bios/main.c).
+This firmware plus boot loader will train DDR3 RAM before loading and jumping to the kernel entry point.
+The litex bios can load the operating system from an SD-card or from TFTP over a network connection.
+
+On very Tiny FPGA boards like a base De0 Nano lacking non-volatile storage,
+there is no means to load an OS via SD-card or network.  We use GDB, a debugger
+typically used to read and write CPU and memory state.  We can leverage this to
+load ELF kernel images into memory over the JTAG debug interface.  Once, memory
+is loaded we can reset the CPU to have it jump to `0x100` and boot the kernel.
+
+### Device tree
+
+The device tree file (.dts) is used to specify hardware configuration settings,
+such as base addresses and interrupt numbers for peripherals, main memory, the
+numbers of CPUs in the system and other things. OpenRISC Linux always needs a
+device tree to boot.  The device tree can be built into the kernel or passed as
+a boot parameter via register `r3`.
+
+The below is a very simple device tree source file describing an OpenRISC system
+with:
+ - 1 CPU
+ - 1 UART at 0x90000000
+ - 32 MB main memory at address 0x0
+ - 20 Mhz clock
+
+The device tree will be compiled down to a `.dtb` binary file using the device
+tree compiler (`dtc`) durig the build processes.  During the boot process the
+kernel uses the device tree definitions to initialize devices and memory.
+
+```
+/dts-v1/;
+/ {
+	compatible = "opencores,or1ksim";
+	#address-cells = <1>;
+	#size-cells = <1>;
+	interrupt-parent = <&pic>;
+
+	aliases {
+		uart0 = &serial0;
+	};
+
+	chosen {
+		stdout-path = "uart0:115200";
+	};
+
+	memory@0 {
+		device_type = "memory";
+		reg = <0x00000000 0x02000000>;
+	};
+
+	cpus {
+		#address-cells = <1>;
+		#size-cells = <0>;
+		cpu@0 {
+			compatible = "opencores,or1200-rtlsvn481";
+			reg = <0>;
+			clock-frequency = <20000000>;
+		};
+	};
+
+	pic: pic {
+		compatible = "opencores,or1k-pic";
+		#interrupt-cells = <1>;
+		interrupt-controller;
+	};
+
+	serial0: serial@90000000 {
+		compatible = "opencores,uart16550-rtlsvn105", "ns16550a";
+		reg = <0x90000000 0x100>;
+		interrupts = <2>;
+		clock-frequency = <20000000>;
+	};
+};
+```
+
+### Toolchains
+
+To compile the Linux kernel itself the toolchain used is not very important,
+as the kernel doesn't depend on any toolchain runtime features.  You can use
+any toolchain to build the kernel.
+However, if you want to build userspace applications choosing the correct
+toolchain requires some thought.  The main choices are:
+
+ - [musl](../musl.html) - A lightweight and efficient toolchain
+ - [glibc](../glibc.html) - A fully featured application runtime with c++ and FPU support
+
+The musl toolchain is good enough for most purposes.  Whichever toolchain
+you choose to build your applications be sure to use a rootfs with a compatible
+runtime installed.
+
+### Rootfs
+
+The rootfs is like the Linux distribution for an embedded linux.
+
+We provide some [prebuilt rootfs images](https://github.com/stffrdhrn/or1k-rootfs-build) to
+help get you started. The main choices are:
+
+ - buildroot - a fully featured rootfs ideal for boards with and sd-card, with
+   well known utilties like `bash`.
+ - busybox - a lightweight single binary rootfs, comming in at under 3MB
+
+### Memory Layout
 
 The OpenRISC is able to address up to 32-bits of address space giving us up
-to 4GB of addressable memory.
+to 4GB of addressable memory.  The space is shared between user space, the
+kernel and hardware devices.
+
+Paging
+
+Openrisc uses 2-level paging
+
+```
+      _ 11 bits for pte offset
+     /
+     | __-- 13 bit pages
+     |/  \
+     |    |
+    / \   |
+ 0xfffe0000
+   \/
+    \_ top 8 bit used for pgd
+```
+
+Notes for or1k PGD
+
+```
+PGD - dir      top 8 bits - 256 enties pgd_offset
+PMD - mid      1
+PTE - entry    least sig 11 bits of page - 2048 entries in PTE page
+
+  pte_offset
+    return (address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
+
+
+                        13 + 13-2 => 24
+                        1 << 24
+
+#define PGDIR_SHIFT     (PAGE_SHIFT + (PAGE_SHIFT-2))
+#define PGDIR_SIZE      (1UL << PGDIR_SHIFT)
+
+                        1 << 8
+
+1 Page per PTE / 4 => 2048
+#define PTRS_PER_PTE    (1UL << (PAGE_SHIFT-2))
+
+                        2048
+
+#define PTRS_PER_PGD    (1UL << (32-PGDIR_SHIFT))
+
+                       256
+
+#define USER_PTRS_PER_PGD       (TASK_SIZE/PGDIR_SIZE)
+                       128
+
+swapper_pg_dir[PTRS_PER_PGD];
+
+        if (ret) {
+                memset(ret, 0, USER_PTRS_PER_PGD * sizeof(pgd_t));
+                memcpy(ret + USER_PTRS_PER_PGD,
+                       swapper_pg_dir + USER_PTRS_PER_PGD,
+                       (PTRS_PER_PGD - USER_PTRS_PER_PGD) * sizeof(pgd_t));
+
+        }
+
+     0-128 - zeroed for users
+   128-256 - copied from kernel
+
+page
+ 31 ... 14
+
+ 31 ... 10
+
+ * An OR32 PTE looks like this:
+ *
+ * |  31 ... 10 |  9  |  8 ... 6  |  5  |  4  |  3  |  2  |  1  |  0  |
+ *  Phys pg.num    L     PP Index    D     A    WOM   WBC   CI    CC
+ *
+```
 
 #### Physical Addresses
 
@@ -50,7 +241,7 @@ Address Range      | Description
 -------------------+---------------------------
 0x80000000 ~ (2GB) | IO space, not cached
 -------------------+---------------------------
-0x00000000 ~ (2GB) | Memory space, cached
+0x00000000 ~ (2GB) | Physical RAM space, cached
 ```
 
 #### Virtual Memory
@@ -62,41 +253,22 @@ which we reserver for other purposes.
 OpenRISC uses 8kb pages.
 
 ```
-Address |
-+---
-| 0xffffffff - Top of address space
-|               ^
-|      1GB kernel space (30-bits)
-|               V
-| 0xc0000000 - Linux kernel base       (KERNELBASE 0xc0000000)
-+--
-| 0xbc000000 - 0xbfffffff (VMALLOC_START - VMALLOC_END) 64MB vmalloc/ioremap (64MB)
-| 0x80000000 - 0xbbffffff
-+--
-| 0x7fffffff - Top of user space (stack)
-!
-|       1GB User space                 (TASK_SIZE  0x80000000)
-|
-| 0x00002000 - Bottom of address space
-| 0x00000000 - Unmapped page (NULL)
+| Address Range           | Defines                       | Size  | Usage
++-------------------------+-------------------------------+-------+------
+| 0xffffc000 - 0xffffffff |                               | 16KB  | 2 Page hole
+| 0xf7fc0000 - 0xffffbfff | FIXADDR_START to FIXADDR_TOP  | 256KB | 32 Fixmap slots, 256 KB
+| 0xc0000000 - 0xf7fbffff | KERNELBASE                    | ~1GB  | direct mapped, kernel space (30-bits)
++-------------------------+-------------------------------+-------+-------
+| 0xbc000000 - 0xbfffffff | VMALLOC_START to VMALLOC_END) | 64MB  | vmalloc/ioremap
+| 0x80000000 - 0xbbffffff |                               | ~1GB  | hole
++-------------------------+-------------------------------+-------+------
+| 0x00002000 - 0x7fffffff | TASK_SIZE                     | ~2GB  | User space
+| 0x00000000 - 0x00001fff |                               | 8KB   | Unmapped page, NULL catch
 +----
 ```
 
 If we look at the Linux kernel ELF binary we see the following.
 
-```
-~ # cat /proc/1/maps
-00002000-00168000 r-xp 00000000 00:03 7          /bin/busybox
-00168000-0016a000 r--p 00164000 00:03 7          /bin/busybox
-0016a000-0016c000 rw-p 00166000 00:03 7          /bin/busybox
-0016e000-00170000 ---p 00000000 00:00 0          [heap]
-00170000-00172000 rwxp 00000000 00:00 0          [heap]
-30000000-300de000 r-xp 00000000 00:03 114        /lib/libc.so
-300de000-300e0000 r--p 000dc000 00:03 114        /lib/libc.so
-300e0000-300e2000 rw-p 000de000 00:03 114        /lib/libc.so
-300e2000-300e4000 rwxp 00000000 00:00 0
-7ff84000-7ffa6000 rw-p 00000000 00:00 0          [stack]
-```
 
 ```
 readelf -S vmlinux
@@ -158,65 +330,72 @@ Program Headers:
   GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4
 ```
 
-```
-f0000000  - devices
-c0000000  - kernel space
-b0000000  - user space
-```
-
-
-### Device tree
-
-Rootfs loaded to memory for embedded systems / no sd card
-
-The device tree file (.dts) is used to specify hardware configuration settings, such as base addresses and interrupt numbers
-for peripherals, memory sizes, the numbers of CPUs in the system and other things. If no custom device tree is used, a default one
-is enabled that contains a UART at address 0x90000000, 32MB RAM and one CPU. These parameters should be available for any
-OpenRISC system and can therefore be safely used. To enable more options, a separate device tree must be used
-
-### Boot loaders
-
-The job of the [boot loader](https://en.wikipedia.org/wiki/Bootloader) is to prepare the operating system to boot
-and then boot it.  In the most simple sense this means loading the operating system kernel into memory and then
-jumping to the entry point.  Traditionally the popular Linux boot loader is [GRUB](https://www.gnu.org/software/grub/).
-However, on embedded Linux platforms like OpenRISC Linux more simple loaders are used.  These include:
+If we have a look at the ELF binary of a user space process we see the
+following:
 
- - For Simulators - or1ksim and QEMU provide built in boot loaders
- - FPGA Boards - For larger FPGA boards with litex support we use the litex bios
- - Tiny FPGA Boards - For tiny FPGA boards we use GDB as a simple boot loader
-
-Simulators like `or1ksim` and `QEMU` have the ability to be passed a kernel ELF image from the command
-line.  When the simulator is initialized they can read the ELF binary and load the bits directly into the simulator memory.
-In `QEMU` it will additionally generate and load a device tree to describe to the kernel what hardware
-is available, dynamically.  After the system and memory are initialized the simulator CPU will jump to `0x100`
-the entry point of the OpenRISC platform.
-
-On typical FPGA boards there is storage available to store a bootloader and devices available to store the operating system.
-For example on the [Digilent Arty](https://digilent.com/shop/arty-a7-100t-artix-7-fpga-development-board/) when
-the FPGA bitstream is programmed a ROM is programmed with the [litex bios](https://github.com/enjoy-digital/litex/blob/master/litex/soc/software/bios/main.c).
-This firmware plus boot loader will train DDR3 RAM before loading and jumping to the kernel entry point.
-The litex bios can load the operating system from an SD-card or from TFTP over a network connection.
-
-On very Tiny FPGA boards like a base De0 Nano lacking non-volatile storage,
-there is no means to load an OS via SD-card or network.  We use GDB, a debugger
-typically used to read and write CPU and memory state.  We can leverage this to
-load ELF kernel images into memory over the JTAG debug interface.  Once, memory
-is loaded we can reset the CPU to have it jump to `0x100` and boot the kernel.
-
-### Toolchains
-
-Linux toolchain vs baremetal toolchains.
+```
+$ readelf -e ../../busybox-rootfs/initramfs/bin/busybox
+There are 23 section headers, starting at offset 0x19478c:
 
- Libc - musl
- libc - glibc
+Section Headers:
+  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
+  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
+  [ 1] .hash             HASH            00000134 000134 000ac4 04   A  3   0  4
+  [ 2] .gnu.hash         GNU_HASH        00000bf8 000bf8 000048 04   A  3   0  4
+  [ 3] .dynsym           DYNSYM          00000c40 000c40 001a80 10   A  4   9  4
+  [ 4] .dynstr           STRTAB          000026c0 0026c0 000e60 00   A  0   0  1
+  [ 5] .rela.dyn         RELA            00003520 003520 003e40 0c   A  3   0  4
+  [ 6] .rela.plt         RELA            00007360 007360 0011f4 0c  AI  3  20  4
+  [ 7] .init             PROGBITS        00008554 008554 000014 00  AX  0   0  1
+  [ 8] .plt              PROGBITS        00008568 008568 001800 04  AX  0   0  4
+  [ 9] .text             PROGBITS        00009d68 009d68 16023c 00  AX  0   0  4
+  [10] .fini             PROGBITS        00169fa4 169fa4 000014 00  AX  0   0  1
+  [11] .rodata           PROGBITS        00169fb8 169fb8 027036 00   A  0   0  8
+  [12] .interp           PROGBITS        00190fee 190fee 000017 00   A  0   0  1
+  [13] .eh_frame_hdr     PROGBITS        00191008 191008 00002c 00   A  0   0  4
+  [14] .eh_frame         PROGBITS        00191034 191034 0000c0 00   A  0   0  4
+  [15] .init_array       INIT_ARRAY      00192510 192510 000004 04  WA  0   0  4
+  [16] .fini_array       FINI_ARRAY      00192514 192514 000004 04  WA  0   0  4
+  [17] .data.rel.ro      PROGBITS        00192518 192518 0019fc 00  WA  0   0  8
+  [18] .dynamic          DYNAMIC         00193f14 193f14 0000e8 08  WA  4   0  4
+  [19] .data             PROGBITS        00194000 194000 00001d 00  WA  0   0  4
+  [20] .got              PROGBITS        00194020 194020 0006b8 04  WA  0   0  4
+  [21] .bss              NOBITS          001946d8 1946d8 0005cc 00  WA  0   0  8
+  [22] .shstrtab         STRTAB          00000000 1946d8 0000b1 00      0   0  1
+Key to Flags:
+  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
+  L (link order), O (extra OS processing required), G (group), T (TLS),
+  C (compressed), x (unknown), o (OS specific), E (exclude),
+  D (mbind), p (processor specific)
 
+Program Headers:
+  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
+  PHDR           0x000034 0x00000034 0x00000034 0x00100 0x00100 R   0x4
+  INTERP         0x190fee 0x00190fee 0x00190fee 0x00017 0x00017 R   0x1
+      [Requesting program interpreter: /lib/ld-musl-or1k.so.1]
+  LOAD           0x000000 0x00000000 0x00000000 0x1910f4 0x1910f4 R E 0x2000
+  LOAD           0x192510 0x00192510 0x00192510 0x021c8 0x02794 RW  0x2000
+  DYNAMIC        0x193f14 0x00193f14 0x00193f14 0x000e8 0x000e8 RW  0x4
+  GNU_EH_FRAME   0x191008 0x00191008 0x00191008 0x0002c 0x0002c R   0x4
+  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x10
+  GNU_RELRO      0x192510 0x00192510 0x00192510 0x01af0 0x01af0 R   0x1
 
-### Rootfs
+```
 
-The rootfs is like the Linux distribution for an embedded linux.
+When this is running we can see it maps into user space as follows.
 
-We provide some prebuilt rootfs images here https://github.com/stffrdhrn/or1k-rootfs-build
+```
+~ # cat /proc/1/maps
+00002000-00168000 r-xp 00000000 00:03 7          /bin/busybox
+00168000-0016a000 r--p 00164000 00:03 7          /bin/busybox
+0016a000-0016c000 rw-p 00166000 00:03 7          /bin/busybox
+0016e000-00170000 ---p 00000000 00:00 0          [heap]
+00170000-00172000 rwxp 00000000 00:00 0          [heap]
+30000000-300de000 r-xp 00000000 00:03 114        /lib/libc.so
+300de000-300e0000 r--p 000dc000 00:03 114        /lib/libc.so
+300e0000-300e2000 rw-p 000de000 00:03 114        /lib/libc.so
+300e2000-300e4000 rwxp 00000000 00:00 0
+7ff84000-7ffa6000 rw-p 00000000 00:00 0          [stack]
+```
 
-  buildroot
-  busybox
 

From 2eec82ebe3ed56e35bc61592f11da99fb293b028 Mon Sep 17 00:00:00 2001
From: Stafford Horne <shorne@gmail.com>
Date: Wed, 1 Apr 2026 00:17:10 +0100
Subject: [PATCH 4/6] Linux: Add comment about correct bit structure

---
 docs/Linux.md | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/docs/Linux.md b/docs/Linux.md
index 2239508..85cb2ec 100644
--- a/docs/Linux.md
+++ b/docs/Linux.md
@@ -170,6 +170,8 @@ Openrisc uses 2-level paging
  0xfffe0000
    \/
     \_ top 8 bit used for pgd
+
+
 ```
 
 Notes for or1k PGD
@@ -182,6 +184,7 @@ PTE - entry    least sig 11 bits of page - 2048 entries in PTE page
   pte_offset
     return (address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
 
+  [ 8 ][ 11 ][ 13 ]
 
                         13 + 13-2 => 24
                         1 << 24
@@ -217,7 +220,7 @@ swapper_pg_dir[PTRS_PER_PGD];
    128-256 - copied from kernel
 
 page
- 31 ... 14
+ 31 ... 13 - this is what it should be
 
  31 ... 10
 

From cd00e550647a201879cb26cfb67f3421ecd92927 Mon Sep 17 00:00:00 2001
From: Stafford Horne <shorne@gmail.com>
Date: Wed, 1 Apr 2026 19:48:37 +0800
Subject: [PATCH 5/6] Linux: try to improve coherency of page table and memory
 layout sections

---
 docs/Linux.md | 232 ++++++++++++++++++++++++++++----------------------
 1 file changed, 129 insertions(+), 103 deletions(-)

diff --git a/docs/Linux.md b/docs/Linux.md
index 85cb2ec..ba1ed62 100644
--- a/docs/Linux.md
+++ b/docs/Linux.md
@@ -16,7 +16,7 @@ into our Linux on OpenRISC tutorials.  We will cover:
  * Boot loaders - we need to get Linux onto the system, we will explain how this
    is done.
  * Device tree - how does Linux know what hardware is available in the system
- * Toolchains - We covered this before, but a quick refresher on linux
+ * Toolchains - We covered this before, but a quick refresher on Linux
    specific toolchains
  * Rootfs - Applications
  * Memory layout - we explain how devices, Linux and our user processes
@@ -32,32 +32,38 @@ If you wish to skip this you can continue directly with our tutorials:
 
 ### Boot loaders
 
-The job of the [boot loader](https://en.wikipedia.org/wiki/Bootloader) is to prepare the operating system to boot
-and then boot it.  In the most simple sense this means loading the operating system kernel into memory and then
-jumping to the entry point.  Traditionally the popular Linux boot loader is [GRUB](https://www.gnu.org/software/grub/).
-However, on embedded Linux platforms like OpenRISC Linux more simple loaders are used.  These include:
+The job of the [boot loader](https://en.wikipedia.org/wiki/Bootloader) is to
+prepare the operating system to boot and then boot it.  In the most simple sense
+this means loading the operating system kernel into memory and then jumping to
+the entry point.  Traditionally the popular Linux boot loader is
+[GRUB](https://www.gnu.org/software/grub/).  However, on embedded Linux
+platforms like OpenRISC Linux more simple loaders are used.  These include:
 
  - For Simulators - or1ksim and QEMU provide built in boot loaders
  - FPGA Boards - For larger FPGA boards with litex support we use the litex bios
  - Tiny FPGA Boards - For tiny FPGA boards we use GDB as a simple boot loader
 
-Simulators like `or1ksim` and `QEMU` have the ability to be passed a kernel ELF image from the command
-line.  When the simulator is initialized they can read the ELF binary and load the bits directly into the simulator memory.
-In `QEMU` it will additionally generate and load a device tree to describe to the kernel what hardware
-is available, dynamically.  After the system and memory are initialized the simulator CPU will jump to `0x100`
-the entry point of the OpenRISC platform.
-
-On typical FPGA boards there is storage available to store a bootloader and devices available to store the operating system.
-For example on the [Digilent Arty](https://digilent.com/shop/arty-a7-100t-artix-7-fpga-development-board/) when
-the FPGA bitstream is programmed a ROM is programmed with the [litex bios](https://github.com/enjoy-digital/litex/blob/master/litex/soc/software/bios/main.c).
-This firmware plus boot loader will train DDR3 RAM before loading and jumping to the kernel entry point.
-The litex bios can load the operating system from an SD-card or from TFTP over a network connection.
-
-On very Tiny FPGA boards like a base De0 Nano lacking non-volatile storage,
-there is no means to load an OS via SD-card or network.  We use GDB, a debugger
+Simulators like `or1ksim` and `QEMU` have the ability to be passed a kernel ELF
+image from the command line.  When the simulator is initialized they will read
+the ELF binary and load the binary content directly into the simulator memory.
+In `QEMU` it will additionally generate and load a device tree to describe to
+the kernel what hardware is available, dynamically.  After the system and memory
+are initialized the simulator CPU will jump to `0x100` the entry point of the
+OpenRISC platform.
+
+On typical FPGA boards there is storage available to store a bootloader and
+devices available to store the operating system.  For example on the [Digilent Arty](https://digilent.com/shop/arty-a7-100t-artix-7-fpga-development-board/)
+when the FPGA bitstream is programmed a ROM is programmed with the [litex bios](https://github.com/enjoy-digital/litex/blob/master/litex/soc/software/bios/main.c).
+This firmware plus boot loader will train DDR3 RAM before loading and jumping to
+the kernel entry point.  The litex bios can load the operating system from an
+SD-card or from TFTP over a network connection.
+
+On very tiny FPGA boards like a base De0 Nano lacking non-volatile storage,
+there may be no means to load an OS via SD-card or network.  We use GDB, a debugger
 typically used to read and write CPU and memory state.  We can leverage this to
-load ELF kernel images into memory over the JTAG debug interface.  Once, memory
+load ELF kernel images into memory over a JTAG debug interface.  Once, memory
 is loaded we can reset the CPU to have it jump to `0x100` and boot the kernel.
+Address `0x100` is the OpenRISC default reset vector.
 
 ### Device tree
 
@@ -69,13 +75,14 @@ a boot parameter via register `r3`.
 
 The below is a very simple device tree source file describing an OpenRISC system
 with:
+
  - 1 CPU
  - 1 UART at 0x90000000
  - 32 MB main memory at address 0x0
  - 20 Mhz clock
 
 The device tree will be compiled down to a `.dtb` binary file using the device
-tree compiler (`dtc`) durig the build processes.  During the boot process the
+tree compiler (`dtc`) during the build processes.  During the boot process the
 kernel uses the device tree definitions to initialize devices and memory.
 
 ```
@@ -128,8 +135,9 @@ kernel uses the device tree definitions to initialize devices and memory.
 
 To compile the Linux kernel itself the toolchain used is not very important,
 as the kernel doesn't depend on any toolchain runtime features.  You can use
-any toolchain to build the kernel.
-However, if you want to build userspace applications choosing the correct
+any toolchain to build the kernel, as long as it is a recent OpenRISC
+toolchain.
+However, if you want to build user space applications choosing the correct
 toolchain requires some thought.  The main choices are:
 
  - [musl](../musl.html) - A lightweight and efficient toolchain
@@ -141,104 +149,98 @@ runtime installed.
 
 ### Rootfs
 
-The rootfs is like the Linux distribution for an embedded linux.
+The rootfs is like the Linux distribution for an embedded Linux.
 
 We provide some [prebuilt rootfs images](https://github.com/stffrdhrn/or1k-rootfs-build) to
-help get you started. The main choices are:
+help get you started. The top choices are:
 
  - buildroot - a fully featured rootfs ideal for boards with and sd-card, with
-   well known utilties like `bash`.
- - busybox - a lightweight single binary rootfs, comming in at under 3MB
+   well known utilities like `bash`.
+ - busybox - a lightweight single binary rootfs, coming in at under 3MB
 
 ### Memory Layout
 
 The OpenRISC is able to address up to 32-bits of address space giving us up
 to 4GB of addressable memory.  The space is shared between user space, the
-kernel and hardware devices.
+kernel and hardware devices.  Memory protection between processes is achieved
+using the OpenRISC memory management unit **MMU**.
 
-Paging
+The OpenRISC MMU uses 8KB (13-bits) pages leaving the most significant 19-bits
+for indexing into a software page table.  The architecture uses a 2-level [page table](linux/mm/page_tables.rst)
+using 8-bits to index a 256 entry page directory and 11-bits to index 2048 page table entry leaf nodes.
 
-Openrisc uses 2-level paging
+The **page global directory** or **pgd** looks like the following in OpenRISC:
 
 ```
-      _ 11 bits for pte offset
-     /
-     | __-- 13 bit pages
-     |/  \
-     |    |
-    / \   |
- 0xfffe0000
-   \/
-    \_ top 8 bit used for pgd
-
-
+        PGD (256 entries)
+
+  --> +-----+           PTE (2048 entries)
+      | ptr |-------> +-----+
+      | ptr |-        | ptr |-------> PAGE
+      | ptr | \       | ptr |
+      | ptr |  \        ...
+      | ... |   \
+      | ptr |    \         PTE
+      +-----+     +----> +-----+
+                         | ptr |-------> PAGE
+                         | ptr |
+                           ...
+
+ PMD, PUD and P4D are folded up on OpenRISC
 ```
 
-Notes for or1k PGD
+Virtual address bits are used to index into the page table
+and derive the physical address as below:
 
 ```
-PGD - dir      top 8 bits - 256 enties pgd_offset
-PMD - mid      1
-PTE - entry    least sig 11 bits of page - 2048 entries in PTE page
-
-  pte_offset
-    return (address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
-
-  [ 8 ][ 11 ][ 13 ]
-
-                        13 + 13-2 => 24
-                        1 << 24
-
-#define PGDIR_SHIFT     (PAGE_SHIFT + (PAGE_SHIFT-2))
-#define PGDIR_SIZE      (1UL << PGDIR_SHIFT)
++--------+--------+--------+--------+
+| 31  24 | 23  16 | 15   8 | 7    0 |
++--------+--------+--------+--------+
+ |         |          |
+ |         |          v
+ |         |         [12:0] in-page offset
+ |         +-------> [23:13] PTE index
+ +-----------------> [21:24] PGD index
+```
 
-                        1 << 8
+The are defined in `page.h` and `pgtable.h` as follows:
 
-1 Page per PTE / 4 => 2048
-#define PTRS_PER_PTE    (1UL << (PAGE_SHIFT-2))
+From page.h:
 
-                        2048
+```
+#define PAGE_SHIFT      13                               // 8KB
+```
 
-#define PTRS_PER_PGD    (1UL << (32-PGDIR_SHIFT))
+From pgtable.h:
 
-                       256
+```
+#define PGDIR_SHIFT     (PAGE_SHIFT + (PAGE_SHIFT-2))    // 24
+#define PTRS_PER_PTE    (1UL << (PAGE_SHIFT-2))          // 2048
+#define PTRS_PER_PGD    (1UL << (32-PGDIR_SHIFT))        // 256
 
+#define PGDIR_SIZE	(1UL << PGDIR_SHIFT)
 #define USER_PTRS_PER_PGD       (TASK_SIZE/PGDIR_SIZE)
-                       128
-
-swapper_pg_dir[PTRS_PER_PGD];
-
-        if (ret) {
-                memset(ret, 0, USER_PTRS_PER_PGD * sizeof(pgd_t));
-                memcpy(ret + USER_PTRS_PER_PGD,
-                       swapper_pg_dir + USER_PTRS_PER_PGD,
-                       (PTRS_PER_PGD - USER_PTRS_PER_PGD) * sizeof(pgd_t));
-
-        }
-
-     0-128 - zeroed for users
-   128-256 - copied from kernel
-
-page
- 31 ... 13 - this is what it should be
-
- 31 ... 10
-
- * An OR32 PTE looks like this:
- *
- * |  31 ... 10 |  9  |  8 ... 6  |  5  |  4  |  3  |  2  |  1  |  0  |
- *  Phys pg.num    L     PP Index    D     A    WOM   WBC   CI    CC
- *
 ```
 
+The definition of `USER_PTRS_PER_PGD` evaluates to 128. This macro is used to
+reserve the first 128 pfn's for user space leaving pfn's 128 to 255 for kernel
+space.
+
 #### Physical Addresses
 
 In Linux SoC's our data caches are configured with a 31-bit addresses width.
-This means only the first 2GB of memory addresses are cached.  This is useful
+This means only the first 2GB of physical memory space addresses are cached.  This is useful
 as it guarantees that all operations on addresses above `0x80000000` are not cached.
 We use these upper address ranges for IO devices which we do not want to be
 cached.
 
+This means that technically OpenRISC systems cannot have more than 2GiB of main
+memory. However, due to the OpenRISC kernel not supporting highmem and some
+other reserved address space, the main memory limit is about 768MiB; which is
+plenty for OpenRISC embedded system.
+
+The physical address space looks like the follow:
+
 ```
 Address Range      | Description
 -------------------+---------------------------
@@ -250,8 +252,8 @@ Address Range      | Description
 #### Virtual Memory
 
 Virtual memory in Linux is split between kernel space and user space as below.
-There is 1GB reserved for the kernel, 2GB reserved for userspace and a 1GB hole
-which we reserver for other purposes.
+There is 1GB reserved for the kernel, 2GB reserved for user space and a 1GB hole
+which we reserve for other purposes.
 
 OpenRISC uses 8kb pages.
 
@@ -270,8 +272,8 @@ OpenRISC uses 8kb pages.
 +----
 ```
 
-If we look at the Linux kernel ELF binary we see the following.
-
+We can see how this works in practice if we look at a Linux kernel ELF binary as
+below:
 
 ```
 readelf -S vmlinux
@@ -305,9 +307,22 @@ Section Headers:
   [23] .symtab           SYMTAB          00000000 66aec58 066db0 10     24 14480  4
   [24] .strtab           STRTAB          00000000 6715a08 069418 00      0   0  1
   [25] .shstrtab         STRTAB          00000000 677ee20 0000ff 00      0   0  1
+
+Program Headers:
+  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
+  LOAD           0x002000 0xc0000000 0x00000000 0x549344 0x549344 R E 0x2000
+  LOAD           0x54c000 0xc054a000 0x0054a000 0x485ea0 0x499c20 RWE 0x2000
+  NOTE           0x60d85c 0xc060b85c 0x0060b85c 0x00054 0x00054 R   0x4
+  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4
 ```
 
-For fat kernels the rootfs is built into `.data` section.  As we can see below.
+Notice the **Program Headers** reveal that only some of the sections
+are loaded into memory.  Many of the ELF binary sections above are used
+for debugging.  The main executable section `.text` is loaded starting at address `0x0`.
+The other sections are added after that.  The virtual addresses
+of the sections have a base of `0xc0000000`.
+
+For *"fat"* kernels a rootfs is built into `.data` section.  As we can see below.
 
 ```
 $ nm vmlinux | grep __irf_
@@ -324,15 +339,6 @@ In the above example, we can see the included data is about 3.4 MB in size.
 The rootfs is included into the kernel image using the Makefile and tools
 in the `usr/` directory of kernel source tree.
 
-```
-Program Headers:
-  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
-  LOAD           0x002000 0xc0000000 0x00000000 0x549344 0x549344 R E 0x2000
-  LOAD           0x54c000 0xc054a000 0x0054a000 0x485ea0 0x499c20 RWE 0x2000
-  NOTE           0x60d85c 0xc060b85c 0x0060b85c 0x00054 0x00054 R   0x4
-  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4
-```
-
 If we have a look at the ELF binary of a user space process we see the
 following:
 
@@ -385,7 +391,10 @@ Program Headers:
 
 ```
 
-When this is running we can see it maps into user space as follows.
+Notice how the virtual addresses of the loaded sections have a base address
+of `0x00000000`, not `0xc0000000` as we saw in the Linux kernel binary above.
+
+When this binary is running we can see it maps into user space as follows.
 
 ```
 ~ # cat /proc/1/maps
@@ -401,4 +410,21 @@ When this is running we can see it maps into user space as follows.
 7ff84000-7ffa6000 rw-p 00000000 00:00 0          [stack]
 ```
 
+We can see a few things looking at this map:
+
+ - The first page is not mapped; mapping starts at 0x2000. This
+   allows accesses to `0x0` to throw a null pointer exception.
+ - The binary sections are loaded into executable, read only and read write
+   protected regions.
+ - A dynamic heap has been allocated.
+ - Shared libraries are mapped into memory space around the `0x30000000`
+   range.
+ - The stack is high in the virtual memory address space around `0x7fffffff`.
+   It grows down.
+
+### Conclusion
 
+We have gone over some of the internals of the OpenRISC Linux implementation.
+We hope this helps you in the understanding of the fundamentals of embedded
+Linux and will improve your understanding of the Linux bring up tutorials that
+follow.

From 69a0747030badd0b791fb3f9966583ac019930b9 Mon Sep 17 00:00:00 2001
From: Stafford Horne <shorne@gmail.com>
Date: Thu, 2 Apr 2026 19:17:29 +0100
Subject: [PATCH 6/6] Linux: fixup some links and punctuation

---
 docs/Linux.md | 44 ++++++++++++++++++++++----------------------
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/docs/Linux.md b/docs/Linux.md
index ba1ed62..a87dedb 100644
--- a/docs/Linux.md
+++ b/docs/Linux.md
@@ -7,7 +7,7 @@ nav_order: 5
 ## OpenRISC Linux
 
 Linux on OpenRISC is the essence of [embedded Linux](https://elinux.org/Main_Page).
-From the FPGA based SoC's, simulators, toolchains, kernel and software it provides a complete
+From the FPGA based SoCs, simulators, toolchains, kernel and software it provides a complete
 open-source software and hardware stack.
 
 In this tutorial we cover the basics of OpenRISC embedded Linux before diving
@@ -15,12 +15,12 @@ into our Linux on OpenRISC tutorials.  We will cover:
 
  * Boot loaders - we need to get Linux onto the system, we will explain how this
    is done.
- * Device tree - how does Linux know what hardware is available in the system
+ * Device tree - how does Linux know what hardware is available in the system?
  * Toolchains - We covered this before, but a quick refresher on Linux
-   specific toolchains
- * Rootfs - Applications
+   specific toolchains.
+ * Rootfs - Applications.
  * Memory layout - we explain how devices, Linux and our user processes
-   share memory space
+   share memory space.
 
 If you wish to skip this you can continue directly with our tutorials:
 
@@ -40,7 +40,7 @@ the entry point.  Traditionally the popular Linux boot loader is
 platforms like OpenRISC Linux more simple loaders are used.  These include:
 
  - For Simulators - or1ksim and QEMU provide built in boot loaders
- - FPGA Boards - For larger FPGA boards with litex support we use the litex bios
+ - FPGA Boards - For larger FPGA boards with LiteX support we use the LiteX BIOS
  - Tiny FPGA Boards - For tiny FPGA boards we use GDB as a simple boot loader
 
 Simulators like `or1ksim` and `QEMU` have the ability to be passed a kernel ELF
@@ -69,7 +69,7 @@ Address `0x100` is the OpenRISC default reset vector.
 
 The device tree file (.dts) is used to specify hardware configuration settings,
 such as base addresses and interrupt numbers for peripherals, main memory, the
-numbers of CPUs in the system and other things. OpenRISC Linux always needs a
+number of CPUs in the system and other things. OpenRISC Linux always needs a
 device tree to boot.  The device tree can be built into the kernel or passed as
 a boot parameter via register `r3`.
 
@@ -79,7 +79,7 @@ with:
  - 1 CPU
  - 1 UART at 0x90000000
  - 32 MB main memory at address 0x0
- - 20 Mhz clock
+ - 20 MHz clock
 
 The device tree will be compiled down to a `.dtb` binary file using the device
 tree compiler (`dtc`) during the build processes.  During the boot process the
@@ -154,7 +154,7 @@ The rootfs is like the Linux distribution for an embedded Linux.
 We provide some [prebuilt rootfs images](https://github.com/stffrdhrn/or1k-rootfs-build) to
 help get you started. The top choices are:
 
- - buildroot - a fully featured rootfs ideal for boards with and sd-card, with
+ - buildroot - a fully featured rootfs ideal for boards with and SD card, with
    well known utilities like `bash`.
  - busybox - a lightweight single binary rootfs, coming in at under 3MB
 
@@ -166,7 +166,7 @@ kernel and hardware devices.  Memory protection between processes is achieved
 using the OpenRISC memory management unit **MMU**.
 
 The OpenRISC MMU uses 8KB (13-bits) pages leaving the most significant 19-bits
-for indexing into a software page table.  The architecture uses a 2-level [page table](linux/mm/page_tables.rst)
+for indexing into a software page table.  The architecture uses a 2-level [page table](https://docs.kernel.org/mm/page_tables.html)
 using 8-bits to index a 256 entry page directory and 11-bits to index 2048 page table entry leaf nodes.
 
 The **page global directory** or **pgd** looks like the following in OpenRISC:
@@ -228,34 +228,34 @@ space.
 
 #### Physical Addresses
 
-In Linux SoC's our data caches are configured with a 31-bit addresses width.
+In Linux SoCs our data caches are configured with a 31-bit addresses width.
 This means only the first 2GB of physical memory space addresses are cached.  This is useful
 as it guarantees that all operations on addresses above `0x80000000` are not cached.
 We use these upper address ranges for IO devices which we do not want to be
 cached.
 
-This means that technically OpenRISC systems cannot have more than 2GiB of main
+This means that technically OpenRISC systems cannot have more than 2GB of main
 memory. However, due to the OpenRISC kernel not supporting highmem and some
-other reserved address space, the main memory limit is about 768MiB; which is
+other reserved address space, the main memory limit is about 768MB; which is
 plenty for OpenRISC embedded system.
 
 The physical address space looks like the follow:
 
 ```
-Address Range      | Description
--------------------+---------------------------
-0x80000000 ~ (2GB) | IO space, not cached
--------------------+---------------------------
-0x00000000 ~ (2GB) | Physical RAM space, cached
+Address Range       | Description
+--------------------+---------------------------
+0x80000000 ~ (2GB)  | IO space, not cached
+--------------------+---------------------------
+0x00000000 ~ (2GB)  | Physical RAM space, cached
 ```
 
 #### Virtual Memory
 
 Virtual memory in Linux is split between kernel space and user space as below.
-There is 1GB reserved for the kernel, 2GB reserved for user space and a 1GB hole
+There is 1GB reserved for the kernel, 2GiB reserved for user space and a 1GiB hole
 which we reserve for other purposes.
 
-OpenRISC uses 8kb pages.
+OpenRISC uses 8KB pages.
 
 ```
 | Address Range           | Defines                       | Size  | Usage
@@ -265,7 +265,7 @@ OpenRISC uses 8kb pages.
 | 0xc0000000 - 0xf7fbffff | KERNELBASE                    | ~1GB  | direct mapped, kernel space (30-bits)
 +-------------------------+-------------------------------+-------+-------
 | 0xbc000000 - 0xbfffffff | VMALLOC_START to VMALLOC_END) | 64MB  | vmalloc/ioremap
-| 0x80000000 - 0xbbffffff |                               | ~1GB  | hole
+| 0x80000000 - 0xbbffffff |                               | ~1GB  | Reserved
 +-------------------------+-------------------------------+-------+------
 | 0x00002000 - 0x7fffffff | TASK_SIZE                     | ~2GB  | User space
 | 0x00000000 - 0x00001fff |                               | 8KB   | Unmapped page, NULL catch
@@ -426,5 +426,5 @@ We can see a few things looking at this map:
 
 We have gone over some of the internals of the OpenRISC Linux implementation.
 We hope this helps you in the understanding of the fundamentals of embedded
-Linux and will improve your understanding of the Linux bring up tutorials that
+Linux and will improve your understanding of the Linux bring-up tutorials that
 follow.