Skip to content

wasi: add cat/sort/tail/touch integration coverage and related fixes#11712

Open
DePasqualeOrg wants to merge 12 commits intouutils:mainfrom
DePasqualeOrg:wasi-support
Open

wasi: add cat/sort/tail/touch integration coverage and related fixes#11712
DePasqualeOrg wants to merge 12 commits intouutils:mainfrom
DePasqualeOrg:wasi-support

Conversation

@DePasqualeOrg
Copy link
Copy Markdown
Contributor

@DePasqualeOrg DePasqualeOrg commented Apr 8, 2026

This PR includes comprehensive wasm32-wasi platform improvements, enabling many more tools to compile for WASI beyond the existing feat_wasm set. It includes a full synchronous fallback for sort on targets without thread support, and symlink support for cp via WASI's symlink_path.

These changes underwent many rounds of review and refinement with Claude Code and Codex. I've tested them extensively running on Wasmtime and Wasmer.

Motivation

Recent PRs (#11574, #11568, #11569, #11573, #11595, #11624) added initial WASI support – platform stubs, FileInformation, tail feature-gating, and a basic single-threaded sort path. This PR builds on that work with three improvements:

  1. Atomics-aware cfg guards: uses target_feature = "atomics" to distinguish wasm32-wasip1 (no threads) from wasm32-wasip1-threads, so the threaded sort path is preserved on runtimes that support it.
  2. Full synchronous sort pipeline: replaces the in-memory-only fallback with a proper chunked sort-write-merge strategy (ext_sort, merge, check) that handles large inputs via temp files.
  3. Real symlink support for cp: uses symlink_path instead of returning errors. (ln symlink support is in a separate PR.)

This PR supports the following WASI targets:

  • wasm32-wasip1: universal baseline, works on all WASI runtimes
  • wasm32-wasip1-threads: supports threads via the atomics and bulk-memory proposals

Changes

uucore

File Change
lib.rs Add #![cfg_attr(all(target_os = "wasi", feature = "fs"), feature(wasi_ext))] — only std::os::wasi::fs needs the unstable gate; std::os::wasi::ffi is stable
features/fs.rs Add WASI variant to FileInformation using std::os::wasi::fs::MetadataExt for nlink(), PartialEq (dev+ino), Hash (dev+ino). Add is_stdin_directory, path_ends_with_terminator
features/fsext.rs Remove outer #[cfg(not(target_os = "wasi"))] from read_fs_list so the inner WASI block (returning empty Vec) is reachable
features/mode.rs Add WASI get_umask() returning default 0o022
mods/io.rs WASI into_stdio() converts through File first (no direct Stdio::from(OwnedFd) on WASI)

sort: synchronous fallback for WASI without threads

On wasm32-wasip1, sort crashes because it unconditionally spawns threads via std::thread::spawn and rayon. This PR adds synchronous code paths gated on #[cfg(all(target_os = "wasi", not(target_feature = "atomics")))] so that sort works on both WASI targets.

The sort command uses threads in four places, each with a synchronous alternative:

File Threaded path Synchronous fallback
ext_sort/threaded.rs Sorter thread for parallel chunk sorting Sequential read-sort-write loop with same chunked strategy
merge.rs Reader thread for async file I/O during merge SyncFileMerger that reads on demand
check.rs Reader thread for async file I/O during order checking check_sync that reads and checks inline
sort.rs Rayon par_sort_by / par_sort_unstable_by sort_by / sort_unstable_by

The ext_sort module unconditionally compiles the threaded module, which handles both cases via internal cfg guards. The existing separate wasi.rs (a read-all-into-memory fallback from #11624) is removed in favor of this more complete implementation, along with its unused parse_into_chunk helper.

Both the synchronous ext_sort and merge paths emit a warning and fall back to uncompressed temp files if --compress-program is passed, since process spawning is not available on WASI without threads.

The chunks.rs file extracts read_to_chunk() from read() so both threaded and synchronous code paths share the same chunk-reading logic.

Other tool crates

Tool File Change
cat platform/mod.rs Add WASI is_unsafe_overwrite stub (returns false)
cp cp.rs Add WASI symlink using std::os::wasi::fs::symlink_path; return error for timestamp preservation on WASI (filetime panics in from_last_access_time/from_last_modification_time) — handle_preserve suppresses for optional (-a) and reports for required (--preserve=timestamps)
env native_int_str.rs Add #[cfg(target_os = "wasi")] use std::os::wasi::ffi::{OsStrExt, OsStringExt}
mktemp mktemp.rs Gate permissions with #[cfg(unix)] instead of #[cfg(not(windows))]
sort Cargo.toml Exclude ctrlc crate on WASI (no signal handling); make rayon conditional on atomics support
sort tmp_dir.rs Add WASI no-op signal handler; gate ctrlc usage
sort sort.rs Check TMPDIR env var before calling env::temp_dir() (panics on WASI)
tail platform/mod.rs Add WASI stubs for Pid, ProcessChecker (#[allow(dead_code)] — follow mode is disabled on WASI), supports_pid_checks
tail paths.rs Add WASI file_id_eq (returns false — no stable inode API on WASI yet)
tail text.rs, args.rs Add WASI backend name and help text
touch touch.rs Return UnsupportedPlatformFeature error for touch - on WASI (no /dev/stdout path)

What's stubbed vs. fully functional

Most WASI stubs are for Unix concepts that don't exist in WASI's capability-based security model:

  • Stubbed (no-op): signal handling, PID monitoring, umask, file ownership checks, hostname, Unix permission display, touch - (returns error – WASI has no /dev/stdout path), timestamp preservation in cp (filetime crate panics on WASI)
  • Fully functional: file I/O, directory operations, sorting (threaded or synchronous), text processing, symlinks (relative targets only – absolute targets are rejected by WASI's capability-based sandbox), temp files, environment variables

Build requirements

  • Rust nightly (for std::os::wasi::fs extensions, gated with #![cfg_attr(all(target_os = "wasi", feature = "fs"), feature(wasi_ext))])
  • wasm32-wasip1: cargo +nightly build --target wasm32-wasip1 --release
  • wasm32-wasip1-threads: cargo +nightly build --target wasm32-wasip1-threads -Zbuild-std=std,panic_abort --release

Testing

  • Host (macOS): cargo build --release compiles with no warnings
  • wasm32-wasip1: compiles, uses synchronous paths for sort
  • wasm32-wasip1-threads: compiles, uses threaded paths for sort
  • All newly enabled tools verified working on Wasmer (wasip1-threads) and Wasmtime (wasip1)
  • sort requires TMPDIR environment variable on WASI (Rust's std::env::temp_dir() panics on WASI – this is a Rust std library issue, not specific to coreutils)

Note on cfg alias commit

The last commit adds a build.rs to the sort crate that defines a wasi_no_threads cfg alias, replacing ~49 instances of the verbose #[cfg(all(target_os = "wasi", not(target_feature = "atomics")))] with #[cfg(wasi_no_threads)]. This is a readability improvement only and can be reverted if maintainers prefer the explicit predicates.

@oech3
Copy link
Copy Markdown
Contributor

oech3 commented Apr 8, 2026

Would you split PR (at least for symlink support)?

@DePasqualeOrg DePasqualeOrg force-pushed the wasi-support branch 2 times, most recently from 9bced35 to aa75a81 Compare April 8, 2026 11:08
@DePasqualeOrg
Copy link
Copy Markdown
Contributor Author

I separated the symlink changes out into #11713.

@DePasqualeOrg DePasqualeOrg changed the title wasi: atomics-aware threading, synchronous sort pipeline, and symlink support wasi: atomics-aware threading and synchronous sort pipeline Apr 8, 2026
@DePasqualeOrg
Copy link
Copy Markdown
Contributor Author

I added a new commit to address an issue that would cause CI failures.

The WASI CI uses stable Rust, but std::os::wasi::fs requires the unstable feature(wasi_ext) gate. This commit replaces all unstable APIs with stable libc equivalents:

  • uucore/fs.rs: FileInformation now stores libc::stat instead of std::fs::Metadata on WASI, using libc::fstat/libc::stat/libc::lstat for dev/ino/nlink access
  • cp.rs: std::os::wasi::fs::symlink_path replaced with libc::symlink
  • uucore/lib.rs: #![cfg_attr(..., feature(wasi_ext))] removed entirely
  • is_enotsup_error() updated to use libc::EOPNOTSUPP on WASI instead of a hardcoded value

@DePasqualeOrg
Copy link
Copy Markdown
Contributor Author

I resolved the linter error in CI.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

GNU testsuite comparison:

GNU test failed: tests/tail/tail-n0f. tests/tail/tail-n0f is passing on 'main'. Maybe you have to rebase?
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/expand/bounded-memory is now passing!
Congrats! The gnu test tests/printf/printf-surprise is now passing!

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

GNU testsuite comparison:

Skip an intermittent issue tests/tail/symlink (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/cp/link-heap is now being skipped but was previously passing.
Note: The gnu test tests/pr/bounded-memory is now being skipped but was previously passing.
Congrats! The gnu test tests/expand/bounded-memory is now passing!
Congrats! The gnu test tests/printf/printf-surprise is now passing!

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

GNU testsuite comparison:

Skip an intermittent issue tests/cut/bounded-memory (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/expand/bounded-memory is now passing!
Congrats! The gnu test tests/printf/printf-surprise is now passing!

@oech3
Copy link
Copy Markdown
Contributor

oech3 commented Apr 8, 2026

Is it able to split PR per utility? Diff is still too big. How about sort?

@DePasqualeOrg
Copy link
Copy Markdown
Contributor Author

The changes here are interdependent, and splitting out sort would result in PRs of ~560 and ~110 lines each, which would need to be sequenced properly and rebased on upstream changes. They should also probably wait for #11717, in which I've enabled integration tests. If/when that PR is merged, I can enable more tests for these tools in this PR. Keeping them in one PR would reduce the cognitive load for me, since I'm now keeping track of three related PRs.

Add #[cfg(target_os = "wasi")] blocks alongside existing unix and windows
platform code. No changes to existing platform behavior.

Enables compilation to wasm32-wasip1 and wasm32-wasip1-threads targets
for running in WASI-compatible runtimes like WasmKit and Wasmer.
On wasm32-wasip1 (no atomics), sort crashes because ext_sort, merge,
check, and rayon all spawn threads unconditionally. Add synchronous
code paths gated on cfg(all(target_os = "wasi", not(target_feature
= "atomics"))) so sort works on both wasip1 (sync) and
wasip1-threads (threaded).

Key changes:
- Extract read_to_chunk() from chunks::read() for shared use
- Add synchronous ext_sort with chunked sort-write-merge flow
- Add SyncFileMerger for threadless merge operations
- Add synchronous check for order verification
- Gate rayon par_sort with sequential fallback
@DePasqualeOrg
Copy link
Copy Markdown
Contributor Author

I've added WASI integration coverage for four more tools, plus the fixes needed to make those tests pass.

New integration tests

Added test_cat, test_sort, test_tail, test_touch to wasi.yml. Every skip carries a specific reason; those reasons are categorized in docs/src/wasi-test-gaps.md.

Fixes

touch: filetime's WASI backend is stubbed out (FileTime::from_last_{access,modification}_time panics; set_file_times returns "Wasm not implemented"). Swapped for rustix::fs::utimensat and Metadata::{accessed,modified}FileTime::from_system_time. 31 tests fixed.

sort: /tmp isn't a WASI preopen by default, so the hardcoded fallback broke external sort. Added uucore::fs::wasi_default_tmp_dir (gated to cfg(target_os = "wasi")) which returns /tmp if visible, else the current directory. --tmp-dir and TMPDIR still override. 4 tests fixed.

cp: replaced CString + unsafe libc::symlink with rustix::fs::symlink (matches the ln pattern). Implemented --preserve=timestamps on WASI via rustix::fs::utimensat (mirroring touch) and reverted the Preserve::Yes { required: false } downgrade so cp -a no longer silently drops timestamps.

uucore: FileInformation's WASI arm hand-rolled fstat/stat/lstat via libc + CString. Collapsed into the unix arm – rustix::fs::Stat is a typedef to libc::stat on WASI, and rustix::fs::{stat, lstat, fstat} work there without any cfg gate. Removed ~35 lines of unsafe code.

cat: stub stays – when stdout is inherited from a host file descriptor, wasmtime reports its fstat as all-zero so dev/inode comparison can never match. Comment updated to reflect that accurately.

Deferred

test_cp is not added here. ~41 of its failures still cluster around WASI symlink handling that needs follow-up: under wasmtime, symlinks with absolute targets (for example bar -> /foo) fail in ways that do not match POSIX, while equivalent relative symlinks work. The same limitation also affects readlink, realpath, and other symlink-heavy paths. Separately, some remaining failures are test-harness expectation issues because the WASI guest sees the per-test tempdir as /, not as the host absolute temp path. I documented both gaps in docs/src/wasi-test-gaps.md for follow-up.

Verification

Integration tests pass on macOS host and on Linux in an Ubuntu 24.04 Docker container (1592 passed / 0 failed / 176 ignored).

@DePasqualeOrg DePasqualeOrg changed the title wasi: atomics-aware threading and synchronous sort pipeline wasi: add cat/sort/tail/touch integration coverage and related fixes Apr 12, 2026
@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/pr/bounded-memory (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/dd/no-allocate is now passing!
Congrats! The gnu test tests/printf/printf-surprise is now passing!

@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skip an intermittent issue tests/cut/bounded-memory (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/tail-n0f (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tty/tty-eof (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/dd/no-allocate is now passing!
Congrats! The gnu test tests/printf/printf-surprise is now passing!

@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skip an intermittent issue tests/tty/tty-eof (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/dd/no-allocate is now passing!
Congrats! The gnu test tests/printf/printf-surprise is now passing!

@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skip an intermittent issue tests/cut/bounded-memory (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tty/tty-eof (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/basenc/bounded-memory is now being skipped but was previously passing.
Congrats! The gnu test tests/dd/no-allocate is now passing!
Congrats! The gnu test tests/printf/printf-surprise is now passing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants