[tests] Improve DoNotLeakWeakReferences reliability#11615
Merged
jonathanpeppers merged 4 commits intoJun 10, 2026
Conversation
Under stock NUnit 3 (after the dotnet test migration), an unhandled NUnit AssertionException thrown from inside a raw 'new Thread (...)' body becomes an unhandled exception on the .NET runtime side, which SIGABRTs the whole process. The instrumentation host then never gets a Finish() call and 'dotnet test' reports 'Zero tests ran' for the entire run -- masking the real failure (Assert #2 inside the worker) and taking out every other test in the assembly with it. Capture exceptions from the worker thread in a local, then rethrow on the test thread after Join() so an Assert failure (or any other exception) becomes a normal NUnit test failure instead of a process abort. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Build 14275720 had this test fail in 6 lanes simultaneously with two failure modes: #2 Expected: 36/37 But was: 37/38 (+1 extra peer at allocation) #3 Expected: 36/37 But was: 34/35 (1-3 fewer peers after GC) Build 14212378 on main (pre-PR) hit the same #2 failure, so this is a pre-existing flake -- not a regression. Before ac81297, mode #2 escaped from new Thread() and SIGABRT'd the process (manifesting as 'Zero tests ran' for the whole lane); the marshal-exception fix now surfaces it as a real test failure, which is why we suddenly see it. Runtime.GetSurfacedObjects() is process-global: NUnit/MTP per-test infrastructure (TestExecutionContext, listeners, Console capture) allocates and releases Java.Lang.Object peers around every test, so the count drifts by a few entries between snapshots. Observed drift across recent builds is in the range -3..+1; allow +/-5 tolerance which is plenty tight to still detect a real leak (which would dwarf this) while removing the noise. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On the CoreCLRTrimmable lane, the surfaced-object count drifted by 7 (expected 24 +/- 5, actual 31). Bump tolerance from 5 to 10 so transient GC/NUnit infrastructure drift doesn't fail the test. A real leak would still produce far higher counts. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR improves the reliability of the flaky JnienvTest.DoNotLeakWeakReferences device test by making failures from a worker thread observable and by tolerating small process-global drift in Runtime.GetSurfacedObjects() counts between snapshots.
Changes:
- Add a tolerance window when asserting surfaced peer counts to account for per-test infrastructure churn.
- Capture exceptions from the worker thread and fail the test on the main thread when they occur.
simonrozsival
approved these changes
Jun 10, 2026
jonathanpeppers
added a commit
that referenced
this pull request
Jun 10, 2026
## Summary Migrate Mono.Android device test infrastructure from the legacy NUnitLite stack to `dotnet test` with the Microsoft Testing Platform (MTP), following the pattern established for the `androidtest` template in 031246d. Callers now run device tests with stock `dotnet test`, no custom MSBuild targets required. **NUnitLite is removed entirely** from the workload — both as a framework assembly and as a vendored copy. ## Changes ### New: MTP-based `TestInstrumentation` base class - `tests/TestRunner.Core/TestInstrumentation.cs` is now an MTP-based on-device runner using stock NUnit (`NUnitTestAssemblyRunner`) plus `Microsoft.Testing.Platform`. - Mirrors the `androidtest` template: tests run synchronously on the instrumentation thread (`NumberOfTestWorkers = 0`, see comment in code), TRX is written to the app data dir, and the path is sent back through the instrumentation bundle so the host adapter can pick it up. ### Updated test projects - **Mono.Android.NET-Tests**, **Java.Interop-Tests**, **JcwGen-Tests**: now reference stock NUnit `3.13.3` from NuGet (no NUnitLite / TestRunner.NUnit), set `IsTestingPlatformApplication=true`, and use the new `TestInstrumentation` base. ### Deleted (~41K lines) - `src/Xamarin.Android.NUnitLite/` — the `Xamarin.Android.NUnitLite.dll` framework assembly (NUnitLite wrapper, `TestSuiteInstrumentation`, GUI `TestSuiteActivity` / `TestResultActivity` / `OptionsActivity`, layouts). - `src-ThirdParty/NUnitLite/` — the embedded NUnitLite source we used to compile into the above. - `tests/TestRunner.NUnit/` — `NUnitTestRunner` / `NUnitTestInstrumentation`. - `tests/TestRunner.xUnit/` — `XUnitTestRunner` / `XUnitTestInstrumentation`. - `Xamarin.Android.NUnitLite.dll/.pdb/.xml` removed from `build-tools/installers/create-installers.targets` (no longer shipped in the framework). - Various custom `RunTestApp` / device-test MSBuild targets. The `XA4313` deprecation message in `Common.targets` (which suggested customers migrate `Xamarin.Android.NUnitLite` → `Xamarin.Legacy.NUnitLite`) is preserved as the existing escape hatch; the `Xamarin.Legacy.NUnitLite` NuGet is published from a separate repo and is unaffected by these changes. ### NativeAOT trimmer fix NUnit 3.13.3's `EquatablesComparer` walks every value via reflection: `type.GetInterfaceMap(typeof(IEquatable<T>))` then `MethodInfo.Invoke` on the result. Under stock NativeAOT this throws `Arg_NotSupportedException` because ILC strips the dispatch-slot mapping. NUnit 3.14+ guards this with try/catch, but we cannot bump globally (it regresses the Mono Interpreter `AwaitAdapter` lane). Real fix, no `[Category("NativeAOTIgnore")]` shortcuts: - `tests/Mono.Android-Tests/Mono.Android-Tests/Mono.Android.NET-Tests.csproj`: under `'$(PublishAot)' == 'true'`, set `<IlcGenerateCompleteTypeMetadata>true</IlcGenerateCompleteTypeMetadata>` and pull in a new `NativeAOT.rd.xml`. - `tests/Mono.Android-Tests/Mono.Android-Tests/NativeAOT.rd.xml` roots the closed `IEquatable<T>` instantiations the test value types need (`String`, `Boolean`, `IntPtr`, `TimeSpan`, `DateTime`) plus the concrete types themselves as `Dynamic="Required All"`, so ILC keeps the slot mapping. Also roots `TaskAwaitAdapter+GenericAdapter<VoidTaskResult>` for NUnit's async adapter. Result on CI build [14325032](https://devdiv.visualstudio.com/DevDiv/_build/results?buildId=14325032): NativeAOT lane 230 passed / 0 failed (43 legitimate category exclusions). ### CI updates (`apk-instrumentation.yaml`) - Build with `-t:Install`, run with `dotnet test --no-build --report-trx`. - Test results are published as VSTest (TRX) instead of NUnit XML. - Added `adb logcat -d` capture so device-test crashes can be diagnosed without re-running CI. ### Test fixes / bookkeeping - Centralized `JavaSystem.LoadLibrary ("reuse-threads")` in `TestInstrumentation.OnCreate` so it runs on the main Java thread instead of inside individual tests. - Synced `ExcludedTestNames` / `ExcludedCategories` to match `main`. ## Out of scope - `EmbeddedDSO` and `Locale-Tests` — old-style csproj that still `<Reference Include="Xamarin.Android.NUnitLite" />`. They need an SDK migration before this PR can fully delete that assembly's identity from the build (the `XA4313` deprecation will fire on these projects today). Tracked separately. ## Already split out These improvements were carved out of this PR and merged separately to `main`: - #11614 — `[tests] Use Android.Util.Log in JnienvTest.ThreadReuse callback` - #11615 — `[tests] Improve DoNotLeakWeakReferences reliability` - #11609 — `Bump to dotnet/java-interop@b881d21` — picks up dotnet/java-interop#1437 (modernize NUnit attributes), which is why no `NUnitCompatibility.cs` shim is needed in this PR. ## Related - Template work: 031246d - #11523 — fixed `UrlEscaping_Bug43411` on `main` (test had been silently dead under NUnitLite) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Improves reliability of the flaky
JnienvTest.DoNotLeakWeakReferencestest by consolidating three small fixes:Thread, so assertion failures inside it produce a real test failure instead of being swallowed.Runtime.GetSurfacedObjects()is process-global, and NUnit/MTP infrastructure (per-testTestExecutionContext, listeners,Consolecapture, etc.) creates and releases transientJava.Lang.Objectpeers around every test. UseIs.EqualTo(...).Within(tolerance)instead ofAssert.AreEqualso the count can drift by a few entries between snapshots.Split out of #11224 (branch
jonathanpeppers/dogfood-dotnet-test) so it can land independently of the larger MTP /dotnet testdogfooding work. Only touchestests/Mono.Android-Tests/Mono.Android-Tests/Java.Interop/JnienvTest.cs.