Skip to content

DrawingCanvas API: Replace imperative extension methods with stateful canvas-based drawing model#377

Open
JimBobSquarePants wants to merge 205 commits intomainfrom
js/canvas-api
Open

DrawingCanvas API: Replace imperative extension methods with stateful canvas-based drawing model#377
JimBobSquarePants wants to merge 205 commits intomainfrom
js/canvas-api

Conversation

@JimBobSquarePants
Copy link
Copy Markdown
Member

@JimBobSquarePants JimBobSquarePants commented Mar 1, 2026

Prerequisites

  • I have written a descriptive pull-request title
  • I have verified that there are no overlapping pull-requests open
  • I have verified that I am following matches the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
  • I have provided test coverage for my change (where applicable)

Breaking Changes: DrawingCanvas API

Fix #106
Fix #244
Fix #344
Fix #367

This is a major breaking change. The library's public API has been completely redesigned around a canvas-based drawing model, replacing the previous collection of imperative extension methods.

What changed

The old API surface — dozens of IImageProcessingContext extension methods like DrawLine(), DrawPolygon(), FillPolygon(), DrawBeziers(), DrawImage(), DrawText(), etc. — has been removed entirely. These methods were individually simple but suffered from several architectural limitations:

  • Each call was an independent image processor that rasterized and composited in isolation, making it impossible to batch or reorder operations.
  • State (blending mode, clip paths, transforms) had to be passed to every single call.
  • There was no way for an alternate rendering backend to intercept or accelerate a sequence of draw calls.

The new model: DrawingCanvas

All drawing now goes through IDrawingCanvas / DrawingCanvas<TPixel>, a stateful canvas that queues draw commands and flushes them as a batch.

Via Image.Mutate() (most common)

using SixLabors.ImageSharp.Drawing;
using SixLabors.ImageSharp.Drawing.Processing;

image.Mutate(ctx => ctx.ProcessWithCanvas(canvas =>
{
    // Fill a path
    canvas.Fill(Brushes.Solid(Color.Red), new EllipsePolygon(200, 200, 100));

    // Stroke a path
    canvas.Draw(Pens.Solid(Color.Blue, 3), new RectangularPolygon(50, 50, 200, 100));

    // Draw a polyline
    canvas.DrawLine(Pens.Solid(Color.Green, 2), new PointF(0, 0), new PointF(100, 100));

    // Draw text
    canvas.DrawText(
        new RichTextOptions(font) { Origin = new PointF(10, 10) },
        "Hello, World!",
        brush: Brushes.Solid(Color.Black),
        pen: null);

    // Draw an image
    canvas.DrawImage(sourceImage, sourceRect, destinationRect);

    // Save/Restore state (options, clip paths)
    canvas.Save(new DrawingOptions
    {
        GraphicsOptions = new GraphicsOptions { BlendPercentage = 0.5f }
    });
    canvas.Fill(brush, path);
    canvas.Restore();

    // Apply arbitrary image processing to a path region
    canvas.Process(path, inner => inner.Brightness(0.5f));

    // Commands are flushed on Dispose (or call canvas.Flush() explicitly)
}));

Standalone usage (without Image.Mutate)

DrawingCanvas<TPixel> can be constructed directly against an image frame:

using var canvas = DrawingCanvas<Rgba32>.FromRootFrame(image, new DrawingOptions());

canvas.Fill(brush, path);
canvas.Draw(pen, path);
canvas.Flush();
using var canvas = DrawingCanvas<Rgba32>.FromImage(image, frameIndex: 0, new DrawingOptions());
// ...
using var canvas = DrawingCanvas<Rgba32>.FromFrame(frame, new DrawingOptions());
// ...

Canvas state management

The canvas supports a save/restore stack (similar to HTML Canvas or SkCanvas):

int saveCount = canvas.Save();             // push current state
canvas.Save(options, clipPath1, clipPath2); // push and replace state

canvas.Restore();              // pop one level
canvas.RestoreTo(saveCount);   // pop to a specific level

State includes DrawingOptions (graphics options, shape options, transform) and clip paths. SaveLayer creates an offscreen layer that composites back on Restore.

IDrawingBackend — bring your own renderer

The library's rasterization and composition pipeline is abstracted behind IDrawingBackend. This interface has the following methods:

Method Purpose
FlushCompositions<TPixel> Flushes queued composition operations for the target.
TryReadRegion<TPixel> Read pixels back from the target (needed for Process() and DrawImage()).

The library ships with DefaultDrawingBackend (CPU, tiled fixed-point rasterizer). An experimental WebGPU compute-shader backend (ImageSharp.Drawing.WebGPU) is also available, demonstrating how alternate backends plug in. Users can provide their own implementations — for example, GPU-accelerated backends, SVG emitters, or recording/replay layers.

Backends are registered on Configuration:

configuration.SetDrawingBackend(myCustomBackend);

Migration guide

Old API New API
ctx.Fill(color, path) ctx.ProcessWithCanvas(c => c.Fill(Brushes.Solid(color), path))
ctx.Fill(brush, path) ctx.ProcessWithCanvas(c => c.Fill(brush, path))
ctx.Draw(pen, path) ctx.ProcessWithCanvas(c => c.Draw(pen, path))
ctx.DrawLine(pen, points) ctx.ProcessWithCanvas(c => c.DrawLine(pen, points))
ctx.DrawPolygon(pen, points) ctx.ProcessWithCanvas(c => c.Draw(pen, new Polygon(new LinearLineSegment(points))))
ctx.FillPolygon(brush, points) ctx.ProcessWithCanvas(c => c.Fill(brush, new Polygon(new LinearLineSegment(points))))
ctx.DrawText(text, font, color, origin) ctx.ProcessWithCanvas(c => c.DrawText(new RichTextOptions(font) { Origin = origin }, text, Brushes.Solid(color), null))
ctx.DrawImage(overlay, opacity) ctx.ProcessWithCanvas(c => c.DrawImage(overlay, sourceRect, destRect))
Multiple independent draw calls Single ProcessWithCanvas block — commands are batched and flushed together

Other breaking changes in this PR

  • AntialiasSubpixelDepth removed — The rasterizer now uses a fixed 256-step (8-bit) subpixel depth. The old AntialiasSubpixelDepth property (default: 16) controlled how many vertical subpixel steps the rasterizer used per pixel row. The new fixed-point scanline rasterizer integrates area/cover analytically per cell rather than sampling at discrete subpixel rows, so the "depth" is a property of the coordinate precision (24.8 fixed-point), not a tunable sample count. 256 steps gives ~0.4% coverage granularity — more than sufficient for all practical use cases. The old default of 16 (~6.25% granularity) could produce visible banding on gentle slopes.
  • GraphicsOptions.Antialias — now controls RasterizationMode (antialiased vs aliased). When false, coverage is snapped to binary using AntialiasThreshold.
  • GraphicsOptions.AntialiasThreshold — new property (0–1, default 0.5) controlling the coverage cutoff in aliased mode. Pixels with coverage at or above this value become fully opaque; pixels below are discarded.

Benchmarks

All benchmarks run under the following environment.

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.26200
Unknown processor
.NET SDK=10.0.103
  [Host] : .NET 8.0.24 (8.0.2426.7010), X64 RyuJIT

Toolchain=InProcessEmitToolchain  InvocationCount=1  IterationCount=40
LaunchCount=3  UnrollFactor=1  WarmupCount=40

DrawPolygonAll - Renders a 7200x4800px path of the state of Mississippi with a 2px stroke.

Method Mean Error StdDev Median Ratio RatioSD
SkiaSharp 42.20 ms 2.197 ms 6.976 ms 38.18 ms 1.00 0.00
SystemDrawing 44.10 ms 0.172 ms 0.538 ms 44.05 ms 1.07 0.16
ImageSharp 12.09 ms 0.083 ms 0.269 ms 12.06 ms 0.29 0.05
ImageSharpWebGPU 12.47 ms 0.291 ms 0.940 ms 12.71 ms 0.30 0.05

FillParis - Renders 1096x1060px scene containing 50K fill paths.

Method Mean Error StdDev Ratio RatioSD
SkiaSharp 104.46 ms 0.356 ms 1.145 ms 1.00 0.00
SystemDrawing 148.53 ms 0.327 ms 1.033 ms 1.42 0.02
ImageSharp 66.32 ms 0.999 ms 3.083 ms 0.64 0.03
ImageSharpWebGPU 41.95 ms 0.457 ms 1.368 ms 0.40 0.01

@JimBobSquarePants
Copy link
Copy Markdown
Member Author

CPU is now faster than SkiaSharp on my machine

That superiority is only part of the picture because it assumes many available cores and a single user. As always, I'm skeptical about algorithm-level parallelization on CPU because it may actually hurt perf for services under high load. IMO the blog post describing Blaze lacks rigor by omitting the analysis on how the algorithm scales with no. of threads, which makes my skepticism even stronger. The author doesn't seem to consider server-side applications, I assume it's not his area of focus, but for us, the main application is likely still server.

We need an empirical proof that the algorithm has a good parallel efficiency/speedup. Ideally there should be a parametric benchmark showing how the algorithm it scales by adding threads. There should be a sweet spot where efficiency is good enough and I don't think the sweet spot is as high as ProcessorCount because the scaling is rarely linear. If I will be proven wrong, that would be good news of course.

@antonfirsov

OK... I've wired the backend up so it respects MaxDegreeOfParallelism from the configuration everywhere

I've also removed the multiple parallel steps in the CPU scene builder. There's now only 2 that run always and one optional one when clipping or dashing is required.

  • Command preparation (path clipping, dashing)
  • Geometry linearization (building retained rasterizable data)
  • Row-band execution (rasterization + brush composition)

All three using the same pattern:

int partitionCount = Math.Min(
    workItemCount,
    requestedParallelism == -1 ? Environment.ProcessorCount : requestedParallelism);

Parallel.For(
    0,
    count,
    new ParallelOptions { MaxDegreeOfParallelism = partitionCount },
    ...);

Given that the parallel approach we take is used for almost all image processing operations I think that if this model caused problems under server load, it would have shown up years ago across the entire library.

On top of that I've managed to massively improve performance and reduce the task workload by doing a few things.

  • Fused stroking and transforming with rasterization. We no longer pay the up-front cost of calling the mapping to-and-from clipping library types and stroking during batch preparation. This is true for both GPU and CPU.
  • For CPU I've removed the per-row cost of renting and returning the Vector4 buffer during pixel blending by adding an overload to the API in ImageSharp that allows passing a buffer. Our solid brush does not need a color buffer anymore either.
  • For GPU I've copied the exact same memory setup Vello uses for our initial memory arenas and scale them based on the exact requirements returned by the processor pipeline should the allocation be too small. We also chunk processing for massive scenes. This matches some ongoing work that the Vello team are doing. The arenas are also shared across flushes in a threadsafe manner which improves processing time.

I'm getting good competitive numbers across all test scenarios.

@antonfirsov
Copy link
Copy Markdown
Member

antonfirsov commented Apr 10, 2026

Given that the parallel approach we take is used for almost all image processing operations I think that if this model caused problems under server load, it would have shown up years ago across the entire library.

As seen in SixLabors/ImageSharp#3111, most but not all processors react well to parallelization. To be on the safe side, I want to run those benchmarks against the new rasterizer.

@JimBobSquarePants
Copy link
Copy Markdown
Member Author

Given that the parallel approach we take is used for almost all image processing operations I think that if this model caused problems under server load, it would have shown up years ago across the entire library.

As seen in SixLabors/ImageSharp#3111, most but not all processors react well to parallelization. To be on the safe side, I want to run those benchmarks against the new rasterizer.

This should scale better than the poor examples given the sheer amount of work that takes place per task, but it would be good to see.

Copy link
Copy Markdown
Member

@antonfirsov antonfirsov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests:

I'm not sure if all test output looks as expected and I'm somewhat worried if the new test suite gives enough confidence in the implementation, since it seems to be much smaller than the deleted tests.

It shouldn't be hard to migrate the existing tests to the new APIs with LLM-s, or at least a reasonable subset. That coverage is valuable and helps to protect us with changes of this size.

/// <returns>
/// A new array containing the elements of both source arrays.
/// </returns>
public static T[] Merge<T>(this T[] source1, T[] source2)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be Concat in .NET terminology. From "merging arrays" I usually associate to this.

Comment on lines +29 to +37
for (int i = 0; i < source1.Length; i++)
{
target[i] = source1[i];
}

for (int i = 0; i < source2.Length; i++)
{
target[i + source1.Length] = source2[i];
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be replaced by array or span copy.

@@ -0,0 +1,120 @@

Microsoft Visual Studio Solution File, Format Version 12.00
Copy link
Copy Markdown
Member

@antonfirsov antonfirsov Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see many benefits from this isolation: There aren't that many projects to make IDE-s struggle, and if changing product code working with the other solution breaks samples, they would need to be updated manually.

Copy link
Copy Markdown
Member

@antonfirsov antonfirsov Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks odd to me. Are you sure it's the correct output? If it's wrong, is it possible that there are more reference images that do not represent our intention?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be empty?

namespace SixLabors.ImageSharp.Drawing.Tests.Processing;

[GroupOutput("Drawing")]
public partial class DrawingCanvasTests
Copy link
Copy Markdown
Member

@antonfirsov antonfirsov Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The amount of pictures in the output is very low compared to what we had previously. I understand these tests try to be more "dense" but there was significant value in the previous per-functionality testing.

I think we are losing a lot of coverage by deleting that test suite entirely.

Copy link
Copy Markdown
Member

@antonfirsov antonfirsov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This review mostly focuses on WebGPU utility code.

There is one unexplored feature: how would one create a NativeSurface/Canvas around an existing WindowHandle? This is important if someone wants to render to an existing area within a Windows application (WinForms, WPF, WinUI) via GPU.

I also noticed many odd behaviors and (code) quality issues done by LLM-s. IMO the amount of how many of those will slip through to a release (and potentially to APIs) is proportional to the speed you want to go with so I wonder how much compromise are you willing to make in order to get things released quickly? I'm worried that given the scale, there might be still many issues in this PR, which will may create significant technical debt if not dealt with now. This is basically the million dollar question of AI assisted coding today.

/// <param name="workItemCount">The total number of work items available for partitioning.</param>
/// <returns>The number of partitions to schedule.</returns>
public static int GetPartitionCount(int maxDegreeOfParallelism, int workItemCount)
=> maxDegreeOfParallelism == -1 ? workItemCount : Math.Min(maxDegreeOfParallelism, workItemCount);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand it right that when workItemCount = scene.RowCount, it means the number of rows to render. I don't see why would one need MaxDegreeOfParallelism to be that high, am I missing something?

See my comment on SixLabors/ImageSharp#3110.

/// <inheritdoc />
protected override void OnFrameApply(ImageFrame<TPixel> source)
{
using DrawingCanvas<TPixel> canvas = source.CreateCanvas(this.definition.Options);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not propagate this.Configuration so anything we pass to Mutate/Clone is gone. Would might make sense to test the propagation too.

/// reinitialized later by calling <see cref="Acquire"/> again.
/// </remarks>
/// <exception cref="InvalidOperationException">Thrown when runtime leases are still active.</exception>
public static void Shutdown()
Copy link
Copy Markdown
Member

@antonfirsov antonfirsov Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unused, consequentially leaseCount and the whole Lease mechanism is pointless.

/// Canvas frame adapter that exposes both a CPU region and a native surface.
/// </summary>
/// <typeparam name="TPixel">The pixel format.</typeparam>
public sealed class HybridCanvasFrame<TPixel> : ICanvasFrame<TPixel>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm failing to find the intended use-case. Is there any test or example code using these?

public static void Main()
{
// FIFO is the safest sample default: it presents in display order with normal v-sync behavior.
using WebGPUWindow<Bgra32> window = new(new WebGPUWindowOptions
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can one make this fullscreen?

/// <param name="destination">The destination image that receives the readback pixels.</param>
/// <param name="error">Receives the failure reason when readback cannot complete.</param>
/// <returns><see langword="true"/> when readback succeeds; otherwise <see langword="false"/>.</returns>
public bool TryReadbackInto(Image<TPixel> destination, [NotNullWhen(false)] out string? error)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is very unusual to pass around error information like this in .NET. There should be either error codes or exceptions.

return;
}

WebGPUTextureTransfer.Release(this.TextureHandle, this.TextureViewHandle);
Copy link
Copy Markdown
Member

@antonfirsov antonfirsov Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dispose is problematic, since it would leak native handles when missed. The old-school practice to deal with is to implement the dispose pattern, however, this is actively discouraged today -- the preferred way is to wrap handles in a SafeHandle since they allow to increase refcounts before P/Invoke, so killing a handle during an outstanding P/Invoke would not lead to a a crash.

The fact that this is likely an LLM-made mistake is very alarming to me. It's not something we would normally miss.

@antonfirsov
Copy link
Copy Markdown
Member

antonfirsov commented Apr 10, 2026

API

I find the ProcessWithCanvas method oddly verbose and hard to discover. Since the user actually wants to draw things here, how about naming the method Draw? Other simple names may also do the job.

image.Mutate(ctx => ctx.Draw(canvas =>
{
    // Fill a path
    canvas.Fill(Brushes.Solid(Color.Red), new EllipsePolygon(200, 200, 100));

    // Stroke a path
    canvas.Draw(Pens.Solid(Color.Blue, 3), new RectangularPolygon(50, 50, 200, 100));

    // Draw a polyline
    canvas.DrawLine(Pens.Solid(Color.Green, 2), new PointF(0, 0), new PointF(100, 100));
}));

Also, the image.Mutate(ctx => ctx.ProcessWithCanvas(canvas => line together with the closing parantheses is somewhat of an ergonomy killer. I think most of the cases, Drawing users want to only draw things in their Mutate calls without chaining in further processing calls. It would be nice to provide helpers to avoid using double-delegates, for example .MutateDraw() would look like this:

image.MutateDraw(canvas =>
{
    // Fill a path
    canvas.Fill(Brushes.Solid(Color.Red), new EllipsePolygon(200, 200, 100));

    // Stroke a path
    canvas.Draw(Pens.Solid(Color.Blue, 3), new RectangularPolygon(50, 50, 200, 100));

    // Draw a polyline
    canvas.DrawLine(Pens.Solid(Color.Green, 2), new PointF(0, 0), new PointF(100, 100));
});

/// <param name="next">The next block in the retained chain.</param>
public unsafe LineArrayX32Y16Block(LineArrayX32Y16Block? next)
{
this.lines = (PackedLineX32Y16*)NativeMemory.Alloc((nuint)LineCount, (nuint)sizeof(PackedLineX32Y16));
Copy link
Copy Markdown
Member

@antonfirsov antonfirsov Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the size is constant, this could be an InlineArray to avoid unsafe native allocations and memory fragmentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

2 participants