Skip to content

CP-313264: introduce faster migration datapath using kernel TLS (kTLS) instead of stunnel#7154

Open
mg12 wants to merge 2 commits into
xapi-project:masterfrom
mg12:cp-313264-migration-ktls
Open

CP-313264: introduce faster migration datapath using kernel TLS (kTLS) instead of stunnel#7154
mg12 wants to merge 2 commits into
xapi-project:masterfrom
mg12:cp-313264-migration-ktls

Conversation

@mg12

@mg12 mg12 commented Jun 30, 2026

Copy link
Copy Markdown
Member

stunnel is an external process that provides secure TLS transport for the xenguest VM-migrate stream. It's a separate process from xenguest, and the dom0 kernel pipes the data between these processes, wasting cpu and memory throughput. The result is a slower VM-migrate experience, and consequently a slower host-evacuate experience for the user.

This change removes this inefficient data pipe between stunnel and xenguest, with a new option to replace stunnel's TLS with kernel TLS (kTLS). When this ktls option is enabled, the kernel transparently uses TLS to securely transport the data produced by xenguest, without the need to modify xenguest and without the need to use stunnel.

Measurements indicate significant improvements:

  • host-evacuate time: ~1.50x faster (stunnel/kTLS), depending on the load of the guests (higher load the better, as there is more guest data pages being transmitted, see details in the design page migration-tls.md).
  • dom0 cpu usage: ~20% less (see details in the design page migration-tls.md).

Activation uses a new xenopsd.conf flag, so the existing stunnel datapath remains the default.

Tested with:

  • performance tests
  • ring3 BVT

mg12ctx added 2 commits June 29, 2026 18:24
stunnel is an external process that provides secure TLS transport for the
xenguest VM-migrate stream. It's a separate process from xenguest, and
the dom0 kernel pipes the data between these processes, wasting cpu and
memory throughput. The result is a slower VM-migrate experience, and
consequently a slower host-evacuate experience for the user.

This change removes this inefficient data pipe between stunnel and xenguest,
with a new option to replace stunnel's TLS with kernel TLS (kTLS). When
this ktls option is enabled, the kernel transparently uses TLS to securely
transport the data produced by xenguest, without the need to modify
xenguest and without the need to use stunnel.

This new ktls option uses a small C helper that:
  (a) uses the same stunnel's OpenSSL library to perform the TLS handshake
      and install the symmetric key into the kernel to perform kTLS, and
  (b) hands the kTLS-enabled socket fd back to xenopsd over SCM_RIGHTS,
      which stunnel can't do.
xenopsd then treats the fd as an ordinary TCP socket; the kernel encrypts
on the way out so the stunnel pipe and the extra data copies disappear
from the xenguest sender side. The receiver side for now still uses stunnel
on the destination host, and can be updated to use kTLS as well in the
future.

Measurements indicate significant improvements:
* host-evacuate time: ~1.50x faster (stunnel/kTLS), depending on the load
        of the guests (higher load the better, as there is more guest data
        pages being transmitted, see details in the design page).
* dom0 cpu usage: ~20% less (see details in the design page).

Activation uses a new xenopsd.conf flag so the existing path remains the
default and the two paths can be A/B compared on the same build:

    migration-tls = "ktls"      # use the helper
    migration-tls = "stunnel"   # explicit default
    migration-tls = ""          # currently defaults to original "stunnel"

If the helper fails for any reason before the fd is handed to the
migration (binary missing, TLS handshake error, tls.ko not loaded, cipher
rejected by the kernel, SCM_RIGHTS message lost, helper timeout)
then xenopsd logs a single warn line:

    migration-tls=ktls connect failed for <host>:<port> (<reason>);
    falling back to stunnel for this connection

and transparently falls back to the original Open_uri.with_open_uri so
the migration still proceeds using stunnel as before for the sender.

Signed-off-by: Marcus Granado <marcus.granado@citrix.com>
Signed-off-by: Marcus Granado <marcus.granado@citrix.com>

@andyhhp andyhhp left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The headline improvement here speaks for itself, and the proposed approach seems like an obvious way forward.

How applicable is this to other streams? Migrate might be the bulkiest data but it's not the only bulk data which is bouncing around userspace pipes. Storage migration also comes to mind, and RRDs too.

Comment on lines +34 to +36
socket with `setsockopt(SOL_TLS, ...)`, the kernel encrypts/decrypts every
subsequent `read`/`write` transparently, using AES-NI, producing the same
byte stream stunnel produces today.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What visibility/debuggibility do we have on the kernel's choice of algorithm?

This approach gets more valuable when we ensure that dom0 can see and use all hardware accelerations. i.e. we probably want to be more proactive at checking details like this (and fixing if necessary) as part of supporting new CPUs.

Comment on lines +118 to +122
Conclusion: design C (kTLS) seems superior, as it reaches the goal with fewer
changes, is toolstack-only (no xen-devel upstream loop, no future xenguest
security maintenance), is a small focused tool rather than a change to the
highly-complex xenguest, leaves the stunnel option in place, and opens the way
to kTLS hardware offload later.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that option C is the obvious choice. Option A is going to encounter firm resistance upstream.

stream is byte-identical to stunnel's and is accepted by the unchanged
destination stunnel.

* same handshake: TLS 1.2, cipher `ECDHE-RSA-AES256-GCM-SHA384`, ECDHE group

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps not relevant to this PR, but why are we wasting time/effort with SHA384?

It's the SHA512 computation with the upper bits discarded at the end, so is strictly worse (security wise) without recovering any performance

| idle | 185s | 163s | 1.13x |
| medium (windows apps) | 249s | 171s | 1.46x |
| high (synthetic page thrasher) | 1341s | 835s | 1.60x |

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has times, but no information about the OS or VM size. I presume Win11, but the VM size matters greatly for measurements like this

Comment on lines +245 to +246
* kTLS hardware-offload NICs could move the bulk encryption off the cpu and make
the transfer faster still.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on this some more? What should we be looking out for?

and is never more permissive. *)
["--no-verify"]
| Some cfg ->
["--ca"; cfg.Stunnel.cert_bundle_path]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be able to use peer (pined) certificates as well as CA ones? This is probably going to be useful for cros-pool migrations

See https://github.com/xapi-project/xen-api/blob/master/doc/content/design/trusted-certificates.md?plain=1#L228

let close_ignore fd =
Xapi_stdext_pervasives.Pervasiveext.ignore_exn (fun () -> Unix.close fd)

let finally = Xapi_stdext_pervasives.Pervasiveext.finally

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be reframed to avoid indentation on use:

let protect ~finally protected =  Xapi_stdext_pervasives.Pervasiveext.finally protected finally

let ( let@ ) f x = f x

then on use:

  let@ () = protect ~finally:(fun () -> close_ignore sock_xenopsd) in
  let received_fd =
    try ...

stream is byte-identical to stunnel's and is accepted by the unchanged
destination stunnel.

* same handshake: TLS 1.2, cipher `ECDHE-RSA-AES256-GCM-SHA384`, ECDHE group

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the allowed ciphers still come from the same single source of truth?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants