x64: Refactor backend condition handling#11097
Merged
fitzgen merged 1 commit intobytecodealliance:mainfrom Jun 24, 2025
Merged
Conversation
This commit represents a refactor of the x64 backend's handling of
condition codes for various instructions with a goal of simplifying
conversion of a `Value` to a condition code used for `select` and
`brif`. This additionally tries to deduplicate handling of condition
codes across `icmp` and `fcmp` for example.
Previously the x64 backend had types such as `IcmpCondResult` and
`FcmpCondResult` which respresented the condition codes for the `icmp`
and `fcmp` values. When used in `select`, `brif`, or `trap{n,}z` each
lowering had to special case `icmp` and `fcmp` to create these
`*CondResult` values and call specialized helpers to lower the value
back. Furthermore on the `select` instruction this created a matrix of
ways-to-produce-the-condition-code along with
ways-to-handle-the-condition-code to conditionally move GPR, XMM, or
128-bit values. The end result was a fair bit of duplication across
rules and optimizations on some rules but not others. For example `brif`
on a `vall_true` value was optimized but `select` was not.
The design implemented in this commit is inspired/modeled after what the
riscv64 backend is currently doing. There is a top-level helper
`is_nonzero_cmp` which takes a `Value` and produces a `CondResult`. This
`CondResult` is a merging of `IcmpCondResult` and `FcmpCondResult` into
one. This centralizes the ability to take any value and produce
condition codes, for example this handles `icmp`, `fcmp`, 128-bit
integers, `v{all,any}_true`, etc. Once the condition is produced it's
then handled separately at each location for jumps, selects, traps, etc.
In effect `CondResult` serves as a "narrow waist" through which
production of condition codes can all flow meaning that if an
optimization is added for production of condition codes all instructions
benefit instead of having to hand-update each one.
The goal of this refactoring was to not actually change any lowerings
and only refactor internals, but this refactoring exposed a few
non-optimal lowerings in the backend which were improved as a result of
this change. In the future I plan on additionally adding more ways to
pattern-match "produces a condition" which will now equally benefit all
of these locations.
alexcrichton
added a commit
to alexcrichton/wasmtime
that referenced
this pull request
Jul 15, 2025
This commit fixes an accidental regression from bytecodealliance#11097 where a `select_spectre_guard` with a boolean condition that and'd two CCs together would fail to lower and cause a panic during lowering. This was reachable when explicit bounds checks are enabled from wasm, for example. The fix here is to handle the `And` condition in the same way that lowering `select` does which is to model that as it flows into the select helper.
github-merge-queue bot
pushed a commit
that referenced
this pull request
Jul 15, 2025
This commit fixes an accidental regression from #11097 where a `select_spectre_guard` with a boolean condition that and'd two CCs together would fail to lower and cause a panic during lowering. This was reachable when explicit bounds checks are enabled from wasm, for example. The fix here is to handle the `And` condition in the same way that lowering `select` does which is to model that as it flows into the select helper.
alexcrichton
added a commit
to alexcrichton/wasmtime
that referenced
this pull request
Jul 15, 2025
…liance#11242) This commit fixes an accidental regression from bytecodealliance#11097 where a `select_spectre_guard` with a boolean condition that and'd two CCs together would fail to lower and cause a panic during lowering. This was reachable when explicit bounds checks are enabled from wasm, for example. The fix here is to handle the `And` condition in the same way that lowering `select` does which is to model that as it flows into the select helper.
alexcrichton
added a commit
that referenced
this pull request
Jul 22, 2025
…11248) This commit fixes an accidental regression from #11097 where a `select_spectre_guard` with a boolean condition that and'd two CCs together would fail to lower and cause a panic during lowering. This was reachable when explicit bounds checks are enabled from wasm, for example. The fix here is to handle the `And` condition in the same way that lowering `select` does which is to model that as it flows into the select helper.
bongjunj
pushed a commit
to prosyslab/wasmtime
that referenced
this pull request
Oct 20, 2025
This commit represents a refactor of the x64 backend's handling of
condition codes for various instructions with a goal of simplifying
conversion of a `Value` to a condition code used for `select` and
`brif`. This additionally tries to deduplicate handling of condition
codes across `icmp` and `fcmp` for example.
Previously the x64 backend had types such as `IcmpCondResult` and
`FcmpCondResult` which respresented the condition codes for the `icmp`
and `fcmp` values. When used in `select`, `brif`, or `trap{n,}z` each
lowering had to special case `icmp` and `fcmp` to create these
`*CondResult` values and call specialized helpers to lower the value
back. Furthermore on the `select` instruction this created a matrix of
ways-to-produce-the-condition-code along with
ways-to-handle-the-condition-code to conditionally move GPR, XMM, or
128-bit values. The end result was a fair bit of duplication across
rules and optimizations on some rules but not others. For example `brif`
on a `vall_true` value was optimized but `select` was not.
The design implemented in this commit is inspired/modeled after what the
riscv64 backend is currently doing. There is a top-level helper
`is_nonzero_cmp` which takes a `Value` and produces a `CondResult`. This
`CondResult` is a merging of `IcmpCondResult` and `FcmpCondResult` into
one. This centralizes the ability to take any value and produce
condition codes, for example this handles `icmp`, `fcmp`, 128-bit
integers, `v{all,any}_true`, etc. Once the condition is produced it's
then handled separately at each location for jumps, selects, traps, etc.
In effect `CondResult` serves as a "narrow waist" through which
production of condition codes can all flow meaning that if an
optimization is added for production of condition codes all instructions
benefit instead of having to hand-update each one.
The goal of this refactoring was to not actually change any lowerings
and only refactor internals, but this refactoring exposed a few
non-optimal lowerings in the backend which were improved as a result of
this change. In the future I plan on additionally adding more ways to
pattern-match "produces a condition" which will now equally benefit all
of these locations.
bongjunj
pushed a commit
to prosyslab/wasmtime
that referenced
this pull request
Oct 20, 2025
…liance#11242) This commit fixes an accidental regression from bytecodealliance#11097 where a `select_spectre_guard` with a boolean condition that and'd two CCs together would fail to lower and cause a panic during lowering. This was reachable when explicit bounds checks are enabled from wasm, for example. The fix here is to handle the `And` condition in the same way that lowering `select` does which is to model that as it flows into the select helper.
cfallin
added a commit
to cfallin/wasmtime
that referenced
this pull request
Jan 13, 2026
We had a set of rules introduced in bytecodealliance#11097 that attempted to optimize the case of testing the result of an `icmp` for a nonzero value. This allowed optimization of, for example, `(((x == 0) == 0) == 0 ...)` to a single level, either `x == 0` or `x != 0` depending on even/odd nesting depth. Unfortunately this kind of recursion in the backend has a depth bounded only by the user input, hence creates a DoS vulnerability: the wrong kind of compiler input can cause a stack overflow in Cranelift at compilation time. This case is reachable from Wasmtime's Wasm frontend via the `i32.eqz` operator (for example) as well. Ideally, this kind of deep rewrite is best done in our mid-end optimizer, where we think carefully about bounds for recursive rewrites. The left-hand sides for the backend rules should really be fixed shapes that correspond to machine instructions, rather than ad-hoc peephole optimizations in their own right. This fix thus simply removes the recursion case that causes the blowup. The patch includes two tests: one with optimizations disabled, showing correct compilation (without the fix, this case fails to compile with a stack overflow), and one with optimizations enabled, showing that the mid-end properly cleans up the nested expression and we get the expected one-level result anyway.
github-merge-queue bot
pushed a commit
that referenced
this pull request
Jan 13, 2026
* Cranelift: x64: fix user-controlled recursion in cmp emission. We had a set of rules introduced in #11097 that attempted to optimize the case of testing the result of an `icmp` for a nonzero value. This allowed optimization of, for example, `(((x == 0) == 0) == 0 ...)` to a single level, either `x == 0` or `x != 0` depending on even/odd nesting depth. Unfortunately this kind of recursion in the backend has a depth bounded only by the user input, hence creates a DoS vulnerability: the wrong kind of compiler input can cause a stack overflow in Cranelift at compilation time. This case is reachable from Wasmtime's Wasm frontend via the `i32.eqz` operator (for example) as well. Ideally, this kind of deep rewrite is best done in our mid-end optimizer, where we think carefully about bounds for recursive rewrites. The left-hand sides for the backend rules should really be fixed shapes that correspond to machine instructions, rather than ad-hoc peephole optimizations in their own right. This fix thus simply removes the recursion case that causes the blowup. The patch includes two tests: one with optimizations disabled, showing correct compilation (without the fix, this case fails to compile with a stack overflow), and one with optimizations enabled, showing that the mid-end properly cleans up the nested expression and we get the expected one-level result anyway. * Preserve codegen on branches. This change works by splitting a rule so that the entry point used by `brif` lowering can still peel off one layer of `icmp` and emit it directly, without entering the unbounded structural recursion. It also adds a mid-end rule to catch one case that we were previously catching in the backend only: `fcmp(...) != 0`.
cfallin
added a commit
to cfallin/wasmtime
that referenced
this pull request
Jan 13, 2026
…odealliance#12333) * Cranelift: x64: fix user-controlled recursion in cmp emission. We had a set of rules introduced in bytecodealliance#11097 that attempted to optimize the case of testing the result of an `icmp` for a nonzero value. This allowed optimization of, for example, `(((x == 0) == 0) == 0 ...)` to a single level, either `x == 0` or `x != 0` depending on even/odd nesting depth. Unfortunately this kind of recursion in the backend has a depth bounded only by the user input, hence creates a DoS vulnerability: the wrong kind of compiler input can cause a stack overflow in Cranelift at compilation time. This case is reachable from Wasmtime's Wasm frontend via the `i32.eqz` operator (for example) as well. Ideally, this kind of deep rewrite is best done in our mid-end optimizer, where we think carefully about bounds for recursive rewrites. The left-hand sides for the backend rules should really be fixed shapes that correspond to machine instructions, rather than ad-hoc peephole optimizations in their own right. This fix thus simply removes the recursion case that causes the blowup. The patch includes two tests: one with optimizations disabled, showing correct compilation (without the fix, this case fails to compile with a stack overflow), and one with optimizations enabled, showing that the mid-end properly cleans up the nested expression and we get the expected one-level result anyway. * Preserve codegen on branches. This change works by splitting a rule so that the entry point used by `brif` lowering can still peel off one layer of `icmp` and emit it directly, without entering the unbounded structural recursion. It also adds a mid-end rule to catch one case that we were previously catching in the backend only: `fcmp(...) != 0`.
cfallin
added a commit
to cfallin/wasmtime
that referenced
this pull request
Jan 13, 2026
…odealliance#12333) * Cranelift: x64: fix user-controlled recursion in cmp emission. We had a set of rules introduced in bytecodealliance#11097 that attempted to optimize the case of testing the result of an `icmp` for a nonzero value. This allowed optimization of, for example, `(((x == 0) == 0) == 0 ...)` to a single level, either `x == 0` or `x != 0` depending on even/odd nesting depth. Unfortunately this kind of recursion in the backend has a depth bounded only by the user input, hence creates a DoS vulnerability: the wrong kind of compiler input can cause a stack overflow in Cranelift at compilation time. This case is reachable from Wasmtime's Wasm frontend via the `i32.eqz` operator (for example) as well. Ideally, this kind of deep rewrite is best done in our mid-end optimizer, where we think carefully about bounds for recursive rewrites. The left-hand sides for the backend rules should really be fixed shapes that correspond to machine instructions, rather than ad-hoc peephole optimizations in their own right. This fix thus simply removes the recursion case that causes the blowup. The patch includes two tests: one with optimizations disabled, showing correct compilation (without the fix, this case fails to compile with a stack overflow), and one with optimizations enabled, showing that the mid-end properly cleans up the nested expression and we get the expected one-level result anyway. * Preserve codegen on branches. This change works by splitting a rule so that the entry point used by `brif` lowering can still peel off one layer of `icmp` and emit it directly, without entering the unbounded structural recursion. It also adds a mid-end rule to catch one case that we were previously catching in the backend only: `fcmp(...) != 0`.
cfallin
added a commit
to cfallin/wasmtime
that referenced
this pull request
Jan 13, 2026
…odealliance#12333) * Cranelift: x64: fix user-controlled recursion in cmp emission. We had a set of rules introduced in bytecodealliance#11097 that attempted to optimize the case of testing the result of an `icmp` for a nonzero value. This allowed optimization of, for example, `(((x == 0) == 0) == 0 ...)` to a single level, either `x == 0` or `x != 0` depending on even/odd nesting depth. Unfortunately this kind of recursion in the backend has a depth bounded only by the user input, hence creates a DoS vulnerability: the wrong kind of compiler input can cause a stack overflow in Cranelift at compilation time. This case is reachable from Wasmtime's Wasm frontend via the `i32.eqz` operator (for example) as well. Ideally, this kind of deep rewrite is best done in our mid-end optimizer, where we think carefully about bounds for recursive rewrites. The left-hand sides for the backend rules should really be fixed shapes that correspond to machine instructions, rather than ad-hoc peephole optimizations in their own right. This fix thus simply removes the recursion case that causes the blowup. The patch includes two tests: one with optimizations disabled, showing correct compilation (without the fix, this case fails to compile with a stack overflow), and one with optimizations enabled, showing that the mid-end properly cleans up the nested expression and we get the expected one-level result anyway. * Preserve codegen on branches. This change works by splitting a rule so that the entry point used by `brif` lowering can still peel off one layer of `icmp` and emit it directly, without entering the unbounded structural recursion. It also adds a mid-end rule to catch one case that we were previously catching in the backend only: `fcmp(...) != 0`.
cfallin
added a commit
to cfallin/wasmtime
that referenced
this pull request
Jan 13, 2026
…odealliance#12333) * Cranelift: x64: fix user-controlled recursion in cmp emission. We had a set of rules introduced in bytecodealliance#11097 that attempted to optimize the case of testing the result of an `icmp` for a nonzero value. This allowed optimization of, for example, `(((x == 0) == 0) == 0 ...)` to a single level, either `x == 0` or `x != 0` depending on even/odd nesting depth. Unfortunately this kind of recursion in the backend has a depth bounded only by the user input, hence creates a DoS vulnerability: the wrong kind of compiler input can cause a stack overflow in Cranelift at compilation time. This case is reachable from Wasmtime's Wasm frontend via the `i32.eqz` operator (for example) as well. Ideally, this kind of deep rewrite is best done in our mid-end optimizer, where we think carefully about bounds for recursive rewrites. The left-hand sides for the backend rules should really be fixed shapes that correspond to machine instructions, rather than ad-hoc peephole optimizations in their own right. This fix thus simply removes the recursion case that causes the blowup. The patch includes two tests: one with optimizations disabled, showing correct compilation (without the fix, this case fails to compile with a stack overflow), and one with optimizations enabled, showing that the mid-end properly cleans up the nested expression and we get the expected one-level result anyway. * Preserve codegen on branches. This change works by splitting a rule so that the entry point used by `brif` lowering can still peel off one layer of `icmp` and emit it directly, without entering the unbounded structural recursion. It also adds a mid-end rule to catch one case that we were previously catching in the backend only: `fcmp(...) != 0`.
cfallin
added a commit
that referenced
this pull request
Jan 14, 2026
… (#12338) * Cranelift: x64: fix user-controlled recursion in cmp emission. We had a set of rules introduced in #11097 that attempted to optimize the case of testing the result of an `icmp` for a nonzero value. This allowed optimization of, for example, `(((x == 0) == 0) == 0 ...)` to a single level, either `x == 0` or `x != 0` depending on even/odd nesting depth. Unfortunately this kind of recursion in the backend has a depth bounded only by the user input, hence creates a DoS vulnerability: the wrong kind of compiler input can cause a stack overflow in Cranelift at compilation time. This case is reachable from Wasmtime's Wasm frontend via the `i32.eqz` operator (for example) as well. Ideally, this kind of deep rewrite is best done in our mid-end optimizer, where we think carefully about bounds for recursive rewrites. The left-hand sides for the backend rules should really be fixed shapes that correspond to machine instructions, rather than ad-hoc peephole optimizations in their own right. This fix thus simply removes the recursion case that causes the blowup. The patch includes two tests: one with optimizations disabled, showing correct compilation (without the fix, this case fails to compile with a stack overflow), and one with optimizations enabled, showing that the mid-end properly cleans up the nested expression and we get the expected one-level result anyway. * Preserve codegen on branches. This change works by splitting a rule so that the entry point used by `brif` lowering can still peel off one layer of `icmp` and emit it directly, without entering the unbounded structural recursion. It also adds a mid-end rule to catch one case that we were previously catching in the backend only: `fcmp(...) != 0`.
cfallin
added a commit
that referenced
this pull request
Jan 14, 2026
… (#12341) * Cranelift: x64: fix user-controlled recursion in cmp emission. We had a set of rules introduced in #11097 that attempted to optimize the case of testing the result of an `icmp` for a nonzero value. This allowed optimization of, for example, `(((x == 0) == 0) == 0 ...)` to a single level, either `x == 0` or `x != 0` depending on even/odd nesting depth. Unfortunately this kind of recursion in the backend has a depth bounded only by the user input, hence creates a DoS vulnerability: the wrong kind of compiler input can cause a stack overflow in Cranelift at compilation time. This case is reachable from Wasmtime's Wasm frontend via the `i32.eqz` operator (for example) as well. Ideally, this kind of deep rewrite is best done in our mid-end optimizer, where we think carefully about bounds for recursive rewrites. The left-hand sides for the backend rules should really be fixed shapes that correspond to machine instructions, rather than ad-hoc peephole optimizations in their own right. This fix thus simply removes the recursion case that causes the blowup. The patch includes two tests: one with optimizations disabled, showing correct compilation (without the fix, this case fails to compile with a stack overflow), and one with optimizations enabled, showing that the mid-end properly cleans up the nested expression and we get the expected one-level result anyway. * Preserve codegen on branches. This change works by splitting a rule so that the entry point used by `brif` lowering can still peel off one layer of `icmp` and emit it directly, without entering the unbounded structural recursion. It also adds a mid-end rule to catch one case that we were previously catching in the backend only: `fcmp(...) != 0`.
cfallin
added a commit
that referenced
this pull request
Jan 14, 2026
… (#12339) * Cranelift: x64: fix user-controlled recursion in cmp emission. We had a set of rules introduced in #11097 that attempted to optimize the case of testing the result of an `icmp` for a nonzero value. This allowed optimization of, for example, `(((x == 0) == 0) == 0 ...)` to a single level, either `x == 0` or `x != 0` depending on even/odd nesting depth. Unfortunately this kind of recursion in the backend has a depth bounded only by the user input, hence creates a DoS vulnerability: the wrong kind of compiler input can cause a stack overflow in Cranelift at compilation time. This case is reachable from Wasmtime's Wasm frontend via the `i32.eqz` operator (for example) as well. Ideally, this kind of deep rewrite is best done in our mid-end optimizer, where we think carefully about bounds for recursive rewrites. The left-hand sides for the backend rules should really be fixed shapes that correspond to machine instructions, rather than ad-hoc peephole optimizations in their own right. This fix thus simply removes the recursion case that causes the blowup. The patch includes two tests: one with optimizations disabled, showing correct compilation (without the fix, this case fails to compile with a stack overflow), and one with optimizations enabled, showing that the mid-end properly cleans up the nested expression and we get the expected one-level result anyway. * Preserve codegen on branches. This change works by splitting a rule so that the entry point used by `brif` lowering can still peel off one layer of `icmp` and emit it directly, without entering the unbounded structural recursion. It also adds a mid-end rule to catch one case that we were previously catching in the backend only: `fcmp(...) != 0`.
cfallin
added a commit
that referenced
this pull request
Jan 14, 2026
… (#12342) * Cranelift: x64: fix user-controlled recursion in cmp emission. We had a set of rules introduced in #11097 that attempted to optimize the case of testing the result of an `icmp` for a nonzero value. This allowed optimization of, for example, `(((x == 0) == 0) == 0 ...)` to a single level, either `x == 0` or `x != 0` depending on even/odd nesting depth. Unfortunately this kind of recursion in the backend has a depth bounded only by the user input, hence creates a DoS vulnerability: the wrong kind of compiler input can cause a stack overflow in Cranelift at compilation time. This case is reachable from Wasmtime's Wasm frontend via the `i32.eqz` operator (for example) as well. Ideally, this kind of deep rewrite is best done in our mid-end optimizer, where we think carefully about bounds for recursive rewrites. The left-hand sides for the backend rules should really be fixed shapes that correspond to machine instructions, rather than ad-hoc peephole optimizations in their own right. This fix thus simply removes the recursion case that causes the blowup. The patch includes two tests: one with optimizations disabled, showing correct compilation (without the fix, this case fails to compile with a stack overflow), and one with optimizations enabled, showing that the mid-end properly cleans up the nested expression and we get the expected one-level result anyway. * Preserve codegen on branches. This change works by splitting a rule so that the entry point used by `brif` lowering can still peel off one layer of `icmp` and emit it directly, without entering the unbounded structural recursion. It also adds a mid-end rule to catch one case that we were previously catching in the backend only: `fcmp(...) != 0`.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This commit represents a refactor of the x64 backend's handling of condition codes for various instructions with a goal of simplifying conversion of a
Valueto a condition code used forselectandbrif. This additionally tries to deduplicate handling of condition codes acrossicmpandfcmpfor example.Previously the x64 backend had types such as
IcmpCondResultandFcmpCondResultwhich respresented the condition codes for theicmpandfcmpvalues. When used inselect,brif, ortrap{n,}zeach lowering had to special caseicmpandfcmpto create these*CondResultvalues and call specialized helpers to lower the value back. Furthermore on theselectinstruction this created a matrix of ways-to-produce-the-condition-code along withways-to-handle-the-condition-code to conditionally move GPR, XMM, or 128-bit values. The end result was a fair bit of duplication across rules and optimizations on some rules but not others. For example
brifon avall_truevalue was optimized butselectwas not.The design implemented in this commit is inspired/modeled after what the riscv64 backend is currently doing. There is a top-level helper
is_nonzero_cmpwhich takes aValueand produces aCondResult. ThisCondResultis a merging ofIcmpCondResultandFcmpCondResultinto one. This centralizes the ability to take any value and produce condition codes, for example this handlesicmp,fcmp, 128-bit integers,v{all,any}_true, etc. Once the condition is produced it's then handled separately at each location for jumps, selects, traps, etc. In effectCondResultserves as a "narrow waist" through which production of condition codes can all flow meaning that if an optimization is added for production of condition codes all instructions benefit instead of having to hand-update each one.The goal of this refactoring was to not actually change any lowerings and only refactor internals, but this refactoring exposed a few non-optimal lowerings in the backend which were improved as a result of this change. In the future I plan on additionally adding more ways to pattern-match "produces a condition" which will now equally benefit all of these locations.