Commit graph

220914 commits

Author SHA1 Message Date
Marek Olšák
2aa9ec5018 amd/packets: fix the size of 1-bit bitfields
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15137

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40588>
2026-04-10 03:42:44 +00:00
Alexander Koskovich
f560760b27 freedreno/common: add support for the Adreno 810
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Add support for the Adreno 810 found on the SM7635 (milos).

Signed-off-by: Alexander Koskovich <akoskovich@pm.me>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40613>
2026-04-10 01:24:59 +00:00
Faith Ekstrand
432a298f67 pan/bi: Vectorize more conversions
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40769>
2026-04-09 18:08:40 -04:00
Faith Ekstrand
37dcfcc6d0 pan/bi: Handle vector 16-bit extract_[ui]8
The old implementation only worked for 16-bit because we assumed scalar
so we could stomp the whole destination as if it was 32-bit.  This
version works for v2i16.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40769>
2026-04-09 18:08:40 -04:00
Faith Ekstrand
91e6507665 nir: Add a nir_alu_src_comp_as_uint() helper
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40769>
2026-04-09 18:08:40 -04:00
Faith Ekstrand
567bc7a8df pan/bi: Simplify extract_i8 handling
Now that bi_byte() does the right thing, we can just use it and not
worry about the rest.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40769>
2026-04-09 18:08:40 -04:00
Faith Ekstrand
a88e724b6e pan/nir: Use minimum-width constants instead of scalar
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40769>
2026-04-09 18:08:39 -04:00
Faith Ekstrand
f138d13672 pan/nir: Stop being so conservative about phi scalarizing
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40769>
2026-04-09 18:08:25 -04:00
Faith Ekstrand
eca0575069 pan/nir: Stop doing manual optimization after resize_varying_io
We call bi_optimize_nir() a few lines later.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40769>
2026-04-09 18:07:53 -04:00
Faith Ekstrand
c8f61b6b0e pan/bi: Handle arbitrary size constants
This isn't hard to do and it gives us a lot more flexibility in what NIR
we can consume.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40769>
2026-04-09 18:05:21 -04:00
Faith Ekstrand
0e5626d717 pan/bi: Use nir_src_is_zero()
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40769>
2026-04-09 18:05:20 -04:00
Faith Ekstrand
fb347b8458 nir: Add a couple is_zero() helpers
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40769>
2026-04-09 18:05:20 -04:00
Faith Ekstrand
fd5c6d1223 pan/bi: Support all the swizzles in the packer
Add asserts this time that we don't miss any and that the buckets
actually match the enum in bifrost/compiler.h.

Fixes: 82328a5245 ("pan/bi: Generate instruction packer for new IR")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40769>
2026-04-09 18:05:20 -04:00
Faith Ekstrand
ab285efd1b pan/bi: Add BI_SWIZZLE_NONE
Fixes: 82328a5245 ("pan/bi: Generate instruction packer for new IR")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40769>
2026-04-09 18:05:20 -04:00
Faith Ekstrand
48b2e6b551 pan/bi: Delete BI_SWIZZLE_1123
It appears nowhere so I don't know why we have it in the enum.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40769>
2026-04-09 18:05:20 -04:00
Kenneth Graunke
0b99c88337 nir, brw: lower scratch in NIR
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This will let us share a common scratch swizzling between brw and jay.

Changes by Ken:
- Use an immediate SIMD width when known so we don't need to re-lower
- Switch to load_simd_width_intel because it may not match
  info->api_subgroup_size on Vulkan without VK_EXT_subgroup_size_control
- Stop using DWord Scattered Write messages for scratch.  These take an
  offset in DWords, and our offsets are now always in bytes.  This also
  means that we no longer create MEMORY_OPCODE_* IR with inconsistent
  units of either bytes or dwords.  Yikes.  We use byte scattered
  messages now.

fossil-db stats on Battlemage:

   Instrs: 500477504 -> 500450056 (-0.01%); split: -0.01%, +0.00%
   CodeSize: 7807432368 -> 7806786192 (-0.01%); split: -0.01%, +0.00%
   Cycle count: 62404008370 -> 62398437734 (-0.01%); split: -0.01%, +0.00%
   Fill count: 546690 -> 546695 (+0.00%); split: -0.00%, +0.00%
   Max live registers: 141257956 -> 141258100 (+0.00%); split: -0.00%, +0.00%
   Non SSA regs after NIR: 72350283 -> 72336544 (-0.02%)

   Totals from 99 (0.01% of 1581969) affected shaders:
   Instrs: 366593 -> 339145 (-7.49%); split: -7.58%, +0.09%
   CodeSize: 6425936 -> 5779760 (-10.06%); split: -10.06%, +0.00%
   Cycle count: 2412009876 -> 2406439240 (-0.23%); split: -0.26%, +0.03%
   Fill count: 19675 -> 19680 (+0.03%); split: -0.02%, +0.04%
   Max live registers: 17600 -> 17744 (+0.82%); split: -0.09%, +0.91%
   Non SSA regs after NIR: 37894 -> 24155 (-36.26%)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>
2026-04-09 21:02:16 +00:00
Alyssa Rosenzweig
140616d26a brw: scalarize even 64-bit scratch access
No, I don't know how this worked before, thanks for asking.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>
2026-04-09 21:02:16 +00:00
Alyssa Rosenzweig
15b11635a2 brw: Move intel_nir_opt_peephole_imul32x16 later in compilation
(Split by Ken out of a patch authored by Alyssa.)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>
2026-04-09 21:02:16 +00:00
Kenneth Graunke
e5598166b0 brw: Have brw_nir_apply_key call brw_nir_lower_simd for all stages
brw_nir_apply_key typically knows the dispatch width (it's fixed for
geometry stages, and we clone the NIR for compute and mesh shaders).
For compute/mesh, this was the very next thing called.  For the others,
if we know the width, there's no reason not to lower it.

Scratch lowering will start using load_simd_width_intel soon, so we
need it to work in all stages.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>
2026-04-09 21:02:16 +00:00
Kenneth Graunke
765d74eebe brw: Set nir->info.{min,max}_subgroup_size in brw_nir_apply_key
This records the actual SIMD width we selected for the shader, in
all cases except fragment shaders, where we don't know it yet.

MR 37258 notes that "Backends can update [these fields] when they make
new decisions about the subgroup size" - which is what we now do.

Note that nir->info.api_subgroup_size may be different than min/max
subgroup size on Vulkan prior to SPV1.6/VK_EXT_subgroup_size_control,
so we do not alter that.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>
2026-04-09 21:02:16 +00:00
Kenneth Graunke
d7d2d7aceb brw: Support load_simd_width_intel for fragment shaders
This lets us emit NIR code based on the SIMD size.  For non-fragment
stages, we'll replace it with a constant and optimize, but for FS,
we delay it until the backend.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>
2026-04-09 21:02:16 +00:00
Kenneth Graunke
cac9f670d1 intel/compiler: Use nir_static_workgroup_size helper
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>
2026-04-09 21:02:16 +00:00
Connor Abbott
82b3db7e06 tu: Enable multiviewGeometryShader
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40153>
2026-04-09 20:34:58 +00:00
Connor Abbott
73ab56fd2e tu: Lower maxMultiviewViewCount to 6
With multiview, the HW has to dispatch at least ViewCount * 6 fibers per
primitive, since there are up to 6 VS threads per primitive. The HW can
launch multiple GS waves per VS wave but one VS wave must contain the
entire primitive. With ViewCount = 16 there are 96 fibers per
primitive, which is more than we can launch in one wave. To fix this,
lower the maximum view count.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40153>
2026-04-09 20:34:58 +00:00
Connor Abbott
3433e53da7 tu: Fill GS/DS ViewID register fields
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40153>
2026-04-09 20:34:58 +00:00
Connor Abbott
f497e3913b tu: Adjust multiview lowering for GS
When there is a GS, run multiview lowering for the VS and multiply the
per-primitive varying stride by the view count since all outputs are now
per-view.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40153>
2026-04-09 20:34:58 +00:00
Connor Abbott
be84cb6211 ir3: Support multiview in GS lowering
With GS+multiview, the VS will loop over each view in the shader while
each GS invocation only corresponds to a single view. Varyings for each
view will be stored next to each other in local memory. Implement view
index calculations when lowering VS outputs/GS inputs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40153>
2026-04-09 20:34:58 +00:00
Connor Abbott
bc72ef2ee9 ir3: Implement ViewIndex for GS
For GS, the ViewIndex is passed through from the DS/VS in a
similar manner to PrimID.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40153>
2026-04-09 20:34:58 +00:00
Connor Abbott
2d4bb4cdc6 freedreno: Name GS/DS ViewID register fields
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40153>
2026-04-09 20:34:58 +00:00
Olivia Lee
31ddfe26eb panfrost: don't try to emit varying shader stats on v12+
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
On v12+, IDVS no longer has separate position and varying variants, so
we only need to emit stats for one binary. Attempting to emit stats for
the nonexistent varying shader breaks shader-db.

Fixes: 7819b103fa ("pan/bi: Add support for IDVS2 on Avalon")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40810>
2026-04-09 18:21:12 +00:00
Olivia Lee
43b85b151b panvk/csf: enable allow_merging_workgroups when possible
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Now that all of the additional cases are handled, we can hook up the
allow_merging_workgroups flag in panvk.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Caterina Shablia <caterina.shablia@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38586>
2026-04-09 17:53:46 +00:00
Olivia Lee
a5a3036972 panvk/csf: lower divergent values introduced by merged workgroups
Mali does not support divergent operands in some cases, and we are
already using lower_non_uniform_access to handle this for descriptor
indexing. We can extend this to handle merged workgroups by just tagging
every intrinsic as nonuniform and then letting divergence analysis sort
out which ones can actually be nonuniform in opt_non_uniform_access.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38586>
2026-04-09 17:53:46 +00:00
Olivia Lee
e9ca69b807 panvk/csf: take merged workgroups into account for divergence
Merging workgroups affects divergence analysis, since subgroups can now
contain extra threads from other workgroups. We already have divergence
analysis flags to handle this case, but since the compiler options memory
is static, we need to define an entirely separate option set for merged
vs non-merged workgroups.

In gallium, we don't have to switch options because opengl requires
uniformity over the entire dispatch in application shaders.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38586>
2026-04-09 17:53:46 +00:00
Olivia Lee
c42e124a66 pan/va: don't merge workgroups when subgroups are used
Vulkan guarantees that all subgroup invocations will be part of the same
workgroup, so we need to disable merging workgroups for shaders where
the subgroup layout is observable.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38586>
2026-04-09 17:53:46 +00:00
Olivia Lee
a0f6c6d84d pan/va: move allow_merging_workgroups decision to drivers
In panvk, we will need to decide whether we are merging workgroups early
in shader compilation, before calling nir_lower_non_uniform_access. This
is because nonuniform lowering introduces new subgroup intrinsics which
would otherwise inhibit workgroup merging, and because the set of
instructions that need to be lowered may be different with merged
workgroups.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38586>
2026-04-09 17:53:46 +00:00
Olivia Lee
1f75299ebb pan/va: weaken barrier requirements for allow_merging_workgroups
The only requirement for barriers is that the hardware doesn't support
allow_merging_workgroups with actual BARRIER instructions. We only emit
these for workgroup execution barriers though, so are safe to merge
workgroups when the shader uses memory barriers or subgroup execution
barriers.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Caterina Shablia <caterina.shablia@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38586>
2026-04-09 17:53:46 +00:00
Dhruv Mark Collins
46aac5abaf tu: Only emit preempt optimization ambles when active
This avoid unnecessarily emitting the switch back/away ambles when
they aren't actually used due to preemption optimization being
disabled. This alleviates unnecessary overhead when not running with
the mitigation for kernel drivers which support it.

Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40852>
2026-04-09 17:29:18 +00:00
Dhruv Mark Collins
18437c7a65 tu: Disable features using performance counter for KGSL
KGSL doesn't support reading of performance counters by writing to
the selector registers directly from a userspace CS, instead these
requests need to be routed via the KGSL uAPI for perf counters.

Certain Turnip features which use performance counters such as
KHR_performance_query as well as preempt-optimize mode in autotune
are now explicitly disabled to reflect this.

Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40852>
2026-04-09 17:29:18 +00:00
Samuel Pitoiset
63d55d84a3 radv: replace remaining occurrences of VK_ACCESS_xxx
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40845>
2026-04-09 15:58:06 +00:00
Samuel Pitoiset
f33757416b radv/meta: remove an outdated comment in vkCmdClearAttachments()
This is no longer true since
"ae84d41d483 - radv/meta: Rework saving/restoring state".

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40845>
2026-04-09 15:58:06 +00:00
Mike Blumenkrantz
166c68914b zink: use EXT_primitive_restart_index
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40784>
2026-04-09 15:37:29 +00:00
Roland Scheidegger
76cdbcb96a llvmpipe: disable denorms in compute shaders on x86/sse
For consistency with other shader stages (required by d3d, neither GL nor
Vulkan really care). A bit awkward since we don't want to disable them for
things like rusticl, which we should be able to distinguish with shader type.
Note that to satisfy d3d requirements, disabling denorms in general is not
sufficient, due to d3d requiring them to be disabled for single precision
opcodes, but enabled for double precision ones, and x86 can't switch that
individually (hence will need per-instruction tracking and switching inside
the shader).

Reviewed-by: Brian Paul <brian.paul@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40787>
2026-04-09 15:10:00 +00:00
Roland Scheidegger
c20106f1e4 llvmpipe: don't rely on cpu denorms for float to smallfloat conversion
Similar to what we already do for smallfloats to floats, handle denorms
and normals separately with bit manipulation stuff rather than rely on
a rescale mul which depends on cpu denorms.
This is a bit more complex, but on the upside we don't need to track
fpstate for denorms anymore in llvmpipe backend. (With modern x86 cpus
this is essentially only really relevant for r11g11b10 float format,
since f16 formats are using f16c instructions.)

Reviewed-by: Brian Paul <brian.paul@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40787>
2026-04-09 15:10:00 +00:00
Roland Scheidegger
2ba62d1502 llvmpipe: get rid of unused code in float to small float code
We really don't want to use the slightly faster but slightly incorrect
(wrt NaN preservation etc.) code.

Reviewed-by: Brian Paul <brian.paul@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40787>
2026-04-09 15:10:00 +00:00
Olle Lögdahl
c69da756d1 aco/isel: added test-case for iterative cf visitor
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
isel.cf.deep_traversal is a new ACO test that verifies
that the iterative nir cf visitor allows arbitrary depth.
A depth of 10000 would cause a stack overflow on x86-64 linux
(4096 kB stack) for the old recursive code. This test
is by default not enabled.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40364>
2026-04-09 13:46:23 +00:00
Olle Lögdahl
aa49d69ea0 aco/isel: use iterative visitor during traversal
When iterating control-flow recursively, we always run the
risk of causing a stack overflow if the control-flow
depth is too large. This patch resolves this by visiting
control-flow nodes in an iterative way, managing an explicit
stack on the heap.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40364>
2026-04-09 13:46:23 +00:00
Daniel Schürmann
37e2deab74 aco/isel: Remove if_context* parameter from begin_if() / end_if() helper functions
We can transparently create the context inside the functions, now.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40364>
2026-04-09 13:46:23 +00:00
Daniel Schürmann
53836320a9 aco/isel: Remove loop_context* parameter from begin_loop() / end_loop() helper functions
We can transparently create the context inside the functions, now.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40364>
2026-04-09 13:46:22 +00:00
Olle Lögdahl
5c1dea7ee4 aco/isel: move if_context and loop_context to heap
if_context and loop_context are large structs and may cause
stack overflows during CF traversal. This fix moves them to
the heap.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40364>
2026-04-09 13:46:22 +00:00
Collabora's Gfx CI Team
909e0026d8 Uprev VVL to cb2acdf7f49053406770ae73cbb315229a9131eb
adfdda5b66...cb2acdf7f4

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40618>
2026-04-09 12:59:00 +00:00