Commit graph

219708 commits

Author SHA1 Message Date
Mary Guillemard
6a8d09972e nir: Add isbewr_nv intrinsic and extends isberd_nv
Adds a new intrinsic allowing to do raw write in the various ISBE spaces
where attributes are stored.

This also adapt isberd_nv to map to what we have since SM70+.

This will be used to support mesh shaders.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39716>
2026-03-11 19:41:33 +00:00
Marek Olšák
796e1749c6 ac: replace some packet field definitions in sid.h by generated ones
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40183>
2026-03-11 18:54:20 +00:00
Marek Olšák
1c75cd958f ac: enable the new auto-generated CP packet parser
This keeps old packets that were removed from newer HW, packets that set
registers, and packets using non-trivial custom code.

It preserves address checking that was done in print_addr.

Packet names still used the old generator.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40183>
2026-03-11 18:54:20 +00:00
Marek Olšák
c2d01d6fd5 amd: generate a packet parser/printer automatically from packet definitions
The next commit will enable this.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40183>
2026-03-11 18:54:20 +00:00
Marek Olšák
aeab5057a8 amd/packets: remove non-existent CLEAR_STATE from gfx12 definitions
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40183>
2026-03-11 18:54:20 +00:00
Georg Lehmann
07d5c2cbaa radv: set no_signed_zero for FS store_output when format doesn't care
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Or when blending is enabled, because blending doesn't require IEEE math.

Foz-DB Navi21:
Totals from 367 (0.32% of 114627) affected shaders:
MaxWaves: 7884 -> 7876 (-0.10%); split: +0.20%, -0.30%
Instrs: 354948 -> 354386 (-0.16%); split: -0.16%, +0.00%
CodeSize: 1905980 -> 1903172 (-0.15%); split: -0.15%, +0.00%
VGPRs: 20208 -> 20216 (+0.04%); split: -0.08%, +0.12%
Latency: 1855670 -> 1854973 (-0.04%); split: -0.06%, +0.02%
InvThroughput: 540792 -> 539688 (-0.20%); split: -0.20%, +0.00%
PreSGPRs: 18426 -> 18366 (-0.33%)
PreVGPRs: 17213 -> 17249 (+0.21%); split: -0.05%, +0.26%
VALU: 258793 -> 258237 (-0.21%); split: -0.22%, +0.00%
SALU: 35168 -> 35166 (-0.01%); split: -0.01%, +0.01%

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40323>
2026-03-11 16:47:15 +00:00
Georg Lehmann
769606e2e6 nir/opt_fp_math_ctrl: handle input/output no_signed_zero flag
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40323>
2026-03-11 16:47:15 +00:00
Georg Lehmann
0d747eee88 nir: add no_signed_zero flag to io semantics
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40323>
2026-03-11 16:47:15 +00:00
Georg Lehmann
26f5a6d6cc nir: fix nir_intrinsic_copy_const_indices for large indices
Fixes: 4ba581887e ("nir: support intrinsic indicies larger than 32 bits")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40323>
2026-03-11 16:47:15 +00:00
Mike Blumenkrantz
c9b2986607 egl/device: fix the fix for explicit sw rejection in non-sw EGL_PLATFORM=device
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
"explicit sw" means llvmpipe, which cannot be a real drm device. this requires also
returning only a single device so as to avoid leaking non-sw drivers

should fix LIBGL_ALWAYS_SOFTWARE=1 eglinfo

Fixes: 8a339cdebc ("egl: fix sw fallback rejection in non-sw EGL_PLATFORM=device")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40069>
2026-03-11 16:03:19 +00:00
Job Noorman
5e4a7d01fe ir3: don't predicate vote_all/vote_any
These get lowered to control flow which isn't allowed inside predicated
blocks.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 39088571f0 ("ir3: add support for predication")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40345>
2026-03-11 15:13:07 +00:00
Job Noorman
f88e8b778d ir3: update context builder after ir3_get_predicate
If we are currently inserting instructions after the src of the
predicate conversion, uses of the predicate will be inserted before its
def (the conversion). Fix this by updating the context builder to point
to after the conversion.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: fda91b49d7 ("ir3: refactor builders to use ir3_builder API")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15043
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40336>
2026-03-11 14:19:42 +00:00
Christian Gmeiner
65216acd82 panvk: Implement VK_EXT_memory_budget support
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40246>
2026-03-11 08:36:05 +00:00
Christian Gmeiner
0d75cb479e docs/features: VK_VALVE_mutable_descriptor_type: Add missing version info
VK_EXT_mutable_descriptor_type is exposed on panvk/v9+ and the same applies
to VK_VALVE_mutable_descriptor_type.

Fixes: 266160fe4e ("panvk: Advertise VK_VALVE_mutable_descriptor_type")
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40316>
2026-03-11 08:20:13 +00:00
Samuel Pitoiset
dfdaf6a277 radv: rewrite a comment explaining why PFP waits for ME with streamout
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40327>
2026-03-11 07:34:45 +00:00
Samuel Pitoiset
d9420eed9e radv: fix missing L2 cache invalidation with streamout on GFX12
COPY_DATA emitted from the CP isn't coherent with L2, in case the
buffer filled size needs to be copied.

This fixes rare and random flickering with Mafia 3 Definitive Edition
on RDNA4.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14697
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40327>
2026-03-11 07:34:45 +00:00
Marek Olšák
88a675c1f3 radeonsi: remove AMD_TEST=blitperf
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It's not needed anymore. It has been ported to gpu-ratemeter, improved, and
has Vulkan support.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pelloux@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40311>
2026-03-11 00:08:05 -04:00
Marek Olšák
1c3883bf09 radeonsi: add debug options forcing fast clear, gfx and compute blits
so that we can isolate their performance for gpu-ratemeter

Reviewed-by: Pierre-Eric Pelloux-Prayer <pelloux@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40311>
2026-03-11 00:08:04 -04:00
Marek Olšák
51b2e6f4c2 radeonsi: don't fail si_compute_blit for compressed/subsampled formats properly
Reviewed-by: Pierre-Eric Pelloux-Prayer <pelloux@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40311>
2026-03-11 00:08:02 -04:00
Nanley Chery
eb8883f3ef intel/blorp: Redescribe surfaces for copies
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
When copying data between two surfaces, independently increase the size
of each surface's format (bits-per-pixel) as alignment constraints
allow. Adjust the other surface parameters and blorp_copy() parameters
accordingly.

This fixes copies between the 16bpp YCRCB formats and 32bpp formats:
dEQP-VK.ycbcr.single_plane_copy.linear.linear.r8g8b8a8_to_g8b8g8r8_422
This new test failure was reported by Iván Briano.

More generally, this increases the efficiency of our copies. As shown in
the configuration pages of the PRMs, our sampler is able to fetch texels
at a fixed rate of texels / clock regardless of the texel size
(presumably our rendering hardware has similar behavior). By using the
largest texel size possible, we can transfer more bits / clock.

Improves the performance of a number of traces in the performance CI for
BMG:

* TotalWarWarhammer3 +2.24%
* Payday3 	     +1.87%
* BaldursGate3 	     +1.34%
* Control 	     +1.25%
* TotalWarPharaoh    +1.22%

Four additional traces are helped between +0.44% and +0.96%.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
2026-03-11 00:36:19 +00:00
Nanley Chery
73796c7245 intel/blorp: Add blorp_surf_convert_to_single_level_tile()
Convert a Tile64/Yf/Ys surface to a single level or a single miptail.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
2026-03-11 00:36:19 +00:00
Nanley Chery
9351dbfb25 intel/blorp: Use stencil hardware less for CPB copies
Don't use it without ISL_AUX_USAGE_STC_CCS. With a future patch, this
will allow blorp_copy() calls to increase the size of the surface format
for CPB.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
2026-03-11 00:36:19 +00:00
Nanley Chery
20bf27f2a8 intel/blorp: Make blorp_copy() format queries aux-dependent
blorp_copy() will soon start changing the format in a way which drivers
cannot rely on to do things like manage the texture cache (see iris).

Narrow down the scope of blorp_copy_get_formats() and
blorp_copy_get_color_format() such that the returned value can only be
trusted if compression would be enabled on each image.

By doing this (and adapting iris to reflect this), we'll get the
required flushes on the platforms which need
WaSamplerCacheFlushBetweenRedescribedSurfaceReads:

* On the platforms which need the workaround for all formats,
  blorp_copy() will stick with the queried format on compressed
  surfaces.
* On the platforms which need the workaround when switching from ASTC
  and non-ASTC formats, blorp_copy() may actually change the queried
  format on compressed surfaces. This is not a problem, because
  surfaces which may be read with ASTC formats are not compressible.

Prevents gfx9 from failing tests under:
* KHR-GL46.copy_image.functional_src_target_texture_2d_array_src_format_r3_g3_b2*
* KHR-GL46.copy_image.functional_src_target_texture_2d_array_src_format_rgb5*
* KHR-GL46.copy_image.functional_src_target_texture_2d_array_src_format_rgba2*
* KHR-GL46.copy_image.functional_src_target_texture_2d_array_src_format_rgba4*

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
2026-03-11 00:36:18 +00:00
Nanley Chery
7fffd67803 anv: Add WaSamplerCacheFlushBetweenRedescribedSurfaceReads
With upcoming blorp_copy() changes, this avoids the following failures
with zink on gfx9:
* dEQP-GLES3.functional.texture.specification.basic_teximage3d.r8_2d_array
* dEQP-GLES3.functional.texture.specification.basic_teximage3d.r8_snorm_2d_array
* dEQP-GLES3.functional.texture.specification.basic_teximage3d.r8i_2d_array
* dEQP-GLES3.functional.texture.specification.basic_teximage3d.r8ui_2d_array

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
2026-03-11 00:36:18 +00:00
Nanley Chery
465c186fc5 anv: Prepare for format width changes in blorp_copy()
blorp_copy() will soon gain the ability to increase the format bpb.
Prepare anv by replicating the clear color pixel on gfx12.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
2026-03-11 00:36:18 +00:00
Nanley Chery
d993e0dc47 intel/blorp: Add blorp_surf::has_replicated_pixel
This allows blorp_copy() to widen a surface format width in some cases.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
2026-03-11 00:36:17 +00:00
Nanley Chery
a77f79f21e intel/blorp: Lower bit-casting code in blorp_copy()
We're going to add code between calling
blorp_surf_convert_to_uncompressed() and bit-casting determination.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
2026-03-11 00:36:17 +00:00
Nanley Chery
e0859f5ca1 intel/isl: Use a fixed alignment for single slices
We're going to start changing the surface format during blorp_copy().
Changing the surface format could lead to incorrect image alignment
parameters, so return a fixed halign and valign for images with a single
subresource. That's all that will be needed for the upcoming
blorp_copy() changes.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
2026-03-11 00:36:17 +00:00
Nanley Chery
b16b9b5591 intel/isl: Relax some alignments in get_image_surf()
Aux-tt alignment only applies to the beginning of the resource. Drop it
if we're pointing to an image that is not in the first tile of the
image. Likewise for the alignment we add for sequential multi-engine
access.

We allow sparse on 1D images. When getting an image from such a surface,
the alignment likely won't be aligned to 64KB. So, in this case, remove
the flag to avoid the alignment expectation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
2026-03-11 00:36:16 +00:00
Nanley Chery
8d82d06cbc intel/isl: Generalize and move some Yf/Ys miptail limits
Increase the scope of Yf/Ys miptail workarounds to drop the dependency
on format type (compressed or uncompressed) and make this information
more publically accessible. If I recall correctly, the affected tests
only performed blorp_copy() uploads and downloads and never accessed
images with compressed formats. So, we likely should be increasing the
scope.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
2026-03-11 00:36:16 +00:00
Nanley Chery
9cbf14690f intel/isl: Increase 3D miptail workaround scope
Fixes the following test case on ICL:

$ INTEL_DEBUG=noccs ./deqp-vk -n
  dEQP-VK.api.image_clearing.core.clear_color_image.3d.optimal.
  single_layer.r32g32b32a32_uint

Fixes: 78e24605db ("intel/isl: Reduce scope of Yf-disabling workaround")
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
2026-03-11 00:36:15 +00:00
Nanley Chery
27d515772e intel/isl: Replace mc_format with aux_format
We're going to be changing the surface format of images but need to
maintain a consistent render compression format to properly
encode/decode. Generalize and use the field that was previously specific
to ISL_AUX_USAGE_MC.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>
2026-03-11 00:36:15 +00:00
Sagar Ghuge
cb423ee636 anv: Fix Wa_14021821874, Wa_14018813551, Wa_14026600921
WA states that we need to allocate maximum number of stackIDs per DSS
from RT_DISPATCH_GLOBALS to 2048.

We can still throttle/control the CFE_STATE::StackID to be in range
specified by the field.

This does impact performance having CFE_STATE::stackIDs capped to 2K
by default. More the outstanding ray queries, larger the working set and
have more impact on cache hit rate.

This affect performance on Xe2+ onwards:
* Boundary Benchmark:            36.2%
* Solar Bay extreme:             9.8%
* Hitman world of assassination: 3.9%

Fixes: c1a44e8d43 ("anv: force StackIDControl value for Wa_14021821874")
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40310>
2026-03-10 22:41:54 +00:00
Mike Blumenkrantz
72bf7ad701 vk/cmd_queue: generate CmdPushConstants2
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40268>
2026-03-10 21:49:18 +00:00
Mike Blumenkrantz
47e78e60e3 vk/cmd_queue: generate the rest of the descriptor functions
this special cases the pData for template updating since it's a weird
one-off case where all the data needs to be copied

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40268>
2026-03-10 21:49:18 +00:00
Mike Blumenkrantz
c35499e99d vk/cmd_queue: move pipeline layout refs into builder
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40268>
2026-03-10 21:49:18 +00:00
Mike Blumenkrantz
dbbd5360f0 vk/cmd_queue: generate CmdPushDescriptorSet
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40268>
2026-03-10 21:49:17 +00:00
Mike Blumenkrantz
5f8b244ce6 vk/cmd_queue: generate CmdBindDescriptorSets
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40268>
2026-03-10 21:49:17 +00:00
Mike Blumenkrantz
f10f9d45bc vk/cmd_queue: pass command to struct copying methods
no functional changes

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40268>
2026-03-10 21:49:16 +00:00
Mike Blumenkrantz
9f272af038 vk/cmd_queue: return cmd instead of error code
the error code is always the same, and this is a bit more logical

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40268>
2026-03-10 21:49:15 +00:00
Mike Blumenkrantz
e3be0e23f9 vk/cmd_queue: handle descriptor layout refcounting
this enables supporting pnexted VkPipelineLayoutCreateInfo from vkCmdPushDescriptorSetWithTemplate2

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40268>
2026-03-10 21:49:15 +00:00
Mike Blumenkrantz
0c22630ba2 vk/cmd_queue: use arrays to directly manage refcounting
this allows deleting driver_free_cb and elimination of another iteration
of each cmdbuf on reset/free

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40268>
2026-03-10 21:49:14 +00:00
Faith Ekstrand
c0947d1cb4 pan/bi: Delete the b32csel special case and assert sizes match
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40307>
2026-03-10 20:54:44 +00:00
Faith Ekstrand
08c437f644 pan/bi: Be more careful about bit sizes in b2f lowering
Fixes: 21bdee7bcc ("pan/bi: Switch to lower_bool_to_bitsize")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40307>
2026-03-10 20:54:44 +00:00
Faith Ekstrand
5de5987678 nir,panfrost: Move lower_bool_to_bitsize to panfrost
It's the only driver that uses the pass so it may as well go there.

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40307>
2026-03-10 20:54:44 +00:00
Faith Ekstrand
3fd471dca5 nir/lower_bool_to_bitsize: Make all bN_csel sources match
Previously, we assumed that the selector for bcsel could be whatever,
regardless of the bit sizes of the data and we'd just fix it in the
back-end.  This works okay for scalars but falls over the moment we
vectorize because all our vector handling assumes bit sizes match.
Since matching bit sizes is what the hardware wants anyway, it's better
to do the right thing in NIR and hope copy-propagation can fold in
conversions if needed.

Unfortunately, copy prop isn't that smart yet so this does hurt a bit:

    Instrs: 1193679 -> 1198086 (+0.37%); split: -0.06%, +0.43%
    CodeSize: 11915136 -> 11950592 (+0.30%); split: -0.05%, +0.34%
    Full: 160985 -> 160941 (-0.03%); split: -0.04%, +0.01%
    Estimated normalized CVT cycles: 4456.938557000181 -> 4480.876069000186 (+0.54%); split: -0.13%, +0.67%
    Estimated normalized SFU cycles: 6350.9375 -> 6392.21875 (+0.65%)
    Estimated normalized Load/Store cycles: 205773.0 -> 205795.0 (+0.01%)
    Maximum number of threads: 12864 -> 12863 (-0.01%)
    Number of spill instructions: 22487 -> 22489 (+0.01%)
    Number of fill instructions: 52179 -> 52219 (+0.08%)

Hurt shaders:

    google-meet-clvk/BgBlur
    google-meet-clvk/Relight
    parallel-rdp/small_subgroup
    parallel-rdp/small_uber_subgroup

The proper solution here is to teach copy-prop about this stuff so that
it can propagate swizzles into ALU ops when they're supported:
https://gitlab.freedesktop.org/panfrost/mesa/-/issues/265

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14945
Cc: mesa-stable
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40307>
2026-03-10 20:54:43 +00:00
Faith Ekstrand
6fb3995659 etnaviv: Call lower_bool_to_int32 not to_bitsize
It calls both for some reason but never handles any other booleans than
32-bit.  This was probably a mistake.

Fixes: e63a7882a0 ("etnaviv: call nir_lower_bool_to_bitsize")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40307>
2026-03-10 20:54:43 +00:00
Mary Guillemard
8f2eeee7ba vulkan: Do not override the shader_flags in case of no task shader
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This should be doing a or and not an assign.
This fixes issues on NVK with mesh stages on DGC.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 9308e8d90d ("vulkan: Add generic graphics and compute VkPipeline implementations")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40266>
2026-03-10 20:03:56 +00:00
Antonino Maniscalco
d526bbc29b zink: don't care about generated gs output primitive
Zink uses the output primitive of the last vertex stage when deciding
the raster primitive. When we generate the gs the output primitive
depends on the raster primitive.

Not only does the generated gs output primitive have no value in chosing
the raster primitive, it can also get us stuck with the last raster
primitve which is of course incorrect.

Ignore it for generated shaders.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32399>
2026-03-10 19:21:08 +00:00
Lionel Landwerlin
df06d117c5 anv: fix internal compute shader constant data pull
Forgot to update this path that must now use the new intrinsic.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15012
Fixes: 9f2215b480 ("anv/brw: remove push constant load emulation from the backend compiler")
Tested-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40308>
2026-03-10 18:24:04 +00:00