Commit graph

219338 commits

Author SHA1 Message Date
Collabora's Gfx CI Team
dfb4f90a51 Uprev Vulkan Validation Layers
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
https://github.com/KhronosGroup/Vulkan-ValidationLayers/compare/snapshot-2026wk07...f020266adee4bb87e8fde219f6fb31f8f141213e

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40032>
2026-03-03 17:24:54 +00:00
Jordan Justen
059dea672e intel/genxml: Update README notes on hardware version numbers
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40103>
2026-03-03 17:03:15 +00:00
Jordan Justen
22796f6cb1 intel/genxml: Start Xe3P (GFX_VERx10 == 350) support (xe3p.xml, xe3p_rt.xml)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40103>
2026-03-03 17:03:15 +00:00
Jordan Justen
475bc596ea intel/genxml: Rename Xe3 genxml to xe3.xml and xe3_rt.xml
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40103>
2026-03-03 17:03:15 +00:00
Jordan Justen
edff8c3ffa intel/genxml: Rename Xe2 genxml to xe2.xml and xe2_rt.xml
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40103>
2026-03-03 17:03:14 +00:00
Jordan Justen
e0f7f44ab8 intel/decoder: Use array of filenames in get_embedded_xml_data_by_name()
The code is a bit simpler to follow. We add new files so rarely, that
it shouldn't be much of a burden add new items.

Reworks:
 * Iván Briano recommended to use array of names rather than more
   complicated filename parsing.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40103>
2026-03-03 17:03:14 +00:00
Karol Herbst
bba6e40ea7 asahi: support subgroup_rotate
This enables cl_khr_subgroup_rotate

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37169>
2026-03-03 16:31:32 +00:00
Karol Herbst
ca30514389 rusticl: support more subgroup extensions
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37169>
2026-03-03 16:31:32 +00:00
Karol Herbst
e167bdf4ac zink: implement subgroup rotate
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37169>
2026-03-03 16:31:32 +00:00
Karol Herbst
db6394babc ac/llvm: handle int8 inside ac_build_optimization_barrier
Rusticl wants to support more subgroup operations and with CL they use
8bit integer sources as well.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37169>
2026-03-03 16:31:31 +00:00
Karol Herbst
dee90bf617 zink: handle drivers with multiple subgroup sizes correctly
This makes use of VK_KHR_pipeline_executable_properties and
VK_EXT_subgroup_size_control to query supported subgroup sizes and to
query which one the driver picks for a given pipeline.

This fixes OpenCL subgroups on drivers with multiple supported subgroup
sizes.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37169>
2026-03-03 16:31:31 +00:00
Lionel Landwerlin
38ef732169 anv: dirty all push constant stages in simple shader
Above we're reprogramming push constants as well at a couple of
workarounds that require dirtying all stages.

cmd_buffer->state.gfx.push_constant_stages was already set in the
above function.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4fa1eddb4c ("anv: optimize binding table flushing")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14953
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40198>
2026-03-03 16:08:12 +00:00
Pavel Ondračka
85fdbd9bf2 r300/ci: enable HiZ in CI
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This might introduce some flakiness due to the inherent "only one process
can have HyperZ lock" feature. So it might not catch regressions reliably,
but probably better than no coverage at all.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39914>
2026-03-03 15:00:36 +00:00
Pavel Ondračka
b0f019f8cf r300: disable HiZ for PIPE_FUNC_ALWAYS
AMD docs support this:
R5xx Acceleration v1.5 says safest handling for ZFUNC changes is to disable
HiZ except specific LESS/LEQUAL and GREATER/GEQUAL transitions.
ATI OpenGL Programming and Optimization Guide advises avoiding ALWAYS when
trying to benefit from HiZ so that would imply fglrx also disables HiZ
there.

On RV530 this fixes the following dEQPs:
dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.43
dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.74

Fixes: 12dcbd5954 ("r300g: enable Hyper-Z by default on r500")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8093
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39914>
2026-03-03 15:00:36 +00:00
David Rosca
55bab89951 vl: Also disable MPEG2 Main profile when mpeg12 decode is disabled
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: f4959c16c8 ("meson: add mpeg12dec as a video-codec")
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39672>
2026-03-03 12:15:20 +00:00
Konstantin Seurer
d363381d18 vulkan/cmd_queue: Don't explicitly set struct members to NULL
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
They are already initialized by the memcpy above.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39902>
2026-03-03 10:19:31 +00:00
Konstantin Seurer
f2bb6103c3 vulkan/cmd_queue: Rework copy codegen
The new code handles pNext chanis correctly.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39902>
2026-03-03 10:19:31 +00:00
Jose Maria Casanova Crespo
ecb6c5d555 vc4: flush write jobs before BO replacement in DISCARD_WHOLE path
The DISCARD_WHOLE_RESOURCE path in vc4_map_usage_prep() replaces the
resource's BO with vc4_resource_bo_alloc(). As the RCL resolves
rsc->bo at job submit in vc4_submit_setup_rcl_surface(), any pending
write job would store to the new BO instead of the old one, corrupting
the new written data.

This is the same bug that was fixed in v3d in the previous commit.

Fixes: 18ccda7b86 ("vc4: When asked to discard-map a whole resource, discard it.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40180>
2026-03-03 09:32:47 +00:00
Jose Maria Casanova Crespo
1eaa46da09 v3d: flush write jobs before BO replacement in DISCARD_WHOLE path
The DISCARD_WHOLE_RESOURCE path in v3d_map_usage_prep() replaces the
resource's BO with v3d_resource_bo_alloc(). As the RCL resolves
rsc->bo at job submit in emit_rcl() any pending write job would
store to the new BO instead of the old one, corrupting the new
written data.

This is adressed by flushing all pending write jobs affecting the
resource before replacing its BO.

This fixes multiple tests where data copied to a renderbuffer was
overwritten by a previos GPU clear. Test are from the subgroup:
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.*

Fixes: 45bb8f2957 ("broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40180>
2026-03-03 09:32:47 +00:00
Rhys Perry
a5f63424c7 ac/gpu_info: print most of ac_compiler_info
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Excluding fields which are copied in ac_gpu_info.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042>
2026-03-03 08:50:12 +00:00
Rhys Perry
5c3b5688a1 amd: rename ac_cu_info to ac_compiler_info
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042>
2026-03-03 08:50:12 +00:00
Rhys Perry
ab4b3b7737 ac/nir/ngg: add ac_cu_info shortcut
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042>
2026-03-03 08:50:12 +00:00
Rhys Perry
a65089dfce ac/nir: pass ac_cu_info to ac_nir_compute_tess_wg_info
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042>
2026-03-03 08:50:11 +00:00
Rhys Perry
8801ca188d ac/nir: don't pass radeon_info to ac_nir_set_options
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042>
2026-03-03 08:50:10 +00:00
Rhys Perry
5a8a7dbb22 ac/nir: don't pass radeon_info to NGG lowering
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042>
2026-03-03 08:50:09 +00:00
Rhys Perry
36feec61c8 ac/nir: use ac_nir_lower_ngg_options for ac_nir_lower_ngg_mesh
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042>
2026-03-03 08:50:09 +00:00
Rhys Perry
2c7b8e6786 ac/gpu_info: move some NGG flags to ac_cu_info
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042>
2026-03-03 08:50:09 +00:00
Maaz Mombasawala
2a3c9d5ac6 svga: Use gfx-ci kernel in CI
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We currently use a custom kernel in blu/linux to run the ci for svga,
instead switch to the kernel used by other drivers.

Signed-off-by: Maaz Mombasawala <maaz.mombasawala@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40016>
2026-03-03 00:09:59 +00:00
irql-notlessorequal
74e73b0e21 docs/features: Mark VK_KHR_maintenance6 complete for hasvk
Also does the same for VK_KHR_maintenance5 as that was missed.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37434>
2026-03-02 23:16:11 +00:00
irql-notlessorequal
0988c68eeb hasvk: Advertise VK_KHR_maintenance6
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37434>
2026-03-02 23:16:11 +00:00
irql-notlessorequal
971cca33df hasvk: Add support for Cmd*DescriptorSet*2KHR
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37434>
2026-03-02 23:16:11 +00:00
irql-notlessorequal
9b8cb10e1d hasvk: Handle VkBindMemoryStatusKHR on buffer/image memory bind
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37434>
2026-03-02 23:16:11 +00:00
irql-notlessorequal
da9e9329ec hasvk: Remove no longer valid assert
VK_KHR_maintenance6 allows creating uncompressed views of compressed images with multiple layers.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37434>
2026-03-02 23:16:11 +00:00
irql-notlessorequal
879bbf4e37 hasvk: Allow NULL index buffers
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37434>
2026-03-02 23:16:10 +00:00
Rob Clark
dcbb16a488 ir3: Initialize debug once
Reduce tsan spam.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40178>
2026-03-02 22:47:27 +00:00
Rob Clark
23d28e605e freedreno: Initialize debug once
Technically overwriting fd_mesa_debug with the same value should be
harmless.  But tsan doesn't know this.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40178>
2026-03-02 22:47:27 +00:00
Rob Clark
8bfe7b32c9 freedreno/drm: bo cache logging vs tsan
Make unlocked update of counters conditional on cache logging being
enabled, to reduce tsan noise.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40178>
2026-03-02 22:47:27 +00:00
Daivik Bhatia
bb02e4bc5c v3d/v3dv: drop manual log2_tile_width/height asserts.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Move the log2_tile_width/height asserts to pack header functions.

Reviewed-by: Iago Toral Quiroga itoral@igalia.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40108>
2026-03-02 21:45:41 +00:00
Christian Gmeiner
4a5123241f panvk: Advertise VK_ARM_scheduling_controls on CSF
Expose the extension, feature bit, and properties for
VK_ARM_scheduling_controls on CSF architectures (>= v10).

The only supported scheduling control flag is
VK_PHYSICAL_DEVICE_SCHEDULING_CONTROLS_SHADER_CORE_COUNT_ARM, which
allows applications to limit the number of shader cores used by a
queue via VkDeviceQueueShaderCoreControlCreateInfoARM.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40063>
2026-03-02 20:22:07 +00:00
Christian Gmeiner
fff1410c8a panvk: Use per-queue shader core count for CSF group creation
Extract VkDeviceQueueShaderCoreControlCreateInfoARM from the queue
create info pNext chain and use shaderCoreCount to limit
max_compute_cores and max_fragment_cores in the panthor group create
ioctl. The core masks remain unchanged, letting the kernel pick which
cores to schedule on.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40063>
2026-03-02 20:22:07 +00:00
Dylan Baker
a8ba682919 anv: assert we haven't gone over the maximum number of push_buffers
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Coverity notes that we break out of the loop walking the analysis_ranges
early if n_push_ranges >= max_push_buffers, so it notes that
n_push_ranges could already be 3 or 4 (depending on whether we're doing
mesh), and that then if we need the padding we insert another, which
would write past the end of the array.

I don't think this is actually possible in practice, but we can add an
assert to both keep coverity happy and detect that this has actually
happened.

CID: 1681478
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40147>
2026-03-02 19:50:17 +00:00
Dylan Baker
d7f67171ef intel/tools: Don't allocate in noop_drm_shim until after error checking
This avoids the potential to have allocated memory that needs to be
freed, but currently isn't.

CID: 1681055
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40148>
2026-03-02 19:18:33 +00:00
Jesse Natalie
9e277ed2b6 d3d12: Fix importing external resources
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: 97061dd7 ("d3d12: Add support for Xbox GDK.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40152>
2026-03-02 17:54:49 +00:00
Georg Lehmann
7194dfcc2c nir/opt_algebraic: optimize b2i(a) * b to bcsel
Foz-DB Navi48:
Totals from 3180 (2.77% of 114655) affected shaders:
MaxWaves: 85526 -> 85446 (-0.09%)
Instrs: 2681446 -> 2678641 (-0.10%); split: -0.17%, +0.07%
CodeSize: 14295536 -> 14284628 (-0.08%); split: -0.13%, +0.05%
VGPRs: 174792 -> 174636 (-0.09%); split: -0.16%, +0.07%
SpillSGPRs: 306 -> 308 (+0.65%)
Latency: 14078973 -> 14070122 (-0.06%); split: -0.07%, +0.01%
InvThroughput: 2774242 -> 2764051 (-0.37%); split: -0.37%, +0.00%
VClause: 41744 -> 41734 (-0.02%); split: -0.10%, +0.07%
SClause: 58176 -> 58154 (-0.04%); split: -0.05%, +0.01%
Copies: 222967 -> 223108 (+0.06%); split: -0.14%, +0.20%
Branches: 57317 -> 57322 (+0.01%)
PreSGPRs: 140454 -> 140451 (-0.00%); split: -0.01%, +0.00%
PreVGPRs: 131649 -> 131540 (-0.08%); split: -0.09%, +0.01%
VALU: 1509318 -> 1505443 (-0.26%); split: -0.26%, +0.00%
SALU: 384419 -> 385838 (+0.37%); split: -0.01%, +0.38%
VOPD: 13272 -> 13286 (+0.11%); split: +0.14%, -0.03%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40160>
2026-03-02 15:58:30 +00:00
Georg Lehmann
9f1a446107 ci: update expectations
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
2026-03-02 15:24:36 +00:00
Georg Lehmann
3d304d5647 nir/opt_algebraic: remove is_used_once on outer instruction
This just prevents useful optimizations.
is_used_once only makes sense on inner instructions, to prevent
creating more new instructions than will be removed.

Foz-DB Navi48:
Totals from 16989 (14.82% of 114655) affected shaders:
MaxWaves: 434379 -> 434353 (-0.01%); split: +0.01%, -0.01%
Instrs: 29030794 -> 29022514 (-0.03%); split: -0.07%, +0.04%
CodeSize: 155293092 -> 155262816 (-0.02%); split: -0.05%, +0.03%
VGPRs: 1093980 -> 1094088 (+0.01%); split: -0.01%, +0.02%
SpillSGPRs: 9801 -> 9803 (+0.02%); split: -0.03%, +0.05%
Latency: 356327270 -> 356283384 (-0.01%); split: -0.03%, +0.02%
InvThroughput: 58239439 -> 58229374 (-0.02%); split: -0.03%, +0.01%
VClause: 451716 -> 451815 (+0.02%); split: -0.07%, +0.09%
SClause: 654614 -> 654556 (-0.01%); split: -0.03%, +0.03%
Copies: 1809805 -> 1809297 (-0.03%); split: -0.20%, +0.17%
Branches: 552382 -> 552384 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 947188 -> 947224 (+0.00%); split: -0.01%, +0.02%
PreVGPRs: 879583 -> 880173 (+0.07%); split: -0.01%, +0.08%
VALU: 16317859 -> 16309975 (-0.05%); split: -0.07%, +0.02%
SALU: 4256121 -> 4259315 (+0.08%); split: -0.05%, +0.12%
SMEM: 1067069 -> 1067070 (+0.00%)
VOPD: 440855 -> 440792 (-0.01%); split: +0.05%, -0.07%

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
2026-03-02 15:24:36 +00:00
Georg Lehmann
41878e5714 nir_opt_algebraic: remove unneeded is_not_const
These were needed when we didn't constant fold inside nir_search,
to prevent infinite loops.
But now all they do is slow down pattern matching.

Foz-DB Navi48:
Totals from 107 (0.09% of 114655) affected shaders:
Instrs: 162439 -> 162481 (+0.03%); split: -0.01%, +0.03%
CodeSize: 943056 -> 942988 (-0.01%); split: -0.03%, +0.02%
Latency: 971667 -> 970865 (-0.08%); split: -0.09%, +0.00%
InvThroughput: 164452 -> 164521 (+0.04%); split: -0.02%, +0.07%
Copies: 7980 -> 7982 (+0.03%)
VALU: 103572 -> 103566 (-0.01%); split: -0.05%, +0.04%
SALU: 12825 -> 12878 (+0.41%)
VOPD: 5235 -> 5190 (-0.86%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
2026-03-02 15:24:36 +00:00
Georg Lehmann
374cbc17a4 nir_opt_algebraic: reassociate fadd into ffma where one factor is a constant
This restriction doesn't really make sense, probably an accident.

Foz-DB Navi48:
Totals from 2290 (2.00% of 114655) affected shaders:
MaxWaves: 57496 -> 57510 (+0.02%); split: +0.06%, -0.03%
Instrs: 2817419 -> 2816209 (-0.04%); split: -0.12%, +0.08%
CodeSize: 15218816 -> 15220576 (+0.01%); split: -0.09%, +0.10%
VGPRs: 147456 -> 147384 (-0.05%); split: -0.07%, +0.02%
Latency: 13757114 -> 13751833 (-0.04%); split: -0.13%, +0.09%
InvThroughput: 2463343 -> 2462482 (-0.03%); split: -0.07%, +0.04%
VClause: 40137 -> 40153 (+0.04%); split: -0.07%, +0.11%
SClause: 57351 -> 57385 (+0.06%); split: -0.12%, +0.18%
Copies: 135482 -> 136258 (+0.57%); split: -0.22%, +0.79%
Branches: 30886 -> 30894 (+0.03%)
PreSGPRs: 113470 -> 113462 (-0.01%); split: -0.03%, +0.02%
PreVGPRs: 117554 -> 117591 (+0.03%); split: -0.01%, +0.04%
VALU: 1682734 -> 1681557 (-0.07%); split: -0.10%, +0.03%
SALU: 390685 -> 391301 (+0.16%); split: -0.07%, +0.22%
VOPD: 6159 -> 6254 (+1.54%); split: +1.72%, -0.18%

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
2026-03-02 15:24:36 +00:00
Georg Lehmann
b949122908 nir/opt_algebraic: remove loops for b2f/b2i equality handling
The feq/fneu patterns already existed, and there is no reason to use bit size based
loops here.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
2026-03-02 15:24:36 +00:00
Georg Lehmann
83091276f8 nir_opt_algebraic: remove more specific cmp+bcsel opts
Only some minimal difference from pattern ordering:

Foz-DB Navi48:
Totals from 3 (0.00% of 114655) affected shaders:
Instrs: 4556 -> 4533 (-0.50%)
CodeSize: 23716 -> 23608 (-0.46%)
Latency: 27424 -> 26336 (-3.97%)
InvThroughput: 4674 -> 4672 (-0.04%)
SClause: 107 -> 105 (-1.87%)
Copies: 351 -> 346 (-1.42%)
Branches: 130 -> 126 (-3.08%)
VALU: 2598 -> 2595 (-0.12%)
SALU: 561 -> 555 (-1.07%)
SMEM: 169 -> 167 (-1.18%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
2026-03-02 15:24:36 +00:00